home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-03-29 | 36.9 KB | 1,004 lines |
- To: oesterle@wpi.wpi.edu
- Subject: Re: Binary Tree Re-balancing
- Newsgroups: comp.lang.c
- In-Reply-To: <10403@wpi.wpi.edu>
- Organization: Harris Computer Systems, Fort Lauderdale, FL
- Cc:
- Bcc:
-
- In article <10403@wpi.wpi.edu> you write:
- >
- >Greetings!
- >
- > I would like to make a request for some references or C code
- >for balancing binary trees after insertion/deletion. I have been
- >looking at two books (_Algorithms + Data Structures = Programs_,
- >by Nicklaus Wirth; and _Algorithms: Their Complexity and
- >Efficiency_ by Lydia Kronsj:o). Both the books' code is in
- >PASCAL; I would more or less want to compare to see if I am
- >interpreting the code correctly, since both books implement the
- >balancing differently and I still want to implement it
- >differently than both of the books. I plan to keep the balancing
- >information in structures for speed. Simplicity is much
- >desirable, recursive is great.
- >
- I have spent a couple of years doing exactly this. I have implemented
- what I feel is a very elegant and easy to understand solution and the
- result is a C library for handling AVL trees as an abstract type. If
- you want me to mail my code, then let me know! I will give a bit of a
- discussion though.
-
- First of all, I use a balance factor that ranges from -2 .. 2,
- Many texts Ive seen use -1 .. 1 but I feel -2 .. 2 is easier to
- understand and provides for more simple balance updating.
- It is also UNNECESSARY to use separate routines to do left rotations,
- and right rotations, one routine that takes the direction as an
- extra parameter will do.
-
- CALCULATING NEW BALANCES AFTER A ROTATION:
- ==========================================
- To calculate the new balances after a single left rotation; assume we have
- the following case:
-
- A B
- / \ / \
- / \ / \
- a B ==> A c
- / \ / \
- / \ / \
- b c a b
-
-
- The left is what the tree looked like BEFORE the rotation and the right
- is what the tree looks like after the rotation. Capital letters are used
- to denote single nodes and lowercase letters are used to denote subtrees.
-
- The "balance" of a tree is the height of its right subtree less the
- height of its left subtree. Therefore, we can calculate the new balances
- of "A" and "B" as follows (ht is the height function):
-
- NewBal(A) = ht(b) - ht(a)
-
- OldBal(A) = ht(B) - ht(a)
- = ( 1 + max (ht(b), ht(c)) ) - ht(a)
-
-
- subtracting the second equation from the first yields:
-
-
- NewBal(A) - OldBal(A) = ht(b) - ( 1 + max (ht(b), ht(c)) )
- + ht(a) - ht(a)
-
-
- canceling out the ht(a) terms and adding OldBal(A) to both sides yields:
-
-
- NewBal(A) = OldBal(A) - 1 - (max (ht(b), ht(c)) - ht(b) )
-
-
- Noting that max(x, y) - z = max(x-z, y-z), we get:
-
-
- NewBal(A) = OldBal(A) - 1 - (max (ht(b) - ht(b), ht(c) - ht(b)) )
-
-
- But ht(c) - ht(b) is OldBal(B) so we get:
-
-
- NewBal(A) = OldBal(A) - 1 - (max (0, OldBal(B)) )
- = OldBal(A) - 1 - max (0, OldBal(B))
-
- Thus, for A, we get the equation:
-
- NewBal(A) = OldBal(A) - 1 - max (0, OldBal(B))
-
- To calculate the Balance for B we perform a similar computation:
-
- NewBal(B) = ht(c) - ht(A)
- = ht(c) - (1 + max(ht(a), ht(b)) )
-
- OldBal(B) = ht(c) - ht(b)
-
-
- subtracting the second equation from the first yields:
-
-
- NewBal(B) - OldBal(B) = ht(c) - ht(c)
- + ht(b) - (1 + max(ht(a), ht(b)) )
-
-
- canceling, and adding OldBal(B) to both sides gives:
-
-
- NewBal(B) = OldBal(B) - 1 - (max(ht(a), ht(b)) - ht(b))
- = OldBal(B) - 1 - (max(ht(a) - ht(b), ht(b) - ht(b))
-
-
- But ht(a) - ht(b) is - (ht(b) - ht(a)) = -NewBal(A), so ...
-
-
- NewBal(B) = OldBal(B) - 1 - max( -NewBal(A), 0)
-
-
- Using the fact that min(x,y) = -max(-x, -y) we get:
-
-
- NewBal(B) = OldBal(B) - 1 + min( NewBal(A), 0)
-
-
- So, for a single left rotation we have shown the the new balances
- for the nodes A and B are given by the following equations:
-
- NewBal(A) = OldBal(A) - 1 - max(OldBal(B), 0)
- NewBal(B) = OldBal(B) - 1 + min(NewBal(A), 0)
-
- Now let us look at the case of a single right rotation. The case
- we will use is the same one we used for the single left rotation
- only with all the left and right subtrees switched around so that
- we have the mirror image of the case we used for our left rotation.
-
-
- A B
- / \ / \
- / \ / \
- B a ==> c A
- / \ / \
- / \ / \
- c b b a
-
-
- If we perform the same calculations that we made for the left rotation,
- we will see that the new balances for a single right rotation are given
- by the following equations:
-
- NewBal(A) = OldBal(A) + 1 - min(OldBal(B), 0)
- NewBal(B) = OldBal(B) + 1 + max(NewBal(A), 0)
-
- Hence, the C code for single left and right rotations would be:
-
- #define LEFT 0
- #define RIGHT 1
-
- #define MAX(a,b) ( (a) > (b) ? (a) : (b) )
- #define MIN(a,b) ( (a) < (b) ? (a) : (b) )
-
- typedef struct avl_node {
- DATA_TYPE data;
- short bal;
- struct avl_node *subtree[2];
- } AVL_NODE, *AVL_TREE; /* avl_node */
-
- void rotate_left (tree)
- AVL_TREE tree;
- {
- AVL_TREE old_root = tree;
-
- /* perform rotation */
- tree = tree->subtree[RIGHT];
- old_root->subtree[RIGHT] = tree->subtree[LEFT];
- tree->subtree[LEFT] = old_root;
-
- /* update balances */
- old_root->bal -= ( 1 + MAX(tree->bal, 0) );
- tree->bal -= ( 1 - MIN(old_root->bal, 0) );
- }/* rotate_left */
-
-
- void rotate_right (tree)
- AVL_TREE tree;
- {
- AVL_TREE old_root = tree;
-
- /* perform rotation */
- tree = tree->subtree[LEFT];
- old_root->subtree[LEFT] = tree->subtree[RIGHT];
- tree->subtree[RIGHT] = old_root;
-
- /* update balances */
- old_root->bal += ( 1 - MIN(tree->bal, 0) );
- tree->bal += ( 1 + MAX(old_root->bal, 0) );
- }/* rotate_right */
-
- We can make this code more compact however by using only ONE
- rotate routine which takes an additional parameter: the direction
- in which to rotate. Notice that I have defined LEFT, and RIGHT to
- be mnemonic constants to index into an array of subtrees. I can
- pass the constant LEFT or RIGHT to the rotation routine and it can
- calculate the direction opposite the given direction by subtracting
- the given direction from the number one.
-
- It does not matter whether LEFT is 0 or RIGHT is 0 as long as one
- of them is 0 and the other is 1. If this is the case, then:
-
- 1 - LEFT = RIGHT
- and
- 1 - RIGHT = LEFT
-
- Using this and the same type definitions as before (and the same
- macros), the C source for a single rotation becomes:
-
- #define OTHER_DIR(x) ( 1 - (x) )
-
- typedef short DIRECTION
-
- void rotate (tree, dir)
- AVL_TREE tree;
- DIRECTION dir;
- {
- AVL_TREE old_root = tree;
- DIRECTION other_dir = OTHER_DIR(dir);
-
- /* rotate */
- tree = tree->subtree[other_dir];
- old_root->subtree[other_dir] = tree->subtree[dir];
- tree->subtree[dir] = old_root;
-
- /* update balances */
- if ( dir == LEFT ) {
- old_root->bal -= ( 1 + MAX(tree->bal, 0) );
- tree->bal -= ( 1 - MIN(old_root->bal, 0) );
- }/* if */
-
- else /* dir == RIGHT */ {
- old_root->bal += ( 1 - MIN(tree->bal, 0) );
- tree->bal += ( 1 + MAX(old_root->bal, 0) );
- }/* else */
-
- }/* rotate */
-
- We can compact this code even further if we play around with the
- equations for updating the balances. Let us use the fact that
- max(x,y) = -min(-x,-y):
-
- for a left rotation
- ------------------------------------------------
- old_root->bal -= ( 1 + MAX(tree->bal, 0) );
- tree->bal -= ( 1 - MIN(old_root->bal, 0) );
-
-
- for a right rotation
- ------------------------------------------------
- old_root->bal += ( 1 - MIN(tree->bal, 0) );
- tree->bal += ( 1 + MAX(old_root->bal, 0) );
-
- Using the above rule to change all occurrences of "MIN" to "MAX"
- these equations become:
-
- for a left rotation
- ------------------------------------------------
- old_root->bal -= ( 1 + MAX( +(tree->bal), 0) );
- tree->bal -= ( 1 + MAX( -(old_root->bal), 0) );
-
-
- for a right rotation
- ------------------------------------------------
- old_root->bal += ( 1 + MAX( -(tree->bal), 0) );
- tree->bal += ( 1 + MAX( +(old_root->bal), 0) );
-
-
- Note that the difference between updating the balances for our right
- and left rotations is only the occurrence of a '+' where we would
- like to see a '-' in the assignment operator, and the sign of the
- first argument to the MAX macro. If we had a function that would
- map LEFT to +1 and RIGHT to -1 we could multiply by the result
- of that function to update our balances. Such a function is
-
- f(x) = 1 - 2x
-
- "f" maps 0 to 1 and maps 1 to -1. This function will NOT map LEFT
- and RIGHT to the same value regardless of which is 1 and which is
- 0 however. If we wish our function to have this property then we
- can multiply (1 - 2x) by (RIGHT - LEFT) so that the result "switches"
- signs accordingly depending upon whether LEFT is 0 or RIGHT is 0.
- This defines a new function "g":
-
- g(x) = (1 - 2x)(RIGHT - LEFT)
-
- If LEFT = 0 and RIGHT = 1 then:
-
- g(LEFT) = (1 - 2*0)(1 - 0) = 1*1 = 1
- g(RIGHT) = (1 - 2*1)(1 - 0) = (-1)*1 = -1
-
- If LEFT = 1 and RIGHT = 0 then:
-
- g(LEFT) = (1 - 2*1)(0 - 1) = (-1)*(-1) = 1
- g(RIGHT) = (1 - 2*0)(0 - 1) = 1*(-1) = -1
-
- So, as desired, the function "g" maps LEFT to +1 and RIGHT to -1
- regardless of which is 0 and which is 1.
-
- Now, if we introduce a new variable called "factor" and assign
- it the value "g(dir)", we may update the balances in our rotation
- routine without using a conditional statement:
-
- for a rotation in the "dir" direction
- ------------------------------------------------------------
- old_root->bal -= factor * ( 1 + MAX( factor * tree->bal, 0) );
- tree->bal += factor * ( 1 + MAX( factor * old_root->bal, 0) );
-
- Using this, the new code for our rotation routine becomes:
-
- #define OTHER_DIR(x) ( 1 - (x) )
-
- typedef short DIRECTION
-
- void rotate (tree, dir)
- AVL_TREE tree;
- DIRECTION dir;
- {
- AVL_TREE old_root = tree;
- DIRECTION other_dir = OTHER_DIR(dir);
- short factor = (RIGHT - LEFT) * (1 - (2 * dir));
-
- /* rotate */
- tree = tree->subtree[other_dir];
- old_root->subtree[other_dir] = tree->subtree[dir];
- tree->subtree[dir] = old_root;
-
- /* update balances */
- old_root->bal -= factor * ( 1 + MAX( factor * tree->bal, 0) );
- tree->bal += factor * ( 1 + MAX( factor * old_root->bal, 0) );
-
- }/* rotate */
-
- However, although this second version of "rotate" is more compact and
- doesn't require the use of a conditional test on the variable "dir",
- It may actually run slower than our first version of "rotate" because
- the time required to make the "test" may well be less than the time
- required to perform the additional multiplications and subtractions.
-
- Now a double rotation can be implemented as a series of single rotations:
-
- /*
- * rotate_twice() -- rotate a given node in the given direction
- * and then in the opposite direction
- * to restore the balance of a tree
- */
- void rotate_twice( rootp, dir )
- AVLtree *rootp;
- DIRECTION dir;
- {
- DIRECTION other_dir = OPPOSITE( dir );
-
- rotate_once( &( (*rootp) -> subtree[ other_dir ] ), other_dir );
- rotate_once( rootp, dir );
-
- }/* rotate_twice */
-
-
- ANOTHER METHOD FOR CALCULATING BALANCES AFTER ROTATION:
- =======================================================
- One may use a different method than the one described above which
- is perhaps simpler. Note however that the method for updating balances
- described above works regardless of what numbers the balance factor
- may contain (as long as they are correct -- it works, no matter how
- imbalanced). If we take into account some of conditions that cause
- a rotation, we have more information to work with (such that the
- node to be rotated has a balance of +2 or -2 etc..)
-
- For a single LL rotation we have one of two possibilities:
-
-
- A B
- / \ / \
- / \ / \
- a B ==> A c
- / \ / \
- / \ / \
- b c a b
-
- ==============================================================
- BALANCE FACTORS
- BEFORE ROTATION AFTER ROTATION
- ------------------ ----------------
- case 1) A = +2 ; B = +1 A = 0 ; B = 0
- case 2) A = +2 ; B = 0 A = +1 ; B = -1
- ==============================================================
-
- so in either case NewB = OldB -1 and newA = -newB so we get
- A = - (--B) for a single left rotation.
-
- For a single RR rotation the possibilities are (The picture is a
- mirror image (swap all right and left kids of each node) of the LL one)
- ==============================================================
- BALANCE FACTORS
- BEFORE ROTATION AFTER ROTATION
- ------------------ ----------------
- case 1) A = -2 ; B = -1 A = 0 ; B = 0
- case 2) A = -2 ; B = 0 A = -1 ; B = +1
- ==============================================================
-
- so in either case NewB = OldB +1 and newA = -newB so we get
- A = - (++B) for a single left rotation.
-
- This means that we can use the following to update balances:
-
- /* Directional Definitions */
- typedef short DIRECTION;
- #define LEFT 0
- #define RIGHT 1
- #define OPPOSITE(x) ( 1 - (x) )
-
-
- /* return codes used by avl_insert(), avl_delete(), and balance() */
- #define HEIGHT_UNCHANGED 0
- #define HEIGHT_CHANGED 1
-
-
- /*
- * rotate_once() -- rotate a given node in the given direction
- * to restore the balance of a tree
- */
- short rotate_once( rootp, dir )
- AVLtree *rootp;
- DIRECTION dir;
- {
- DIRECTION other_dir = OPPOSITE( dir ); /* opposite direction to "dir" */
- AVLtree old_root = *rootp; /* copy of original root of tree */
- short ht_unchanged; /* true if height unchanged */
-
- /* Here we need to take into account the special case that occurs when a
- ** single rotation was made but the HEIGHT of the rotated tree did NOT
- ** change as a result of the rotation (we will need this later)
- */
- ht_unchanged = ( (*rootp) -> subtree[ other_dir ] -> bal ) ? FALSE : TRUE;
-
- /* assign new root */
- *rootp = old_root -> subtree[ other_dir ];
-
- /* new-root exchanges it's "dir" subtree for it's parent */
- old_root -> subtree[ other_dir ] = (*rootp) -> subtree[ dir ];
- (*rootp) -> subtree[ dir ] = old_root;
-
- /* update balances */
- old_root -> bal = -( dir == LEFT ? --( (*rootp) -> bal )
- : ++( (*rootp) -> bal ) );
-
- return ht_unchanged;
- }/* rotate_once */
-
- We get an even nicer scenario when we look at LR and RL rotations.
- For a double LR rotation we have one of three possibilities:
-
-
- A B
- / \ / \
- / \ / \
- a C ==> A C
- / \ / \ / \
- / \ / | | \
- B c a b1 b2 c
- / \
- / \
- b1 b2
-
- ==============================================================
- BALANCE FACTORS
- BEFORE ROTATION AFTER ROTATION
- ------------------------ -----------------------
- A = +2 ; C = -1 ; B = +1 A = -1 ; B = 0 ; C = 0
- A = +2 ; C = -1 ; B = 0 A = 0 ; B = 0 ; C = 0
- A = +2 ; C = -1 ; B = -1 A = 0 ; B = 0 ; C =+1
- ==============================================================
-
- So we get, in all three cases:
-
- newA = -max( oldB, 0 )
- newC = -min( oldB, 0 )
- newB = 0
-
- Now for a double RL rotation we have the following possibilities (again, the
- picture is the mirror image of the LR case):
-
- ==============================================================
- BALANCE FACTORS
- BEFORE ROTATION AFTER ROTATION
- ------------------------ -----------------------
- A = -2 ; C = +1 ; B = +1 A = -1 ; B = 0 ; C = 0
- A = -2 ; C = +1 ; B = 0 A = 0 ; B = 0 ; C = 0
- A = -2 ; C = +1 ; B = -1 A = 0 ; B = 0 ; C =+1
- ==============================================================
-
- So we get, in all three cases:
-
- newA = -max( oldB, 0 )
- newC = -min( oldB, 0 )
- newB = 0
-
- This is EXACTLY what we had for the LR case (isnt that nice!!!) so now we can
- code up a double rotation as follows:
-
- /*
- * rotate_twice() -- rotate a given node in the given direction
- * and then in the opposite direction
- * to restore the balance of a tree
- */
- void rotate_twice( rootp, dir )
- AVLtree *rootp;
- DIRECTION dir;
- {
- DIRECTION other_dir = OPPOSITE( dir );
- AVLtree old_root = *rootp,
- old_other_dir_subtree = (*rootp) -> subtree[ other_dir ];
-
- /* assign new root */
- *rootp = old_root -> subtree[ other_dir ] -> subtree[ dir ];
-
- /* new-root exchanges it's "dir" subtree for it's grandparent */
- old_root -> subtree[ other_dir ] = (*rootp) -> subtree[ dir ];
- (*rootp) -> subtree[ dir ] = old_root;
-
- /* new-root exchanges it's "other-dir" subtree for it's parent */
- old_other_dir_subtree -> subtree[ dir ] = (*rootp) -> subtree[ other_dir ];
- (*rootp) -> subtree[ other_dir ] = old_other_dir_subtree;
-
- /* update balances */
- (*rootp) -> subtree[ LEFT ] -> bal = -MAX( (*rootp) -> bal, 0 );
- (*rootp) -> subtree[ RIGHT ] -> bal = -MIN( (*rootp) -> bal, 0 );
- (*rootp) -> bal = 0;
-
- }/* rotate_twice */
-
- Now isnt that special! Now that we have the rotation routines written,
- we just need to worry about when to call them. One help is a routine called
- balance which is called when a node gets too heavy on a particular side.
- Here we need to take into account the special case that occurs when a
- single rotation was made but the HEIGHT of the rotated tree did NOT change
- as a result of the rotation (discussed in more detail in the next section):
-
-
- /* return codes used by avl_insert(), avl_delete(), and balance() */
- #define HEIGHT_UNCHANGED 0
- #define HEIGHT_CHANGED 1
-
- /* Balance Definitions */
- #define LEFT_HEAVY -1
- #define BALANCED 0
- #define RIGHT_HEAVY 1
- #define LEFT_IMBALANCE(nd) ( (nd)->bal < LEFT_HEAVY )
- #define RIGHT_IMBALANCE(nd) ( (nd)->bal > RIGHT_HEAVY )
-
- /*
- * balance() -- determines and performs the sequence of rotations needed
- * (if any) to restore the balance of a given tree.
- *
- * Returns 1 if tree height changed due to rotation; 0 otherwise
- */
- short balance( rootp )
- AVLtree *rootp;
- {
- short special_case = FALSE;
-
- if ( LEFT_IMBALANCE( *rootp ) ) { /* need a right rotation */
- if ( (*rootp) -> subtree[ LEFT ] -> bal == RIGHT_HEAVY )
- rotate_twice( rootp, RIGHT ); /* double RL rotation needed */
-
- else /* single RR rotation needed */
- special_case = rotate_once( rootp, RIGHT );
- }/* if */
-
- else if ( RIGHT_IMBALANCE( *rootp ) ) { /* need a left rotation */
- if ( (*rootp) -> subtree[ RIGHT ] -> bal == LEFT_HEAVY )
- rotate_twice( rootp, LEFT ); /* double LR rotation needed */
-
- else /* single LL rotation needed */
- special_case = rotate_once( rootp, LEFT );
- }/* elif */
-
- else return HEIGHT_UNCHANGED; /* no rotation occurred */
-
- return ( special_case ) ? HEIGHT_UNCHANGED : HEIGHT_CHANGED;
- }/* balance */
-
- This routine helps but now comes the hard part (IMHO at least), figuring
- out when the height of the current subtree has changed.
-
- CALCULATING WHEN THE HEIGHT OF THE CURRENT SUBTREE HAS CHANGED:
- ===============================================================
- After we have inserted or deleted a node from the current subtree, we
- need to determine if the total height of the current tree has changed
- so that we may pass the information up the recursion stack to previous
- instantiations of the insertion and deletion routines.
-
- Let us first consider the case of an insertion. The simplest case is
- at the point where the insertion occurred. Since we have created a node
- where one did not previously exist, we have increased the height of the
- inserted node from 0 to 1. Therefore we need to pass the value 1 (I will
- use "1" for TRUE and "0" for FALSE) back to the previous level of
- recursion to indicate the increase in the height of the current subtree.
-
- | after insertion |
- NULL ================> |
- A
-
-
- The remaining cases for an insertion are almost as simple. If a 0 (FALSE)
- was the "height-change-indicator" passed back by inserting into a subtree
- of the current level, then there is no height change at the current level.
- It is true that the structure of one of the subtrees may have changed due
- to an insertion and/or rotation, but since the height of the subtree did
- not change, neither did the height of the current level.
-
- | after insertion |
- | ================> |
- A A
- / \ / \
- / \ / \
- b c b d
-
- If the current level is balanced after inserting the node (but before
- attempting any rotations) then we just made one subtree equal in height
- to the other. Therefore the overall height of the current level remains
- unchanged and a 0 is returned.
-
- | after insertion |
- | ================> |
- A A
- / / \
- / / \
- b b c
-
- Before we go and write avl_insert, we will need some help from a few other
- small routines, most of which are needed for deletion more than for insertion.
- These routines will free/create a node, and tell the type of a node.
-
- /* ckalloc( size ) -- allocate space; check for success */
- void *ckalloc( size )
- int size;
- {
- void *malloc(), *ptr;
-
- if ( (ptr = (char *) malloc( (unsigned) size)) == NULL ) {
- fprintf( stderr, "Unable to allocate storage." );
- exit( 1 );
- }/* if */
-
- return ptr;
- }/* ckalloc */
-
-
- /*
- * new_node() -- get space for a new node and its data;
- * return the address of the new node
- */
- AVLtree new_node( data, size )
- void *data;
- unsigned size;
- {
- AVLtree root;
-
- root = (AVLtree) ckalloc( sizeof (AVLnode) );
- root -> data = (void *) ckalloc( size );
- memmove( root -> data, data, size );
- root -> bal = BALANCED;
- root -> subtree[ LEFT ] = root -> subtree[ RIGHT ] = NULL_TREE;
-
- return root;
- }/* new_node */
-
-
- /*
- * free_node() -- free space for a node and its data!
- * reset the node pointer to NULL
- */
- void free_node( rootp )
- AVLtree *rootp;
- {
- free( (void *) *rootp );
- *rootp = NULL_TREE;
- }/* free_node */
-
-
- /*
- * node_type() -- determine the number of null pointers for a given
- * node in an AVL tree, Returns a value of type NODE
- * which is an enumeration type with the following values:
- *
- * IS_TREE -- both subtrees are non-empty
- * IS_LBRANCH -- left subtree is non-empty; right is empty
- * IS_RBRANCH -- right subtree is non-empty; left is empty
- * IS_LEAF -- both subtrees are empty
- * IS_NULL -- given tree is empty
- */
- NODE node_type( tree )
- AVLtree tree;
- {
- if ( tree == NULL_TREE )
- return IS_NULL;
-
- else if ( (tree -> subtree[ LEFT ] != NULL_TREE)
- && (tree -> subtree[ RIGHT ] != NULL_TREE) )
- return IS_TREE;
-
- else if ( tree -> subtree[ LEFT ] != NULL_TREE )
- return IS_LBRANCH;
-
- else if ( tree -> subtree[ RIGHT ] != NULL_TREE )
- return IS_RBRANCH;
-
- else
- return IS_LEAF;
- }/* node_type */
-
- Now the last helper routines we need will be a dirty trick. We need a function
- to compare items while we traverse the tree. Normally, we expect this compare
- function to return a strcmp() type result (<0, =0, or >0 for <,=,> respectively)
- We will be a little sneaky and write our own compare function which will take
- the supplied compare function, and the node-type of the node being compared
- against. This will help in deletion when we want to delete the minimal/maximal
- element of a given tree. We will go about it in such a way that we can delete
- or find a given item, or the min/max item in a tree. We do this as follows:
-
- /*
- * avl_min() -- compare function used to find the minimal element in a tree
- */
- int avl_min( elt1, elt2, nd_typ )
- void *elt1, *elt2;
- NODE nd_typ;
- {
- if ( nd_typ == IS_RBRANCH || nd_typ == IS_LEAF )
- return 0; /* left subtree is empty -- this is the minimum */
-
- else return -1; /* keep going left */
- }/* avl_min */
-
-
- /*
- * avl_max() -- compare function used to find the maximal element in a tree
- */
- int avl_max( elt1, elt2, nd_typ )
- void *elt1, *elt2;
- NODE nd_typ;
- {
- if ( nd_typ == IS_RBRANCH || nd_typ == IS_LEAF )
- return 0; /* right subtree is empty -- this is the maximum */
-
- else return 1; /* keep going right */
- }/* avl_max */
-
- Now we can use avl_min/avl_max as compare functions. If the compare
- function is other than one of these two, usually it will only use the first
- two parameters (so it gets an extra one). This is not great for portability
- so we should really do something like:
-
- /*
- * avl_compare() -- compare an item with a node-item in an avl tree
- */
- int avl_compare( elt1, elt2, nd_typ, compar )
- void *elt1, *elt2;
- NODE nd_typ;
- int (*compar)();
- {
- if ( compare == avl_min || compar == avl_max )
- return (*compar)( elt1, elt2, nd_typ );
- else
- return (*compare)( elt1, elt2 );
- }/* avl_compare */
-
- If you wish to do this you may, but my code works without it, it just ignores
- any extra parameters. If you have ANSI-C you may get some compiler complaints.
- In any case, I shall proceed without using avl_compare(). In addition, avl_min
- and avl_max should only be passed to avl_find (search without inserting),
- and avl_delete (NOT avl_insert). We are now ready to write avl_insert:
-
- /*
- * avl_insert() -- insert an item into the given tree
- *
- * PARAMETERS:
- * data -- a pointer to a pointer to the data to add;
- * On exit, *data is NULL if insertion succeeded,
- * otherwise address of the duplicate key
- * rootp -- a pointer to an AVL tree
- * compar -- name of the function to compare 2 data items
- */
- short avl_insert( data, rootp, compar )
- void **data;
- AVLtree *rootp;
- int (*compar)();
- {
- short increase;
- int cmp;
-
- if ( *rootp == NULL_TREE ) { /* insert new node here */
- *rootp = new_node( *data, SIZE_OF_DATA )) );
- *data = NULL; /* set return value in data */
- return HEIGHT_CHANGED;
- }/* if */
-
- cmp = (*compar)( *data, (*rootp) -> data ); /* compare data items */
-
- if ( cmp < 0 ) { /* insert into the left subtree */
- increase = -avl_insert( data, &( (*rootp) -> subtree[ LEFT ] ), compar );
- if ( *data != NULL ) return HEIGHT_UNCHANGED;
- }/* elif */
-
- else if ( cmp > 0 ) { /* insert into the right subtree */
- increase = avl_insert( data, &( (*rootp) -> subtree[ RIGHT ] ), compar );
- if ( *data != NULL ) return HEIGHT_UNCHANGED;
- }/* elif */
-
- else { /* data already exists */
- *data = (*rootp) -> data; /* set return value in data */
- return HEIGHT_UNCHANGED;
- } /* else */
-
- (*rootp) -> bal += increase; /* update balance factor */
-
- /************************************************************************
- * re-balance if needed -- height of current tree increases only if its
- * subtree height increases and the current tree needs no rotation.
- ************************************************************************/
- if ( increase && (*rootp) -> bal )
- return ( 1 - balance( rootp ) );
- else
- return HEIGHT_UNCHANGED;
- }/* avl_insert */
-
-
- Deletion is more complicated than insertion. The height of the current level
- may decrease for two reasons: either a rotation occurred to decrease the
- height of a subtree (and hence the current level), or a subtree shortened
- in height resulting in a now balanced current level (subtree was "trimmed
- down" to the same size as the other). Just because a rotation has occurred
- however, does not mean that the subtree height has decreased. There is a
- special case where rotating preserves the current subtree height.
-
- Suppose I have a tree as follows:
-
-
- C
- / \
- A E
- / \
- D F
-
-
- Deleting "A" results in the following (imbalanced) tree:
-
-
- C
- \
- E
- / \
- D F
-
-
- This type of imbalance cannot occurr during insertion, only during
- deletion. Notice that the root has a balance of 2 but its heavy subtree
- has a balance of zero (the other case would be a -2 and a 0). Performing
- a single left rotation to restore the balance results in:
-
-
- E
- / \
- C F
- \
- D
-
-
- This tree has the same height as it did before it was rotated. Hence,
- we may determine if deletion caused the subtree height to change by
- seeing if one of the following occurred:
-
- 1) If the new balance (after deletion) is zero and NO rotation
- took place.
-
- 2) If a rotation took place but was NOT one of the special rotations
- mentioned above (a -2:0 or a 2:0 rotation).
-
- For insertion, we only needed to check if a rotation occurred to see if the
- subtree height had changed. But for deletion we need to ckeck all of the above.
- So for deletion of a node we have:
-
- /*
- * avl_delete() -- delete an item from the given tree
- *
- * PARAMETERS:
- * data -- a pointer to a pointer to the key to delete
- * On exit, *data points to the deleted data item
- * (or NULL if deletion failed).
- * rootp -- a pointer to an AVL tree
- * compar -- name of function to compare 2 data items
- */
- PRIVATE short avl_delete( data, rootp, compar )
- void **data;
- AVLtree *rootp;
- int (*compar)();
- {
- short decrease;
- int cmp;
- AVLtree old_root = *rootp;
- NODE nd_typ = node_type( *rootp );
- DIRECTION dir = (nd_typ == IS_LBRANCH) ? LEFT : RIGHT;
-
- if ( *rootp == NULL_TREE ) { /* data not found */
- *data = NULL; /* set return value in data */
- return HEIGHT_UNCHANGED;
- }/* if */
-
- /* NOTE the extra parameter to compare this time */
- cmp = compar( *data, (*rootp) -> data, nd_typ ); /* compare data items */
-
- if ( cmp < 0 ) { /* delete from left subtree */
- decrease = -avl_delete( data, &( (*rootp) -> subtree[ LEFT ] ), compar );
- if ( *data == NULL ) return HEIGHT_UNCHANGED;
- }/* elif */
-
- else if ( cmp > 0 ) { /* delete from right subtree */
- decrease = avl_delete( data, &( (*rootp) -> subtree[ RIGHT ] ), compar );
- if ( *data == NULL ) return HEIGHT_UNCHANGED;
- }/* elif */
-
- /************************************************************************
- * At this point we know that if "cmp" is zero then "*rootp" points to
- * the node that we need to delete. There are three cases:
- *
- * 1) The node is a leaf. Remove it and return.
- *
- * 2) The node is a branch (has only 1 child). Make "*rootp"
- * (the pointer to this node) point to the child.
- *
- * 3) The node has two children. We swap data with the successor of
- * "*rootp" (the smallest item in its right subtree) and delete
- * the successor from the right subtree of "*rootp". The
- * identifier "decrease" should be reset if the subtree height
- * decreased due to the deletion of the sucessor of "rootp".
- ************************************************************************/
-
- else { /* cmp == 0 */
- *data = (*rootp) -> data; /* set return value in data */
-
- switch ( nd_typ ) { /* what kind of node are we removing? */
- case IS_LEAF :
- free_node( rootp ); /* free the leaf, its height */
- return HEIGHT_CHANGED; /* changes from 1 to 0, return 1 */
-
- case IS_RBRANCH : /* only child becomes new root */
- case IS_LBRANCH :
- *rootp = (*rootp) -> subtree[ dir ];
- free_node( &old_root ); /* free the deleted node */
- return HEIGHT_CHANGED; /* we just shortened the "dir" subtree */
-
- case IS_TREE :
- decrease = avl_delete( &( (*rootp) -> data ),
- &( (*rootp) -> subtree[ RIGHT ] ),
- avl_min );
- } /* switch */
- } /* else */
-
- (*rootp) -> bal -= decrease; /* update balance factor */
-
- /**********************************************************************
- * Rebalance if necessary -- the height of current tree changes if one
- * of two things happens: (1) a rotation was performed which changed
- * the height of the subtree (2) the subtree height decreased and now
- * matches the height of its other subtree (so the current tree now
- * has a zero balance when it previously did not).
- **********************************************************************/
- if ( decrease && (*rootp) -> bal ) /* return 1 if height */
- return balance( rootp ); /* changed due to rotation */
-
- else if ( decrease && !(*rootp) -> bal ) /* or if balance is 0 from */
- return HEIGHT_CHANGED; /* height decrease of subtree */
-
- else
- return HEIGHT_UNCHANGED;
-
- }/* avl_delete */
-
- NOTICE how in the case of nd_typ == IS_TREE, I only need one statement. This
- is due to the way avl_delete sets its parameters. The data pointer passed on
- entrance points to the deleted node's data on exit. So I just delete the
- minimal element of the right subtree, and steal its data as my-own (returning
- my former data item on exit).
-
-
- And there we have it, the mantainenance of AVL tree manipulations, the brunt
- of which is covered in 5 routines, none of which (except for delete which
- is less than 1-1/2pages) is greater than 1 normal page in length, including
- comments (and there are a lot). The main routines are:
-
- rotate_once(), rotate_twice(), balance(), avl_insert(), avl_delete().
-
- All other routines are very small and auxillary in nature.
-