Data Structures and Algorithms with Object-Oriented Design Patterns in Ruby
next up previous index

AVL Search Trees

 

The problem with binary search trees is that while the average running times for search, insertion, and withdrawal operations are all tex2html_wrap_inline58549, any one operation is still O(n) in the worst case. This is so because we cannot say anything in general about the shape of the tree.

For example, consider the two binary search trees shown Figure gif. Both trees contain the same set of keys. The tree tex2html_wrap_inline61985 is obtained by starting with an empty tree and inserting the keys in the following order

displaymath63373

The tree tex2html_wrap_inline61987 is obtained by starting with an empty tree and inserting the keys in this order

displaymath63374

Clearly, tex2html_wrap_inline61987 is a better tree search tree than tex2html_wrap_inline61985. In fact, since tex2html_wrap_inline61987 is a perfect binary tree , its height is tex2html_wrap_inline63399. Therefore, all three operations, search, insertion, and withdrawal, have the same worst case asymptotic running time, tex2html_wrap_inline58549.

The reason that tex2html_wrap_inline61987 is better than tex2html_wrap_inline61985 is that it is the more balanced tree. If we could ensure that the search trees we construct are balanced then the worst-case running time of search, insertion, and withdrawal, could be made logarithmic rather than linear. But under what conditions is a tree balanced?

If we say that a binary tree is balanced if the left and right subtrees of every node have the same height, then the only trees which are balanced are the perfect binary trees. A perfect binary tree of height h has exactly tex2html_wrap_inline62373 internal nodes. Therefore, it is only possible to create perfect trees with n nodes for tex2html_wrap_inline63413. Clearly, this is an unsuitable balance condition because it is not possible to create a balanced tree for every n.

What are the characteristics of a good balance condition ?

  1. A good balance condition ensures that the height of a tree with n nodes is tex2html_wrap_inline58549.
  2. A good balance condition can be maintained efficiently. That is, the additional work necessary to balance the tree when an item is inserted or deleted is O(1).
Adelson-Velskii and Landisgif were the first to propose the following balance condition and show that it has the desired characteristics.

Definition (AVL Balance Condition)  An empty binary tree is AVL balanced  . A non-empty binary tree, tex2html_wrap_inline62349, is AVL balanced if both tex2html_wrap_inline62345 and tex2html_wrap_inline62347 are AVL balanced and

displaymath63375

where tex2html_wrap_inline63429 is the height of tex2html_wrap_inline62345 and tex2html_wrap_inline63433 is the height of tex2html_wrap_inline62347.

Clearly, all perfect binary trees are AVL balanced. What is not so clear is that heights of all trees that satisfy the AVL balance condition are logarithmic in the number of internal nodes.

Theorem  The height, h, of an AVL balanced tree with n internal nodes satisfies

displaymath63376

extbfProof The lower bound follows directly from Theorem gif. It is in fact true for all binary trees regardless of whether they are AVL balanced.

To determine the upper bound, we turn the problem around and ask the question, what is the minimum number of internal nodes in an AVL balanced tree of height h?

Let tex2html_wrap_inline63443 represent an AVL balanced tree of height h which has the smallest possible number of internal nodes, say tex2html_wrap_inline63447. Clearly, tex2html_wrap_inline63443 must have at least one subtree of height h-1 and that subtree must be tex2html_wrap_inline63453. To remain AVL balanced, the other subtree can have height h-1 or h-2. Since we want the smallest number of internal nodes, it must be tex2html_wrap_inline63459. Therefore, the number of internal nodes in tex2html_wrap_inline63443 is tex2html_wrap_inline63463, where tex2html_wrap_inline63465.

Clearly, tex2html_wrap_inline62199 contains a single internal node, so tex2html_wrap_inline63469. Similarly, tex2html_wrap_inline61965 contains exactly two nodes, so tex2html_wrap_inline63473. Thus, tex2html_wrap_inline63447 is given by the recurrence

  equation18829

The remarkable thing about Equation gif is its similarity with the definition of Fibonacci numbers  (Equation gif). In fact, it can easily be shown by induction that

displaymath63377

for all tex2html_wrap_inline62261, where tex2html_wrap_inline63479 is the tex2html_wrap_inline60669 Fibonacci number.

Base Cases

eqnarray18843

Inductive Hypothesis Assume that tex2html_wrap_inline63483 for tex2html_wrap_inline62271. Then

eqnarray18847

Therefore, by induction on k, tex2html_wrap_inline63483, for all tex2html_wrap_inline62261.

According to Theorem gif, the Fibonacci numbers are given by

displaymath60979

where tex2html_wrap_inline59121 and tex2html_wrap_inline59123. Furthermore, since tex2html_wrap_inline63497, tex2html_wrap_inline63499.

Therefore,

eqnarray18867

This completes the proof of the upper bound.

So, we have shown that the AVL balance condition satisfies the first criterion of a good balance condition--the height of an AVL balanced tree with n internal nodes is tex2html_wrap_inline63503. What remains to be shown is that the balance condition can be efficiently maintained. To see that it can, we need to look at an implementation.




next up previous index

Bruno Copyright © 2004 by Bruno R. Preiss, P.Eng. All rights reserved.