Home Page

Chapter 9: Search Trees


People like choices. We like to have different ways of solving the same problem, so that we can explore different trade-offs and efficiencies. This chapter is devoted to the exploration of different ways of solving a problem we discussed in an earlier chapter--the implementation of an ordered dictionary. Namely, we study several alternative data structures based on trees for realizing ordered dictionaries.

We begin this chapter by discussing binary search trees, and how they support a simple tree-based implementation of an ordered dictionary, but do not guarantee efficient worst-case performance. Nevertheless, they form the basis of many tree-based dictionary implementations, and we discuss several in this chapter. One of the classic implementations is the AVL tree, which is a binary search tree that achieves logarithmic-time search and update operations.

We also introduce the concept of a multi-way search tree, which is an ordered tree where each internal node can store several items and have several children. A multi-way search tree is a generalization of the binary search tree, and like the binary search tree, it can be specialized into an efficient data structure for dictionaries by imposing additional constraints. One of the advantages of using these multi-way trees is that they often require fewer internal nodes than binary search trees to store items. But, just as with binary search trees, multi-way trees require additional methods to make them efficient for all dictionary methods.

We discuss a specific multi-way tree, the (2,4) tree. This is a multi-way search tree, such that all the external nodes have the same depth and each node stores 1, 2, or 3 keys and has 2, 3, or 4 children, respectively. The advantage of this kind of trees is that it has algorithms for inserting and removing keys that are simple and intuitive. Update operations rearrange a (2,4) tree by means of natural operations that split and merge `nearby' nodes or transfer keys between them. A (2,4) tree storing n items uses O(n) space and supports searches, insertions, and removals in logarithmic worst-case time.

We present red-black trees, as well. These are binary search trees whose nodes are colored `red' and `black' in such a way that the coloring scheme guarantees logarithmic height. There is a simple, yet illuminating, correspondence between red-black and (2,4) trees. Using this correspondence, we motivate and provide intuition for the somewhat more complex algorithms for insertion and removal in red-black trees, which are based on rotations and recolorings. Like an AVL tree, a red-black tree uses linear space and supports searches, insertions, and removals in logarithmic worst-case time. The advantage that a red-black tree achieves over an AVL tree is that it can be restructured after an insertion or removal with only constant number of rotations (although at the expense of more complex operations).

Finally, we discuss external searching, that is, searching in external memory (which will usually be a disk). We introduce a type of multi-way tree called the B-tree and we show how it can be used to store and search items so as to minimize the number of input-output (I/O) operations.

There are admittedly quite a few kinds of search trees discussed in this chapter, and we recognize that a reader or instructor with limited time might be interested in studying only selected topics. For this reason, in the figure below, we show the conceptual dependencies between the sections in this chapter. Thus, for example, a reader can focus on binary search trees and then study AVL trees, (2,4) trees (possibly along with red-black trees as well), and/or external searching.