|
|
Binary Search TreeIn computer science, a binary search tree (BST) is a binary tree where every node has a value, every node's left subtree has values less than or equal to the node's value, and every right subtree has values greater. Note that this requires a linear order on the values. A new node is added as a leaf. There is a sort algorithm based on binary search trees, and also a search algorithm. An in-order traversal of a binary search tree will visit the values in increasing order. Operations Search Searching a binary tree for a specific value is easy due to its special properties. We begin by examining the root. If the value equals the root, we have found it. If it is less than the root, then it must be in the left subtree, so we recursively search the left subtree in the same manner. Similarly, if it is greater than the root, then it must be in the right subtree, so we recursively search the right subtree in the same manner. If we reach a nil leaf, then the item is not where it would be if it were present, so it is not in the tree at all. A comparison may be made with binary search, which operates in nearly the same way but using random access on an array instead of following links. Here is the search algorithm in the Python programming language: def search_binary_tree(treenode, value): if treenode is None: return None # failure left, nodevalue, right = treenode if nodevalue > value: return search_binary_tree(left, value) elif value > nodevalue: return search_binary_tree(right, value) else: return nodevalue This operation requires O(log n) time in the average case, but at worst-case is O(n). Insertion Insertion is almost identical to search: we search for the value, but if we do not find the value we continue searching along either the left or right branch. Eventually we will reach a nil leaf, and then we simply add the value at that position. Put differently, we examine the root and recursively insert the new node to the left subtree if the new value is less than or equal the root, or the right subtree if the new value is greater than the root. Here is the insert algorithm in Python: def binary_tree_insert(treenode, value): if treenode is None: return (None, value, None) left, nodevalue, right = treenode if nodevalue > value: return (binary_tree_insert(left, value), nodevalue, right) else: return (left, nodevalue, binary_tree_insert(right, value)) This operation requires O(log n) time in the average case, but at worst-case is O(n). Deletion Deleting a node which has no children is easy: we just delete it. If a node has only one child, we delete it and replace it with its child. The tricky part comes when we want to delete a node with two children. Call the node being deleted N. In order to do this, we need to replace the value of N with some other value from the tree which is greater than or equal to all nodes in N's left subtree, and less than or equal to all nodes in N's right subtree. Deleting a node with two children from a binary search tree Where will we find such a value? We observe that there are two such values: the largest value in N's left subtree, found by repeatedly following right links, and the smallest value in N's right subtree, found by repeatedly following left links. Once we find it, we copy its value into N, and then delete it. Since either of these nodes has less than two children, we already know how to delete it. In a good implementation, it's recommended to avoid consistently using one of these nodes, because this can unbalance the tree. Here is Python code for deletion: def binary_tree_delete(treenode, value): if treenode is None: return None # Value not found left, nodevalue, right = treenode if nodevalue == value: if left is None: return right elif right is None: return left else: maxvalue, newleft = find_remove_max(left) return (newleft, maxvalue, right) elif value < nodevalue: return (binary_tree_delete(left, value), nodevalue, right) else: return (left, nodevalue, binary_tree_delete(right, value)) def find_remove_max(treenode): left, nodevalue, right = treenode if right is None: return (nodevalue, left) else: (maxvalue, newright) = find_remove_max(right) return (maxvalue, (left, nodevalue, newright)) Although this operation doesn't always traverse the tree down to a leaf, this is always a possibility; thus in the worst case, it requires time proportional to the height of the tree. It doesn't require more even when the node has two children, since it still follows a single path and visits no node twice. Traversal Once we've created a binary search tree, we can traverse its elements in order by recursively traversing the left subtree, visiting the root, then recursively traversing the right subtree. The tree may also be traversed in pre order or post order. def traverse_binary_tree(treenode): if treenode is None: return left, nodevalue, right = treenode traverse_binary_tree(left) visit(nodevalue) traverse_binary_tree(right) Traversal requires Θ(n) time, since it must visit every node. Sort A binary search tree can be used to implement a simple but inefficient sort algorithm. To do this, we insert all the values we wish to sort into a new binary search tree, then traverse it in order, building our result: def build_binary_tree(values): tree = None for v in values: tree = binary_tree_insert(tree, v) return tree def traverse_binary_tree(treenode): if treenode is None: return [] else: left, value, right = treenode return (traverse_binary_tree(left) + value + traverse_binary_tree(right)) The worst-case time of build_binary_tree is O(n2) — if you feed it a sorted list of values, it chains them into a linked list with no left subtrees. For example, build_binary_tree(2, 3, 4, 5) yields the tree (None, 1, (None, 2, (None, 3, (None, 4, (None, 5, None))))). There are a variety of schemes for overcoming this flaw with simple binary trees; the most common is the self-balancing binary search tree. If this same procedure is done using such a tree, the overall worst-case time is the ideal O(nlog n). Types of binary search trees There are many types of binary search trees. AVL trees and red-black trees are both forms of self-balancing binary search trees. A splay tree is a binary search tree that automatically moves frequently accessed elements nearer to the root. In a treap ("tree heap"), each node also holds a priority and the parent node has higher priority than its children. External links
|
 |