In previous days, you saw how to create sorted lists and how to sort lists of data. ROBIN and many other programs require the capability to quickly find data based on a requested value. Although sorting the data is a good start, there are efficient ways to store your records so that retrieval time is optimized. Today you will learn
This chapter explores a number of lists and other complex data structures. To avoid retyping the same code repeatedly, and to provide a good example of how professional programs are laid out, I'll start by defining some header files that will be used for the entire chapter.
Note: The programs in this book were designed with portability in mind. Nonetheless, your compiler may be a bit more of a stickler than mine, and you may get warnings (or perhaps errors!) when you compile some of the code in this book.
There are a few things to try if you encounter errors or warnings in this case. First, if your compiler complains about the use of the BOOL type, comment out the enumerated BOOL shown in the standard definition file that follows, and replace it with these lines:
1: typedef int BOOL; 2: const int FALSE = 0; 3: const int TRUE = 1;
Be sure to contact me on Interchange or CompuServe if the code simply doesn't work for you. It may be that a typo slipped in despite our careful scrutiny, and we may be able to straighten it out for you. With this code, and every other line of code you read or create, don't assume that it is 100 percent correct as it stands.
Listing 7.1 shows stdef.hpp, the file in which some common definitions will be kept, along with a few #include statements that will be used by all subsequent programs. Virtually every listing will include stdef.hpp (standard definition).
Listing 7.1 Using Stdef.hpp to Store Standard Definitions
1: // ************************************************** 2: // PROGRAM: Standard Definitions 3: // FILE: stdef.hpp 4: // PURPOSE: provide standard inclusions and definitions 5: // NOTES: 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 10/23/94 1.0 jl initial release 8: // ************************************************** 9: 10: 11: #ifndef STDDEF_HPP // inclusion guards 12: #define STDDEF_HPP 13: 14: enum BOOL { FALSE, TRUE }; 15: 16: #include <iostream.h> 17: 18: #endif
Output:
None.
Analysis:
Stdef.hpp provides the standard definitions and inclusions that will be used in various other programs. Note the inclusions guards in lines 11 and 12 and ending in line 18. This is a standard way to prevent header files from being included twice in your program. On line 11, the definition STDEF_HPP is tested. If this file has not yet been included, this line will test true and the rest of the file will be included. If the file has not already been included, this line will test false and nothing until the endif in line 18 will be included; the file therefore will be skipped.
In line 12, STDEF_HPP is defined so that the next time through, this check will fail and the file will not be included a second time.
Much of this chapter will use the sorted list template developed on day 3 and reproduced in listing 7.2. This template will be stored in a file named linklist.hpp.
Listing 7.2 The Sorted Linked List Template
1: // ************************************************** 2: // PROGRAM: Linked List Header 3: // FILE: linklist.hpp 4: // PURPOSE: provide sorted linked list template 5: // NOTES: 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 10/23/94 1.0 jl initial release 8: // ************************************************** 9: 10: 11: #ifndef LINKLIST_HPP 12: #define LINKLIST_HPP 13: 14: // **************** Node class ************ 15: template <class T> 16: class Node 17: { 18: public: 19: Node (T*); 20: ~Node(); 21: void InsertAfter(Node *); 22: Node * GetNext() const { return itsNext; } 23: void SetNext(Node * next) { itsNext = next; } 24: T & GetObject() const { return *itsObject; } 25: int operator<(const Node &rhs) const; 26: BOOL operator==(const T& rhs) const; 27: 28: private: 29: T * itsObject; 30: Node * itsNext; 31: }; 32: 33: // **************** Object List ************ 34: template <class T> 35: class List 36: { 37: public: 38: List(); 39: ~List(); 40: long GetCount() const { return itsCount; } 41: void Insert(T &); 42: void Iterate(void (T::*f)()const); 43: T & operator[](long); 44: T * FindObject(const T& target ); 45: 46: private: 47: Node<T> itsHead; 48: long itsCount; 49: }; 50: 51: // *** node implementations **** 52: 53: template <class T> 54: Node<T>::Node(T * pObject): 55: itsObject(pObject), 56: itsNext(0) 57: {} 58: 59: template <class T> 60: Node<T>::~Node() 61: { 62: delete itsObject; 63: itsObject = 0; 64: delete itsNext; 65: itsNext = 0; 66: } 67: 68: template <class T> 69: void Node<T>::InsertAfter(Node* newNode) 70: { 71: newNode->SetNext(itsNext); 72: itsNext=newNode; 73: } 74: 75: template <class T> 76: int Node<T>::operator<(const Node &rhs) const 77: { 78: return (*itsObject < rhs.GetObject()); 79: } 80: 81: template <class T> 82: BOOL Node<T>::operator==(const T& target) const 83: { 84: return (*itsObject == target); 85: } 86: 87: // Implementations for Lists... 88: 89: template<class T> 90: List <T>::List(): 91: itsCount(0), 92: itsHead(0) // initialize head node to have no Object 93: {} 94: 95: template<class T> 96: List <T>::~List() 97: { 98: } 99: 100: template<class T> 101: T & List<T>::operator[](long offSet) 102: { 103: if (offSet+1 > itsCount) 104: return itsHead.GetObject(); // error 105: 106: Node<T>* pNode = itsHead.GetNext(); 107: 108: for (long i=0;i<offSet; i++) 109: pNode = pNode->GetNext(); 110: 111: return pNode->GetObject(); 112: } 113: 114: template<class T> 115: T* List<T>::FindObject(const T& target ) 116: { 117: for (Node<T> * pNode = itsHead.GetNext(); 118: pNode!=NULL; 119: pNode = pNode->GetNext() 120: ) 121: { 122: if ( *pNode == target) 123: break; 124: } 125: if (pNode == NULL) 126: return 0; 127: else 128: return &(pNode->GetObject()); 129: } 130: 131: template<class T> 132: void List<T>::Insert(T & Object) 133: { 134: Node<T> * NewNode = new Node<T>(&Object); 135: 136: for (Node<T> * pNode = &itsHead;;pNode = pNode->GetNext()) 137: { 138: if (pNode->GetNext() == NULL || *NewNode < *(pNode->GetNext() )) 139: { 140: pNode->InsertAfter(NewNode); 141: itsCount++; 142: break; 143: } 144: } 145: } 146: 147: template<class T> 148: void List<T>::Iterate(void (T::*func)()const) 149: { 150: for (Node<T>* pNode = itsHead.GetNext(); 151: pNode; 152: pNode=pNode->GetNext() 153: ) 154: (pNode->GetObject().*func)(); 155: } 156: 157: #endif
Output:
None.
Analysis:
This code is very similar to the code used on day 3. The significant change is that it has been made into a header file, complete with a heading of its own and inclusion guards. Save this file to your disk as linklist.hpp.
The minimal rNote object as described on day 3 also will be used, along with the String class previously described. The rNote declaration is provided in note.hpp, and the String declaration is provided in string.hpp. The String implementation is provided in string.cpp. Listing 7.3 illustrates note.hpp, listing 7.4 shows string.hpp, and listing 7.5 has string.cpp.
Listing 7.3 Note.hpp
1: // ************************************************** 2: // PROGRAM: Basic Note object 3: // FILE: note.hpp 4: // PURPOSE: provide simple note object 5: // NOTES: 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 10/23/94 1.0 jl initial release 8: // ************************************************** 9: 10: #ifndef NOTE_HPP 11: #define NOTE_HPP 12: 13: #include "stdef.hpp" 14: 15: class rNote 16: { 17: public: 18: rNote(const String& text): 19: itsText(text), itsDate(0L) 20: {itsNoteNumber = theNoteNumber++;} 21: ~rNote(){} 22: 23: const String& GetText()const { return itsText; } 24: long GetDate() const { return itsDate; } 25: long GetNoteNumber() const { return itsNoteNumber; } 26: 27: int operator<(const rNote& rhs) 28: { return itsNoteNumber < rhs.GetNoteNumber(); } 29: BOOL operator==(const rNote& rhs) 30: { return itsText == rhs.GetText(); } 31: 32: operator long() { return itsNoteNumber; } 33: 34: void Display() const 35: { cout << "Note #: " << itsNoteNumber; 36: cout << " Text: " << itsText << endl; } 37: 38: static long theNoteNumber; 39: private: 40: String itsText; 41: long itsDate; 42: long itsNoteNumber; 43: }; 44: 45: #endif
Output:
None
Analysis:
This rNote object will be used in subsequent chapters as the basis for exploring complex data-manipulation objects. Store this listing in the file note.hpp.
Listing 7.4 String.hpp
1: // ************************************************** 2: // PROGRAM: String declaration 3: // FILE: string.hpp 4: // PURPOSE: provide fundamental string functionality 5: // NOTES: 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 10/23/94 1.0 jl initial release 8: // ************************************************** 9: 10: #ifndef STRING_HPP 11: #define STRING_HPP 12: #include <string.h> 13: #include "stdef.hpp" 14: 15: 16: class xOutOfBounds {}; 17: 18: class String 19: { 20: public: 21: 22: // constructors 23: String(); 24: String(const char *); 25: String (const char *, int length); 26: String (const String&); 27: ~String(); 28: 29: // helpers and manipulators 30: int GetLength() const { return itsLen; } 31: BOOL IsEmpty() const { return (BOOL) (itsLen == 0); } 32: void Clear(); // set string to 0 length 33: 34: // accessors 35: char operator[](int offset) const; 36: char& operator[](int offset); 37: const char * GetString()const { return itsCString; } 38: 39: // casting operators 40: operator const char* () const { return itsCString; } 41: operator char* () { return itsCString;} 42: 43: // operators 44: const String& operator=(const String&); 45: const String& operator=(const char *); 46: 47: void operator+=(const String&); 48: void operator+=(char); 49: void operator+=(const char*); 50: 51: int operator<(const String& rhs)const; 52: int operator>(const String& rhs)const; 53: BOOL operator<=(const String& rhs)const; 54: BOOL operator>=(const String& rhs)const; 55: BOOL operator==(const String& rhs)const; 56: BOOL operator!=(const String& rhs)const; 57: 58: 59: // friend functions 60: String operator+(const String&); 61: String operator+(const char*); 62: String operator+(char); 63: 64: void Display()const { cout << *this << " "; } 65: friend ostream& operator<< (ostream&, const String&); 66: 67: private: 68: // returns 0 if same, -1 if this is less than argument, 69: // 1 if this is greater than argument 70: int StringCompare(const String&) const; // used by Boolean operators 71: 72: char * itsCString; 73: int itsLen; 74: }; 75: 76: #endif // end inclusion guard
Listing 7.5 String.cpp
1: // ************************************************** 2: // PROGRAM: String definition 3: // FILE: string.cpp 4: // PURPOSE: provide fundamental string functionality 5: // NOTES: 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 10/23/94 1.0 jl initial release 8: // ************************************************** 9: 10: 11: #include "string.hpp" 12: 13: // default constructor creates string of 0 bytes 14: String::String() 15: { 16: itsCString = new char[1]; 17: itsCString[0] = '\0'; 18: itsLen=0; 19: } 20: 21: String::String(const char *rhs) 22: { 23: itsLen = strlen(rhs); 24: itsCString = new char[itsLen+1]; 25: strcpy(itsCString,rhs); 26: } 27: 28: String::String (const char *rhs, int length) 29: { 30: itsLen = strlen(rhs); 31: if (length < itsLen) 32: itsLen = length; // max size = length 33: itsCString = new char[itsLen+1]; 34: memcpy(itsCString,rhs,itsLen); 35: itsCString[itsLen] = '\0'; 36: } 37: 38: // copy constructor 39: String::String (const String & rhs) 40: { 41: itsLen=rhs.GetLength(); 42: itsCString = new char[itsLen+1]; 43: memcpy(itsCString,rhs.GetString(),itsLen); 44: itsCString[rhs.itsLen]='\0'; 45: } 46: 47: String::~String () 48: { 49: Clear(); 50: } 51: 52: void String::Clear() 53: { 54: delete [] itsCString; 55: itsLen = 0; 56: } 57: 58: //non constant offset operator 59: char & String::operator[](int offset) 60: { 61: if (offset > itsLen) 62: { 63: throw xOutOfBounds(); 64: return itsCString[itsLen-1]; 65: } 66: else 67: return itsCString[offset]; 68: } 69: 70: // constant offset operator 71: char String::operator[](int offset) const 72: { 73: if (offset > itsLen) 74: { 75: throw xOutOfBounds(); 76: return itsCString[itsLen-1]; 77: } 78: else 79: return itsCString[offset]; 80: } 81: 82: // operator equals 83: const String& String::operator=(const String & rhs) 84: { 85: if (this == &rhs) 86: return *this; 87: delete [] itsCString; 88: itsLen=rhs.GetLength(); 89: itsCString = new char[itsLen+1]; 90: memcpy(itsCString,rhs.GetString(),itsLen); 91: itsCString[rhs.itsLen]='\0'; 92: return *this; 93: } 94: 95: const String& String::operator=(const char * rhs) 96: { 97: delete [] itsCString; 98: itsLen=strlen(rhs); 99: itsCString = new char[itsLen+1]; 100: memcpy(itsCString,rhs,itsLen); 101: itsCString[itsLen]='\0'; 102: return *this; 103: } 104: 105: 106: // changes current string, returns nothing 107: void String::operator+=(const String& rhs) 108: { 109: unsigned short rhsLen = rhs.GetLength(); 110: unsigned short totalLen = itsLen + rhsLen; 111: char *temp = new char[totalLen+1]; 112: for (int i = 0; i<itsLen; i++) 113: temp[i] = itsCString[i]; 114: for (int j = 0; j<rhsLen; j++, i++) 115: temp[i] = rhs[j]; 116: temp[totalLen]='\0'; 117: *this = temp; 118: } 119: 120: int String::StringCompare(const String& rhs) const 121: { 122: return strcmp(itsCString, rhs.GetString()); 123: } 124: 125: String String::operator+(const String& rhs) 126: { 127: 128: char * newCString = new char[GetLength() + rhs.GetLength() + 1]; 129: strcpy(newCString,GetString()); 130: strcat(newCString,rhs.GetString()); 131: String newString(newCString); 132: return newString; 133: } 134: 135: 136: String String::operator+(const char* rhs) 137: { 138: 139: char * newCString = new char[GetLength() + strlen(rhs)+ 1]; 140: strcpy(newCString,GetString()); 141: strcat(newCString,rhs); 142: String newString(newCString); 143: return newString; 144: } 145: 146: 147: String String::operator+(char rhs) 148: { 149: int oldLen = GetLength(); 150: char * newCString = new char[oldLen + 2]; 151: strcpy(newCString,GetString()); 152: newCString[oldLen] = rhs; 153: newCString[oldLen+1] = '\0'; 154: String newString(newCString); 155: return newString; 156: } 157: 158: 159: 160: BOOL String::operator==(const String& rhs) const 161: { return (BOOL) (StringCompare(rhs) == 0); } 162: BOOL String::operator!=(const String& rhs)const 163: { return (BOOL) (StringCompare(rhs) != 0); } 164: int String::operator<(const String& rhs)const 165: { return (BOOL) (StringCompare(rhs) < 0); } 166: int String::operator>(const String& rhs)const 167: { return (BOOL) (StringCompare(rhs) > 0); } 168: BOOL String::operator<=(const String& rhs)const 169: { return (BOOL) (StringCompare(rhs) <= 0); } 170: BOOL String::operator>=(const String& rhs)const 171: { return (BOOL) (StringCompare(rhs) >= 0); } 172: 173: ostream& operator<< (ostream& ostr, const String& str) 174: { 175: ostr << str.itsCString; 176: return ostr; 177: }
Output:
None.
Analysis:
The string implementation provided here is the same as seen previously, except that it has been changed to work with the other included files. Save this as string.cpp.
String.hpp, string.cpp, note.hpp, linklist.hpp, and stdef.hpp now provide a set of functionalities with which you can explore the complex data structures seen in the rest of this chapter. Save these all in one directory, because you will be using them and extending them over the coming days.
Before moving on, examine listing 7.6, which provides a small test program that exercises the listings provided earlier in this chapter.
Listing 7.6 Testing the Building Blocks
1: // ************************************************** 2: // PROGRAM: Test driver 3: // FILE: 0706.cpp 4: // PURPOSE: Test String, Note and Linked List 5: // NOTES: Same functionality as 0307.cpps 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 10/23/94 1.0 jl initial release 8: // ************************************************** 9: 10: #include "String.hpp" // string class 11: #include "Note.hpp" // notes 12: #include "LinkList.hpp" // linked lists 13: #include "Note.hpp" 14: #include "stdef.hpp" // standard definitions 15: 16: long rNote::theNoteNumber = 0; 17: void main() 18: { 19: List<rNote> pl; 20: rNote * pNote = 0; 21: long choice; 22: char buffer[256]; 23: 24: while (1) 25: { 26: cout << "\n(0)Quit (1)Add Note "; 27: cin >> choice; 28: if (!choice) 29: break; 30: 31: cin.ignore(255,'\n'); 32: cout << "\nText: "; 33: cin.getline(buffer,255); 34: pNote = new rNote(buffer); 35: pl.Insert(*pNote); 36: } 37: 38: cout << "\n\nResults: \n" << endl; 39: void (rNote::*pFunc)()const = rNote::Display; 40: pl.Iterate(pFunc); 41: 42: cin.ignore(255,'\n'); 43: cout << "\nFind Note with text: "; 44: cin.getline(buffer,255); 45: rNote target(buffer); 46: pNote=0; 47: pNote = pl.FindObject(target); 48: if (pNote) 49: { 50: cout << "\nFound! " << endl; 51: pNote->Display(); 52: } 53: else 54: cout << "Note not found." << endl; 55: }
Output:
(0)Quit (1)Add Note 1 Text: Walk the walk (0)Quit (1)Add Note 1 Text: Talk the talk (0)Quit (1)Add Note 0 Results: Note #: 0 Text: Walk the walk Note #: 1 Text: Talk the talk Find Note with text: Talk the talk Found! Note #: 1 Text: Talk the talk
Analysis:
Listing 7.6 reproduces the functionality of listing 3.7, "Using a Parameterized List Class," using the new .hpp files shown in the previous listings. This confirms that these listings are correct and will compile and link. Although it took a few pages to get here, you now know that you have solid, reusable code.
A hash table is a convenient way to manage large amounts of data using a small number of storage locations. You can imagine a series of "buckets" with data in each. Each bucket is numbered, and all the data you want to store must go in one or another of the buckets. You create a hashing algorithm to take all your data and reduce it to one of the available bucket values.
A simple (and simplistic!) approach to storing terms for ROBIN might be to hash the data based on the first letter of each word. Words beginning with A would be in bucket 0, words beginning with B would be in bucket 1, and so on. This method would enable you to create an array of 26 words, and each word would have its specific and easily identified location.
The problem with this approach, of course, is that more than one word might evaluate to the same bucket. Both boy and bottle begin with b, for example. You could disallow this condition, but then ROBIN would become pretty unusable, pretty quickly.
A modest alternative would be for each array location to hold a linked list of words. Thus, when two words evaluate to the same bucket (such as boy and bottle), they would each be put in the same list. Listing 7.7 illustrates this approach.
New term: A bucket is a numbered location for your data.
New term: A hashing algorithm is a formula applied to the data that renders a hash bucket number.
Listing 7.7 Hashing the Words in ROBIN
1: // ************************************************** 2: // PROGRAM: Hash Table 3: // FILE: 0707.cpp 4: // PURPOSE: Provide simple hash table for words 5: // NOTES: 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 10/23/94 1.0 jl initial release 8: // ************************************************** 9: 10: #include "String.hpp" // string class 11: #include "LinkList.hpp" // linked lists 12: #include "stdef.hpp" // standard definitions 13: #include <ctype.h> // toupper() 14: 15: void main(int argc, char ** argv) 16: { 17: List<String> myArray[26]; 18: 19: for (int i=1; i<argc; i++) 20: { 21: int offset = toupper(argv[i][0]) - 'A'; 22: String* ps = new String(argv[i]); 23: myArray[offset].Insert(*ps); 24: } 25: 26: for (i=0; i<26; i++) 27: myArray[i].Iterate(String::Display); 28: 29: }Output:
d:\>0707 Eternal Vigilance Is The Price Of Liberty Eternal Is Liberty Of Price The VigilanceAnalysis:
This small program develops a very complicated and interesting hash table. In line 17, myArray is declared to be an array of 26 String lists. In lines 19 through 24, each of the arguments to the program is analyzed, looking for the first letter. The letters are forced to uppercase and then the value of 'A' is subtracted, leaving a number between 0 and 25. That hashed value is the offset into the array of lists of strings at which this word will be kept.
A string is created in line 22 from the argument at argv[i], and that string is passed by reference into the array of lists, calling the Insert function.
In line 26, the entire array is canvassed, with Iterate() called on each list. The address of the String member function Display is passed into Iterate(), and each String of each member of each list is called.
In truth, you almost never would use a hash table as was done here. The hallmark usage of a hash table is when you have a small number of items, which you need to access wicked fast. Typically, you would hope to have more buckets than items, keeping data collisions down to a very small number.
You still would implement the linked list to handle the occasional collision, but almost all these lists would have at most one item in them.
For a first version of ROBIN, the hash algorithm shown in this chapter might be sufficient. It is a fundamental design principle to get something working, and then to worry about optimizing it. Who knows? You might find that accessing the words is not the slow part of your program.
It is true, however, that there are more efficient ways to get at a great deal of data, and that when you have tens of thousands of words you can expect this approach to bog down. After all, accessing the word Thunder means finding the bucket of T words, and then walking the linked list until you find Thunder. There must be a better way, and there is.
A tree is a linked list in which each node points to two or more next nodes. Figure 7.1 illustrates a basic tree structure.
Figure 7.1. Basic tree structure.
The topmost node in a tree is called the root node, and nodes that don't point to any other nodes are called leaf nodes.
New term: The root node is the top of a tree.
New term: A leaf node is a node that points to no other nodes.
Note that in a tree, nodes point only to other nodes farther "down" the tree. The nodes they point to are called child nodes, and thus the pointing node is called the parent node. In figure 7.1, the root node is parent to two children, each of which is a parent to two more children.
The simplest form of tree is a binary tree, in which each node has no more than two child nodes. In a binary tree, each node is associated with a value. The left child node has values less than the current value, and the right child has values greater than the current value. Figure 7.2 illustrates this idea.
Figure 7.2. A simple binary tree.
Note that each node either points to more nodes, or points to a null pointer.
You search for a node by beginning at the root, and then moving down to the left for lesser values, and down to the right for greater values. In figure 7.3, the word client is searched for. The root is Computer, and client comes earlier in the alphabet, so the node to the left is examined. The word client comes later than the word cat, so the node to the right is searched, and the target word, client, is found.
Figure 7.3. Searching for client.
In figure 7.4, the word drugs is searched for. Again, you start at the root, computer. Because drugs comes later in the alphabet, the node to the right is examined. Drugs is larger than dog, so again the node to the right is examined, obtaining eat. Because drugs is less than eat, the node to the left would be examined, but it is null. The word is not found.
Figure 7.4. Searching for drugs.
If you wanted to insert a word into the list, you would search for it, and add it when you hit the first null. In the example in figure 7.4, drugs would be added as the left node of eat. Adding the word drag would work as shown in figure 7.5.
Figure 7.5. Adding drag.
You start your search for drag at computer, go right to dog, go right to eat, go left to drugs, and then go left and get back null.
When you delete a node from the tree, it has child nodes or it is a leaf. Deleting a leaf is simple; deleting a node with child nodes, however, is a bit more complex.
To delete a leaf node from the tree, just tell its parent node not to point to it anymore, setting the parent node's pointer to null. Then delete the node, returning its memory to the free store, and removing whatever data it holds. This process is illustrated in figure 7.6, in which the leaf node boy is removed. The parent, Cat, now has its left pointer point to null; and the node, boy, is deleted.
Figure 7.6. Deleting boy--a leaf node.
To delete a node with one child node, have the parent point to the child and then delete the node. Figure 7.7 illustrates this process by deleting drugs. To do this, the parent, eat, now points to drag, and drugs is deleted.
Figure 7.7. Deleting drugs--a node with one child.
Deleting a node with two children is somewhat more complicated. Examine figure 7.8, which has a different set of nodes; these are assembled from the words four score and seven years ago our fore fathers brought forth on this continent a new nation.
Figure 7.8. A different set of nodes.
Notice that the word and has two child nodes. It is imperative that when and is removed, the tree remains in the right order, with each left child node having a lower value than the current node, and each right child node having a greater value. Here's how it is done:
Deleting and in figure 7.8 leaves the binary tree, as shown in figure 7.9.
The binary tree will sort rWord objects in alphabetical order based on their strings. The rWord object shown in listing 7.8 is only a first approximation of the final rWord object, it will be used in ROBIN and is used here only for demonstration purposes.
Figure 7.9. Deleting and.
Listing 7.8 Using an rWord Declaration
1: // ************************************************** 2: // PROGRAM: Basic word object 3: // FILE: word.hpp 4: // PURPOSE: provide simple word object 5: // NOTES: 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 10/23/94 1.0 jl initial release 8: // ************************************************** 9: 10: #ifndef WORD_HPP 11: #define WORD_HPP 12: 13: #include "stdef.hpp" 14: #include "string.hpp" 15: 16: class rWord 17: { 18: public: 19: rWord(const String& text): 20: itsText(text), reserved1(0L), reserved2(0L) {} 21: ~rWord(){} 22: 23: const String& GetText()const { return itsText; } 24: 25: int operator<(const rWord& rhs) 26: { return itsText < rhs.GetText(); } 27: 28: int operator>(const rWord& rhs) 29: { return itsText > rhs.GetText(); } 30: 31: int operator<=(const rWord& rhs) 32: { return itsText <= rhs.GetText(); } 33: 34: int operator=>(const rWord& rhs) 35: { return itsText => rhs.GetText(); } 36: 31: BOOL operator==(const rWord& rhs) 37: { return itsText == rhs.GetText(); } 38: 39: void Display() const 40: { cout << " Text: " << itsText << endl; } 41: 42: private: 43: String itsText; 44: long reserved1; 45: long reserved2; 46: }; 47: 48: #endif
Output:
None.
Analysis:
The rWord object holds a string and reserves space for two long values. These may hold record numbers or other information later, when rWord is fleshed out. For now, rWord objects serve as string holders that can be placed in the binary tree.
The demonstration program will read the command line and create an rWord object for each word in the command line. The rWord then will be added to the tree. When all the command-line words have been read, the tree will be printed. Indentation is used to show the relationship among the words, using a very simple display function provided by most compilers. Listing 7.9 provides the driver program to illustrate the binary tree.
Note: Most compilers support the ANSI function gotoxy(). If your compiler does not, however, you will need to modify this program so that the words are printed to the screen one after another.
Listing 7.9 Using the Driver Program
1: // Using the Driver Program 2: 3: #include "word.hpp" 4: #include <conio.h> 5: 6: class BinaryNode; // forward declaration 7: 8: class BinaryTree 9: { 10: public: 11: BinaryTree():myHead(0),myCount(0){} 12: ~BinaryTree(){} 13: long GetCount() const { return myCount; } 14: void Insert (rWord &); 15: void Delete (rWord &); 16: void Iterate(void (rWord::*f)() const); 17: BinaryNode* FindObject(rWord& target); 18: void PrintTree(int, char**); 19: static long x; 20: static long y; 21: 22: private: 23: BinaryNode* myHead; 24: long myCount; 25: }; 26: 27: class BinaryNode 28: { 29: public: 30: BinaryNode(rWord* word); 31: BinaryNode(rWord& word); 32: BinaryNode(const BinaryNode&); 33: ~BinaryNode() {} 34: rWord* GetWord()const{ return myWord; } 35: void SetWord(rWord* target) { myWord = target;} 36: 37: void InsertSmaller(BinaryNode* NewNode); 38: void InsertBigger(BinaryNode* NewNode); 39: 40: BinaryNode* GetSmaller() const; // { return mySmaller; } 41: BinaryNode* GetBigger() const; // { return myBigger; } 42: BinaryNode* GetParent() const; //{ return myParent; } 43: 44: void SetSmaller(BinaryNode* target) {mySmaller = target; } 45: void SetBigger(BinaryNode* target) {myBigger = target; } 46: void SetParent(BinaryNode* target) {myParent = target; } 47: 48: int operator<(const BinaryNode &rhs) const 49: {return *myWord < *(rhs.GetWord());} 50: int operator>(const BinaryNode &rhs) const 51: {return *myWord > *(rhs.GetWord());} 52: BOOL operator==(const BinaryNode &rhs) const 53: { return *myWord ==*(rhs.GetWord());} 54: BOOL operator==(const rWord& target) const 55: {return *myWord == target.GetText();} // 56: 57: private: 58: 59: BinaryNode * mySmaller; 60: BinaryNode * myBigger; 61: BinaryNode * myParent; 62: rWord * myWord; 63: }; 64: 65: BinaryNode* BinaryNode::GetSmaller() const 66: { 67: BinaryTree::y++; 68: BinaryTree::x-= 14-BinaryTree::y*2; 69: return mySmaller; 70: } 71: 72: BinaryNode* BinaryNode::GetBigger() const 73: { 74: BinaryTree::y++; 75: BinaryTree::x+= 14-BinaryTree::y*2; 76: return myBigger; 77: } 78: 79: BinaryNode* BinaryNode::GetParent() const 80: { 81: BinaryTree::y--; 82: return myParent; 83: } 84: 85: 86: BinaryNode::BinaryNode(rWord* word): 87: mySmaller(0), 88: myBigger(0), 89: myParent(0), 90: myWord(word) 91: { } 92: 93: BinaryNode::BinaryNode(rWord &word): 94: mySmaller(0), 95: myBigger(0), 96: myParent(0), 97: myWord(&word) 98: { } 99: 100: BinaryNode::BinaryNode(const BinaryNode& rhs) 101: { 102: mySmaller=rhs.GetSmaller(); 103: myBigger=rhs.GetBigger(); 104: myParent=rhs.GetParent(); 105: myWord = rhs.GetWord(); 106: } 107: 108: 109: void BinaryNode::InsertSmaller(BinaryNode* newNode) 110: { 111: newNode->SetSmaller(mySmaller); 112: newNode->SetParent(this); 113: mySmaller=newNode; 114: } 115: 116: 117: void BinaryNode::InsertBigger(BinaryNode* newNode) 118: { 119: newNode->SetBigger(myBigger); 120: newNode->SetParent(this); 121: myBigger=newNode; 122: } 123: 124: 125: void BinaryTree::Insert(rWord& rhs) 126: { 127: BinaryNode* newNode = new BinaryNode(&rhs); 128: 129: if (myHead == 0) 130: { 131: myHead = newNode; 132: return; 133: } 134: 135: BinaryNode* node = myHead; 136: while (node) 137: { 138: // duplicate? replace 139: if (newNode == node) 140: { 141: newNode->SetSmaller(node->GetSmaller()); 142: newNode->SetBigger(node->GetBigger()); 143: newNode->SetParent(node->GetParent()); 144: if (node == myHead) 145: myHead = newNode; 146: else 147: { 148: if ( node->GetParent()->GetSmaller() == node) 149: node->GetParent()->SetSmaller(newNode); 150: else 151: node->GetParent()->SetBigger(newNode); 152: } 153: delete node; 154: break; 155: } 156: if (*newNode < *node) 157: { 158: if (!node->GetSmaller()) 159: { 160: node->SetSmaller(newNode); 161: newNode->SetParent(node); 162: break; 163: } 164: else 165: node = node->GetSmaller(); 166: } 167: else 168: { 169: if (!node->GetBigger()) 170: { 171: node->SetBigger(newNode); 172: newNode->SetParent(node); 173: break; 174: } 175: else 176: node = node->GetBigger(); 177: } 178: } // end while 179: } // end function 180: 181: 182: BinaryNode* BinaryTree::FindObject(rWord& rhs) 183: { 184: BinaryNode* node = myHead; 185: x = 40; 186: y = 1; 187: 188: while (node) 189: { 190: if (*node == rhs) 191: break; 192: 193: if (*node < rhs) 194: node = node->GetBigger(); 195: else 196: node = node->GetSmaller(); 197: } 198: return node; 199: } 200: 201: void BinaryTree::Delete (rWord & target) 202: { 203: BinaryNode* node = FindObject(target); 204: 205: if (node) 206: { 207: if (!node->GetBigger()) 208: { 209: if (!node->GetSmaller()) 210: { 211: if (node == myHead) 212: myHead = 0; 213: else 214: { 215: if (node->GetParent()->GetSmaller() == node) 216: node->GetParent()->SetSmaller(0); 217: else 218: node->GetParent()->SetBigger(0); 219: } 220: } 221: else // has a smaller 222: { 223: if (node == myHead) 224: myHead = node->GetSmaller(); 225: else 226: { 227: if (node->GetParent()->GetSmaller() == node) 228: node->GetParent()->SetSmaller(node->GetSmaller()); 229: else 230: node->GetParent()->SetBigger(node->GetSmaller()); 231: } 232: } 233: } 234: else // node does have a bigger 235: { 236: if (!node->GetSmaller()) 237: { 238: if (node == myHead) 239: myHead = node->GetBigger(); 240: else 241: { 242: if (node->GetParent()->GetSmaller() == node) 243: node->GetParent()->SetSmaller(node->GetBigger()); 244: else 245: node->GetParent()->SetBigger(node->GetBigger()); 246: } 247: } 248: else // node has both!! 249: { 250: BinaryNode * next = node->GetBigger(); 251: 252: while (next->GetSmaller()) 253: next=next->GetSmaller(); 254: 255: if (next->GetParent()->GetSmaller() == next) 256: next->GetParent()->SetSmaller(next->GetBigger()); 257: else 258: next->GetParent()->SetBigger(next->GetBigger()); 259: 260: node->SetWord(next->GetWord()); 261: 262: if (next->GetBigger()) 263: next->GetBigger()->SetParent(node); 264: 265: node = next; 266: } 267: } // end else does have bigger 268: } // end if node 269: delete node; 270: } // end function 271: 272: void BinaryTree::PrintTree(int argc, char **argv) 273: { 274: for (int i = 1; i<argc; i++) 275: { 276: rWord* word = new rWord(argv[i]); 277: BinaryNode* pbn = FindObject(*word); 278: delete word; 279: // BinaryTree::x*=3; // 3 spaces per unit 280: BinaryTree::x = BinaryTree::x < 0 ? 0 : BinaryTree::x; 281: BinaryTree::x = BinaryTree::x > 80 ? 80 : BinaryTree::x; 282: BinaryTree::y = BinaryTree::y < 0 ? 0 : BinaryTree::y; 283: BinaryTree::y = BinaryTree::y > 25 ? 25 :BinaryTree::y; 284: 285: gotoxy(BinaryTree::x, BinaryTree::y+10); 286: cout << pbn->GetWord()->GetText(); 287: } 288: } 289: 290: long BinaryTree::x 291: long BinaryTree::y; 292: int main(int argc, char **argv) 293: { 294: BinaryTree tree; 295: for (int i = 1; i< argc; i++) 296: { 297: rWord* word = new rWord(argv[i]); 298: tree.Insert(*word); 299: } 300: 301: tree.PrintTree(argc,argv); 302: cout << "\n\n\n" << endl; 303: 304: return 0; 305: }Output:
d:\>0708 now is the time for all good men to come to the aid now is the for men time all good to aid comeAnalysis:
In line 3, the new word.hpp file is included, along with conio.h in line 4, which is the header file that Borland C++ 4.0 requires for the gotoxy() function. In line 6, the BinaryNode class is forward declared because BinaryTree will make reference to BinaryNode pointers, and the compiler must know that a BinaryNode is a type.
Note that in a real program, these declarations (of both BinaryNode and BinaryTree) would be moved off to a header file. The BinaryTree declares a constructor and destructor and a GetCount() method that returns the number of nodes in the tree (and that is not used in the demonstration program).
The Insert() and Delete() member functions will work as described earlier. The Iterate() function is a carryover from the linked List class, and again is not used in the demonstration program.
In line 18, PrintTree() is declared, and below it two public, static variables, x and y. These are used for demonstration purposes only, in order to print out a graph of the tree. This mechanism is rather crude and is included only to prove that the binary tree is working.
In lines 23 and 24, the private member variables are declared. Note that binary tree (unlike linked list) includes a pointer to the first link in the tree.
In lines 27 through 63, the BinaryNode class is declared. As described earlier, each node has four pointers, shown in lines 59 through 62. The node points to each of its two children (mySmaller and mybigger) as well as its parent and the rword object for which it is the node.
In addition to the usual constructors (lines 30 through 32), a variety of accessor functions is provided (lines 40 through 46) and comparison operators (lines 48 through 55). Note that the comparisons are immediately passed on to the rWord, which in turn does a comparison of its strings. Thus, comparing two nodes is the same as comparing the strings of their rWord objects.
The implementations of the accessors GetSmaller() and GetBigger() are made somewhat more complex because they keep track of the relative x and y coordinates of the node. That is, each time you go down a level, the static member BinaryTree::y is incremented, and each time you go right or left, the static member BinaryTree::x is incremented or decremented. Note that a simple but effective algorithm is used to decide how much to move right or left. As you go farther down the tree, you want to shift fewer spaces right or left before writing the word. This still does not prevent all overlaps, so it is possible for two words to print to the same space, overwriting each other. Keep this in mind as you experiment with different command-line arguments.
BinaryNode::InsertSmaller() in lines 109 through 114 and BinaryNode::InsertBigger() in lines 117 through 122 are fairly straightforward. When a node is inserted into the smaller position, it is told to set its smaller pointer to the current node's smaller pointer, and to set its parent node pointer to the current node. Then the current node sets its own smaller pointer to the new node.
The more interesting method, BinaryTree::Insert(), is shown in lines 125 through 179. Here, a word object is provided to the tree. The first thing the tree does is to create a new node for the word, in line 127. It then checks to see whether the root of the tree, myHead, already is occupied. If not, the new node is made the head and the insertion is complete.
Assuming that the head already exists, the trick now is to find the right position for the new node. A node pointer, node, is declared in line 135, and set to point to the head.
Each time through the while loop, the new node is compared to node. In the first case, this looks for a duplicate of the head node. Assuming that it fails the test in line 139, the new node is compared to the node to see whether it is larger or smaller. Assuming that it is smaller, it will pass the test in line 156; otherwise, it will fall through to the else statement in line 167.
If the new node is smaller, the next question is whether the current node already has a smaller pointer. If so, that becomes the current node, and the while loop starts again. If the current node does not have a smaller node, however, the new node is inserted as the current node's smaller node.
If the new node is a bigger node than the current node, the same test is applied: Does the current node have a bigger node? If not, the new node becomes the current node's bigger node; if there is already a bigger node, it becomes the current node and the while loop continues.
FindObject() also takes a word object. It initializes the x and y coordinates for display and then searches for the node in the tree.
Delete() follows the logic described earlier. Take the time to compare the code with the description and make sure that you follow how this works.
PrintTree(), shown in lines 272 through 277, is a quick hack to display the words in a simple graphical manner. It iterates through the command line and asks the binary tree to find each word in turn. The act of finding the word sets the x and y coordinates, which then are checked to make sure they are in range. Then gotoxy() is called, which moves the cursor to the appropriate screen coordinate, where the word is written.
In lines 290 and 291, the static member variables BinaryNode::x and BinaryNode::y are defined. Remember that declaring these in the class does not set aside memory; you must define them at global scope.
The version of the binary tree shown in listing 7.9 works quite well but brings with it some unnecessary complexity. The printing algorithm is inefficient and not guaranteed to produce good results, and the insertion method is complex and potentially confusing.
The next example rewrites the binary tree to take advantage of the inherent recursion in trees: each node can be conceived, in some sense, as the top of its own tree. This makes insertion and deletion simpler.
It is important to be able to walk the tree when you need to print or search through the tree, although the particular order in which you walk the tree depends on what you are trying to accomplish.
The next program walks the tree in the following order: looking at the smaller value, looking at the current value, and then looking at the larger value. Recursing into this routine enables you to print the entire tree sideways. I'll discuss this in detail on day 8, but for now it enables the tree to show its contents quickly and easily. Listing 7.10 demonstrates these approaches.
Listing 7.10 Another Way to Write a Binary Tree
1: // ************************************************** 2: // PROGRAM: Binary tree - Second version 3: // FILE: bintree.hpp 4: // PURPOSE: btree with recursive insertion 5: // NOTES: Special thanks to Stephen Zagieboylo 6: // AUTHOR: Jesse Liberty (jl) 7: // REVISIONS: 11/3/94 1.0 jl initial release 8: // ************************************************** 9: 10: #include "stdef.hpp" 11: #include "string.hpp" 12: #include "word.hpp" 13: 14: class BinaryNode; // Forward reference 15: 16: class BinaryTree 17: { 18: public: 19: BinaryTree():myHead(0),myCount(0){} 20: ~BinaryTree(){} 21: long GetCount() const { return myCount; } 22: 23: void Insert (const rWord& ); 24: BOOL Delete (const rWord& ); // Returns TRUE if found and deleted. 25: void Iterate(void (*f)(const rWord&, int depth)); 26: 27: private: 28: BinaryNode * myHead; 29: long myCount; 30: }; 31: 32: class BinaryNode 33: { 34: public: 35: BinaryNode(const rWord &); 36: ~BinaryNode() {} 37: const rWord & GetValue() { return myValue; } 38: void SetValue(const rWord& val) { myValue = val;} 39: 40: BinaryNode* GetSmaller() const { return mySmaller; } 41: BinaryNode* GetBigger() const { return myBigger; } 42: 43: BinaryNode * Insert(const rWord&); 44: static void ProcessDuplicateValue(const rWord& newValue, rWord& existingValue); 45: 46: // Returns the new top of this subtree. 47: // Sets the BOOL if it really deleted anything. 48: BinaryNode * Delete(const rWord&, BOOL & DidDelete); 49: 50: void SetSmaller(BinaryNode* target) { mySmaller = target; } 51: void SetBigger(BinaryNode* target) { myBigger = target; } 52: 53: BOOL operator>(const rWord& rhs) const; 54: BOOL operator==(const rWord& rhs) const; 55: 56: void Iterate(void (*f)(const rWord&, int depth), int depth); 57: 58: private: 59: BinaryNode * mySmaller; 60: BinaryNode * myBigger; 61: rWord myValue; 62: }; 63: 64: BinaryNode::BinaryNode(const rWord& word): 65: mySmaller(0), 66: myBigger(0), 67: myValue(word) 68: { } 69: 70: 71: BinaryNode * BinaryNode::Insert(const rWord& word) 72: { 73: if (*this == word) 74: ProcessDuplicateValue(word, myValue); 75: else if (*this > word) 76: { 77: if (mySmaller != 0) 78: mySmaller = mySmaller->Insert(word); 79: else 80: mySmaller = new BinaryNode(word); 81: } 82: else 83: { 84: if (myBigger != 0) 85: myBigger = myBigger->Insert(word); 86: else 87: myBigger = new BinaryNode(word); 88: } 89: 90: return this; 91: } 92: 93: void BinaryNode::ProcessDuplicateValue(const rWord& word , rWord& otherWord ) 94: { 95: cout << otherWord.GetText() << " is a duplicate of "; 96: cout << word.GetText() << endl; 97: } 98: 99: BinaryNode * BinaryNode::Delete(const rWord& word, BOOL & DidDelete) 100: { 101: if (*this == word) 102: { 103: // This is the one to remove. It might be a leaf, 104: // a single parent, or a double parent. 105: if (mySmaller == 0) // leaf or one type of single parent 106: { 107: // if myBigger == 0, return 0 anyway. 108: BinaryNode * retval = myBigger; 109: DidDelete = TRUE; 110: delete this; // Dangerous! Must return immediately. 111: return retval; 112: } 113: else if (myBigger == 0) // other type of single parent 114: { 115: BinaryNode * retval = mySmaller; 116: DidDelete = TRUE; 117: delete this; // Dangerous! Must return immediately. 118: return retval; 119: } 120: else // Double parent 121: { 122: // Find the Node with the lowest value on 123: // my Bigger subtree. Remove him and put him in my place 124: BinaryNode * smallest = myBigger; 125: BinaryNode * hisparent = 0; 126: while (smallest->GetSmaller() != 0) 127: { 128: hisparent = smallest; 129: smallest = smallest->GetSmaller(); 130: } 131: 132: // Remove him gracefully and put him in our place. 133: // Watch out for the case where he is our child. 134: if (hisparent != 0) // not our immediate child. 135: { 136: hisparent->SetSmaller(smallest->GetBigger()); 137: smallest->SetBigger(myBigger); 138: } 139: 140: smallest->SetSmaller(mySmaller); 141: 142: DidDelete = TRUE; 143: delete this; // Dangerous! Must return immediately. 144: return smallest; 145: } 146: } 147: else if (*this > word) 148: { 149: if (mySmaller != 0) 150: mySmaller = mySmaller->Delete(word, DidDelete); 151: return this; 152: } 153: else 154: { 155: if (myBigger != 0) 156: myBigger = myBigger->Delete(word, DidDelete); 157: return this; 158: } 159: } 160: 161: 162: BOOL BinaryNode::operator>(const rWord &rhs) const 163: { 164: return BOOL(myValue.GetText() > rhs.GetText()); 165: } 166: 167: 168: BOOL BinaryNode::operator==(const rWord &rhs) const 169: { 170: return BOOL(myValue.GetText = rhs.GetText()); 171: } 172: 173: 174: 175: void BinaryNode::Iterate(void (*f)(const rWord&, int depth), int depth) 176: { 177: if (mySmaller != 0) 178: mySmaller->Iterate(f, depth+1); 179: 180: f(myValue, depth); 181: 182: if (myBigger != 0) 183: myBigger->Iterate(f, depth+1); 184: } 185: 186: 187: void BinaryTree::Insert(const rWord& word) 188: { 189: if (myHead == 0) 190: myHead = new BinaryNode(word); 191: else 192: myHead = myHead->Insert(word); 193: 194: myCount++; 195: } 196: 197: 198: BOOL BinaryTree::Delete(const rWord& word) 199: { 200: BOOL DidDelete = FALSE; 201: 202: if (myHead != 0) 203: myHead = myHead->Delete(word, DidDelete); 204: 205: if (DidDelete) 206: myCount--; 207: 208: return DidDelete; 209: } 210: 211: void BinaryTree::Iterate(void (*f)(const rWord&, int depth)) 212: { 213: if (myHead != 0) 214: myHead->Iterate(f, 1); 215: } 216: 217: 218: void Display(const rWord & word, int depth) 219: { 220: for ( ;depth > 0; depth--) 221: cout << " "; 222: cout << word.GetText() << endl; 223: } 224: 225: 226: // Utility function used below. 227: void ShowTree( BinaryTree & tree ) 228: { 229: cout << "\n\nResults: \n" << endl; 230: cout << "Count is: " << tree.GetCount() << endl; 231: tree.Iterate(Display); 232: cout << "\n\n" ; 233: }
Listing 7.11 Using the Driver Program
1: // 7.11 - Using the Driver Program 2: // second version 3: 4: #include "btree.hpp" 5: 6: int main() 7: { 8: char buffer[256]; 9: BinaryTree tree; 10: static String ShowString("show"); 11: 12: while (1) 13: { 14: cout << "New String: "; 15: cin.getline(buffer,255); 16: rWord* word = new rWord(buffer); 17: 18: if (buffer[0] == '\0') 19: break; 20: 21: if (*word == ShowString) 22: { 23: ShowTree(tree); 24: continue; 25: } 26: 27: tree.Insert(*word); 28: } 29: 30: ShowTree(tree); 31: 32: while (1) 33: { 34: cout << "Word to Delete: "; 35: cin.getline(buffer,255); 36: rWord* word = new rWord(buffer); 37: 38: if (buffer[0] == '\0') 39: break; 40: 41: if (*word == ShowString) 42: { 43: ShowTree(tree); 44: continue; 45: } 46: 47: if (!tree.Delete(*word)) 48: cout << "Not Found\n\n"; 49: } 50: 51: tree.Iterate(Display); 52: }Output:
d:\112\day7>0711 New String: now New String: is New String: the New String: time New String: for New String: all New String: good New String: men New String: to New String: come New String: to to is a duplicate of to New String: the the is a duplicate of the New String: aid New String: of New String: the the is a duplicate of the New String: party Results: Count is: 16 aid all come for good is men now of party the time to Word to Delete: good Word to Delete: time Word to Delete: aid all come for is men now of party the toAnalysis:
The BinaryNode and BinaryTree classes declared in listing 7.10 store the same rWord objects as the previous examples and utilize the same stdef inclusion file.
One addition to this tree is that it detects and reports on duplicates, without adding them to the tree. The BinaryNode class assumes much more of the work in this approach.
BinaryTree::Insert() is in line 187. It takes an rWord reference and passes it to its topmost node. The tree's only other job is to increment its count.
The real work is done in BinaryNode::Insert(), which is shown in lines 71 through 91. The word is examined to see whether it is a duplicate of the current node, and processed accordingly if it is. If it is not a duplicate, the word is either smaller or larger than the current node.
If the word is smaller and if there is no pointer to a smaller node, this new node becomes the smaller node. If there is already a smaller node, the new node is passed recursively to the insert method of the smaller node.
If the new node is larger than the current node, the same process occurs: If there is no larger node, the new node becomes the larger; if there already is a larger node, the new node is passed recursively down to the larger.
In this way, the new node is passed down the tree until its appropriate insertion point is found.
Delete works in much the same way. The problem facing the tree's Delete() function is that it must get back two values: the new head pointer (in case the current one is deleted) and a Boolean as to whether the deletion was completed. Deletion can fail if the item is not found in the tree.
This problem is solved by passing a Boolean to the BinaryNode::Delete() function by reference. The node then can set the value of Delete, while returning the new head pointer.
BinaryNode::Delete() is shown in lines 99 through 159. If the target node is the current node, the current node must be examined to see whether it has children. If it has no smaller child, Delete() returns its bigger child after deleting itself.
If it has a smaller child, but no bigger child, it returns the smaller child after deleting itself. The only other alternative is that it has two children. In that case, the logic in lines 120 through 145 is used, in which the "next value" is placed into the position of the removed node.
Today you saw how to create a hash table, which depends on a hashing algorithm to put values into discrete buckets. You saw how to resolve collisions when two objects hash to the same value, and you learned what the trade-offs are in using a hash table. You also learned two ways to implement a binary tree, and how to print the contents of the tree.
Q: When would you use a hash table in the real world?
A: Although it is fairly memory-expensive, a hash table is the fastest way to retrieve data, as long as your table is big enough. You would use this technique only in times when speed is of the essence, typically because the access to the data is needed many times a second.
One example where a hash table often is used is for a software virtual memory system. You would keep in a hash table the records of which blocks of data are paged in. In a virtual memory system, the most important point of optimization is the speed of locating a block that is already in memory. If you have these in a hash table, this will be very fast, and the number of pages that can be in memory at one time is relatively low (compared to the number that might be out on disk).
Q: What are the characteristics of a good hashing algorithm?
A: The hashing algorithm determines the right bucket for a particular key. First, it should be very fast, because the whole point of using a hash table is speed. Second, given a typical set of keys, it should spread them out fairly evenly over the range, so that the buckets are equally full.
Q: Give an example of a good hashing algorithm.
A: For alphanumeric strings, XORing the first byte with the last one shifted left three bits is pretty good. Follow this by ANDing the number with an appropriate bit mask to cut it down to the right number of buckets (which has to be a power of two). This example will give you a number from 0 to 63, which you would use to index to a bucket:
inline int HashingAlgorithm(const char * s) {return (s[0] ^ (s[strlen(s)] << 3)) & 0x3f;}
Q: In a binary tree, how many compares will it take to reach any one node in the tree? What is the average for all nodes as a function of the number of nodes in the tree?
A: The number of compares it will take to reach any node is the depth or level of the tree. This will average less than log2 N, where N is the total number of nodes in the tree.
Q: How much of a risk is it that binary trees will be badly unbalanced?
A: If the data being inserted truly is random, then the binary tree will stay reasonably well balanced. However, real data sometimes already is sorted, depending on the source of the data. If sorted data is inserted into a binary tree, it creates the very worst case, a long chain of only right (or only left) branches. A solution to this problem is provided on day 8.
The Workshop provides quiz questions to help you solidify your understanding of the material covered, and exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix A, and make sure that you understand the answers before continuing to the next chapter.
inline int HashingAlgorithm(const char * s) return (s[0] ^ (s[1] << 3) ^ (s[2] >> 3)) & 0x3f;}
What classes would be involved in a hash table in order to maintain this information?
Go to: Table of Contents | Next Page