Day 15: Iterators

An iterator is an object that points into a list and can return the first, current, last, or next item on that list. Generic iterators are difficult to write and raise many interesting programming and database issues. Today you will learn

An iterator enables you to move through a list from a given starting point, examining each node in turn. General purpose iterators can be very flexible, enabling you to walk forward and backward in the list.

Most collections let the program ask for more than one iterator on the list, which means that the user can have iterators pointing to different parts of the same list. Additionally, some iterators are read/write, which enables the user to insert and delete items in the list.

Using an Iterator

The best way to study iterators is to use one to solve a real-world problem. ROBIN, as currently written, has a significant limitation: You can search only for an exact match on a given term. If there are notes with the words computer, computing, and computerize, there is no easy way to search for all these at the same time.

Presumably, the user would like to enter the string comput and see all the notes with words that begin with these six letters. Searching the list for each of these permutations is tricky, however, because the words computing and computerize might not be on the same page.

Getting from computing to computerize requires going up through two nodes and back down to a new leaf page. Putting the capability to do this into the list makes no sense; the list's primary mode of finding and adding indexes is the existing recursive routines.

Creating an iterator for the BTree class solves this problem. The program searches for the first word that matches the target string and returns that index. The search also initializes the iterator, so that calling GetNext() will return the next index that matches the string.

Because the list is sorted alphabetically, finding all the words that match the string is as simple as finding the first match, and then iterating through the list until you find the first word that does not match the target string, at which time you are done.

Understanding How the Iterator Works

To accomplish this goal, the B-tree must be extended to include an iterator object. The job of the iterator is to answer the question What is the next entry in the list?

To simplify this task, I've tailored the iterator to meet the specific needs of the existing B-tree. The iterator keeps track of every page that matches the target string until the leaf page is found. The next index is defined as the next index in the tree after the last find().

Usually, the next index is on the same page as the current index, but not always. When the iterator must find the next page, it walks up its node list, hunting for the first node that has another index in its list. After the iterator finds that node, it works its way back down the tree to the first index on the leaf page and returns that value.

It is a design goal of the Iterator class to hide all the details of finding the next index from its clients. The client calls GetNext(), which returns the offset of the next matching record, or 0 if there is none.

Illustrating an Iterator over a B-Tree

To keep the code simple, and to allow a full and in-depth exploration of how the iterator works, I'll implement only the most fundamental functionality in the iterator, as shown in the following listings. Listing 15.1 provides the header file BTree.hpp. Listing 15.2 is BTree.cpp, listing 15.3 is Page.cpp, listing 15.4 is Iterator.cpp, and listing 15.5 is Index.cpp. The data file and wnjfile listings have not changed, but are provided here for completeness as listings 15.6 and 15.7, respectively. Listing 15.9 is the driver program.

Listing 15.1 BTree.hpp

    1:  //*********************************************
    2:      // PROGRAM:  Btree, Page and Index declarations
    3:      // FILE:     btree.hpp
    4:      // PURPOSE:  provide fundamental btree functionality
    5:      // NOTES:
    6:      // AUTHOR:   Jesse Liberty (jl)
    7:      // REVISIONS: 11/1/94 1.0 jl  initial release
    8:      // **************************************************
    9:      #ifndef BTREE_HPP                 // inclusion guards
    10:     #define BTREE_HPP
    11:    #include <time.h>
    12:     #include <string.h>
    13:     #include <fstream.h>
    14:    #include "stdef.hpp"
    15:     const int Order =      4; //31;  // 31 indices and 1 header
    16:    const int dataLen =    11; // length of a key
    17:     const int MaxPages =   500; // more than we need
    18:     const int SizeItem =   16;  // key + offset
    19:     const int SizePointer = 2; // size of offset
    20:     const int PageSize = (Order+1) * SizeItem;
    21:     const int offInt = 2;
    22:     // forward declarations
    23:    class Page;
    24:     class Index;
    25:     // all members are public
    26:     // used only by Iterator
    27:    struct History
    28:     {
    29:        History(int, int,BOOL);
    30:        int PageNo;
    31:        int OffSet;
    32:        BOOL IsNode;
    33:     };
    34:     // Iterator class keeps track
    35:     // of current (and next) index
    36:     class Iterator
    37:    {
    38:    public:
    39:        Iterator();
    40:        ~Iterator();
    41:        void RecordNode(int, int);
    42:        void RecordLeaf(int, int);
    43:        History * GetLast();
    44:        History * GetFirst();
    45:        int GetNext(const Index&);
    46:        void Reset();
    47:     private:
    48:      History* myHistory[MaxPages];
    49:      int myNext;
    50:     };
    51:     class DataFile
    52:    {
    53:     public:
    54:        // constructors & destructor
    55:        DataFile();
    56:        ~DataFile(){}
    57:       // management of files
    58:       void Close();
    59:        void Create();
    60:        void GetRecord(int, char*, int&, time_t&);
    61:        int Insert(char *);
    62:     private:
    63:     fstream myFile;
    64:     };
    65:    class WNJFile
    66:     {
    67:     public:
    68:        // constructors & destructor
    69:        WNJFile();
    70:       ~WNJFile(){}
    71:        // management of files
    72:       void Close();
    73:        void Create();
    74:        int* Find(int NextWNJ);
    75:        int Insert(int, int);
    76:       int Append(int);
    77:     private:
    78:     static int myCount;
    79:     fstream myFile;
    80:     union
    81:     {
    82:        int myints[5];
    83:       char  myArray[5*4];
    84:     };
    85:     };
    86:
    87:     // IDXManager - in memory keeps track of what pages are
    88:     // already in memory and swaps to disk as required
    89:     class IDXManager
    90:    {
    91:     public:
    92:        // constructors & destructor
    93:        IDXManager();
    94:        ~IDXManager(){}
    95:
    96:        // management of files
    97:        void Close(int);
    98:        int Create();
    99:
    100:       // Page manipulation
    101:       Page* GetPage(int);
    102:       void SetPage(Page*);
    103:       void Insert(Page * newPage);
    104:       void Save(Page* pg);
    105:       int NewPage(Index&, BOOL);
    106:       int NewPage(Index *array, int offset, BOOL leaf, int count);
    107:
    108:    private:
    109:    Page* myPages[MaxPages];
    110:    fstream myFile;
    111:    int myCount;
    112:    };
    113:
    114:    // the btree itself -has a pointer to first page
    115:    class BTree
    116:    {
    117:    public:
    118:       // constructors and destructor
    119:       BTree();
    120:       ~BTree();
    121:
    122:       // utility functions
    123:       void AddKey(char* data, int offset);//,BOOL synonym = FALSE);
    124:       void Insert(char* buffer, int, char**);
    125:       void Insert(char*);
    126:       void PrintTree();
    127:       int Find(char* data);
    128:       int FindExact(char* data);
    129:       int GetNext(char* data);
    130:
    131:       // page methods
    132:       Page* GetPage(int page)
    133:          { return theIDXManager.GetPage(page); }
    134:       int NewPage(Index& pIndex, BOOL IsLeaf)
    135:          { return theIDXManager.NewPage(pIndex,FALSE); }
    136:
    137:       static int myAlNum(char ch);
    138:       // public static member!
    139:       static  IDXManager theIDXManager;
    140:       static WNJFile   theWNJFile;
    141:       static DataFile theDataFile;
    142:       static Iterator theIter;
    143:
    144:       static BOOL GetWord(char*, char*, int&);
    145:       static void GetStats();
    146:       static int NodeIndexCtr;
    147:       static int LeafIndexCtr;
    148:       static int NodePageCtr;
    149:       static int LeafPageCtr;
    150:       static int NodeIndexPerPage[Order+1];
    151:       static int LeafIndexPerPage[Order+1];
    152:
    153:    private:
    154:       int myRoot;
    155:    };
    156:
    157:    // index objects point to pages or real data
    158:    class Index
    159:    {
    160:    public:
    161:       // constructors & destructor
    162:       Index();
    163:       Index(char *);
    164:       Index (char*, int);
    165:       Index(Index&);
    166:       ~Index(){}
    167:
    168:       // accessors
    169:       const char * GetData() const  { return myData; }
    170:       void SetData(const Index& rhs)
    171:          { strcpy(myData,rhs.GetData()); }
    172:       void SetData(const char * rhs)
    173:          { strcpy(myData,rhs); }
    174:       int GetPointer()const { return myPointer; }
    175:       void SetPointer (int pg) { myPointer = pg; }
    176:
    177:       // utility functions
    178:       void PrintKey();
    179:       void PrintPage();
    180:       int Insert(Index& ref,BOOL findOnly = FALSE);//, BOOL findExac = FALSE);
    181:
    182:       int Begins(const Index& rhs) const;
    183:
    184:       // overloaded operators
    185:       int operator==(const Index& rhs);
    186:
    187:       int operator < (const Index& rhs)
    188:        {return strcmp(myData,rhs.GetData())<0;}
    189:
    190:       int operator <= (const Index& rhs)
    191:        {return strcmp(myData,rhs.GetData())<=0;}
    192:
    193:       int operator > (const Index& rhs)
    194:        {return strcmp(myData,rhs.GetData())>0;}
    195:
    196:       int operator >= (const Index& rhs)
    197:        {return strcmp(myData,rhs.GetData())>=0;}
    198:
    199:    public:
    200:       int myPointer;
    201:       char myData[SizeItem - SizePointer];
    202:    };
    203:
    204:
    205:    // pages - consist of header and array of indices
    206:    class Page
    207:    {
    208:    public:
    209:       // constructors and destructor
    210:       Page();
    211:       Page(char*);
    212:       Page(Index&,BOOL);
    213:       Page(Index*, int, BOOL, int);
    214:       ~Page(){}
    215:
    216:       // insertion and searchoperations
    217:       int Insert(Index&, BOOL findOnly = FALSE);//, BOOL synonym = FALSE);
    218:       int Find(Index& idx) { return Insert(idx,TRUE); }
    219:       int InsertLeaf(Index&);
    220:       int FindLeaf(Index&,BOOL findOnly);//, BOOL synonym = FALSE);
    221:       int InsertNode(Index&,BOOL findOnly = FALSE);//, BOOL synonym = FALSE);
    222:       void Push(Index&,int offset=0, BOOL=TRUE);
    223:
    224:       // accessors
    225:       Index GetFirstIndex() { return myKeys[0]; }
    226:       Index GetIndex(int offset) { return myKeys[offset]; }
    227:       BOOL GetIsLeaf() const { return myVars.IsLeaf; }
    228:       int GetCount() const { return myVars.myCount; }
    229:       void SetCount(int cnt) { myVars.myCount=cnt; }
    230:       time_t GetTime() { return myTime; }
    231:       BOOL GetLocked() { return myVars.IsLocked; }
    232:       void SetLocked (BOOL state) { myVars.IsLocked = state; }
    233:
    234:       // page manipulation
    235:       int GetPageNumber() const { return myVars.myPageNumber; }
    236:       Page* GetPage(int page)
    237:          { return BTree::theIDXManager.GetPage(page); }
    238:       int NewPage(Index& rIndex, BOOL IsLeaf)
    239:          { return BTree::theIDXManager.NewPage(rIndex,FALSE); }
    240:
    241:       int NewPage(Index* arr, int off,BOOL isL, int cnt)
    242:          { return   BTree::theIDXManager.NewPage(arr,off,isL, cnt); }
    243:
    244:       // utility functions
    245:       void Nullify(int offset);
    246:       void Print();
    247:       fstream& Write(fstream&);
    248:       void ReCount();
    249:
    250:       static int GetgPageNumber(){ return gPage; }
    251:       static void SetgPageNumber(int pg) { gPage = pg; }
    252:
    253:    private:
    254:       Index * const myKeys; // will point to myVars.mk
    255:       union
    256:       {
    257:          char myBlock[PageSize];   // a page from disk
    258:          struct
    259:          {
    260:             int myCount;
    261:             BOOL IsLeaf;
    262:             int myPageNumber;
    263:             BOOL IsLocked;
    264:             char filler[8]; // we want 16 bytes of overhead
    265:             char mk[Order*sizeof(Index)];  // array of indices
    266:          }myVars;
    267:       };
    268:
    269:       // memory only
    270:       static int gPage;
    271:       time_t myTime; // for lifo queue
    272:    };
    273:
    274:    #endif

Listing 15.2 BTree.cpp

    1:      // **************************************************
    2:      // PROGRAM:  Btree
    3:      // FILE:     btree.cpp
    4:      // PURPOSE:  provide fundamental btree functionality
    5:      // NOTES:
    6:      // AUTHOR:   Jesse Liberty (jl)
    7:      // REVISIONS: 11/1/94 1.0 jl  initial release
    8:      // **************************************************
    9:
    10:     #include "btree.hpp"
    11:     #include "stdef.hpp"
    12:     #include <ctype.h>
    13:     #include <stdlib.h>
    14:
    15:     // construct the tree
    16:     // initialize myRoot pointer to nothing
    17:     // either create or read the index file
    18:     BTree::BTree():myRoot(0)
    19:     {
    20:        myRoot = theIDXManager.Create();
    21:        theDataFile.Create();
    22:        theWNJFile.Create();
    23:
    24:     }
    25:
    26:     // write out the index file
    27:     BTree::~BTree()
    28:     {
    29:        theIDXManager.Close(myRoot);
    30:        theDataFile.Close();
    31:        theWNJFile.Close();
    32:     }
    33:
    34:     // find an existing record
    35:     int BTree::Find(char * str)
    36:     {
    37:        Index index(str);
    38:        BTree::theIter.Reset();
    39:        if (!myRoot)
    40:           return 0L;
    41:        else
    42:           return GetPage(myRoot)->Find(index);
    43:     }
    44:
    45:
    46:     int BTree::GetNext(char * str)
    47:     {
    48:        Index index(str);
    49:        return BTree::theIter.GetNext(index);
    50:     }
    51:
    52:
    53:     void BTree::Insert(char * buffer, int argc, char** argv)
    54:     {
    55:        // get each word,
    56:        int offset = theDataFile.Insert(buffer);
    57:        for (int i = 1; i<argc; i++)
    58:        {
    59:           AddKey(argv[i],offset);
    60:        }
    61:
    62:     }
    63:
    64:     void BTree::Insert(char* buffer)
    65:     {
    66:
    67:        if (strlen(buffer) < 3)
    68:           return;
    69:
    70:         char *buff = buffer;
    71:         char word[PageSize];
    72:         int wordOffset = 0;
    73:         int offset;
    74:
    75:        if (GetWord(buff,word,wordOffset))
    76:        {
    77:           offset = theDataFile.Insert(buffer);
    78:           AddKey(word,offset);
    79:        }
    80:
    81:
    82:        while (GetWord(buff,word,wordOffset))
    83:        {
    84:           AddKey(word,offset);
    85:        }
    86:
    87:     }
    88:
    89:     void BTree::AddKey(char * str, int offset )
    90:     {
    91:
    92:        if (strlen(str) < 3)
    93:           return;
    94:
    95:        int retVal =0;
    96:        Index index(str,offset);
    97:        if (!myRoot)
    98:        {
    99:           myRoot = theIDXManager.NewPage (index,TRUE);
    100:       }
    101:       else
    102:       {
    103:          retVal = GetPage(myRoot)->Insert(index);
    104:          if (retVal) // our root split
    105:          {
    106:             // create a pointer to the old top
    107:             Index index(GetPage(myRoot)->GetFirstIndex());
    108:             index.SetPointer(myRoot);
    109:             // make the new page & insert the index
    110:             int PageNumber = NewPage(index,FALSE);
    111:
    112:             Page* pg = GetPage(PageNumber);
    113:
    114:             //get a pointer to the new (sibling) page
    115:             Index Sib(GetPage(retVal)->GetFirstIndex());
    116:             Sib.SetPointer(retVal);
    117:             // put it into the page
    118:             pg->InsertLeaf(Sib);
    119:
    120:             // reset myRoot to point to the new top
    121:             myRoot = PageNumber;
    122:          }
    123:       }
    124:    }
    125:
    126:    void BTree::PrintTree()
    127:    {
    128:       GetPage(myRoot)->Print();
    129:
    130:       cout << "\n\nStats:" << endl;
    131:       cout << "Node pages:   " << NodePageCtr << endl;
    132:       cout << "Leaf pages:   " << LeafPageCtr << endl;
    133:       cout << "Node indexes: " << NodeIndexCtr << endl;
    134:       cout << "Leaf indexes: " << LeafIndexCtr << endl;
    135:       for (int i = 0; i < Order +1; I++)
    136:       {
    137:          if (NodeIndexPerPage[i])
    138:          {
    139:             cout << "Pages with " << i << " nodes:  ";
    140:             cout << NodeIndexPerPage[i] << endl;
    141:          }
    142:          if (LeafIndexPerPage[i])
    143:          {
    144:             cout << "Pages with " << i << " leaves: ";
    145:             cout << LeafIndexPerPage[i] << endl;
    146:          }
    147:       }
    148:
    149:    }
    150:
    151:    BOOL BTree::GetWord(char* string, char* word, int& wordOffset)
    152:    {
    153:
    154:       if (!string[wordOffset])
    155:          return FALSE;
    156:
    157:       char *p1, *p2;
    158:       p1 = p2 = string+wordOffset;
    159:
    160:       // eat leading spaces
    161:       for (int i = 0; i<(int)strlen(p1) && !BTree::myAlNum(p1[0]); i++)
    162:          p1++;
    163:
    164:       // see if you have a word
    165:       if (!BTree::myAlNum(p1[0]))
    166:          return FALSE;
    167:
    168:        p2 = p1; // point to start of word
    169:
    170:       // march p2 to end of word
    171:       while (BTree::myAlNum(p2[0]))
    172:          p2++;
    173:
    174:       int len = int(p2-p1);
    175:    #if defined(_MSVC_16BIT) || defined(_MSVC_32BIT)
       :    {
       :       len = __min(len,(int)PageSize);
       :    }
       :    #else
       :    {
       :       len = min(len,(int)PageSize);
       :    }
       :    #endif
    176:
    177:       strncpy (word,p1,len);
    178:       word[len]='\0';
    179:
    180:       for (i = int (p2-string); i<(int)strlen(string) &&
               !BTree::myAlNum(p2[0]); i++)
    181:          p2++;
    182:
    183:       wordOffset = int (p2-string);
    184:
    185:       return TRUE;
    186:    }
    187:
    188:    int BTree::myAlNum(char ch)
    189:    {
    190:       return isalnum(ch) ||
    191:          ch == '-' ||
    192:          ch == '\'' ||
    193:          ch == '(' ||
    194:          ch == ')';
    195:    }

Listing 15.3 Page.cpp

    1:      // **************************************************
    2:      // PROGRAM:  Page
    3:      // FILE:     Page.cpp
    4:      // PURPOSE:  provide fundamental btree functionality
    5:      // NOTES:
    6:      // AUTHOR:   Jesse Liberty (jl)
    7:      // REVISIONS: 11/1/94 1.0 jl  initial release
    8:      // **************************************************
    9:
    10:     #include "btree.hpp"
    11:     #include <assert.h>
    12:
    13:     // constructors
    14:
    15:     // default constructor
    16:     Page::Page()
    17:     {
    18:     }
    19:
    20:     // create a page from a buffer read in from disk
    21:     Page::Page(char *buffer):
    22:        myKeys((Index*)myVars.mk)
    23:     {
    24:        assert(sizeof(myBlock) == PageSize);
    25:        assert(sizeof(myVars) == PageSize);
    26:        memcpy(&myBlock,buffer,PageSize);
    27:        SetLocked(FALSE);
    28:        myTime = time(NULL);
    29:     }
    30:
    31:     // create a Page from the first index
    32:     Page::Page(Index& index, BOOL bLeaf):
    33:        myKeys((Index*)myVars.mk)
    34:     {
    35:
    36:        myVars.myCount=1;
    37:        myVars.IsLeaf = bLeaf;
    38:        SetLocked(FALSE);
    39:        // if this is a leaf, this is the first
    40:        // index on the first page, set its pointer
    41:        // based on creating a new wnj. otherwise
    42:        // you are here creating a new node, do not
    43:        // set the pointer, it is already set.
    44:        if (bLeaf)
    45:           index.SetPointer(BTree::theWNJFile.Append(index.GetPointer()));
    46:        myKeys[0]=index;
    47:        myVars.myPageNumber = gPage++;
    48:        myTime = time(NULL);
    49:     }
    50:
    51:     // create a page by splitting another page
    52:     Page::Page(Index *array, int offset, BOOL bLeaf,int count):
    53:        myKeys((Index*)myVars.mk)
    54:     {
    55:        myVars.IsLeaf = bLeaf;
    56:        myVars.myCount = 0;
    57:        for (int i=0, j = offset; j<Order && i < count; i++, j++)
    58:        {
    59:           myKeys[i]= array[j];
    60:           myVars.myCount++;
    61:        }
    62:        myVars.myPageNumber = gPage++;
    63:        SetLocked(FALSE);
    64:        myTime = time(NULL);
    65:     }
    66:
    67:     void Page::Nullify(int offset)
    68:     {
    69:        for (int i = offset; i<Order; i++)
    70:        {
    71:              myKeys[i].SetPointer(0);
    72:              myVars.myCount;
    73:        }
    74:     }
    75:
    76:     // decide whether I'm a leaf or a node
    77:     // and pass this index to the right
    78:     // function. If findOnly is true, don't insert
    79:     // just return the page number (for now)
    80:     int Page::Insert(Index& rIndex, BOOL findOnly)
    81:     {
    82:        int result;
    83:        if (myVars.IsLeaf)
    84:        {
    85:           SetLocked(TRUE);
    86:           result = FindLeaf(rIndex,findOnly);
    87:           SetLocked(FALSE);
    88:           return result;
    89:        }
    90:        else
    91:        {
    92:           SetLocked(TRUE);
    93:           result = InsertNode(rIndex,findOnly);
    94:           SetLocked(FALSE);
    95:           return result;
    96:        }
    97:     }
    98:
    99:
    100:    // find the right page for this index
    101:    int Page::InsertNode(Index& rIndex, BOOL findOnly)
    102:    {
    103:       int retVal =0;
    104:       BOOL inserted = FALSE;
    105:       int i;
    106:
    107:       assert(myVars.myCount>0); // nodes have at least 1
    108:       assert(myKeys[0].GetPointer()); // must be valid
    109:
    110:       if (findOnly)
    111:       {
    112:          for (i = 0; i< myVars.myCount; i++)
    113:          {
    114:             if (rIndex.Begins(myKeys[i]))
    115:             {
    116:                BTree::theIter.RecordNode(myVars.myPageNumber,i);
    117:                return myKeys[i].Insert(rIndex,findOnly);
    118:             }
    119:          }
    120:       }
    121:
    122:
    123:       // does it go before my first entry?
    124:       if (!inserted && rIndex < myKeys[0])
    125:       {
    126:
    127:          if (findOnly)
    128:             return 0L;
    129:
    130:          myKeys[0].SetData(rIndex);
    131:          BTree::theIter.RecordNode(myVars.myPageNumber,0);
    132:          retVal=myKeys[0].Insert(rIndex,findOnly);
    133:          inserted = TRUE;
    134:       }
    135:
    136:       // find insertion point
    137:       if (!inserted)
    138:       {
    139:          for (i = myVars.myCount-1; i>=0; i)
    140:          {
    141:             assert(myKeys[i].GetPointer());
    142:             if ( (rIndex >= myKeys[i]))
    143:             {
    144:                 BTree::theIter.RecordNode(myVars.myPageNumber,i);
    145:                 retVal=myKeys[i].Insert(rIndex,findOnly);
    146:                 inserted = TRUE;
    147:                 break;
    148:             }
    149:          }
    150:       }
    151:       assert(inserted);  // change to exception if not!
    152:
    153:       // if you had to split
    154:       if (retVal && !findOnly) // got back a pointer to a new page
    155:       {
    156:          Index * pIndex = new Index(GetPage(retVal)->GetFirstIndex());
    157:          pIndex->SetPointer(retVal);
    158:          retVal = InsertLeaf(*pIndex);
    159:       }
    160:       return retVal;
    161:    }
    162:
    163:    // called if current page is a leaf
    164:    int Page::FindLeaf(Index& rIndex, BOOL findOnly)
    165:    {
    166:       int result = 0;
    167:
    168:       // no duplicates!
    169:       for (int i=0; i < myVars.myCount; i++)
    170:       {
    171:          if (findOnly)
    172:          {
    173:             if (rIndex.Begins(myKeys[i]))
    174:             {
    175:                   BTree::theIter.RecordLeaf(myVars.myPageNumber,i);
    176:                   return myKeys[i].GetPointer();
    177:             }
    178:           }
    179:           else
    180:             if (rIndex == myKeys[i])
    181:                   return BTree::theWNJFile.Insert(
    182:                               rIndex.GetPointer(),
    183:                                  myKeys[i].GetPointer());
    184:       }
    185:
    186:       if (findOnly)         // not found
    187:          return result;
    188:
    189:       // this index item does not yet exist
    190:       // before you push it into the index
    191:       // push an entry into the wnj.idx
    192:       // and set the index to point to that entry
    193:       rIndex.SetPointer(BTree::theWNJFile.Append(rIndex.GetPointer())); //
               new!
    194:       return InsertLeaf(rIndex);
    195:    }
    196:
    197:    int Page::InsertLeaf(Index& rIndex)
    198:    {
    199:       int result = 0;
    200:       if (myVars.myCount < Order)
    201:          Push(rIndex);
    202:       else      // overflow the page
    203:       {
    204:             // make sibling
    205:             int NewPg =
    206:                NewPage(myKeys,Order/2,myVars.IsLeaf,myVars.myCount);
    207:             Page* Sibling = GetPage(NewPg);
    208:             Nullify(Order/2);                     // nullify my right half
    209:
    210:             // does it fit in this side?
    211:             if (myVars.myCount>Order/2-1 && rIndex <= myKeys[Order/2-1])
    212:                Push(rIndex);
    213:             else  // push it into the new sibling
    214:                Sibling->Push(rIndex);
    215:
    216:             result = NewPg; // we split, pass it up
    217:          }
    218:        return result;
    219:    }
    220:
    221:    // put the new index into this page (in order)
    222:    void Page::Push(Index& rIndex,int offset,BOOL first)
    223:    {
    224:       BOOL inserted = FALSE;
    225:       assert(myVars.myCount < Order);
    226:       for (int i=offset; i<Order && i<myVars.myCount; i++)
    227:       {
    228:          assert(myKeys[i].GetPointer());
    229:          if (rIndex <= myKeys[i])
    230:          {
    231:                Push(myKeys[i],offset+1,FALSE);
    232:                myKeys[i]=rIndex;
    233:                inserted = TRUE;
    234:                break;
    235:          }
    236:       }
    237:       if (!inserted)
    238:          myKeys[myVars.myCount] = rIndex;
    239:
    240:       if (first)
    241:          myVars.myCount++;
    242:    }
    243:
    244:
    245:    void Page::Print()
    246:    {
    247:          if (!myVars.IsLeaf)
    248:          {
    249:             BTree::NodePageCtr++;
    250:             BTree::NodeIndexPerPage[myVars.myCount]++;
    251:             BTree::NodeIndexCtr+=myVars.myCount;
    252:          }
    253:          else
    254:          {
    255:             BTree::LeafPageCtr++;
    256:             BTree::LeafIndexPerPage[myVars.myCount]++;
    257:             BTree::LeafIndexCtr+=myVars.myCount;
    258:          }
    259:
    260:          for (int i = 0; i<Order && i < myVars.myCount; i++)
    261:          {
    262:             assert(myKeys[i].GetPointer());
    263:             if (myVars.IsLeaf)
    264:                myKeys[i].PrintKey();
    265:             else
    266:                myKeys[i].PrintPage();
    267:          }
    268:    }
    269:
    270:    // write out the entire page as a block
    271:    fstream& Page::Write(fstream& file)
    272:    {
    273:       char buffer[PageSize];
    274:       memcpy(buffer,&myBlock,PageSize);
    275:       file.flush();
    276:       file.clear();
    277:       file.seekp(myVars.myPageNumber*PageSize);
    278:       file.write(buffer,PageSize);
    279:       return file;
    280:    }

Listing 15.4 Iterator.cpp

    1:      // **************************************************
    2:      // PROGRAM:  Iterator
    3:      // FILE:     iter.cpp
    4:      // PURPOSE:  iterator for BTree
    5:      // NOTES:
    6:      // AUTHOR:   Jesse Liberty (jl)
    7:      // REVISIONS: 12/1/94 1.0 jl  initial release
    8:      // **************************************************
    9:      #include "btree.hpp"
    10:
    11:     // history class implementations
    12:     History::History(int pno, int off, BOOL node):
    13:        PageNo(pno),
    14:        OffSet(off),
    15:        IsNode(node)
    16:     {}
    17:
    18:     // Iterator implementations
    19:
    20:     // start off with blank iterator
    21:     Iterator::Iterator()
    22:     {
    23:        Reset();
    24:     }
    25:
    26:     Iterator::~Iterator()
    27:     {
    28:        for (int i = 0; i<MaxPages; i++)
    29:           delete myHistory[i];
    30:     }
    31:
    32:     void Iterator::RecordNode(int PageNo, int OffSet)
    33:     {
    34:       myHistory[myNext++] = new History(PageNo,OffSet,TRUE);
    35:     }
    36:
    37:     void Iterator::RecordLeaf(int PageNo, int OffSet)
    38:     {
    39:       myHistory[myNext++] = new History(PageNo,OffSet,FALSE);
    40:     }
    41:
    42:     History* Iterator::GetLast()
    43:     {
    44:        return myHistory[myNext-1];
    45:     }
    46:
    47:     History* Iterator::GetFirst()
    48:     {
    49:        return myHistory[0];
    50:     }
    51:
    52:     void Iterator::Reset()
    53:     {
    54:        for (int i = 0; i<MaxPages; i++)
    55:           myHistory[i] = 0;
    56:        myNext = 0;
    57:     }
    58:
    59:
    60:     int Iterator::GetNext(const Index& theIndex)
    61:     {
    62:
    63:        for (;;)
    64:        {
    65:           History * pHist = GetLast();
    66:           int pgNo = pHist-> PageNo;
    67:           int newOffSet = pHist->OffSet+1;
    68:           Page * pg = BTree::theIDXManager.GetPage(pgNo);
    69:
    70:           if (newOffSet < pg->GetCount())
    71:           {
    72:              Index Idx = pg->GetIndex(newOffSet);
    73:              pHist->OffSet = newOffSet;
    74:              for (;;)
    75:              {
    76:                 if (pg->GetIsLeaf())
    77:                 {
    78:                    // cout << "Key: " << Idx.GetData() << endl;
    79:                    if (theIndex.Begins(Idx))
    80:                       return Idx.GetPointer();
    81:                    else
    82:                       return 0;
    83:                 }
    84:                 else
    85:                 {
    86:                 pg = BTree::theIDXManager.GetPage(Idx.GetPointer());
    87:                 Idx = pg->GetFirstIndex();
    88:                 pHist = new History(pg->GetPageNumber(),0, (BOOL)
                        !pg->GetIsLeaf());
    89:                 myHistory[myNext++] = pHist;
    90:                 }
    91:              }  // end inner for loop
    92:           }
    93:           else
    94:           {
    95:              delete myHistory[myNext-1];
    96:              myHistory[myNext-1] = 0;
    97:              myNext;
    98:              if (!myNext)
    99:                 return 0;
    100:          }
    101:       }  // end outer for loop
    102:    }

Listing 15.5 Index.cpp

    1:       // **************************************************
    2:       // PROGRAM:  index
    3:       // FILE:     index.cpp
    4:       // PURPOSE:  provide fundamental btree functionality
    5:       // NOTES:
    6:       // AUTHOR:   Jesse Liberty (jl)
    7:       // REVISIONS: 11/1/94 1.0 jl  initial release
    8:       // **************************************************
    9:
    10:      #include "btree.hpp"
    11:      #include <ctype.h>
    12:
    13:      Index::Index(char* str):myPointer(1)
    14:      {
    15:         strncpy(myData,str,dataLen);
    16:         myData[dataLen]='\0';
    17:         for (int i = 0; i< strlen(myData); i++)
    18:            myData[i] = toupper(myData[i]);
    19:      }
    20:
    21:      Index::Index(char* str, int ptr):myPointer(ptr)
    22:      {
    23:
    24:         strncpy(myData,str,dataLen);
    25:         myData[dataLen]='\0';
    26:         for (int i = 0; i< strlen(myData); i++)
    27:            myData[i] = toupper(myData[i]);
    28:
    29:      }
    30:
    31:      Index::Index(Index& rhs):
    32:         myPointer(rhs.GetPointer())
    33:      {
    34:         strcpy(myData, rhs.GetData());
    35:         for (int i = 0; i< strlen(myData); i++)
    36:            myData[i] = toupper(myData[i]);
    37:
    38:      }
    39:
    40:      Index::Index():myPointer(0)
    41:      {
    42:         myData[0]='\0';
    43:      }
    44:
    45:      void Index::PrintKey()
    46:      {
    47:         cout << "  " << myData;
    48:      }
    49:
    50:      void Index::PrintPage()
    51:      {
    52:         cout << "\n" << myData << ": " ;
    53:         BTree::theIDXManager.GetPage(myPointer)->Print();
    54:      }
    55:
    56:      int Index::Insert(Index& ref, BOOL findOnly)
    57:      {
    58:         return BTree::theIDXManager.GetPage(myPointer)->Insert(ref,findOnly);
    59:      }
    60:
    61:      int Index::operator==(const Index& rhs)
    62:      {
    63:         return (strcmp(myData,rhs.GetData()) == 0); // case insensitive
    64:      }
    65:
    66:      int Index::Begins(const Index& rhs) const
    67:      {
    68:         int len = strlen(myData);
    69:         if (!len)
    70:            return TRUE;
    71:         return (strncmp(myData,rhs.GetData(),len) == 0);
    72:      }

Listing 15.6 Datafile.cpp

    1:          // **************************************************
    2:          // PROGRAM:  Data file
    3:          // FILE:     datafile.cpp
    4:          // PURPOSE:
    5:          // NOTES:
    6:          // AUTHOR:   Jesse Liberty (jl)
    7:          // REVISIONS: 11/5/94 1.0 jl  initial release
    8:          // **************************************************
    9:
    10:         #include "btree.hpp"
    11:         #include <assert.h>
    12:
    13:         // on construction, try to open the file if it exists
    14:         DataFile::DataFile():
    15:            myFile("ROBIN.DAT",
    16:            ios::binary | ios::in | ios::out | ios::nocreate | ios::app)
    17:         {
    18:         }
    19:
    20:         void DataFile::Create()
    21:         {
    22:            if (!myFile) // nocreate failed, first creation
    23:            {
    24:
    25:               // open the file, create it this time
    26:               myFile.clear();
    27:
    28:               myFile.open
    29:                  ("ROBIN.DAT",ios::binary | ios::in | ios::out | ios::app);
    30:
    31:               char Header[2];
    32:               int MagicNumber = 1234; // a number we can check for
    33:               memcpy(Header,&MagicNumber,2);
    34:               myFile.clear();
    35:               myFile.flush();
    36:               myFile.seekp(0);
    37:               myFile.write(Header,2);
    38:               return;
    39:            }
    40:
    41:            // we did open the file, it already existed
    42:            // get the numbers we need
    43:            int MagicNumber;
    44:            myFile.seekg(0);
    45:            myFile.read((char *) &MagicNumber,2);
    46:
    47:            // check the magic number. If it is wrong the file is
    48:            // corrupt or this isn't the index file
    49:            if (MagicNumber != 1234)
    50:            {
    51:                  // change to an exception!!
    52:                  cout << "DataFile Magic number failed!";
    53:            }
    54:            return;
    55:         }
    56:
    57:         // write out the numbers we'll need next time
    58:         void DataFile::Close()
    59:         {
    60:            myFile.close();
    61:         }
    62:
    63:
    64:         int DataFile::Insert(char * newNote)
    65:         {
    66:            int len = strlen(newNote);
    67:            int fullLen = len + 6;
    68:
    69:            time_t theTime;
    70:            theTime = time(NULL);
    71:
    72:            char buffer[PageSize];
    73:            memcpy(buffer,&len,2);
    74:            memcpy(buffer+2,&theTime,4);
    75:            memcpy(buffer+6,newNote,len);
    76:
    77:            myFile.clear();
    78:            myFile.flush();
    79:            myFile.seekp(0,ios::end);
    80:            int offset = (int) myFile.tellp();
    81:            myFile.write(buffer,fullLen);
    82:            myFile.flush();
    83:            return offset;
    84:         }
    85:
    86:         void DataFile::GetRecord(int offset, char* buffer, int& len,
                time_t&theTime)
    87:         {
    88:            char tmpBuff[PageSize];
    89:            myFile.flush();
    90:            myFile.clear();
    91:            myFile.seekg(offset);
    92:            myFile.read(tmpBuff,PageSize);
    93:            memcpy(&len,tmpBuff,2);
    94:            memcpy(&theTime,tmpBuff+2,4);
    95:            strncpy(buffer,tmpBuff+6,len);
    96:            buffer[len] = '\0';
    97:         }

Listing 15.7 IdxMgr.cpp

    1:       // **************************************************
    2:       // PROGRAM:  Idxmr
    3:       // FILE:     Idxmgr.cpp
    4:       // PURPOSE:  provide i/o for btree
    5:       // NOTES:
    6:       // AUTHOR:   Jesse Liberty (jl)
    7:       // REVISIONS: 11/5/94 1.0 jl  initial release
    8:       // **************************************************
    9:
    10:      #include "btree.hpp"
    11:      #include <assert.h>
    12:
    13:      // on construction, try to open the file if it exists
    14:      IDXManager::IDXManager():
    15:         myFile("ROBIN.IDX",ios::binary | ios::in | ios::out | ios::nocreate)
    16:      {
    17:         // initialize the pointers to null
    18:         for (int i = 0; i< MaxPages; i++)
    19:            myPages[i] = 0;
    20:         myCount = 0;
    21:
    22:      }
    23:
    24:      // called by btree constructor
    25:      // if we opened the file, read in the numbers we need
    26:      // otherwise create the file
    27:      int IDXManager::Create()
    28:      {
    29:         if (!myFile) // nocreate failed, first creation
    30:         {
    31:            // open the file, create it this time
    32:            myFile.open("ROBIN.IDX",ios::binary | ios::in | ios::out);
    33:
    34:            char Header[PageSize];
    35:            int MagicNumber = 1234; // a number we can check for
    36:            memcpy(Header,&MagicNumber,2);
    37:            int NextPage = 1;
    38:            memcpy(Header+2,&NextPage,2);
    39:            memcpy(Header+4,&NextPage,2);
    40:            Page::SetgPageNumber(NextPage);
    41:            char title[]="ROBIN.IDX. Ver 1.00";
    42:            memcpy(Header+6,title,strlen(title));
    43:            myFile.flush();
    44:            myFile.clear();
    45:            myFile.seekp(0);
    46:            myFile.write(Header,PageSize);
    47:            return 0;
    48:         }
    49:
    50:         // we did open the file, it already existed
    51:         // get the numbers we need
    52:         int MagicNumber;
    53:         myFile.seekg(0);
    54:         myFile.read((char *) &MagicNumber,2);
    55:
    56:         // check the magic number. If it is wrong the file is
    57:         // corrupt or this isn't the index file
    58:         if (MagicNumber != 1234)
    59:         {
    60:               // change to an exception!!
    61:               cout << "Index Magic number failed!";
    62:               return 0;
    63:         }
    64:
    65:         int NextPage;
    66:         myFile.seekg(2);
    67:         myFile.read((char*) &NextPage,2);
    68:         Page::SetgPageNumber(NextPage);
    69:         int FirstPage;
    70:         myFile.seekg(4);
    71:         myFile.read((char*) &FirstPage,2);
    72:         const int room =  PageSize - 6;
    73:         char buffer[room];
    74:         myFile.read(buffer,room);
    75:         buffer[20]='\0';
    76:         // cout << buffer << endl;
    77:         // read in all the pages
    78:         for (int i = 1; i < NextPage; i++)
    79:         {
    80:               myFile.seekg(i * PageSize);
    81:               char buffer[PageSize];
    82:               myFile.read( buffer, PageSize);
    83:               Page * pg = new Page(buffer);
    84:               Insert(pg);
    85:         }
    86:
    87:         return FirstPage;
    88:      }
    89:
    90:      // write out the numbers we'll need next time
    91:      void IDXManager::Close(int theRoot)
    92:      {
    93:
    94:         for (int i = 0; i< MaxPages; i++)
    95:            if (myPages[i])
    96:               Save(myPages[i]);
    97:         int NextPage = Page::GetgPageNumber();
    98:         if (!myFile)
    99:            cout << "Error opening myFile!" << endl;
    100:        myFile.flush();
    101:        myFile.clear();
    102:        myFile.seekp(2);
    103:        myFile.write ( (char *) &NextPage,2);
    104:        myFile.seekp(4);
    105:        myFile.write((char*) &theRoot,2);
    106:        myFile.close();
    107:     }
    108:
    109:     // wrapper function
    110:     int IDXManager::NewPage(Index& index, BOOL bLeaf)
    111:     {
    112:        Page * newPage = new Page(index, bLeaf);
    113:        Insert(newPage);
    114:        Save(newPage);
    115:        return newPage->GetPageNumber();
    116:     }
    117:
    118:     int IDXManager::NewPage(
    119:        Index *array,
    120:        int offset,
    121:        BOOL leaf,
    122:        int count)
    123:     {
    124:        Page * newPage = new Page(array, offset, leaf, count);
    125:        Insert(newPage);
    126:        Save(newPage);
    127:        return newPage->GetPageNumber();
    128:     }
    129:
    130:     void IDXManager::Insert(Page * newPage)
    131:     {
    132:
    133:        // add new page into array of page managers
    134:        if (myCount < MaxPages)
    135:        {
    136:           assert(!myPages[myCount]);
    137:           myPages[myCount++] = newPage;
    138:        }
    139:        else  // no room, time to page out to disk
    140:        {
    141:           int lowest = -1;
    142:
    143:           for (int i = 0; i< MaxPages; i++)
    144:           {
    145:              if (myPages[i]->GetLocked() == FALSE )
    146:                 lowest = i;
    147:           }
    148:           if (lowest == -1)
    149:              assert(lowest != -1); // change to exception if -1 (no page to
                      kill)
    150:
    151:           for (i = 0; i< MaxPages; i++)
    152:              if (myPages[i]->GetTime() < myPages[lowest]->GetTime() &&
                      myPages[i]->GetLocked() == FALSE)
    153:
    154:
    155:                 lowest = i;
    156:
    157:           assert(myPages[lowest]);
    158:           Save(myPages[lowest]);
    159:           delete myPages[lowest];
    160:           myPages[lowest] = newPage;
    161:
    162:        }
    163:     }
    164:
    165:     // tell the page to write itself out
    166:     void IDXManager::Save(Page* pg)
    167:     {
    168:       pg->Write(myFile);
    169:     }
    170:
    171:     // see if the page is in memory, if so return it
    172:     // otherwise get it from disk
    173:     // note: this won't scale, with lots of page managers
    174:     // you'd need a more efficient search. 10 levels of page
    175:     // managers, with 31 indexes per page gives you room for
    176:     // 800 trillion words. Even if each page is only 1/2 full
    177:     // on average, 10 levels of depth would represent 64 million
    178:     // keys alone, not to mention the actual records.
    179:
    180:     Page * IDXManager::GetPage(int target)
    181:     {
    182:
    183:        for (int i = 0; i< MaxPages; i++)
    184:        {
    185:           if (myPages[i]->GetPageNumber() == target)
    186:              return  myPages[i];
    187:        }
    188:        myFile.seekg(target * PageSize);
    189:        char buffer[PageSize];
    190:        myFile.read( buffer, PageSize);
    191:        Page * pg = new Page(buffer);
    192:        Insert(pg);
    193:        return pg;
    194:     }

Listing 15.8 WNJFile.cpp

    1:        // **************************************************
    2:        // PROGRAM:  WNJ file
    3:        // FILE:     WNJfile.cpp
    4:        // PURPOSE:
    5:        // NOTES:
    6:        // AUTHOR:   Jesse Liberty (jl)
    7:        // REVISIONS: 11/5/94 1.0 jl  initial release
    8:        // **************************************************
    9:
    10:       #include "btree.hpp"
    11:       #include <assert.h>
    12:
    13:       // on construction, try to open the file if it exists
    14:       WNJFile::WNJFile():
    15:          myFile("ROBINWNJ.IDX",
    16:          ios::binary | ios::in | ios::out | ios::nocreate)
    17:       {
    18:          for (int i = 0; i<5; i++)
    19:             myints[i]=0L;
    20:       }
    21:
    22:       void WNJFile::Create()
    23:       {
    24:          char Header[4];
    25:          int MagicNumber=0; // a number we can check for
    26:          int zero = 0;
    27:
    28:          if (!myFile) // nocreate failed, first creation
    29:          {
    30:             // open the file, create it this time
    31:             myFile.clear();
    32:             myFile.open("ROBINWNJ.IDX",
    33:                ios::binary | ios::in | ios::out);
    34:
    35:             MagicNumber = 1234;
    36:             memcpy(Header,&MagicNumber,4);
    37:             memcpy(Header+2,&zero,2);
    38:             myFile.seekp(0);
    39:             myFile.write(Header,4);
    40:             myFile.flush();
    41:             return;
    42:          }
    43:
    44:          // we did open the file, it already existed
    45:          // get the numbers we need
    46:
    47:
    48:          myFile.seekg(0);
    49:          myFile.read(Header,4);
    50:          memcpy(&MagicNumber,Header,2);
    51:          memcpy(&myCount,Header+2,2);
    52:
    53:          // check the magic number. If it is wrong the file is
    54:          // corrupt or this isn't the index file
    55:          if (MagicNumber != 1234)
    56:          {
    57:                // change to an exception!!
    58:                cout << "WNJ Magic number failed!";
    59:                cout << "Magic number: " << MagicNumber;
    60:                cout << "\nmyCount: " << myCount << endl;
    61:          }
    62:          return;
    63:       }
    64:
    65:       // write out the numbers we'll need next time
    66:       void WNJFile::Close()
    67:       {
    68:          myFile.seekg(2);
    69:          myFile.write((char*)&myCount,2);
    70:          myFile.close();
    71:       }
    72:
    73:       int WNJFile::Append(int DataOffset)
    74:       {
    75:
    76:          int newPos = 4 + myCount++ * (5*2);
    77:          int offsets[5];
    78:          offsets[0] = DataOffset;
    79:          for (int i = 1; i<5; i++)
    80:             offsets[i]=0;
    81:          myFile.seekg(newPos);
    82:          myFile.write((char*)offsets,5*2);
    83:
    84:          return newPos;
    85:       }
    86:
    87:
    88:       int WNJFile::Insert(int DataOffset,int WNJOffset)
    89:       {
    90:          int ints[5];
    91:          myFile.seekg(WNJOffset);
    92:          myFile.read((char*)ints,5*2);
    93:
    94:          int offset=WNJOffset;
    95:
    96:          while (ints[4])
    97:          {
    98:             offset = ints[4];
    99:             myFile.clear();
    100:            myFile.flush();
    101:            myFile.seekg(ints[4]);
    102:            myFile.read((char*)ints,5*2);
    103:         }
    104:         if (ints[3]) // full!
    105:         {
    106:            ints[4] = Append(DataOffset);
    107:            myFile.clear();
    108:            myFile.flush();
    109:            myFile.seekg(offset);
    110:            myFile.write((char*)ints,5*2);
    111:         }
    112:         else
    113:         {
    114:            for (int i = 0; i<4; i++)
    115:            {
    116:               if (ints[i] == 0)
    117:               {
    118:                  ints[i] = DataOffset;
    119:                  myFile.clear();
    120:                  myFile.flush();
    121:                  myFile.seekg(offset);
    122:                  myFile.write((char*)ints,5*2);
    123:                  break;
    124:               }
    125:            }
    126:         }
    127:         return 0;
    128:      }
    129:
    130:
    131:      int* WNJFile::Find(int NextWNJ)
    132:      {
    133:         int ints[5];
    134:         int * results = new int[256];
    135:
    136:         int i = 0, j=0;
    137:
    138:         while (j<256)
    139:            results[j++] = 0;
    140:
    141:         j = 0;
    142:
    143:         myFile.seekg(NextWNJ);
    144:         myFile.read((char*)ints,5*2);
    145:
    146:         while (j < 256)
    147:         {
    148:            if (ints[i])
    149:            {
    150:               if (i == 4)
    151:               {
    152:                  myFile.seekg(ints[4]);
    153:                  myFile.read((char*)ints,5*2);
    154:                  i = 0;
    155:                  continue;
    156:               }
    157:            results[j++] = ints[i++];
    158:            }
    159:            else
    160:               break;
    161:         }
    162:         return results;
    163:      }

Listing 15.9 Driver.cpp

    1:       // Listing 15.9 - Driver.cpp
    2:
    3:       #include "String.hpp"
    4:       #include "stdef.hpp"
    5:       #include "btree.hpp"
    6:
    7:       IDXManager BTree::theIDXManager;
    8:       DataFile BTree::theDataFile;
    9:       WNJFile BTree::theWNJFile;
    10:      Iterator BTree::theIter;
    11:
    12:      int WNJFile::myCount=0L;
    13:      int Page::gPage=1;
    14:      int BTree::NodeIndexCtr=0;
    15:      int BTree::LeafIndexCtr=0;
    16:      int BTree::NodePageCtr=0;
    17:      int BTree::LeafPageCtr=0;
    18:      int BTree::NodeIndexPerPage[Order+1];
    19:      int BTree::LeafIndexPerPage[Order+1];
    20:
    21:      int main()
    22:      {
    23:         BTree myTree;
    24:
    25:         for (int i = 0; i < Order +2; i++)
    26:         {
    27:            BTree::NodeIndexPerPage[i] =0;
    28:            BTree::LeafIndexPerPage[i] = 0;
    29:         }
    30:
    31:         char buffer[PageSize+1];
    32:
    33:         for (;;)
    34:         {
    35:            cout << "Enter a string (blank to stop): ";
    36:            cin.getline(buffer,PageSize);
    37:            // cin.ignore(PageSize,'\n');
    38:            if (buffer[0])
    39:               myTree.Insert(buffer);
    40:            else
    41:               break;
    42:         }
    43:         for (;;)
    44:         {
    45:            cout << "Find: ";
    46:            cin.getline(buffer,255);
    47:            if (buffer[0])
    48:            {
    49:                  int offset = myTree.Find(buffer);
    50:                  while (offset)
    51:                  {
    52:                     int* found  =  BTree::theWNJFile.Find(offset);
    53:                     int i=0;
    54:                     int len;
    55:                     time_t theTime;
    56:                     char buff[512];
    57:                     while (found[i])
    58:                     {
    59:                        BTree::theDataFile.GetRecord(found[i],buff,len,
                               theTime);
    60:                        struct tm * ts = localtime(&theTime);
    61:                        cout << "Found: ";
    62:                        cout << ts->tm_mon << "/";
    63:                        cout << ts->tm_mday << "/";
    64:                        cout << ts->tm_year << " ";
    65:                        cout <<  buff << endl;
    66:                        i++;
    67:                     }
    68:                     delete [] found;
    69:                     offset = myTree.GetNext(buffer);
    70:                  }
    71:            }
    72:            else
    73:               break;
    74:         }
    75:
    76:         myTree.PrintTree();
    77:
    78:         return 0;
    79:      }
Output:
    d:\112\day16>r1
    Enter a string (blank to stop): There is a place for us
    Enter a string (blank to stop): The quick brown fox jumps
    Enter a string (blank to stop): Is that a theater production?
    Enter a string (blank to stop): Theirs is not to question
    Enter a string (blank to stop): There's no place like home
    Enter a string (blank to stop):
    Find: home
    Found: 11/2/94 There's no place like home
    Find: the
    Found: 11/2/94 The quick brown fox jumps
    Found: 11/2/94 Is that a theater production?
    Found: 11/2/94 Theirs is not to question
    Found: 11/2/94 There is a place for us
    Found: 11/2/94 There's no place like home
    Find:

    BROWN:
    BROWN:   BROWN  FOR
    FOX:   FOX  HOME  JUMPS  LIKE
    NOT:
    NOT:   NOT  PLACE  PRODUCTION  QUESTION
    QUICK:   QUICK  THAT
    THE:   THE  THEATER
    THEIRS:   THEIRS  THERE  THERE'S

    Stats:
    Node pages:   3
    Leaf pages:   6
    Node indexes: 8
    Leaf indexes: 17
    Pages with 2 nodes:  2
    Pages with 2 leaves: 3
    Pages with 3 leaves: 1
    Pages with 4 nodes:  1
    Pages with 4 leaves: 2
Analysis:

In line 40 of BTree.hpp (refer to listing 15.1), the Iterator class is declared. It has an array of history structures, which are declared in lines 30 through 36 of the same file. Each history object knows which binary page it is on, and what its offset is. It also knows whether it is a node (or a page).

In line 155, a static iterator theIter is declared. In line 38 of listing 15.2, the iterator is reset. In line 49, you see that the BTree method, GetNext(), creates an index object and passes it in to the iterator's GetNext() method, which iterates over its list of objects.

In line 10 of the driver program (refer to listing 15.9), the iterator is defined (space is allocated). In line 45, the user is prompted to enter a string for which the program will search. The search is in line 52 and, once found, the record is displayed. In line 69, the iteratorwhich was created in the Find() methodis used to find the next matching record.

Extending the Iterator

A full-fledged iterator should support the previous operation, as well as the next operation. It also should let you start at the first or last node in the tree, and should tell you whether you have gone past the start or end of the tree.

If you were building an iterator for general use, you would want to add a method to BTree that returns a copy of its iterator. You also would want to add methods to Iterator itself (or to BTree as appropriate) to set the iterator to the first or last index in the first or last page.

Examining Issues with Iterators

As mentioned earlier, the trickiest problem when working with iterators is when you allow additions and deletions to the list through the iterator, and you also allow more than one iterator to point to the same list.

If you need to implement this functionality, you have to provide your tree with a list of all the iterators currently pointing into the tree. If the tree is changed for any reason (a node is added or deleted, for example), you need to update all the iterators so that they still are valid.

You didn't have this problem with the iterator shown in the preceding section, because the iterator never was provided as a stand-alone object; the tree maintained its own iterator and kept it up to date each time it did a search.

Summary

Today you learned what iterators are and how to create an iterator for the BTree class. You used the iterator to search the list, and you examined many of the programming trade-offs that iterators present.

Q&A

Q: Why bother with iterators?

A: An iterator enables you to walk a list that otherwise might not be presented in the desired order.

Q: Can you have more than one iterator on a list?

A: Yes, that is one of the great advantages of iterators. You can walk the list from different parts of the same set of data.

Q: What is a disadvantage of providing an iterator?

A: Writing a complete Iterator class is not easy. There are significant issues having to do with adding and deleting items in the list.

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered, and exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix A, and make sure that you understand the answers before continuing to the next chapter.

Quiz

  1. What is the difference between an iterator and an index?

  2. Why does the delete operation present a challenge for the iterator?

  3. What is the principal design goal of a good iterator?

[Click here for Answers]

Exercises

  1. Design (write the public interface for) these two classes: IntLista list of integers and IntListIterator. The only public interface that IntList should provide (besides constructors and trivial accessors) is a method that returns a new IntListIterator. Do not use linked lists; instead, just provide a very simple, fixed-size array of ints. (The size should be passed in as an argument to the constructor. You do not need to provide resize capability.)

    The IntListIterator should provide methods to get and set the current value (the value in the list that the iterator is pointing to). It also should provide the capability to walk forward and backward through the list, and to jump to the beginning or the end. (Indexed access to the list is unnecessary.) All the methods that move the current position should return a Boolean that specifies whether the current position is valid.

  2. Write the body of the IntList and IntListIterator classes.

  3. Write a test program proving that your IntList works.

  4. Convert the IntList and IntListIterator to use templates and to accept the type of elements in the list as a parameter.

  5. Write a test program that proves GenList<int> works the same as IntList did. Also, you should create a GenList<char *> and demonstrate that your polymorphic list works in this case, as well.

Go to: Table of Contents | Next Page