Day 13: ROBIN, V.1

The program is almost ready for your first release. You are, by now, eager to get an early version of the program into the hands of some "friendly users" who will help you test not only the robustness of the code, but the concept itself. There are just a few changes left to make to ensure that the program is usable. Today you will learn

Testing Your Program

In a real commercial release, there are a number of very important steps between finishing your program and shipping it to your customers. Typically, you would want the program extensively tested: first by yourself, then by a quality assurance team, and finally by friendly users.

After you have thoroughly tested the program, you will pass it to a professional quality assurance engineer, whose job it is to find those bugs you simply cannot find. This is a profession unto itself, with a specific intellectual discipline. If you don't have access to a QA engineer, however, you can get at least some of the benefits of QA testing by having someone other than you use the program for a while. You will want someone who will be forgiving of your bugs, but who also will be thorough at testing the product and detailed in writing down the specific steps that re-create the bug.

After you pass quality assurance, your program is ready for use by friendly users. Friendly users are, typically, divided first into co-workers and friends, and then into people outside your firm who would like a look at a preliminary release. The first group, family, friends, and folks with whom you work, usually are called alpha testers, because they see the alpha, or first, version of your program.

After your alpha users have had the product for a while, you may want to enter beta testing, the second round of testing. In this round, you give your program to people you know less well. They will put up with a few obscure bugs, and report faithfully on what they find, in exchange for an early (and usually free) copy of the program.

You can consider a commercial release only after your program has passed all these hurdles. The usual rule of thumb is that a week or two of QA testing, a month or so of alpha, and a few months of beta will suffice for most programs. A particularly simple utility might need less time, and a very complex program might need considerably more time. Programmers call the first release version of their program gold, or shrink-wrap. Some especially complex programs may go to an extended test of many potential users just before going gold; this often is called gamma testing.

Note: The code for ROBIN has been subjected to scrutiny and extensive testing, but nothing like the level of testing that would be appropriate for a commercial program. Remember that the code provided is for illustration purposes only, and treat this program like a beta release; do not assume that it is 100 percent bug free.

Finishing ROBIN, V.1

The program, as you left it on day 12, has almost all the functionality promised for your first release. You just need to add a driver program that reads the command line and takes the appropriate action. In the version shown today, the user may type ROBIN, followed by text to enter or by one of three flags:

In this version of the program, none of this functionality, reading the command line, parsing the commands, reading in an input file, and so onis part of the class structure. All this is done in v1.cpp, a collection of small functions wrapped around the classes created in previous days.

Locking the Pages

One nasty bug waiting to happen for all these days has been the possibility that a page will be swapped out of memory by the index manager while still in use due to the recursion in the insertion methods. This has not been a problem when adding small records, but now that you will have the capability to add hundreds of records from a file, you will need to protect against this problem.

A careful examination of the code as it currently exists shows that the only time a page is deleted from memory is when the index manager is inserting a new page. New pages are inserted only during the recursive Insert() call. Thus, if you can lock a page when you call Insert() and unlock it when you finish that call, you should be safe.

A simple approach to this is to create a flag in Page and check that flag before removing the page from memory. Until now, I have not used the lastKey member variable, so I have renamed it IsLocked and I'll use that as the flag. Because this is private member data, accessor methods have been created to set and obtain the status of this lock.

Note carefully that in most applications a simple Boolean flag such as this will not be sufficient. If more than one function could lock or unlock the same page, you would be in danger of item 1 locking page 5, item 2 locking it as well, and then item 1 unlocking it and thereby losing the lock from item 2! Because there is only one source of a lock in the current case, a Boolean is quite safe, however.

Displaying Results

The specification calls for displaying a list of all notes that match the search criteria. I truncate the display so that the list stays orderly and neat. What is displayed is a menu of numbered choices and the first few letters from each note.

After the user presses the number next to the note, the entire note is displayed and the menu of choices is redisplayed. In a graphical environment, you might put the menu of choices into a list box and have a viewer display the contents of the highlighted note.

For purposes of keeping the code simple and the results readable, I restrict the set of matches to the first 20 found, but this is entirely arbitrary. In a later version of ROBIN, you may want to allow the user to page through these results.

Exercising the Program

After you use the program for a while, create a large text file and feed it to ROBIN using the -F flag. This will load in hundreds of keys and dozens of notes and give you a chance to examine the performance and reliability of the program under "load." I read in the entire text for day 12, and produced the output shown after the following listings, along with the ancillary functions as described earlier.

    Listing 13.1     Btree.hpp

    Listing 13.2     Btree.cpp

    Listing 13.3     Datafile.cpp

    Listing 13.4     IDXMgr.cpp

    Listing 13.5     Index.cpp

    Listing 13.6     Page.cpp

    Listing 13.7     WNJFile.cpp

    Listing 13.8     The driver program, R1.cpp

Listing 13.1 Btree.hpp

    1:        // **************************************************
    2:        // PROGRAM:  Btree, Page and Index declarations
    3:        // FILE:     btree.hpp
    4:        // PURPOSE:  provide fundamental btree functionality
    5:        // NOTES:
    6:        // AUTHOR:   Jesse Liberty (jl)
    7:        // REVISIONS: 11/1/94 1.0 jl  initial release
    8:        // **************************************************
    9:        #ifndef BTREE_HPP                 // inclusion guards
    10:       #define BTREE_HPP
    11:
    12:       #include <time.h>
    13:       #include <string.h>
    14:       #include <fstream.h>
    15:
    16:       #include "stdef.hpp"
    17:
    18:       const int Order =      31;  // 31 indexes and 1 header
    19:       const int dataLen =    11; // length of a key
    20:       const int MaxPages =   10; // more than we need
    21:       const int SizeItem =   16;  // key + offset
    22:       const int SizePointer = 2; // size of offset
    23:       const int PageSize = (Order+1) * SizeItem;
    24:
    25:       // forward declarations
    26:       class Page;
    27:       class Index;
    28:
    29:       class DataFile
    30:       {
    31:       public:
    32:          // constructors & destructor
    33:          DataFile();
    34:          ~DataFile(){}
    35:
    36:          // management of files
    37:          void Close();
    38:          void Create();
    39:          void GetRecord(int, char*, int&, time_t&);
    40:
    41:          int Insert(char *);
    42:
    43:       private:
    44:       fstream myFile;
    45:       };
    46:
    47:       class WNJFile
    48:       {
    49:       public:
    50:          // constructors & destructor
    51:          WNJFile();
    52:          ~WNJFile(){}
    53:
    54:          // management of files
    55:          void Close();
    56:          void Create();
    57:          int* Find(int NextWNJ);
    58:          int Insert(int, int);
    59:          int Append(int);
    60:
    61:       private:
    62:       static int myCount;
    63:       fstream myFile;
    64:       union
    65:       {
    66:          int myints[5];
    67:          char  myArray[5*4];
    68:       };
    69:       };
    70:
    71:       // IDXManager - in memory keeps track of what pages are
    72:       // already in memory and swaps to disk as required
    73:       class IDXManager
    74:       {
    75:       public:
    76:          // constructors & destructor
    77:          IDXManager();
    78:          ~IDXManager(){}
    79:
    80:          // management of files
    81:          void Close(int);
    82:          int Create();
    83:
    84:          // Page manipulation
    85:          Page* GetPage(int);
    86:          void SetPage(Page*);
    87:          void Insert(Page * newPage);
    88:          void Save(Page* pg);
    89:          int NewPage(Index&, BOOL);
    90:          int NewPage(Index *array, int offset, BOOL leaf, int count);
    91:
    92:       private:
    93:       Page* myPages[MaxPages];
    94:       fstream myFile;
    95:       int myCount;
    96:       };
    97:
    98:       // the btree itself - has a pointer to first page
    99:       class BTree
    100:      {
    101:      public:
    102:         // constructors and destructor
    103:         BTree();
    104:         ~BTree();
    105:
    106:         // utility functions
    107:         void AddKey(char* data, int offset);
    108:         void Insert(char* buffer, int, char**);
    109:         void Insert(char*);
    110:         void PrintTree();
    111:         int Find(char* data);
    112:
    113:         // page methods
    114:         Page* GetPage(int page)
    115:            { return theDiskManager.GetPage(page); }
    116:         int NewPage(Index& pIndex, BOOL IsLeaf)
    117:            { return theDiskManager.NewPage(pIndex,FALSE); }
    118:
    119:         static int myAlNum(char ch);
    120:         // public static member!
    121:         static  IDXManager theDiskManager;
    122:         static WNJFile   theWNJFile;
    123:         static DataFile theDataFile;
    124:         static BOOL GetWord(char*, char*, int&);
    125:         static void GetStats();
    126:         static int NodeIndexCtr;
    127:         static int LeafIndexCtr;
    128:         static int NodePageCtr;
    129:         static int LeafPageCtr;
    130:         static int NodeIndexPerPage[Order+1];
    131:         static int LeafIndexPerPage[Order+1];
    132:
    133:      private:
    134:         int myRoot;
    135:      };
    136:
    137:      // index objects point to pages or real data
    138:      class Index
    139:      {
    140:      public:
    141:         // constructors & destructor
    142:         Index();
    143:         Index(char *);
    144:         Index (char*, int);
    145:         Index(Index&);
    146:         ~Index(){}
    147:
    148:         // accessors
    149:         const char * GetData() const  { return myData; }
    150:         void SetData(const Index& rhs)
    151:            { strcpy(myData,rhs.GetData()); }
    152:         void SetData(const char * rhs)
    153:            { strcpy(myData,rhs); }
    154:         int GetPointer()const { return myPointer; }
    155:         void SetPointer (int pg) { myPointer = pg; }
    156:
    157:         // utility functions
    158:         void PrintKey();
    159:         void PrintPage();
    160:         int Insert(Index& ref,BOOL findOnly = FALSE);
    161:
    162:         // overloaded operators
    163:         int operator==(const Index& rhs);
    164:
    165:         int operator < (const Index& rhs)
    166:          {return strcmp(myData,rhs.GetData())<0;}
    167:
    168:         int operator <= (const Index& rhs)
    169:          {return strcmp(myData,rhs.GetData())<=0;}
    170:
    171:         int operator > (const Index& rhs)
    172:          {return strcmp(myData,rhs.GetData())>0;}
    173:
    174:         int operator >= (const Index& rhs)
    175:          {return strcmp(myData,rhs.GetData())>=0;}
    176:
    177:      public:
    178:         int myPointer;
    179:         char myData[SizeItem  SizePointer];
    180:      };
    181:
    182:
    183:      // pages - consist of header and array of indexes
    184:      class Page
    185:      {
    186:      public:
    187:         // constructors and destructor
    188:         Page();
    189:         Page(char*);
    190:         Page(Index&,BOOL);
    191:         Page(Index*, int, BOOL, int);
    192:         ~Page(){}
    193:
    194:         // insertion and searchoperations
    195:         int Insert(Index&, BOOL findOnly = FALSE);
    196:         int Find(Index& idx) { return Insert(idx,TRUE); }
    197:         int InsertLeaf(Index&);
    198:         int FindLeaf(Index&,BOOL findOnly);
    199:         int InsertNode(Index&,BOOL findOnly = FALSE);
    200:         void Push(Index&,int offset=0, BOOL=TRUE);
    201:
    202:         // accessors
    203:         Index GetFirstIndex() { return myKeys[0]; }
    204:         BOOL GetIsLeaf() const { return myVars.IsLeaf; }
    205:         int GetCount() const { return myVars.myCount; }
    206:         void SetCount(int cnt) { myVars.myCount=cnt; }
    207:         time_t GetTime() { return myTime; }
    208:         BOOL GetLocked() { return myVars.IsLocked; }
    209:         void SetLocked (BOOL state) { myVars.IsLocked = state; }
    210:
    211:         // page manipulation
    212:         int GetPageNumber() const { return myVars.myPageNumber; }
    213:         Page* GetPage(int page)
    214:            { return BTree::theDiskManager.GetPage(page); }
    215:         int NewPage(Index& rIndex, BOOL IsLeaf)
    216:            { return BTree::theDiskManager.NewPage(rIndex,FALSE); }
    217:
    218:         int NewPage(Index* arr, int off,BOOL isL, int cnt)
    219:            { return   BTree::theDiskManager.NewPage(arr,off,isL, cnt); }
    220:
    221:         // utility functions
    222:         void Nullify(int offset);
    223:         void Print();
    224:         fstream& Write(fstream&);
    225:         void ReCount();
    226:
    227:         static int GetgPageNumber(){ return gPage; }
    228:         static void SetgPageNumber(int pg) { gPage = pg; }
    229:
    230:      private:
    231:         Index * const myKeys; // will point to myVars.mk
    232:         union
    233:         {
    234:            char myBlock[PageSize];   // a page from disk
    235:            struct
    236:            {
    237:               int myCount;
    238:               BOOL IsLeaf;
    239:               int myPageNumber;
    240:               BOOL IsLocked;
    241:               char overhead[8];
    242:               char mk[Order*sizeof(Index)];  // array of indexes
    243:            }myVars;
    244:         };
    245:
    246:         // memory only
    247:         static int gPage;
    248:         time_t myTime; // for lifo queue
    249:      };
    250:
    251:      #endif
    252:
    253:
    254:

Listing 13.2 Btree.cpp

    1:      // **************************************************
    2:      // PROGRAM:  BTree
    3:      // FILE:     btree.cpp
    4:      // PURPOSE:  provide fundamental btree functionality
    5:      // NOTES:
    6:      // AUTHOR:   Jesse Liberty (jl)
    7:      // REVISIONS: 11/1/94 1.0 jl  initial release
    8:      // **************************************************
    9:
    10:     #include "btree.hpp"
    11:     #include "stdef.hpp"
    12:     #include <ctype.h>
    13:     #include <stdlib.h>
    14:
    15:     // construct the tree
    16:     // initialize myRoot pointer to nothing
    17:     // either create or read the index file
    18:     BTree::BTree():myRoot(0)
    19:     {
    20:        myRoot = theDiskManager.Create();
    21:        theDataFile.Create();
    22:        theWNJFile.Create();
    23:
    24:     }
    25:
    26:     // write out the index file
    27:     BTree::~BTree()
    28:     {
    29:        theDiskManager.Close(myRoot);
    30:        theDataFile.Close();
    31:        theWNJFile.Close();
    32:     }
    33:
    34:     // find an existing record
    35:     int BTree::Find(char * str)
    36:     {
    37:        Index index(str);
    38:        if (!myRoot)
    39:           return 0L;
    40:        else
    41:           return GetPage(myRoot)->Find(index);
    42:     }
    43:
    44:     void BTree::Insert(char * buffer, int argc, char** argv)
    45:     {
    46:        // get each word,
    47:        int offset = theDataFile.Insert(buffer);
    48:        for (int i = 1; i<argc; i++)
    49:        {
    50:           AddKey(argv[i],offset);
    51:        }
    52:
    53:     }
    54:
    55:     void BTree::Insert(char* buffer)
    56:     {
    57:
    58:        if (strlen(buffer) < 3)
    59:           return;
    60:
    61:         char *buff = buffer;
    62:         char word[PageSize];
    63:         int wordOffset = 0;
    64:         int offset;
    65:
    66:        if (GetWord(buff,word,wordOffset))
    67:        {
    68:           offset = theDataFile.Insert(buffer);
    69:           AddKey(word,offset);
    70:        }
    71:
    72:
    73:        while (GetWord(buff,word,wordOffset))
    74:        {
    75:           AddKey(word,offset);
    76:        }
    77:
    78:     }
    79:
    80:
    81:
    82:
    83:     void BTree::AddKey(char * str, int offset )
    84:     {
    85:
    86:        if (strlen(str) < 3)
    87:           return;
    88:
    89:        int retVal =0;
    90:        Index index(str,offset);
    91:        if (!myRoot)
    92:        {
    93:           myRoot = theDiskManager.NewPage (index,TRUE);
    94:        }
    95:        else
    96:        {
    97:           retVal = GetPage(myRoot)->Insert(index);
    98:           if (retVal) // our root split
    99:           {
    100:             // create a pointer to the old top
    101:             Index index(GetPage(myRoot)->GetFirstIndex());
    102:             index.SetPointer(myRoot);
    103:             // make the new page & insert the index
    104:             int PageNumber = NewPage(index,FALSE);
    105:
    106:             Page* pg = GetPage(PageNumber);
    107:
    108:             //get a pointer to the new (sibling) page
    109:             Index Sib(GetPage(retVal)->GetFirstIndex());
    110:             Sib.SetPointer(retVal);
    111:             // put it into the page
    112:             pg->InsertLeaf(Sib);
    113:
    114:             // reset myRoot to point to the new top
    115:             myRoot = PageNumber;
    116:          }
    117:       }
    118:    }
    119:
    120:    void BTree::PrintTree()
    121:    {
    122:       GetPage(myRoot)->Print();
    123:
    124:       cout << "\n\nStats:" << endl;
    125:       cout << "Node pages:   " << NodePageCtr << endl;
    126:       cout << "Leaf pages:   " << LeafPageCtr << endl;
    127:       cout << "Node indexes: " << NodeIndexCtr << endl;
    128:       cout << "Leaf indexes: " << LeafIndexCtr << endl;
    129:       for (int i = 0; i < Order +2; i++)
    130:       {
    131:          if (NodeIndexPerPage[i])
    132:          {
    133:             cout << "Pages with " << i << " nodes:  ";
    134:             cout << NodeIndexPerPage[i] << endl;
    135:          }
    136:          if (LeafIndexPerPage[i])
    137:          {
    138:             cout << "Pages with " << i << " leaves: ";
    139:             cout << LeafIndexPerPage[i] << endl;
    140:          }
    141:       }
    142:
    143:    }
    144:
    145:
    146:
    147:    BOOL BTree::GetWord(char* string, char* word, int& wordOffset)
    148:    {
    149:
    150:       if (!string[wordOffset])
    151:          return FALSE;
    152:
    153:       char *p1, *p2;
    154:       p1 = p2 = string+wordOffset;
    155:
    156:       // eat leading spaces
    157:       for (int i = 0; i<(int)strlen(p1) && !BTree::myAlNum(p1[0]); i++)
    158:          p1++;
    159:
    160:       // see if you have a word
    161:       if (!BTree::myAlNum(p1[0]))
    162:          return FALSE;
    163:
    164:        p2 = p1; // point to start of word
    165:
    166:       // march p2 to end of word
    167:       while (BTree::myAlNum(p2[0]))
    168:          p2++;
    169:
    170:       int len = int (p2 - p1);
    171:       int pgSize = PageSize;
    172:    #if defined(_MSVC_16BIT) || defined(_MSVC_32BIT)
       :    {
       :       len = __min(len,(int)PageSize);
       :    }
       :    #else
       :    {
       :       len = min(len,(int)PageSize);
       :    }
       :    #endif
    173:
    174:       strncpy (word,p1,len);
    175:       word[len]='\0';
    176:
    177:       for (i = int(p2-string); i<(int)strlen(string) && !BTree::myAlNum(p2[0]); i++)
    178:          p2++;
    179:
    180:       wordOffset = int(p2-string);
    181:
    182:       return TRUE;
    183:    }
    184:
    185:    int BTree::myAlNum(char ch)
    186:    {
    187:       return isalnum(ch) ||
    188:          ch == '-' ||
    189:          ch == '\'' ||
    190:          ch == '(' ||
    191:          ch == ')';
    192:    }
    193:

Listing 13.3 Datafile.cpp

    1:          // **************************************************
    2:          // PROGRAM:  Data file
    3:          // FILE:     datafile.cpp
    4:          // PURPOSE:
    5:          // NOTES:
    6:          // AUTHOR:   Jesse Liberty (jl)
    7:          // REVISIONS: 11/5/94 1.0 jl  initial release
    8:          // **************************************************
    9:
    10:         #include "btree.hpp"
    11:         #include <assert.h>
    12:
    13:         // on construction, try to open the file if it exists
    14:         DataFile::DataFile():
    15:            myFile("ROBIN.DAT",
    16:            ios::binary | ios::in | ios::out | ios::nocreate | ios::app)
    17:         {
    18:         }
    19:
    20:         void DataFile::Create()
    21:         {
    22:            if (!myFile) // nocreate failed, first creation
    23:            {
    24:
    25:               // open the file, create it this time
    26:               myFile.clear();
    27:
    28:               myFile.open
    29:                  ("ROBIN.DAT",ios::binary | ios::in | ios::out | ios::app);
    30:
    31:               char Header[2];
    32:               int MagicNumber = 1234; // a number we can check for
    33:               memcpy(Header,&MagicNumber,2);
    34:               myFile.clear();
    35:               myFile.flush();
    36:               myFile.seekp(0);
    37:               myFile.write(Header,2);
    38:               return;
    39:            }
    40:
    41:            // we did open the file, it already existed
    42:            // get the numbers we need
    43:            int MagicNumber;
    44:            myFile.seekg(0);
    45:            myFile.read((char *) &MagicNumber,2);
    46:
    47:            // check the magic number. If it is wrong the file is
    48:            // corrupt or this isn't the index file
    49:            if (MagicNumber != 1234)
    50:            {
    51:                  // change to an exception!!
    52:                  cout << "DataFile Magic number failed!";
    53:            }
    54:            return;
    55:         }
    56:
    57:         // write out the numbers we'll need next time
    58:         void DataFile::Close()
    59:         {
    60:            myFile.close();
    61:         }
    62:
    63:
    64:         int DataFile::Insert(char * newNote)
    65:         {
    66:            int len = strlen(newNote);
    67:            int fullLen = len + 6;
    68:
    69:            time_t theTime;
    70:            theTime = time(NULL);
    71:
    72:            char buffer[PageSize];
    73:            memcpy(buffer,&len,2);
    74:            memcpy(buffer+2,&theTime,4);
    75:            memcpy(buffer+6,newNote,len);
    76:
    77:            myFile.clear();
    78:            myFile.flush();
    79:            myFile.seekp(0,ios::end);
    80:            int offset = (int) myFile.tellp();
    81:            myFile.write(buffer,fullLen);
    82:            myFile.flush();
    83:            return offset;
    84:         }
    85:
    86:         void DataFile::GetRecord(int offset, char* buffer, int& len, time_t&
                theTime)
    87:         {
    88:            char tmpBuff[PageSize];
    89:            myFile.flush();
    90:            myFile.clear();
    91:            myFile.seekg(offset);
    92:            myFile.read(tmpBuff,PageSize);
    93:            memcpy(&len,tmpBuff,2);
    94:            memcpy(&theTime,tmpBuff+2,4);
    95:            strncpy(buffer,tmpBuff+6,len);
    96:            buffer[len] = '\0';
    97:         }

Listing 13.4 IDXMgr.cpp

    1:       // **************************************************
    2:       // PROGRAM:  Idxmr
    3:       // FILE:     Idxmgr.cpp
    4:       // PURPOSE:  provide i/o for btree
    5:       // NOTES:
    6:       // AUTHOR:   Jesse Liberty (jl)
    7:       // REVISIONS: 11/5/94 1.0 jl  initial release
    8:       // **************************************************
    9:
    10:      #include "btree.hpp"
    11:      #include <assert.h>
    12:
    13:      // on construction, try to open the file if it exists
    14:      IDXManager::IDXManager():
    15:         myFile("ROBIN.IDX",ios::binary | ios::in | ios::out | ios::nocreate)
    16:      {
    17:         // initialize the pointers to null
    18:         for (int i = 0; i< MaxPages; i++)
    19:            myPages[i] = 0;
    20:         myCount = 0;
    21:
    22:      }
    23:
    24:      // called by btree constructor
    25:      // if we opened the file, read in the numbers we need
    26:      // otherwise create the file
    27:      int IDXManager::Create()
    28:      {
    29:         if (!myFile) // nocreate failed, first creation
    30:         {
    31:            // open the file, create it this time
    32:            myFile.open("ROBIN.IDX",ios::binary | ios::in | ios::out);
    33:
    34:            char Header[PageSize];
    35:            int MagicNumber = 1234; // a number we can check for
    36:            memcpy(Header,&MagicNumber,2);
    37:            int NextPage = 1;
    38:            memcpy(Header+2,&NextPage,2);
    39:            memcpy(Header+4,&NextPage,2);
    40:            Page::SetgPageNumber(NextPage);
    41:            char title[]="ROBIN.IDX. Ver 1.00";
    42:            memcpy(Header+6,title,strlen(title));
    43:            myFile.flush();
    44:            myFile.clear();
    45:            myFile.seekp(0);
    46:            myFile.write(Header,PageSize);
    47:            return 0;
    48:         }
    49:
    50:         // we did open the file, it already existed
    51:         // get the numbers we need
    52:         int MagicNumber;
    53:         myFile.seekg(0);
    54:         myFile.read((char *) &MagicNumber,2);
    55:
    56:         // check the magic number. If it is wrong the file is
    57:         // corrupt or this isn't the index file
    58:         if (MagicNumber != 1234)
    59:         {
    60:               // change to an exception!!
    61:               cout << "Index Magic number failed!";
    62:               return 0;
    63:         }
    64:
    65:         int NextPage;
    66:         myFile.seekg(2);
    67:         myFile.read((char*) &NextPage,2);
    68:         Page::SetgPageNumber(NextPage);
    69:         int FirstPage;
    70:         myFile.seekg(4);
    71:         myFile.read((char*) &FirstPage,2);
    72:         const int room =  PageSize - 6;
    73:         char buffer[room];
    74:         myFile.read(buffer,room);
    75:         buffer[20]='\0';
    76:         // cout << buffer << endl;
    77:         // read in all the pages
    78:         for (int i = 1; i < NextPage; i++)
    79:         {
    80:               myFile.seekg(i * PageSize);
    81:               char buffer[PageSize];
    82:               myFile.read( buffer, PageSize);
    83:               Page * pg = new Page(buffer);
    84:               Insert(pg);
    85:         }
    86:
    87:         return FirstPage;
    88:      }
    89:
    90:      // write out the numbers we'll need next time
    91:      void IDXManager::Close(int theRoot)
    92:      {
    93:
    94:         for (int i = 0; i< MaxPages; i++)
    95:            if (myPages[i])
    96:               Save(myPages[i]);
    97:         int NextPage = Page::GetgPageNumber();
    98:         if (!myFile)
    99:            cout << "Error opening myFile!" << endl;
    100:        myFile.flush();
    101:        myFile.clear();
    102:        myFile.seekp(2);
    103:        myFile.write ( (char *) &NextPage,2);
    104:        myFile.seekp(4);
    105:        myFile.write((char*) &theRoot,2);
    106:        myFile.close();
    107:     }
    108:
    109:     // wrapper function
    110:     int IDXManager::NewPage(Index& index, BOOL bLeaf)
    111:     {
    112:        Page * newPage = new Page(index, bLeaf);
    113:        Insert(newPage);
    114:        Save(newPage);
    115:        return newPage->GetPageNumber();
    116:     }
    117:
    118:     int IDXManager::NewPage(
    119:        Index *array,
    120:        int offset,
    121:        BOOL leaf,
    122:        int count)
    123:     {
    124:        Page * newPage = new Page(array, offset, leaf,count);
    125:        Insert(newPage);
    126:        Save(newPage);
    127:        return newPage->GetPageNumber();
    128:     }
    129:
    130:     void IDXManager::Insert(Page * newPage)
    131:     {
    132:
    133:        // add new page into array of page managers
    134:        if (myCount < MaxPages)
    135:        {
    136:           assert(!myPages[myCount]);
    137:           myPages[myCount++] = newPage;
    138:        }
    139:        else  // no room, time to page out to disk
    140:        {
    141:           int lowest = -1;
    142:
    143:           for (int i = 0; i< MaxPages; i++)
    144:           {
    145:              if (myPages[i]->GetLocked() == FALSE )
    146:                 lowest = i;
    147:           }
    148:           if (lowest == -1)
    149:              assert(lowest != -1); // change to exception if -1 (no page to
                      kill)
    150:
    151:           for (i = 0; i< MaxPages; i++)
    152:              if (myPages[i]->GetTime() < myPages[lowest]->GetTime() &&
                      myPages[i]->GetLocked() == FALSE)
    153:
    154:
    155:                 lowest = i;
    156:
    157:           assert(myPages[lowest]);
    158:           Save(myPages[lowest]);
    159:           delete myPages[lowest];
    160:           myPages[lowest] = newPage;
    161:
    162:        }
    163:     }
    164:
    165:     // tell the page to write itself out
    166:     void IDXManager::Save(Page* pg)
    167:     {
    168:       pg->Write(myFile);
    169:     }
    170:
    171:     // see if the page is in memory, if so return it
    172:     // otherwise get it from disk
    173:     // note: this won't scale, with lots of page managers
    174:     // you'd need a more efficient search. 10 levels of page
    175:     // managers, with 31 indexes per page gives you room for
    176:     // 800 trillion words. Even if each page is only 1/2 full
    177:     // on average, 10 levels of depth would represent 64 million
    178:     // keys alone, not to mention the actual records.
    179:
    180:     Page * IDXManager::GetPage(int target)
    181:     {
    182:
    183:        for (int i = 0; i< MaxPages; i++)
    184:        {
    185:           if (myPages[i]->GetPageNumber() == target)
    186:              return  myPages[i];
    187:        }
    188:        myFile.seekg(target * PageSize);
    189:        char buffer[PageSize];
    190:        myFile.read( buffer, PageSize);
    191:        Page * pg = new Page(buffer);
    192:        Insert(pg);
    193:        return pg;
    194:     }

Listing 13.5 Index.cpp

    1:      // **************************************************
    2:      // PROGRAM:  index
    3:      // FILE:     index.cpp
    4:      // PURPOSE:  provide fundamental btree functionality
    5:      // NOTES:
    6:      // AUTHOR:   Jesse Liberty (jl)
    7:      // REVISIONS: 11/1/94 1.0 jl  initial release
    8:      // **************************************************
    9:
    10:     #include "btree.hpp"
    11:     #include <ctype.h>
    12:
    13:     Index::Index(char* str):myPointer(1)
    14:     {
    15:        strncpy(myData,str,dataLen);
    16:        myData[dataLen]='\0';
    17:        for (int i = 0; i< strlen(myData); i++)
    18:           myData[i] = toupper(myData[i]);
    19:     }
    20:
    21:     Index::Index(char* str, int ptr):myPointer(ptr)
    22:     {
    23:
    24:        strncpy(myData,str,dataLen);
    25:        myData[dataLen]='\0';
    26:        for (int i = 0; i< strlen(myData); i++)
    27:           myData[i] = toupper(myData[i]);
    28:
    29:     }
    30:
    31:     Index::Index(Index& rhs):
    32:        myPointer(rhs.GetPointer())
    33:     {
    34:        strcpy(myData, rhs.GetData());
    35:        for (int i = 0; i< strlen(myData); i++)
    36:           myData[i] = toupper(myData[i]);
    37:
    38:     }
    39:
    40:     Index::Index():myPointer(0)
    41:     {
    42:        myData[0]='\0';
    43:     }
    44:
    45:     void Index::PrintKey()
    46:     {
    47:        cout << "  " << myData;
    48:     }
    49:
    50:     void Index::PrintPage()
    51:     {
    52:        cout << "\n" << myData << ": " ;
    53:        BTree::theDiskManager.GetPage(myPointer)->Print();
    54:     }
    55:
    56:     int Index::Insert(Index& ref, BOOL findOnly)
    57:     {
    58:        return BTree::theDiskManager.GetPage(myPointer)->Insert(ref,findOnly);
    59:     }
    60:
    61:     int Index::operator==(const Index& rhs)
    62:     {
    63:        return (strcmp(myData,rhs.GetData()) == 0); // case insensitive
    64:     }

Listing 13.6 Page.cpp

    1:       // **************************************************
    2:       // PROGRAM:  Page
    3:       // FILE:     Page.cpp
    4:       // PURPOSE:  provide fundamental btree functionality
    5:       // NOTES:
    6:       // AUTHOR:   Jesse Liberty (jl)
    7:       // REVISIONS: 11/1/94 1.0 jl  initial release
    8:       // **************************************************
    9:
    10:      #include "btree.hpp"
    11:      #include <assert.h>
    12:
    13:      // constructors
    14:
    15:      // default constructor
    16:      Page::Page()
    17:      {
    18:      }
    19:
    20:      // create a page from a buffer read in from disk
    21:      Page::Page(char *buffer):
    22:         myKeys((Index*)myVars.mk)
    23:      {
    24:         assert(sizeof(myBlock) == PageSize);
    25:         assert(sizeof(myVars) == PageSize);
    26:         memcpy(&myBlock,buffer,PageSize);
    27:         SetLocked(FALSE);
    28:         myTime = time(NULL);
    29:      }
    30:
    31:      // create a Page from the first index
    32:      Page::Page(Index& index, BOOL bLeaf):
    33:         myKeys((Index*)myVars.mk)
    34:      {
    35:
    36:         myVars.myCount=1;
    37:         myVars.IsLeaf = bLeaf;
    38:         SetLocked(FALSE);
    39:         // if this is a leaf, this is the first
    40:         // index on the first page, set its pointer
    41:         // based on creating a new wnj. otherwise
    42:         // you are here creating a new node, do not
    43:         // set the pointer, it is already set.
    44:         if (bLeaf)
    45:            index.SetPointer(BTree::theWNJFile.Append(index.GetPointer()));
    46:         myKeys[0]=index;
    47:         myVars.myPageNumber = gPage++;
    48:         myTime = time(NULL);
    49:      }
    50:
    51:      // create a page by splitting another page
    52:      Page::Page(Index *array, int offset, BOOL bLeaf,int count):
    53:         myKeys((Index*)myVars.mk)
    54:      {
    55:         myVars.IsLeaf = bLeaf;
    56:         myVars.myCount = 0;
    57:         for (int i=0, j = offset; j<Order && i < count; i++, j++)
    58:         {
    59:            myKeys[i]= array[j];
    60:            myVars.myCount++;
    61:         }
    62:         myVars.myPageNumber = gPage++;
    63:         SetLocked(FALSE);
    64:         myTime = time(NULL);
    65:      }
    66:
    67:      void Page::Nullify(int offset)
    68:      {
    69:         for (int i = offset; i<Order; i++)
    70:         {
    71:               myKeys[i].SetPointer(0);
    72:               myVars.myCount--;
    73:         }
    74:      }
    75:
    76:      // decide whether I'm a leaf or a node
    77:      // and pass this index to the right
    78:      // function. If findOnly is true, don't insert
    79:      // just return the page number (for now)
    80:      int Page::Insert(Index& rIndex, BOOL findOnly)
    81:      {
    82:         int result;
    83:         if (myVars.IsLeaf)
    84:         {
    85:            SetLocked(TRUE);
    86:            result = FindLeaf(rIndex,findOnly);
    87:            SetLocked(FALSE);
    88:            return result;
    89:         }
    90:         else
    91:         {
    92:            SetLocked(TRUE);
    93:            result = InsertNode(rIndex,findOnly);
    94:            SetLocked(FALSE);
    95:            return result;
    96:         }
    97:      }
    98:
    99:
    100:     // find the right page for this index
    101:     int Page::InsertNode(Index& rIndex, BOOL findOnly)
    102:     {
    103:        int retVal =0;
    104:        BOOL inserted = FALSE;
    105:        int i,j;
    106:
    107:        assert(myVars.myCount>0); // nodes have at least 1
    108:        assert(myKeys[0].GetPointer()); // must be valid
    109:
    110:        // does it go before my first entry?
    111:        if (rIndex < myKeys[0])
    112:        {
    113:           if (findOnly)
    114:              return 0L; // not found
    115:
    116:           myKeys[0].SetData(rIndex);
    117:           retVal=myKeys[0].Insert(rIndex);
    118:           inserted = TRUE;
    119:        }
    120:
    121:        // does it go after my last?
    122:        if (!inserted)
    123:           for (i = myVars.myCount-1; i>=0; i--)
    124:           {
    125:              assert(myKeys[i].GetPointer());
    126:              if (rIndex >= myKeys[i])
    127:              {
    128:                  retVal=myKeys[i].Insert(rIndex,findOnly);
    129:                  inserted = TRUE;
    130:                  break;
    131:              }
    132:         }
    133:
    134:        // find where it does go
    135:        if (!inserted)
    136:           for (j = 0; j<i && j+1 < myVars.myCount; j++)
    137:           {
    138:              assert(myKeys[j+1].GetPointer());
    139:              if (rIndex < myKeys[j+1])
    140:              {
    141:                 retVal=myKeys[j].Insert(rIndex,findOnly);
    142:                 inserted = TRUE;
    143:                 break;
    144:              }
    145:           }
    146:
    147:        assert(inserted);  // change to exception if not!
    148:
    149:        // if you had to split
    150:        if (retVal && !findOnly) // got back a pointer to a new page
    151:        {
    152:           Index * pIndex = new Index(GetPage(retVal)->GetFirstIndex());
    153:           pIndex->SetPointer(retVal);
    154:           retVal = InsertLeaf(*pIndex);
    155:        }
    156:        return retVal;
    157:     }
    158:
    159:     // called if current page is a leaf
    160:     int Page::FindLeaf(Index& rIndex, BOOL findOnly)
    161:     {
    162:        int result = 0;
    163:
    164:        // no duplicates!
    165:        for (int i=0; i < myVars.myCount; i++)
    166:              if (rIndex == myKeys[i])
    167:              {
    168:                 if (findOnly)               // return first WNJ
    169:                    //return BTree::theWNJFile.Find(myKeys[i].GetPointer());
    170:                    return myKeys[i].GetPointer();
    171:                 else
    172:                    return BTree::theWNJFile.Insert(
    173:                                rIndex.GetPointer(),
    174:                                   myKeys[i].GetPointer());
    175:              }
    176:
    177:        if (findOnly)         // not found
    178:           return result;
    179:
    180:        // this index item does not yet exist
    181:        // before you push it into the index
    182:        // push an entry into the wnj.idx
    183:        // and set the index to point to that entry
    184:        rIndex.SetPointer(BTree::theWNJFile.Append(rIndex.GetPointer())); //
                new!
    185:        return InsertLeaf(rIndex);
    186:     }
    187:
    188:     int Page::InsertLeaf(Index& rIndex)
    189:     {
    190:        int result = 0;
    191:        if (myVars.myCount < Order)
    192:           Push(rIndex);
    193:        else      // overflow the page
    194:        {
    195:              // make sibling
    196:              int NewPg =
    197:                 NewPage(myKeys,Order/2,myVars.IsLeaf,myVars.myCount);
    198:              Page* Sibling = GetPage(NewPg);
    199:              Nullify(Order/2);                     // nullify my right half
    200:
    201:              // does it fit in this side?
    202:              if (myVars.myCount>Order/2-1 && rIndex <= myKeys[Order/2-1])
    203:                 Push(rIndex);
    204:              else  // push it into the new sibling
    205:                 Sibling->Push(rIndex);
    206:
    207:              result = NewPg; // we split, pass it up
    208:           }
    209:         return result;
    210:     }
    211:
    212:     // put the new index into this page (in order)
    213:     void Page::Push(Index& rIndex,int offset,BOOL first)
    214:     {
    215:        BOOL inserted = FALSE;
    216:        assert(myVars.myCount < Order);
    217:        for (int i=offset; i<Order && i<myVars.myCount; i++)
    218:        {
    219:           assert(myKeys[i].GetPointer());
    220:           if (rIndex <= myKeys[i])
    221:           {
    222:                 Push(myKeys[i],offset+1,FALSE);
    223:                 myKeys[i]=rIndex;
    224:                 inserted = TRUE;
    225:                 break;
    226:           }
    227:        }
    228:        if (!inserted)
    229:           myKeys[myVars.myCount] = rIndex;
    230:
    231:        if (first)
    232:           myVars.myCount++;
    233:     }
    234:
    235:
    236:     void Page::Print()
    237:     {
    238:           if (!myVars.IsLeaf)
    239:           {
    240:              BTree::NodePageCtr++;
    241:              BTree::NodeIndexPerPage[myVars.myCount]++;
    242:              BTree::NodeIndexCtr+=myVars.myCount;
    243:           }
    244:           else
    245:           {
    246:              BTree::LeafPageCtr++;
    247:              BTree::LeafIndexPerPage[myVars.myCount]++;
    248:              BTree::LeafIndexCtr+=myVars.myCount;
    249:           }
    250:
    251:           for (int i = 0; i<Order && i < myVars.myCount; i++)
    252:           {
    253:              assert(myKeys[i].GetPointer());
    254:              if (myVars.IsLeaf)
    255:                 myKeys[i].PrintKey();
    256:              else
    257:                 myKeys[i].PrintPage();
    258:           }
    259:     }
    260:
    261:     // write out the entire page as a block
    262:     fstream& Page::Write(fstream& file)
    263:     {
    264:        char buffer[PageSize];
    265:        memcpy(buffer,&myBlock,PageSize);
    266:        file.flush();
    267:        file.clear();
    268:        file.seekp(myVars.myPageNumber*PageSize);
    269:        file.write(buffer,PageSize);
    270:        return file;
    271:     }

Listing 13.7 WNJFile.cpp

    1:        // **************************************************
    2:        // PROGRAM:  WNJ file
    3:        // FILE:     WNJfile.cpp
    4:        // PURPOSE:
    5:        // NOTES:
    6:        // AUTHOR:   Jesse Liberty (jl)
    7:        // REVISIONS: 11/5/94 1.0 jl  initial release
    8:        // **************************************************
    9:
    10:       #include "btree.hpp"
    11:       #include <assert.h>
    12:
    13:       // on construction, try to open the file if it exists
    14:       WNJFile::WNJFile():
    15:          myFile("ROBINWNJ.IDX",
    16:          ios::binary | ios::in | ios::out | ios::nocreate)
    17:       {
    18:          for (int i = 0; i<5; i++)
    19:             myints[i]=0L;
    20:       }
    21:
    22:       void WNJFile::Create()
    23:       {
    24:          char Header[4];
    25:          int MagicNumber=0; // a number we can check for
    26:          int zero = 0;
    27:
    28:          if (!myFile) // nocreate failed, first creation
    29:          {
    30:             // open the file, create it this time
    31:             myFile.clear();
    32:             myFile.open("ROBINWNJ.IDX",
    33:                ios::binary | ios::in | ios::out);
    34:
    35:             MagicNumber = 1234;
    36:             memcpy(Header,&MagicNumber,4);
    37:             memcpy(Header+2,&zero,2);
    38:             myFile.seekp(0);
    39:             myFile.write(Header,4);
    40:             myFile.flush();
    41:             return;
    42:          }
    43:
    44:          // we did open the file, it already existed
    45:          // get the numbers we need
    46:
    47:
    48:          myFile.seekg(0);
    49:          myFile.read(Header,4);
    50:          memcpy(&MagicNumber,Header,2);
    51:          memcpy(&myCount,Header+2,2);
    52:
    53:          // check the magic number. If it is wrong the file is
    54:          // corrupt or this isn't the index file
    55:          if (MagicNumber != 1234)
    56:          {
    57:                // change to an exception!!
    58:                cout << "WNJ Magic number failed!";
    59:                cout << "Magic number: " << MagicNumber;
    60:                cout << "\nmyCount: " << myCount << endl;
    61:          }
    62:          return;
    63:       }
    64:
    65:       // write out the numbers we'll need next time
    66:       void WNJFile::Close()
    67:       {
    68:          myFile.seekg(2);
    69:          myFile.write((char*)&myCount,2);
    70:          myFile.close();
    71:       }
    72:
    73:       int WNJFile::Append(int DataOffset)
    74:       {
    75:
    76:          int newPos = 4 + myCount++ * (5*2);
    77:          int offsets[5];
    78:          offsets[0] = DataOffset;
    79:          for (int i = 1; i<5; i++)
    80:             offsets[i]=0;
    81:          myFile.seekg(newPos);
    82:          myFile.write((char*)offsets,5*2);
    83:
    84:          return newPos;
    85:       }
    86:
    87:
    88:       int WNJFile::Insert(int DataOffset,int WNJOffset)
    89:       {
    90:          int ints[5];
    91:          myFile.seekg(WNJOffset);
    92:          myFile.read((char*)ints,5*2);
    93:
    94:          int offset=WNJOffset;
    95:
    96:          while (ints[4])
    97:          {
    98:             offset = ints[4];
    99:             myFile.clear();
    100:            myFile.flush();
    101:            myFile.seekg(ints[4]);
    102:            myFile.read((char*)ints,5*2);
    103:         }
    104:         if (ints[3]) // full!
    105:         {
    106:            ints[4] = Append(DataOffset);
    107:            myFile.clear();
    108:            myFile.flush();
    109:            myFile.seekg(offset);
    110:            myFile.write((char*)ints,5*2);
    111:         }
    112:         else
    113:         {
    114:            for (int i = 0; i<4; i++)
    115:            {
    116:               if (ints[i] == 0)
    117:               {
    118:                  ints[i] = DataOffset;
    119:                  myFile.clear();
    120:                  myFile.flush();
    121:                  myFile.seekg(offset);
    122:                  myFile.write((char*)ints,5*2);
    123:                  break;
    124:               }
    125:            }
    126:         }
    127:         return 0;
    128:      }
    129:
    130:
    131:      int* WNJFile::Find(int NextWNJ)
    132:      {
    133:         int ints[5];
    134:         int * results = new int[256];
    135:
    136:         int i = 0, j=0;
    137:
    138:         while (j<256)
    139:            results[j++] = 0;
    140:
    141:         j = 0;
    142:
    143:         myFile.seekg(NextWNJ);
    144:         myFile.read((char*)ints,5*2);
    145:
    146:         while (j < 256)
    147:         {
    148:            if (ints[i])
    149:            {
    150:               if (i == 4)
    151:               {
    152:                  myFile.seekg(ints[4]);
    153:                  myFile.read((char*)ints,5*2);
    154:                  i = 0;
    155:                  continue;
    156:               }
    157:            results[j++] = ints[i++];
    158:            }
    159:            else
    160:               break;
    161:         }
    162:         return results;
    163:      }

Listing 13.8 R1.cpp

    1:     // **************************************************
    2:     // PROGRAM:  R1
    3:     // FILE:     r1.cpp
    4:     // PURPOSE:  Fundamental functionality for ROBIN v1.
    5:     // NOTES:
    6:     // AUTHOR:   Jesse Liberty (jl)
    7:     // REVISIONS: 11/1/94 1.0 jl  ALPHA release
    8:     // **************************************************
    9:
    10:    #include "String.hpp"
    11:    #include "stdef.hpp"
    12:    #include "btree.hpp"
    13:    #include <stdlib.h>
    14:
    15:    // static definitions
    16:    IDXManager BTree::theDiskManager;
    17:    DataFile BTree::theDataFile;
    18:    WNJFile BTree::theWNJFile;
    19:
    20:    int WNJFile::myCount=0L;
    21:    int Page::gPage=1;
    22:    int BTree::NodeIndexCtr=0;
    23:    int BTree::LeafIndexCtr=0;
    24:    int BTree::NodePageCtr=0;
    25:    int BTree::LeafPageCtr=0;
    26:    int BTree::NodeIndexPerPage[Order+1];
    27:    int BTree::LeafIndexPerPage[Order+1];
    28:
    29:
    30:    // prototypes
    31:    void parseCommandLines(char *buffer,int argc,char **argv);
    32:    void ShowMenu(long*);
    33:    void DoFind(int, char**, BTree&);
    34:    void ParseFile(int, char**, BTree&);
    35:
    36:    // driver program
    37:    int main(int argc, char ** argv)
    38:    {
    39:       BTree myTree;
    40:
    41:       for (int i = 0; i < Order +2; i++)
    42:       {
    43:          BTree::NodeIndexPerPage[i] =0;
    44:          BTree::LeafIndexPerPage[i] = 0;
    45:       }
    46:
    47:       char buffer[PageSize+1];
    48:
    49:       if (argc == 1)
    50:       {
    51:          cout << "Please provide command line arguments";
    52:          return 1;
    53:       }
    54:
    55:       // check for flags, if none add text to data file
    56:       if (argv[1][0] == '-')
    57:       {
    58:
    59:          switch (argv[1][1])
    60:          {
    61:             case '?':
    62:                DoFind(argc,argv,myTree);
    63:                break;
    64:
    65:             case '!':
    66:                   myTree.PrintTree();
    67:                   break;
    68:
    69:             case 'F':
    70:             case 'f':
    71:                   ParseFile(argc,argv,myTree);
    72:                   break;
    73:          }
    74:       }
    75:       else
    76:       {
    77:          parseCommandLines(buffer,argc,argv);
    78:          myTree.Insert(buffer,argc,argv);
    79:          cout << "Inserted.\n";
    80:       }
    81:       return 0;
    82:    }
    83:
    84:    // concatenate remaining command line arguments
    85:    void parseCommandLines(char *buffer,int argc,char **argv)
    86:    {
    87:       size_t len = 0;
    88:       size_t argLen=0;
    89:       for (int i = 1; i< argc; i++)
    90:       {
    91:          argLen = strlen(argv[i]);
    92:          if (len + argLen +2 < PageSize)
    93:          {
    94:             strncpy(buffer+len,argv[i],argLen);
    95:             strncpy(buffer+len+argLen," ",1);
    96:             len += argLen+1;
    97:          }
    98:       }
    99:       buffer[len] = '\0';
    100:   }
    101:
    102:   // having found matches, show the menu of choices
    103:   // each entry is numbered and dated
    104:   void ShowMenu(int *list)
    105:   {
    106:      int j=0;
    107:      char buffer[PageSize+1];
    108:      time_t theTime;
    109:      int len;
    110:      char listBuff[256];
    111:      struct tm * ts;
    112:      int dispSize;
    113:
    114:      while (list[j] && j < 20)
    115:      {
    116:         BTree::theDataFile.GetRecord(list[j],buffer,len, theTime);
    117:   #if defined(_MSVC_16BIT) || defined(_MSVC_32BIT)
       :   {
       :      dispSize = __min(len,32); // THIS is a DOUBLE UNDERSCORE
       :   }
       :   #else
       :   {
       :      dispSize = min(len,32);
       :   }
       :   #endif
    118:         strncpy(listBuff,buffer,dispSize);
    119:         if (dispSize == 32)
    120:            {
    121:            listBuff[29] = '.';
    122:            listBuff[30] = '.';
    123:            listBuff[31] = '.';
    124:            }
    125:         listBuff[dispSize]='\0';
    126:         ts = localtime(&theTime);
    127:         cout << "[" << (j+1) << "] ";
    128:         cout << ts->tm_mon << "/";
    129:         cout << ts->tm_mday << "/";
    130:         cout << ts->tm_year << " ";
    131:         cout <<  listBuff << endl;
    132:         j++;
    133:      }
    134:   }
    135:
    136:   // handle -? command
    137:   // find matches, show the menu, request choice
    138:   // display record and redisplay menu
    139:   void DoFind(int argc, char ** argv, BTree& myTree)
    140:   {
    141:
    142:      // create an array of total set of WNJ
    143:      // offsets. This will be used to display
    144:      // choices and to find actual text
    145:      int  list[PageSize];
    146:
    147:      // initialize the array to all zeros
    148:      for (int i = 0; i<PageSize; i++)
    149:            list[i] = 0;
    150:
    151:      int k = 0;
    152:
    153:      // for each word in the command line
    154:      // search for the matching set of records
    155:      for (i = 2; i< argc; i++)
    156:      {
    157:         int offset = myTree.Find(argv[i]);
    158:         if (offset)
    159:         {
    160:            // get the array of offsets from WNJfile
    161:            int *found  =  BTree::theWNJFile.Find(offset);
    162:            int j = 0;
    163:
    164:            // add any you don't already have
    165:            for (;k < PageSize && found[j];j++,k++)
    166:            {
    167:               for (int l = 0; l < k; l++)
    168:               {
    169:                  if (list[l] == found[j])
    170:                     continue;
    171:               }
    172:               list[k] = found [j];
    173:            }
    174:               delete [] found;
    175:         }
    176:      }
    177:
    178:         cout << "\n";
    179:
    180:         if (!list[0])
    181:         {
    182:            cout << "Nothing found.\n";
    183:            return;
    184:         }
    185:
    186:         ShowMenu(list);
    187:
    188:         int choice;
    189:         char buffer[PageSize];
    190:         int len;
    191:         time_t theTime;
    192:
    193:         for (;;)
    194:         {
    195:            cout << "Choice (0 to stop): " ;
    196:            cin >> choice;
    197:            if (!choice)
    198:               break;
    199:            BTree::theDataFile.GetRecord(list[choice-1],buffer,len, theTime);
    200:            cout << "\n>> ";
    201:            cout << buffer;
    202:            cout << "\n\n";
    203:            ShowMenu(list);
    204:         }
    205:   }
    206:
    207:   // open a file and create a new note for each line
    208:   // index every word in the line
    209:   void ParseFile(int argc,char **argv, BTree& myTree)
    210:   {
    211:
    212:      char buffer[PageSize];
    213:      char theString[PageSize];
    214:
    215:      ifstream theFile(argv[2],ios::in );
    216:      if (!theFile)
    217:      {
    218:         cout << "Error opening input file!\n";
    219:         return;
    220:      }
    221:      int offset = 0;
    222:      for (;;)
    223:      {
    224:            theFile.read(theString,PageSize);
    225:            int len = theFile.gcount();
    226:            if (!len)
    227:               break;
    228:            theString[len]='\0';
    229:            char *p1, *p2, *p0;
    230:            p0 = p1 = p2 = theString;
    231:
    232:            while (p1[0] && (p1[0] == '\n' || p1[0] == '\r'))
    233:               p1++;
    234:
    235:            p2 = p1;
    236:
    237:            while (p2[0] && p2[0] != '\n' && p2[0] != '\r')
    238:               p2++;
    239:
    240:            int bufferLen = p2 - p1;
    241:            int totalLen = p2 - p0;
    242:
    243:            if (!bufferLen)
    244:               continue;
    245:
    246:            // lstrcpyn(buffer,p1,bufferLen);
    247:            strncpy(buffer,p1,bufferLen);
    248:            buffer[bufferLen]='\0';
    249:
    250:            // for (int i = 0; i< PageSize; i++)
    251:               cout << "\r";
    252:            cout << "Parsing " << buffer;
    253:            myTree.Insert(buffer);
    254:            offset += totalLen;
    255:            theFile.clear();
    256:            theFile.seekg(offset,ios::beg);
    257:         }
    258:      }
Output:
    d:\day13>-f day12.txt
    d:\day13>r1 -? file
    [1] 10/25/94 You have seen how to create t...
    [2] 10/25/94    [bl] How to create the data ...
    [3] 10/25/94 (c)Putting the notes into a d...
    [4] 10/25/94 The next iteration of the pro...
    [5] 10/25/94 and write the entire text of ...
    [6] 10/25/94 out to the file, and the offs...
    [7] 10/25/94 3:     // FILE:     stdef.hpp
    [8] 10/25/94 3:     // FILE:     btree.hpp
    [9] 10/25/94 2:     // PROGRAM:  Data file
    [10] 10/25/94 3:     // FILE:     datafile....
    [11] 10/25/94 13:    // on construction, tr...
    [12] 10/25/94 24:          // open the file...
    [13] 10/25/94 36:       // we did open the ...
    [14] 10/25/94 42:       // check the magic ...
    [15] 10/25/94 43:       // corrupt or this ...
    [16] 10/25/94 The file is opened much as th...
    [17] 10/25/94 the file should be opened in ...
    [18] 10/25/94 the file should be opened in ...
    [19] 10/25/94 opened in append mode all wri...
    [20] 10/25/94 that you need to know the off...
    Choice (0 to stop): 16

    >> The file is opened much as the other files have been,

    [1] 10/25/94 You have seen how to create t...
    [2] 10/25/94    [lb] How to create the data ...
    [3] 10/25/94 (c)Putting the notes into a d...
    [4] 10/25/94 The next iteration of the pro...
    [5] 10/25/94 and write the entire text of ...
    [6] 10/25/94 out to the file, and the offs...
    [7] 10/25/94 3:     // FILE:     stdef.hpp
    [8] 10/25/94 3:     // FILE:     btree.hpp
    [9] 10/25/94 2:     // PROGRAM:  Data file
    [10] 10/25/94 3:     // FILE:     datafile....
    [11] 10/25/94 13:    // on construction, tr...
    [12] 10/25/94 24:          // open the file...
    [13] 10/25/94 36:       // we did open the ...
    [14] 10/25/94 42:       // check the magic ...
    [15] 10/25/94 43:       // corrupt or this ...
    [16] 10/25/94 The file is opened much as th...
    [17] 10/25/94 the file should be opened in ...
    [18] 10/25/94 the file should be opened in ...
    [19] 10/25/94 opened in append mode all wri...
    [20] 10/25/94 that you need to know the off...
    Choice (0 to stop):-!

          // NOTE: list of words left out of this listing

    Stats:
    Node pages:   2
    Leaf pages:   18
    Node indexes: 20
    Leaf indexes: 343
    Pages with 2 nodes:  1
    Pages with 15 leaves: 6
    Pages with 16 leaves: 2
    Pages with 18 nodes:  1
    Pages with 18 leaves: 1
    Pages with 19 leaves: 3
    Pages with 20 leaves: 2
    Pages with 21 leaves: 1
    Pages with 25 leaves: 1
    Pages with 29 leaves: 1
    Pages with 31 leaves: 1
Analysis:

All the changes discussed at the beginning of this chapter are implemented in this set of code. The page-locking mechanism is implemented by the IsLocked member variable of Page, declared in line 239 of listing 13.1. The accessors are in lines 207 and 208 of the same listing.

The lock actually is set and cleared in the Insert() method of Page, shown in lines 80 through 97 of listing 13.6. The check for the lock is implemented in the IDXManager code, shown in lines 131 through 165 of listing 13.4. This ensures that no page currently in use by the Insert() method will be swapped out of memory.

The code to elicit user input is shown in listing 13.8. The command line is examined, and if the first character is not a dash, indicating a flag, the entire command line other than the program name itself is turned into a note and added to the database.

If the first character is a dash, the flag is examined for the legal values of ?, !, and F or f. If the command is ?, the DoFind() method is called, which searches for a match and then displays a menu of matching titles. Each title is prepended with a menu number and the date of the entry. The user can choose from the menu and see the entire text of that note.

If the flag is -F, the next word must be a legitimate file name, in which case the file is read line by line, and each line is made into a note. This is shown in the output when a text file from day 12 is read and parsed. The result of this is reflected in the statistics, which are shown by entering the -! flag.

Note: Note that the output displayed here does not include the complete list of all of the indexed words (there are 343).

The logic for reading the contents of a file and creating a note from each line is in the method ParseFile(), shown in lines 210 through 259 of listing 13.6. The only tricky part is this: Each line is read, and the length of the line is after the new line character is found (in lines 233 through 239). After a buffer is created with all the characters up to the first new line character, it is passed to the B-tree's Insert() method, and the next line is obtained by seeking to the offset of the new line character and reading from there.

The ShowMenu() method, listed in lines 104 through 134 of listing 13.8, is a rudimentary display of the first 20 matches returned from the database. One of the first things you want to do before releasing this as a commercial program is to enhance the user's capability to page through all the matching records.

Getting It Out the Door

You now have enough of a program to show it to co-workers and others who might be interested, in an attempt to "sanity-check" the idea behind the program. Yes, it is crude, and there are a thousand things you would like to change, fix, enhance, redesign, and so on.

It is my considered opinion, after 10 years in this industry, that a heck of a lot more programs never make it to market because they are kept too long in the lab than because they are released too early. Get it out there, and then listen to the feedback. Your users may not even notice many of the things you hate, and don't be surprised if they come back and complain long and loud about the few things you thought you had right!

The point of an alpha release is to get enough out there to get feedback not to prove that your program is nearly done. It isn't nearly done--it has only just started--but it is time to hear what real people think of it.

Looking At What You Didn't Fix

The list of things that remain to be fixed for ROBIN is long and intimidating. You haven't even started to performance tune it, and you will not do this for a few more days. You still need to put a nice front end on it; remember that users react much more to the look and feel than to the neat B-tree you implemented, and ROBIN's look and feel are, to say the least, rudimentary.

You need to add stop words, not to mention the capability to delete, if not edit, entries in the database. The single biggest weakness in the current version of ROBIN is that it is not nearly robust enough. You have put in almost no error-checking, and there is much to do to make this a professional, well-crafted, hearty program. All of that will come in the next few days, however; for now, you have created a program that demonstrates the fundamental functionality, and it is time to start using it.

Even though ROBIN is a learning tool and not a commercial enterprise, and therefore you will not, in fact, give it to anyone else, you still can use it yourself. Try keeping notes with it. Put your phone numbers in, play with it on a daily basis, and keep a list of all the ways you would like to enhance it, change it, and fix it. Note where it is slow, and most important, note where it breaks!

What's Ahead

On day 14, you will take a break from ROBIN and look at some advanced bit-twiddling techniques. After the break for the second week in review, I'll review memory-management techniques. Then you will return to ROBIN and add synonyms and stop words. The remaining days look at building robust, excellent code and enhancing performance. Finally, I'll walk you through an advanced debugging session and you will add the finishing touches to this program.

Summary

Today you learned what role alpha and beta testing play in the preparation of a commercial release. You also saw how to add robustness to your program with simple page-locking, and a little bit about the reality of shipping high-quality, but "good enough" commercial products.

Q&A

Q: What is QA testing?

A: Quality assurance (QA) testing is a rigorous test of your program by trained professionals who understand how to force a program through virtually every code path. A good QA engineer can automate many of these tests, and is on the lookout not only for obvious bugs but for whether the product complies with the specification.

Q: At what point should your program go to QA?

A: There is a trade-off in this decision, as in most aspects of programming. If you send a program too early to QA, you will only get it back with a stack of bugs and everyone involved will be frustrated. If you spend too much time testing your own code, however, you will miss the opportunity to move on and do further development. My rule of thumb is this: When you cannot find a bug for hours at a time of testing, give it to QA and move on to the next problem.

Q: What is an alpha test?

A: The alpha phase is when you distribute the program only to a very small group of very friendly users. The purpose of it is more to gain feedback on the functionality of the program rather than to weed out bugs. Alpha test programs are not feature complete; you are looking for high-level feedback early in the process.

Q: What is a beta test?

A: The beta phase is distributed to more users, who are perhaps somewhat less tolerant of bugs. The purpose is more to weed out the bugs than to provide feedback. If there is significant negative feedback during the beta phase of a program (more than cosmetic changes), perhaps it is time to go back to the drawing board.

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered, and exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix A, and make sure that you understand the answers before continuing to the next chapter.

Quiz

  1. What is the difference between alpha and beta testing?

  2. What kinds of things should QA test for?

  3. Why shouldn't developers test their own code?

[Click here for Answers]

Exercises

  1. How would you modify the Page class so that the locking mechanism could be re-entrant? In other words, if a page is locked twice, it should need to be unlocked twice before it is considered to be not locked. You should change only the class Page, and you should not change its public interface at all.

  2. Recompile and rerun the program with your changes from exercise 1, to prove that it still works correctly.

  3. Modify the function ParseFile to count the number of lines that are in the file, and to output a message when the file is complete, telling the user the number of lines that were added to the database.

Go to: Table of Contents | Next Page