OS/2 Shareware BBS: 10 Tools

home *** CD-ROM | disk | FTP | other *** search

/ OS/2 Shareware BBS: 10 Tools / 10-Tools.zip / cslio205.zip / DOC / MANUAL.TXT < prev next >

Wrap

Text File | 1997-01-21 | 212KB | 6,088 lines

The CS-Libraries A Database Kit ComBits P.O. Box 3303 2280 GH Rijswijk The Netherlands Copyright (c) 1994-1996 by ComBits, the Netherlands. All Rights Reserved. 1 Contents 1 Contents. . . . . . . . . . . . . . . . . . . . . . . . . 2 Preface . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Contacting ComBits . . . . . . . . . . . . . . . . . . 2.2 Legal Matters. . . . . . . . . . . . . . . . . . . . . 2.2.1 Disclaimer . . . . . . . . . . . . . . . . . . . 2.2.2 Royalties and runtime limitations. . . . . . . . 2.2.3 Trademarks . . . . . . . . . . . . . . . . . . . 3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Debugging. . . . . . . . . . . . . . . . . . . . . . . . . . 6 Runtime Libraries. . . . . . . . . . . . . . . . . . . . . . 7 Compiler Options . . . . . . . . . . . . . . . . . . . . . . 7.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . 7.2 Floating point . . . . . . . . . . . . . . . . . . . . 7.3 Watcom . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Borland. . . . . . . . . . . . . . . . . . . . . . . . 7.5 Visual C++ . . . . . . . . . . . . . . . . . . . . . . 8 Standard Types & Definitions . . . . . . . . . . . . . . . . 9 Runtime Errors and Messages. . . . . . . . . . . . . . . . . 9.1 What to expect?. . . . . . . . . . . . . . . . . . . . 9.2 Changing the way messages are displayed. . . . . . . . 9.3 Message related functions. . . . . . . . . . . . . . . 9.4 Database class messages. . . . . . . . . . . . . . . . 10 Temporary files . . . . . . . . . . . . . . . . . . . . . . 11 Buffering . . . . . . . . . . . . . . . . . . . . . . . . . 12 PAGE-Class. . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Introduction. . . . . . . . . . . . . . . . . . . . . 12.2 Storing data in the header-page . . . . . . . . . . . 13 Lock files. . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Name of a lock file . . . . . . . . . . . . . . . . . 13.2 Controlling lock files. . . . . . . . . . . . . . . . 14 Read-Only databases . . . . . . . . . . . . . . . . . . . . 14.1 Class member functions. . . . . . . . . . . . . . . . 15 TBASE-class . . . . . . . . . . . . . . . . . . . . . . . . 15.1 Introduction. . . . . . . . . . . . . . . . . . . . . 15.2 Using TBASE . . . . . . . . . . . . . . . . . . . . . 15.3 Creating a Database . . . . . . . . . . . . . . . . . 15.4 Opening . . . . . . . . . . . . . . . . . . . . . . . 15.5 Closing . . . . . . . . . . . . . . . . . . . . . . . 15.6 Appending Records . . . . . . . . . . . . . . . . . . 15.7 Deleting Records. . . . . . . . . . . . . . . . . . . 15.8 Page Utilization. . . . . . . . . . . . . . . . . . . 15.9 Locating Records. . . . . . . . . . . . . . . . . . . 15.10 Functions in alphabetical order. . . . . . . . . . . 16 BTREE-class . . . . . . . . . . . . . . . . . . . . . . . . 16.1 Introduction. . . . . . . . . . . . . . . . . . . . . 16.2 BTREEx Classes. . . . . . . . . . . . . . . . . . . . 16.3 Multiple Keys . . . . . . . . . . . . . . . . . . . . 16.4 Current Pointer . . . . . . . . . . . . . . . . . . . 16.5 Using Btrees. . . . . . . . . . . . . . . . . . . . . 16.5.1 Creating. . . . . . . . . . . . . . . . . . . . 16.5.2 Opening . . . . . . . . . . . . . . . . . . . . 16.5.3 Inserting . . . . . . . . . . . . . . . . . . . 16.5.4 Searching . . . . . . . . . . . . . . . . . . . 16.5.5 Current . . . . . . . . . . . . . . . . . . . . 16.5.6 Deleting. . . . . . . . . . . . . . . . . . . . 16.5.7 Closing . . . . . . . . . . . . . . . . . . . . 16.6 Functions in alphabetical order.. . . . . . . . . . . 17 CSDBGEN . . . . . . . . . . . . . . . . . . . . . . . . . . 17.1 Introduction. . . . . . . . . . . . . . . . . . . . . 17.2 Overview. . . . . . . . . . . . . . . . . . . . . . . 17.3 Features. . . . . . . . . . . . . . . . . . . . . . . 17.4 Limitations . . . . . . . . . . . . . . . . . . . . . 17.5 Definition file . . . . . . . . . . . . . . . . . . . 17.6 Tokenizing. . . . . . . . . . . . . . . . . . . . . . 17.6.1 How does it work? . . . . . . . . . . . . . . . 17.7 When is a substring indexed?. . . . . . . . . . . . . 17.8 Compound indexes. . . . . . . . . . . . . . . . . . . 17.8.1 A simple example. . . . . . . . . . . . . . . . 17.8.2 A more complex example. . . . . . . . . . . . . 17.8.3 Compound & Tokenizing Indexes . . . . . . . . . 17.8.4 Locating an Entry . . . . . . . . . . . . . . . 17.9 Export to dBASE . . . . . . . . . . . . . . . . . . . 17.10 Importing from dBASE . . . . . . . . . . . . . . . . 17.11 Exporting/Importing to/from ASCII. . . . . . . . . . 17.12 Starting a new database. . . . . . . . . . . . . . . 17.13 Opening a database . . . . . . . . . . . . . . . . . 17.14 Current Record . . . . . . . . . . . . . . . . . . . 17.15 Accessing fields . . . . . . . . . . . . . . . . . . 17.16 DATE fields. . . . . . . . . . . . . . . . . . . . . 17.17 Changing the record layout.. . . . . . . . . . . . . 17.18 Member functions in alphabetical order . . . . . . . 17.19 Warning. . . . . . . . . . . . . . . . . . . . . . . 17.20 A Large Example. . . . . . . . . . . . . . . . . . . 18 VRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.1 Introduction. . . . . . . . . . . . . . . . . . . . . 18.2 Creating. . . . . . . . . . . . . . . . . . . . . . . 18.3 Opening & Closing . . . . . . . . . . . . . . . . . . 18.4 VRAM Pointers . . . . . . . . . . . . . . . . . . . . 18.5 Fragmentation . . . . . . . . . . . . . . . . . . . . 18.6 Root. . . . . . . . . . . . . . . . . . . . . . . . . 18.7 Functions in Alphabetical order.. . . . . . . . . . . 19 VBASE . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1 Introduction. . . . . . . . . . . . . . . . . . . . . 19.2 Using VBASE.. . . . . . . . . . . . . . . . . . . . . 19.3 Relocating records. . . . . . . . . . . . . . . . . . 19.4 Limitations.. . . . . . . . . . . . . . . . . . . . . 19.5 Functions in alphabetical order.. . . . . . . . . . . 20 VBAXE . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1 Introduction. . . . . . . . . . . . . . . . . . . . . 20.2 Working.. . . . . . . . . . . . . . . . . . . . . . . 20.3 Files . . . . . . . . . . . . . . . . . . . . . . . . 20.4 Prototypes. . . . . . . . . . . . . . . . . . . . . . 21 OLAY. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 Introduction & Overview . . . . . . . . . . . . . . . 21.2 Buffering . . . . . . . . . . . . . . . . . . . . . . 21.3 Performance . . . . . . . . . . . . . . . . . . . . . 21.4 Core Functions. . . . . . . . . . . . . . . . . . . . 21.4.1 Creating. . . . . . . . . . . . . . . . . . . . 21.4.2 Opening . . . . . . . . . . . . . . . . . . . . 21.4.3 Reading and Writing . . . . . . . . . . . . . . 21.4.4 Insert & Delete . . . . . . . . . . . . . . . . 21.4.5 Filesize & bottom . . . . . . . . . . . . . . . 21.4.6 Closing . . . . . . . . . . . . . . . . . . . . 21.5 Additional functions. . . . . . . . . . . . . . . . . 21.6 Import & Export . . . . . . . . . . . . . . . . . . . 21.7 Sequential functions. . . . . . . . . . . . . . . . . 21.7.1 Sequential functions in alphabetical order. . . 21.7.2 Miscellanious functions . . . . . . . . . . . . 22 DLAY. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.1 Performance . . . . . . . . . . . . . . . . . . . . . 22.2 Member functions. . . . . . . . . . . . . . . . . . . 23 IBASE . . . . . . . . . . . . . . . . . . . . . . . . . . . 23.1 Introduction. . . . . . . . . . . . . . . . . . . . . 23.2 Using IBASE . . . . . . . . . . . . . . . . . . . . . 23.3 Using IBASE . . . . . . . . . . . . . . . . . . . . . 23.3.1 Creating. . . . . . . . . . . . . . . . . . . . 23.3.2 Opening . . . . . . . . . . . . . . . . . . . . 23.3.3 Appending Records . . . . . . . . . . . . . . . 23.3.4 Reading . . . . . . . . . . . . . . . . . . . . 23.3.5 Writing . . . . . . . . . . . . . . . . . . . . 23.3.6 Inserting . . . . . . . . . . . . . . . . . . . 23.3.7 Deleting. . . . . . . . . . . . . . . . . . . . 23.3.8 Closing . . . . . . . . . . . . . . . . . . . . 23.3.9 Miscellaneous functions . . . . . . . . . . . . 24 CSDIR . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 CSINFO. . . . . . . . . . . . . . . . . . . . . . . . . . . 26 CSERROR . . . . . . . . . . . . . . . . . . . . . . . . . . 27 CS4DBASE. . . . . . . . . . . . . . . . . . . . . . . . . . 27.1 Introduction. . . . . . . . . . . . . . . . . . . . . 27.2 Converting. . . . . . . . . . . . . . . . . . . . . . 27.3 Example . . . . . . . . . . . . . . . . . . . . . . . 27.4 Importing large databases . . . . . . . . . . . . . . 28 CSTOOLS . . . . . . . . . . . . . . . . . . . . . . . . . . 28.1 Introduction . . . . . . . . . . . . . . . . . . . . 29 CSKEYS. . . . . . . . . . . . . . . . . . . . . . . . . . . 29.1 CSKEYS.exe. . . . . . . . . . . . . . . . . . . . . . 30 DATE. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.1 Example . . . . . . . . . . . . . . . . . . . . . . . 30.2 Initialising. . . . . . . . . . . . . . . . . . . . . 30.3 Converting Strings. . . . . . . . . . . . . . . . . . 30.4 Obtaining date info . . . . . . . . . . . . . . . . . 30.5 Comparing dates . . . . . . . . . . . . . . . . . . . 30.6 Arithmetic. . . . . . . . . . . . . . . . . . . . . . 30.7 Miscellaneous . . . . . . . . . . . . . . . . . . . . 31 HEAP. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . 31.2 When to use it? . . . . . . . . . . . . . . . . . . . 31.3 Using HEAP. . . . . . . . . . . . . . . . . . . . . . 31.4 Functions in alphabetical order.. . . . . . . . . . . 32 Alloc-Logging . . . . . . . . . . . . . . . . . . . . . . . 32.1 Introduction. . . . . . . . . . . . . . . . . . . . . 32.2 Replacements. . . . . . . . . . . . . . . . . . . . . 32.3 Logging . . . . . . . . . . . . . . . . . . . . . . . 32.4 Memory Leaks. . . . . . . . . . . . . . . . . . . . . 33 csSTR . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Preface Nowhere days even the simplest of applications seems to need some sort of database. Despite this, C++ and consequently most compilers, have very little, if any, support for it. Roughly speaking you have the choice between the very basic file IO as defined by the ANSI standard, or resort to the other extreme and use one of the truly colossul DataBase Management Systems (DBMS). Neither option is very appealing. Basic file IO' is so basic, it will take a yéár to come up with a database application. And despite its simple design it's still not all that easy to use. In particular newcomers seem to struggle with it. On the other hand, using a DBMS is often even less fun. From our own experience we recall having written a 400 KB application but having to ship a 15 MB package due to the large X database used. Apart from that, DBMS's have their roots firmly in the late sixties and early seventies. They were designed with a mainframe in mind, and are indeed equipped with all the flexibilty and user-friendly-ness which has made the mainframe a dying species. With this library we believe we are offering a third option. One that is easy to use, poweful and still produces small, fast stand-alone executables. 2.1 Contacting ComBits You can reach us, preferably, by E-mail. The address is: CSLIB@ComBits.nl. If you don't have E-mail access you can reach us by traditional mail: COMBITS P.O. Box 3303 2280 GH Rijswijk The Netherlands Or FAX: +31703960172 Voice: +31703932300 Please remember, it is GMT +100 over here! 2.2 Legal Matters 2.2.1 Disclaimer EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDER AND/OR OTHER PARTIES PROVIDE THIS SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL THE COPYRIGHT HOLDER BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 2.2.2 Royalties and runtime limitations The CS-Libraries can be used in a commercial software product without any royalties as long as the number of copies sold annually does not exceed 10.000. No part of these libraries may be used in a product which is in any way a competitor of the CS-Libraries. 2.2.3 Trademarks IBM and OS/2 are registered trademarks of International Business Machines Corporation. MS-DOS and Windows are registered trademarks of Microsoft Corporation. Borland C/C++ and dBASE are registered trademarks of Borland International Inc. WATCOM is a trademark of WATCOM International Corp. Part One The next part of the documentation presents an introduction and discusses topics of general interest. 3 Introduction Both for historical and practical reasons this library is presented as two distinct, independent sections. The database part is called CSDB which is short for Combits Software DataBases, the other section, CSA, contains general purpose, and portability functions. This library concentrates on supplying a C++ software developper with the means to quickly implement database and database-alike functionality in his/her applications. This is done by supplying a set of easy-to-use C++ classes. There are classes for fixed lenght records, variable length records, indexes and so fort. In addition, a program generator, CSDBGEN, is available, which uses the database classes in this library as building blocks for other, more complex and powerful, databases. There are also classes in CSDB which operate in the grey area betweeen file-extensions' and databases'. Because these classes tackle the limitations of standard file IO on a very low level, they are very flexible and can therefore be of great value when the approach of the traditional database proves too rigid. This library does not pretend to compete with the multi-user Database Management systems. Its purpose is to supply a rich set of tools to overcome the disk-storage problems occurring in single-user applications. 4 Overview This chapter will give a quick overview of the contents of the package. The CSDB section contains (mainly) the following classes: TBASE: A class for reading and writing fixed length records. BTREE: A btree+ to be used as an index. VRAM: A 'database' organized as a heap. VBASE: A database class for variable length records. VBAXE: As VBASE but for very large databases. OLAY: A file system which can insert and delete! DLAY: As OLAY but for large files. IBASE: Fixed length records but with the ability to insert or delete a record anywhere in the database. The CSA section contains general-purpose functions and classes. Among others, the following classes are included: csDATE: To store and manipulate dates. csTIME: To represent the time. csSTR: Strings. CSDIR: To traverse a directory. QALLOC: Quick and dirty way to do dynamic memory allocations. HEAP: To efficiently allocate many small blocks from the heap. Command line utilities (DOS, NT, OS/2, Linux): CSDIR: Lists the CS-databases in a directory. CSINFO: Displays information about a CS-database. CSDBGEN: Important program generator. CSERROR: Utility to convert the error file to C++ source. CSKEYS: Displays the return value of the cskey() function. CSMALLOC: Tests the allocation log for memory leaks. CS4DBASE: Conversion utiltity to read dBASE files. 5 Debugging Of each library two versions exist, one to be used during debugging and one intended for normal 'production'. The difference is that in the debug-version a lot more tests are done, and so many more errors are reported. The idea is to use the debug version during development and recompile with the production version when ready. The debug version is identical to the production version, but with additional tests. The 'working code' is 100% identical. This means there are no subtle differences between the two versions. The production version however can be substantial faster, up to two times, depending on the circumstances. To give an example: The TBASE class has a function to read a record from the database. For this function to operate properly, the class/database needs to be 'opened'. In the debug version this is tested with every call to the read function. In the production version it is never tested. Note that in a decently written and tested application this error should not occur. Forgetting' to call the open function should be corrected in the debugging fase and if the open function fails it should be trapped well before the call to the read function. There are many more errors like this. Errors that should not occur when the application is tested and (almost) ready but which can easily emerge in the development stage. 6 Runtime Libraries There are many libraries included in the package but only one has to be linked in at any given time. They indeed have to be linked in because they are static libraries', not dynamic libraries'. The name of a library is build up according to the next syntax: Meaning: - The first two characters are always cs' - The third character indicates the compiler: B: The Borland C++ compiler W: The Watcom C++ compiler V: The Visual C++ (Microsoft) compiler G: The GNU C compiler - The fourth character indicates the platform: D: Dos W: Windows 16 bit N: NT, Windows95, Win32s O: OS/2 L: Linux - The fifth character indicates the memory model: M: Memory model (Dos, 16 bits Windows) C: Compact model (Dos, 16 bits Windows) L: Large model (Dos, 16 bits Windows) H Huge model (Dos, 16 bits Windows) F: Flat model (NT, OS/2, Linux) - The sixth characters indicates debug or production version: P: Production D: Debugging - The extension is always: .lib Example: csBWCD.lib This the 16 bits Windows library for Borland, using the Compact memory model and the Debug version. Of course not all combinations do exist. Not all plataforms are supported by all compilers and vice versa. - Linux is only supported by the GNU compiler. - The GNU compiler is only supported under Linux. - OS/2 is only supported by the Borland- and Watcom compilers. - The Compact, Large and Huge memory models are only available under DOS and 16 bits Windows. - The flat memory model is only available under NT,OS/2 and Linux. 7 Compiler Options With the ever growing number of compiler options it becomes increasingly harder to keep everybody happy. As a rule we take the compiler defaults, even if we know it isn't the fastest. This chapter shows which options were used while compiling the CS Libraries. Some of the options are really options', but a few others are crucial. In particular the alignment' option can spell disaster. Some structures in the CSDB header files are alignment dependent and if the appropriate alignment option isn't used, the resulting executable will simple crash while no compiler- or linker-error will ever be generated! 7.1 Hardware Because the debug version isn't compiled for speed any way, we grasped the opportunity to assume worst case' hardware. By doing so, it can be used to support out-dated hardware like the ancient 8086 processor. Because we have this escape route, the production version can make reasonable assumptions about the available CPU. As a rule the DOS libraries are compiled for a 80286 CPU, 16 bits Windows libraries for a 80386 CPU and the 32 bits libraries for a 80486 CPU. 7.2 Floating point There is almost no floating point arithmetic used in the CS-Libraries. However, a few functions use julian-date routines to e.g. calculate the number of days between two dates. If this is the case, this documentation will clearly mention it. Because floating point is used so rarely, we make very conservative assumptions about the available hardware support. 7.3 Watcom Next the options used with the Watcom C++ compiler. Debug version, 32 bits: -3r -zld -otexn -zp1 -fpi -fp2 -bt=nt Prodution version, 32 bits: -5r -zld -otexn -zp1 -fpi -fp2 -s -bt=nt Debug version, DOS: -0 -zld -otexn -zp1 -fpi -bt=dos Prodution version, DOS: -2 -zld -otexn -zp1 -fpi -s -bt=dos Debug version, 16 bits Windows: -3 -zw -zld -otexn -zp1 -fpi -bt=windows Prodution version, 16 bits Windows: -3 -zw -zld -otexn -zp1 -fpi -s -bt=windows Explanation: -0 8086 instructions. -2 80286 instructions -3 80386 instructions. -3r Register calling, assuming 80386 or better. -5r Register calling, optimized for pentium but runs on 386 or better. -zld Suppress generation of library file names. -otexn Use all optimizations, except no pointer aliasing'. -zp1 1 byte packing of structures. (default) CRUCIAL! -fpi Support both emulation and 80x87, depending on how you link. -fp2 Use 80287 instructions. -s Don't check stack overflow. 7.4 Borland Next the options used with the Borland C++ compiler. Debug version, 32 bits: -3 -N -G -O2 -k -f Prodution version, 32 bits: -4 -G -N- -O2 -k- -ff Debug version, DOS: -1- -N -f -O2 -k -Ff Prodution version, DOS: -2 -N- -f -O2 -G -k- -Ff Debug version, 16 bits Windows: -3 -N -f -O2 -G -k -Ff -WSE Prodution version, 16 bits Windows: -3 -N- -f -ff -O2 -G -k- -Ff -WSE Explanation: -1- 8086 instructions. -2 80286 instructions -3 80386 instructions. -4 80486 instructions. -f Emulate floating point. -ff Fast floating point. -O2 Fastest code. -G Optimize for speed. -Ff Automatic far data. -k Standard stack frame. -k- No standard stack frame. -N Check stack overflow. -N- Don't check stack overflow. Defaults: Signed characters. 7.5 Visual C++ Next the options used with the Visual C++ compiler. Debug version, 32 bits: -Og -Oi -Ot -Oy -Ob1 -Gf -Gy -GB -DWIN32 Prodution version, 32 bits: -Gs -Og -Oi -Ot -Oy -Ob2 -Gf -Gy -GB -DWIN32 Debug version, DOS: -f- -G0 -Ot -Ob1 -On -Oc -Oe -Og -Ol -Oo -Gf -Gy -D_DOS -Ge Prodution version, DOS: -f- -Gs -G3 -Ot -Ob2 -OV9 -On -Oc -Oe -Og -Ol -Oo -Gf -Gy -D_DOS Debug version, 16 bits Windows: -f- -Oc -Oe -Og -Oi -Ol -On -Oo -Ot -Ob1 -G3 -Gf -Gy -Gw -Ge Prodution version, 16 bits Windows: -f- -Oc -Oe -Og -Oi -Ol -On -Oo -Ot -Ob2 -OV9 -G3 -Gf -Gy -Gw -Gs Explanation: -Oc Common subexpression optimization. -Oe Enable register allocation. -Ol Loop optimization. -On Disable unsafe' optimizations. -Oo Post code optimization. -Og Global optimizations. -Oi Enable intrinsic functions. -Ot Favor code speed. -Oy Enable frame pointer omission. -Ob1 Expand only __inline' functions. -Ob2 Expand inline any suitable' function. -OV9 Expand even large' inline functions. -Gf String pooling. -Gy Separate functions. -GB 'Blended' CPU. -Gs No stack checking. -Ge Enable stack checking. -f- Select optimizing compiler. -G0 8086 instructions -G2 80286 instructions -G3 80386 instructions. 8 Standard Types & Definitions This chapter describes some types and definitions used throughout this library. They are platform independent and therefore portable. Defined in cstools.h: FALSE 0 TRUE 1 Types defined in cstypes.h: S8: Singed 8 bit, U8: Unsigned 8 bit, S16: Singed 16 bit U16: Unsigned 16 bit S32: Singed 32 bit U32: Unsigned 32 bit csCHAR: Signed character These definitions are used extensively in the function prototypes. However, on many occasions (particularly function returns) the range of the variable is not (all that) important. In these cases int s are used because that's normally the fastest. The accompanying max & min values are also defined: S8_MIN: -128 S8_MAX: 127 U8_MAX: 255 S16_MIN: -32768 S16_MAX: 32767 U16_MAX: 65535 S32_MIN: -2147483648 S32_MAX: 2147483647 U32_MAX: 4294967295 9 Runtime Errors and Messages As a rule, errors are signaled by a return value of FALSE, rather then by a visible message. In this way controle over runtime messages is placed in the hands of the developer using our libraries. On the other hand, we try to strike a balance between ease-of-use and controle. Having to check the return value of each and every function to trap highly unlikely errors, rapidly becomes very cumbersome. 9.1 What to expect? The main rule is to expect a return value of TRUE on success and a value of FALSE in case of failure. However, sometimes a message is displayed and there are even cases where the exit() function is called. We distinguish a few categories of errors: - Extremely Improbable Error Normally meaning hardware failure. Because it's so unlike, you will be tempted to forget' testing the return value because it will never go wrong'. A visible message is generated. - Misuse of Functions Calling functions in the wrong order, calling functions with wrong parameters, etc.. Because this type of error should only occur in the earliest development stage, during debugging, we think it's justified to display a message or even exit the program. It is typical for the debug versions of the libraries to test for this type of error. - Irrecoverable Error The kind of error that occurs many function calls deep. In particular the dreaded out of memory' error (often) falls in this category. Only event-handling can deal with this type of error but we don't use it because of its considerable drawbacks. A message is displayed and the exit() function is called. 9.2 Changing the way messages are displayed All the message functions eventually display their messages through a call to csmess_disp(char *). Under DOS, this function writes the message to the screen by using the standard puts() function. With Windows (16 & 32 bits) a standard message box is called. Fortunate, this function can easily be altered! Before being displayed by the csmess_disp() function, every message is converted into a single string. This makes changing the message function very easy. Only a single function, which accepts a character pointer, needs to be supplied. The next function is intended to do that: void csmess_set_fun( void (* fun)(char *)); // Example (Dos): #include "csmess.h" void display(char *s) { // This function is going to be used // to display messages. printf("%s",s); } void main(void) { csmess_set_fun(display); // From now on, all the messages are // displayed by calling the 'display()' function. } To restore the default, the next function can be used: void csmess_reset_fun(void); // Example: void main(void) { csmess_reset_fun(); // Restores the default // Display function. } Function prototypes are in csmess.h. 9.3 Message related functions The entire mechanisme of displaying messages can be switched on and off with the a few (global) functions. To avoid confusion: these are normal C-type functions, not C++ class member functions. Function prototypes are in csmess.h. void csmess_off(void); With this function, messages can be suppressed. Whether you are using the standard message function or has it replaced with your own, after a call to this function no message will be displayed. void csmess_on(void); To be used in conjunction with the csmess_off() function. After a call to 'csmess_on()' messages will be displayed again. int csmess_onoff(void); Returns TRUE if message displaying is switched on, FALSE otherwise. void csmess_onoff(int sw); If called with sw unequal zero, messages will be displayed. Otherwise not. // Example (DOS): #include "csmess.h" void work(void) { // Switch messages off. csmess_off(); // Execute some critical code. csmess_on(); // Switch messages back on. } void main(void) { work(); } 9.4 Database class messages Sofar we have been discussing the global message functions. On top of that, all database classes have a few message functions too. If a functions fails, it returns a value of FALSE, but it also sets the value of a local error variable. Each class instance has its own error variable and a member function to set, and to read it. U16 error_nr(void); This function returns the value of the error variable. After the function call, the value is reset to zero. All the database classes from the CSDB-Library have this member functions. // Example (DOS): #include "iostream.h" #include "cstbase.h" int main(void) { TBASE tb; // A database for fixed length records. // Documented in chapter 15. if(!tb.open("example.dbf")) { cout<<"Error nr: "<<tb.error_nr()<<endl; return 1; } return 0; } void error_nr(U16 ErrNr); Sets the value of the error variable. All the database classes from the CSDB-Library have this member functions. (Mainly intended for internal use.) U16 display_error(void); Obtains the latest error by calling error_nr()'. This error number is then converted into a string by reading the error.err' message file. Finally, it's displayed by using the global message functions described in the first part of this chapter. If the obtained error number is zero, no message is displayed. The return value is the error number. On return, the value of the error variable will be reset to zero. All the database classes from the CSDB-Library have this member functions. // Example (DOS): #include "iostream.h" #include "cstbase.h" int main(void) { TBASE tb; // A database for fixed length records. if(!tb.open("example.dbf")) { tb.display_error(); return 1; } return 0; } 10 Temporary files Temporary files are created through the use of the cstmpname() function, discussed in the CSTOOLS chapter 28. This means, the environment variables TEMP and TMP are checked to determine which subdirectory has to be used. TMP is checked first and if it doesn't exist, TEMP is checked. Temporary files can be as large as the databases they are belonging to. So, make sure the environment variables don' t point to some small ram-disk or an insufficiently large partition. Part Two Part Two of the documentation starts with a few chapters that apply to all database classes. Further on, it will explain how this library can be used to build traditional relational databases. To do so, it uses a TBASE class to store records and a BTREE class for indexes. A program generator, CSDBGEN, is discussed which 'automates' the process of building more complex databases out of TBASE and BTREE. These two classes can also be used seperately. In particular the BTREE class is very useful. It is really easy expandable and can be tailored to a specific purpose by supplying one single function. Simple databases with only one index can be build with just the BTREE class. 11 Buffering Except for RBASE, all database classes are build on top of a very solid buffer system. This buffer system lets you control how much memory can to be used for buffering. All the database classes offer the opportunity to specify this amount with the call to the open() function. This implies that the buffer size can be specified for every class instance individually! Buffers are allocated on the fly, up to the maximum specified. The advantage is that the maximum may never be reached, saving valuable memory for other purposes. The buffering-system itself will stop allocating memory when the heap is exhausted, but if dynamic memory allocation is used somewhere further on, your program may still terminate with a message of the type 'out of memory'. It should be noted that this is only a problem with the MS-DOS operating system. Any other (real) operating system uses virtual memory which makes sure that your program will work with any reasonable assumption about the available amount of memory. For performance reasons, however, it is better not to rely on virtual memory. If you allocate more buffers then the OS is capable of holding in physical memory, performance will drop dramatically. Buffers which are paged out, are far worse then no buffers at all! 12 PAGE-Class 12.1 Introduction The PAGE-class constitutes a kind of 'foundation' for most of the other classes in the CSDB library. It is derived from a class 'BUFF' which takes care of the required buffering. (Described in chapter 11.) The idea is to do disk IO in chunks of 2 Kb. This is close to the optimal size for the average harddisk. These blocks are kept aligned with the sectors of the harddisk, which improves speed considerablely. A harddisk always reads an entire sector, even if you only need, let's say, 10 bytes. Things become even worse if the 10 bytes you are requesting just happen to cross a sector boundary. In that case the harddisk will read 2 entire sectors. Assuming a sector is 1024 bytes, this means that 2*1024=2048 bytes are read just to obtain your 10 bytes! To avoid this kind of inefficiency, the PAGE-class does its disk IO in pages of 2048 bytes while making sure every page is aligned with the harddisk sectors. This also means that the indispensable file-header has to be at least one sector. To avoid complications, a file-header is used which has the same size as a page, 2048 bytes by default. It should be noted that this entire scene is undone by using a disk compression utility, like double space, stacker and the alike. Therefore, if you are concerned about performance, it is better not to use these utilities. More over, a disk compressor will slow down your application considerably when several files are used heavily 'at the same time'. This situation will almost inevitably arise with any serious application which uses more then one database, or even a single database with many indexes. 12.2 Storing data in the header-page As explained above, the header page is quite large. This page is used to store al kind of important variables. However, there is still much space left. Of the 2048 bytes, only about 170 are used. An application using databases is like to have some variables of his own which need to be saved between close/open sequences. It seems the remaining space in the header page is a convenient place store such data. This can save you an additional configuration file and all the error trapping involved. To aid in this, three functions are made public: int data_2_header(void * ptr,U16 length); int header_2_data(void * ptr,U16 length); U16 max_data_in_header(void); U16 max_data_in_header(void); This function returns the maximum number of bytes which will still fit in the header page. This is simply the size of the header-page minus what is used to store the variables of the class. The class needs to be open. int data_2_header(void * buffer,U16 length); Copies data from buffer 'buffer' to the empty space in the header page. The variable 'length' indicates the number of bytes to be copied. This figure is not stored anywhere. It is the programmers' responsibility to retrieve the right number of bytes later on. The class needs to be open. TRUE is returned on success, FALSE otherwise. int header_2_data(void * buffer,U16 length); The counterpart of the previous function. This function copies 'length' number of bytes from the header page to 'buffer'. The class needs to be open. TRUE is returned on success, FALSE otherwise. 13 Lock files To prevent multiple applications from accesing the same database, lock files are created whenever a database is opened for writing. The lock files are removed when the database is closed again. If a database is opened in read-only mode, no lock file will be generated. When an attempt is made to open a database while the lock file exists, an error is generated. 13.1 Name of a lock file The name of the lock file is derived from the database name by replacing the first two characters of the extension by exclamation marks. Under Linux, everything after the last dot is considered the extension'. If no extension exists, an extension consisting of two exclamation marks will be added to the database name to form the name of the lock file. Example: Database name Lock file name ------------- -------------- database.dbf database.!!f db.strings.index db.strings.!!dex data data.!! csCHAR *lock_file_name(csCHAR *DBName,csCHAR *LockName); This global C function can be used to obtain the name of a lock file. DBName' is the name of the database, LockName' must be a pointer to a buffer large enough to hold the name of the lock file. The function fills the LockName' buffer with the name of the lock file corresponding with database DBName'. The return value is the LockName' pointer. 13.2 Controlling lock files Next is a set of member functions, common to every database class, which can be used to control lock files. void use_lock_file(int TrueOrFalse); When called with FALSE, the use of lock files is switched off. When called with TRUE, the use of lock files is switched on. The class default is to use locking. int use_lock_file(void); Returns TRUE if the use of lock files is switched on, FALSE otherwise. int lock_file_exist(csCHAR *DBNAME); This function accepts the name of a database file and returns TRUE if the corresponding lock file exists. Otherwise it returns FALSE. At least one class in this library uses two files to store its data. Therefore checking for lock files is a process which depends on the type of database used. This is why this is a class member function rather then a global C function. int remove_lock_file(csCHAR *DBNAME); This function takes the name of a database, calculates the corresponding lock file name and tries to remove that. If the lock file doesn't exist or couldn't be removed, the function returns FALSE. Otherwise it returns TRUE. // Example: // Error checking omitted for conciseness. #include "iostream.h" #include "csvbaxe.h" void main(void) { VBAXE vb; if(vb.lock_file_exist("example.dbf")) { cout<<"The lock file exists"<<endl; vb.remove_lock_file("example.dbf")) cout<<"The lock file is removed"<<endl; } vb.open("example.dbf"); // // Use the database. // vb.close(); } 14 Read-Only databases It is possible to open a database in read-only mode. If done so, no changes can be made and no time-stamps will be updated. On the level of the operating system the database file is also openend read-only' which makes it possible to use static databases on read-only devices like CD-ROMS. It is valid to open an database in read-only mode while it is already in use by another program. That is, the existence of lock files will not result in an runtime error. Trying to add, delete or write records while in read-only mode is considered an error and therefore should be avoided. Still, with the current implementation, writing an record is simply ignored while adding or deleting result in a runtime error. This behaviour may change with future releases, please do not rely on it! 14.1 Class member functions Next a set of class member function, common to all database classes, which control the application of the read-only' mode. int read_only(int TrueOrFalse); When called with TRUE, the database is opened in read-only mode. Otherwise the database is opened for writing. The function has to be called before the database is opened! It is an error to call read_only() while the database is already open. The function returns TRUE on success, FALSE otherwise. int read_only(void); Same as read_only(TRUE); int is_read_only(void); The function returns TRUE if the database is opened in read-only mode, FALSE otherwise. // Example: // Error checking omitted for conciseness. #include "iostream.h" #include "csdlay.h" void main(void) { DLAY vb; // The miracle class. char some_data[100]; vb.define("example.dbf"); // Create database. vb.read_only(); // Before open(). vb.open("example.dbf"); // Open database // Test for read-only mode and append some data. if(!vb.is_read_only()) vb.append(some_data,7); else cout<<"Read only, cannot append data."<<endl; vb.close(); // Close database. } 15 TBASE-class 15.1 Introduction The TBASE class is intended as a simple, fast way to access records on disk. It assumes a fixed record size and does its IO on a record-by-record basis (contrary to field-by-field). This means: 1) TBASE is unaware of something like 'fields'. The idea is to use a C structure as record and to do all the accessing of fields with the standard C operators. This approach is undoubtedly faster then supporting access on a field-by-field basis as done by dBASE. 2) No indexes. TBASE just reads or writes records, nothing else. NOTE: From this it is clear that with the TBASE class alone no decent database application can be build. Therefore, a separate BTREE class is supplied which can be used as an index. 15.2 Using TBASE The next small example gives an impression of how to use the class. As can be seen from this example, there is no 'record pointer' as in dBASE. The functions to read and write a record, simply take an additional parameter indicating the record number. // A very simple example. // Error checking omitted for conciseness. # include "CSTBASE.H" void main(void) { typedef struct { char name[20]; // The field 'name' char street[40]; // The field 'street' long salary; // The field 'salary' // All the other fields you may require. }record; // The record layout is now defined. TBASE db; record rec; db.open("demo.dbf",110); // Assuming the file is already created. // Use 110 Kb for buffering. db.read_rec(9,&rec); // Read record number 9 into // variable 'rec'. rec.salary=0; // Change salary. db.write_rec(9,&rec); // Write the record back to position 9. // (Any other existing position is also possible.) db.close(); // Is also done automatically // by the class destructor. } 15.3 Creating a Database Before a database can be used it has to be 'created'. This is done through a call to the 'define()' function. Of course this is needed only once. Because TBASE doesn't use fields, the function takes only two parameters: the filename of the database, and the record size. Syntax: int define(char * name,U16 reclen); // This example creates a database 'demo.dbf'. #include "iostream.h" #include "CSTBASE.H" void main(void) { typedef struct { char name[20]; char street[40]; char city[25]; } record; TBASE db; if(!db.define("demo.dbf",sizeof(record))) { // Return value FALSE: display the error. db.display_error(); } } 15.4 Opening Before a record can be read, the database has to be opened through a call to the open() function. This open() function also takes a parameter indicating the amount of memory to be used for buffering. The memory for the buffers is NOT allocated at the moment of the call to open(), but during the use of the database. Memory is allocated when needed, up to this maximum. As explained in chapter 11 about buffering, using up too much memory for buffering is dangerous on an operating system without virtual memory like MS-DOS. Syntax: int open(char *name, S16 kb=32); // Example: // Opening the existing database 'demo.dbf' with 40 Kb for buffers. #include "CSTBASE.H" void main(void) { TBASE db; if(!db.open("demo.dbf",40)) db.display_error(); } 15.5 Closing Closing the database involves writing all the buffered data back to disk and freeing all allocated memory. The close() function is intended for this purpose. If the close() function is not explicitly called in the application, the class destructor will call it. Because there can be a long interval between the last time the database is used and the moment where the destructor is reached, it still makes sense to call the close() function 'by hand'. Syntax: int close(void); // Example: // Error checking omitted for conciseness. #include "CSTBASE.H" void main(void) { TBASE db; db.open("demo.dbf",40); db.close(); } 15.6 Appending Records A special function is needed to add a record to a database: the append_rec() function. Note: The write_rec() can only overwrite an already existing record. Syntax: S32 append_rec(void *data); 'data' is a pointer to a record. Syntax: S32 append_rec(void); This function can be used to add a record to the database without instantly filling it with a record. For the time being, this record will contain 'garbage'. // Example: // Error checking omitted for conciseness. #include "CSTBASE.H" void main(void) { typedef struct { char name[20]; char street[40]; char city[25]; } record; TBASE db; record rec; db.define("demo.dbf",sizeof(record)); //Create new database db.open("demo.dbf",40); strcpy(rec.name,"J.Q. Querlis "); strcpy(rec.street,"Avenue 120"); strcpy(rec.city,"Bombay"); db.append_rec(&rec); // The database now contains 1 record. db.close(); } 15.7 Deleting Records Deleting a record cannot be accomplished instantaneously. A 'delete bit' is used to distinguish deleted records. Deleting a record by setting the 'delete bit' doesn't alter much. E.g. record 9 remains record 9 if you delete record 8. The function 'is_delet()' has to be called to detect whether-or-not a record is 'deleted'. The 'pack()' function can be used to physically remove all the deleted records from the file. int is_delet(long r); Returns 0 if the record 'r' is not deleted, and 1 otherwise. void delet(long r); Sets the delete bit for record 'r'. void undelet(long r ); Resets the delete bit for record 'r'. void pack(void); Removes all the records with the delete bit set from the database. This is done without the use of temporary files. 15.8 Page Utilization Normally the TBASE class does its IO in pages of 2 Kb. It fits an integer number of records on these pages. This approach can lead to a large chunk of unused space on the pages, particularly if you are using large records. On average the slack will be a half record. Solution: This waste of disk space can be avoided by using pages which have the same size as the record. This means that the pages will no longer be aligned with the sectors of your harddisk! The function to accomplish this is: smallest_page(). The define() function of TBASE considers slacks up to 30% acceptable. If it doesn't manage to find a page size which produces a slack of less then 30%, it calls smallest_page(). void smallest_page(void); The function has to be called before the define() function. Because this changes the entire layout of the database file, this cannot be altered once the database is created. // Example: // Error checking omitted for conciseness. #include "CSTBASE.H" void main(void) { typedef struct { char name[20]; char street[40]; char city[25]; } record; TBASE db; db.smallest_page(); //Before the define! db.define("demo.dbf",sizeof(record)); } 15.9 Locating Records In the examples given so far, a record is first read into a local variable and, after being altered, written back to disk. It seems there is room for improvement here. After the record is copied into the local variable it is in memory twice, once in the variable and again in the database buffers. If you know what you are doing, some performance increase can be gained from obtaining a pointer directly into the buffer system. The 'locate_rec()' function does just that. When you are working in the database buffers through a pointer, there is no way the buffer system can tell if data is altered. Therefore, it's the programmers' responsibility to indicate whether or not modifications are going to take place. char *locate_rec(long rec); This function returns a pointer to record 'rec'. It assumes that no alterations are going to take place. This means that the buffer is not written back to disk! char *locate_rec_d(long rec); The additional '_d' stands for dirty buffer. The function returns a pointer to record 'rec'. It is assumed alterations ARE going to take place. The buffer is marked dirty' and is therefore written back to disk when memory is needed to store another page. IMPORTANT!! The locate functions return a pointer directly into the buffer system. Nothing less and nothing more. Any member function of the same class instance which MAY cause disk IO, can therefore alter the contents of the buffers, making your pointer 'point' to an entirely different record! When using these functions, it is highly advisable to do all the reading or writing to the record before calling any other TBASE member function. // Example, of the locate_ function. // Error checking omitted for conciseness. #include "CSTBASE.H" void main(void) { typedef struct { char name[20]; int age; } record; TBASE db; record *rec; db.define("demo.dbf",sizeof(record)); db.open("demo.dbf"); // Use default 32 Kb for buffers. for(int i=1;i<=12;i++) db.append_rec(); // Append 12 records. rec=(record *)db.locate_rec_d(7); // Obtain pointer to record 7. // '_d' because we will 'write'. rec->age=34; // That's all it takes to make an // alteration. No need for a // 'write' function. db.close(); // Not strictly necessary. } 15.10 Functions in alphabetical order. Function prototypes are in 'cstbase.h'. S32 append_rec(void *data); Append a record to the database. The newly created record is filled with the data from buffer 'data'. The function returns the number of the new record (which is equal to the number of records in the database). S32 append_rec(void); Same as the previous function, only this time the new record is filled with binary zero's. int close(void); Closes the database. If the database is already closed, nothing happens. TRUE is returned on success, FALSE otherwise. int define(char *name,U16 reclen); Creates a TBASE file named 'name'. The parameter 'reclen' indicates the size of the records. TRUE is returned on success, FALSE otherwise. void delet(S32 rec); Sets the delete bit of record 'rec'. int empty(void); Removes all the records from the database. Afterwards there will be zero records left, but the database will still be open. TRUE is returned on success, FALSE otherwise. int is_delet(S32 rec); Returns the value of the delete bit of record 'rec'. TRUE means the delete bit is set, FALSE means the bit is not set. U16 lengthrec(void); Returns the length of a record. Because TBASE works with fixed length records, this value is the same for all records. It's the same value used in the call to define(). char *locate_rec(S32 rec); char *locate_rec_d(S32 rec); Functions to return a pointer to record 'rec' directly into the buffer system. Please read the paragraph 15.9 about this topic before using these functions. S32 numrec(void); Returns the number of records currently in the database. Whether or not a record is marked for deletion makes no difference. int open(char *name,S16 kb=32); Opens the existing database 'name' while using 'kb Kb ram for buffering. TRUE is returned on success, FALSE otherwise. int open(void); Returns TRUE if the database is open, FALSE otherwise. int pack(void); Removes all the records with the delete bit set from the database. This is done without the use of temporary files. TRUE is returned on success, FALSE otherwise. void read_rec(S32 rec, void *buff); Copies the contents of record 'rec' into buffer 'buff'. int save(void); Writes all buffered data back to disk. The database header block is also updated. The database remains open. TRUE is returned on success, FALSE otherwise. void set_delet(S32 rec,int ToF); Changes the value of the delet bit of record 'rec'. If 'ToF' is TRUE, the delete bit is set, otherwise it is reset. void smallest_page(void); Set the page size to the smallest value possible. That is, a page will have the same size as a record. This means pages will no longer be alligned with the harddisk sectors. The function has to be called before the define() function. It changes the entire layout of the database file so it cannot be changed once the file is created. void undelet(S32 rec); Resets the delete bit of record 'rec'. void write_rec(S32 rec, void *buff); Overwrites the contents of record 'rec' with the data from buffer 'buff'. Record 'rec' needs to exist, the function cannot be used to append a record. 16 BTREE-class 16.1 Introduction A btree is a system, developed several decades ago, to store data in some predetermined order. The btree is capable of maintaining this order even under insertions and deletions. Btrees are capable of locating, inserting or deleting a specific record with only a few disk-IOs, even when very large. Of course there is a price to be payed for this: the disk space occupied by the btree is not fully utilized. Worst case, only 50% of the space is used, but on average 75% is used. Btrees are convenient as indexes on a database. Of each record in the main database, the key field and the corresponding record number are stored together in the btree. Whenever a record with a certain key value has to be located, the btree is capable of quickly ( without much disk IO) finding the required entry. Once this is done it is also clear which record from the main database has to be read, because this record number is stored together with the key value in the btree. There are several 'flavours' of btrees. The one implemented in this library is known as a 'btree+'. 16.2 BTREEx Classes Unfortunate it's a little bit cumbersome to use one BTREE class for every type of data. Therefore, there are several minor variations on the main BTREE class to account for the different variable types supported by C. Each type has its own class. Classes: BTREEb For binary data. BTREEi For integers. BTREEl For longs. BTREEc For characters. BTREEf For floats. BTREEd For doubles. BTREEa For ASCII data. (Strings) All these classes are derived from the BTREE class. Mainly, they only differ in one function. This is the function needed to compare keys. You can easily define a new BTREE class for a new type of variable. The only thing to do is to define a int t_key(void *a,void *b) function which returns: >0 if a>b 0 if a==b <0 if a<b with 'a' and 'b' pointers to the new type of variables. // Example: // Say, you have your data stored in a C structure. // Something like: typedef struct { char name[20]; int number; } // Which you want to have sorted on the 'number' field. class BTREEnew: public BTREE { virtual int t_key(void *a,void *b) { return ((record *)a)->number-((record *)b)->number; } virtual int class_ID(void) { return -100; } }; That's all! The value '-100' in the class_ID() function is not all that important. Its purpose is to give other library functions the opportunity to distinguish between the classes. The value just has to be different from any value the other classes have. This can be accomplished by choosing any negative number. 16.3 Multiple Keys Because of its nature you would expect a 'key' to appear only once in a btree. However, on many occasions it turns out there is a need for storing the same key more then once, but with a different data part. E.g. this happens when you use a btree as an index on a database and in the field you are indexing a certain value appears more then once. In that case, you want to store the key value several times but with a different data part, namely the record number in the database. By default a key value can appear only once in the btree. If you try to insert a second entry with the same key value, it will simply replace the existing one. The option 'multiple_keys()' can be set to alter that. Syntax: void multiple_keys(int YesNo); void multiple_keys_YES(void); void multiple_keys_NO(void); When the function 'multiple_keys()' is called with the argument set to TRUE, the btree will store a key value more then once. It is important to realize that the btree only keeps the key values sorted, NOT the data values. This means that when you are searching for a particular key/data value, the btree is capable of quickly locating the required key but has to find the correct data value by sequentially traversing all the inserted data belonging to that particular key. Btrees are intended to give quick access to key values by keeping them sorted, this does not apply to data values. Expecting anything else is misusing the btree. If you want quick access to a large number of different data values, all belonging to the same key, you need a different approach. The best thing to do is to construct a new btree and use a key which is a concatenation of the original key and the original data part. Setting the multiple_keys option has to take place before the 'define'. // Example // Error checking omitted for conciseness. #include "csbtree.h" void main(void) { typedef struct { char name[20]; int age; } record; BTREE bt; bt.multiple_keys_YES(); // Must be called before 'define' bt.define("btree.dat",sizeof(record),sizeof(long)); // By now a btree 'btree.dat' is created in the current working // directory with the multiple-keys option switched on. } 16.4 Current Pointer Contrary to TBASE, the btree class does use a 'current' pointer. When using btrees, the need to obtain the next (or previous) entry arises so often, it's inevitable. The btree class spends as little time as possible on maintaining this current pointer. Therefore you should assume it is NOT set, unless you have strong reasons to believe otherwise. A limited set of functions can be used to set the current pointer. After that, the 'next()' and the 'previous()' functions can be used to move to the next resp. the previous entry. When these functions 'fail', which can be noticed from their return value, you should assume the current pointer is not set (any more). The next functions can be used to set the current pointer: - all the search functions. E.g. search_gt(), find() etc. - all the min() and max() functions. E.g. max_key(), min() etc. - the insert() function. The current pointer can be moved back and forth with the next() and previous() functions. Once the current pointer is set, the 'current()' functions can be used to obtain the key value and/or the data part. Any other function can, and probably will, render the value of the current pointer undefined! // Example // The next example displays the contents of a btree with // 'strings' as key fields. // It assumes that the btree 'demo.dbf' exists and that the // key fields are less then 100 bytes. // Error checking omitted for conciseness. #include "iostream.h" #include "csbtree.h" void main(void) { char buffer[100]; BTREEa bt; bt.open("demo.dbf",250); // Does not set the current pointer. // Make the first entry the 'current'. if(bt.min()) // This returns FALSE only if the btree is empty. do { bt.current_key(buffer); // Read the 'current' key value. cout<<buffer<<endl; // Display it. } while(bt.next()); // Move the current pointer 1 position. bt.close(); } 16.5 Using Btrees The next paragraph will try to sort the public member functions according to purpose. 16.5.1 Creating void multiple_keys_YES(void); void define(char *name,int key_length, int data_length); 16.5.2 Opening int open(char *name, int kb_buffer); 16.5.3 Inserting void insert(void *key,void *data); 16.5.4 Searching int search(void *key,void *Data); int search_gt(void *key,void *Key,void *Data); int search_ge(void *key,void *Key,void *Data); int search_lt(void *key,void *Key,void *Data); int search_le(void *key,void *Key,void *Data); int search_dat_..(void *key,void *Data); int search_key_..(void *key,void *Key); int find(void *key,void *data); int find(void *key); int max(void *Key,void *Data); int min(void *Key,void *Data); 16.5.5 Current int current(void *Key,void *Data); int current_key(void *Key); int current_dat(void *Data); int tBOF(void); int tEOF(void); 16.5.6 Deleting int delet(void *delete_key); int delet(void *delete_key,void *data_value); 16.5.7 Closing void close(void); 16.6 Functions in alphabetical order. The function prototypes are in "CSBTREE.H". void close(void); Closes the btree after use. All buffers are flushed and all the allocated memory is freed. This function is also called by the class destructor if needed. int current(void *Key,void *Data); int current_key(void *Key); int current_dat(void *Data); Returns the key and/or data part of the entry the current pointer is pointing at. Parameters: Key Pointer to the buffer to which the key value has to be copied. Data Pointer to the buffer to which the data part has to be copied. If the current pointer has not been set, the functions return FALSE and no data is written into the buffers. If the current pointer is set, the functions return TRUE and the appropriate data is copied into the buffers. void define(char *name, int key_length,int dat_length); This function is needed to initially create the file on disk. This is only needed once, before the first call to open(). Parameters: name The filename of the new BTREE. key_length The number of bytes in the key. This parameter is even needed when its value is 'obvious' from the btree type. dat_length The number of bytes in the data part. Example: If you want to use a btree as index on a string field you will need: - a btree of type ASCII: BTREEa. - a key_length equal to the length of the string field in the database record. - a data part which is capable of holding the number of the record in the database. This will normally be a 'long'. #include "cstbase.h" #include "csbtree.h" void main(void) { typedef struct { char name[30]; float income; }record; TBASE db; BTREEa index; db.define("demo.dbf",sizeof(record)); index.define("demo.ndx",30,sizeof(long)); // By now the database and its index are created. } int delet(void *delete_key); Searches for key 'delete_key' and, if present, removes it from the btree. If the option multiple_keys is set to 'yes' there can be more then one entry with the specified key. Under that circumstances, all these entries will be removed. The function delet() can also accept a parameter with the value of the data part. That function should be used for deleting when multiple keys are used. The function returns TRUE is something is deleted, FALSE otherwise. int delet(void *delete_key,void *data_value); The same as the previous delete function but this time the value of the data part is also specified. Only the entry which matches both the key and the data part is removed from the btree. void empty(void); Removes all the entries in the btree. The btree needs to be open. Upon return, the btree will contain zero keys and will still be open. int find(void *key,void *data); int find(void *key); These functions 'test' if a certain key value is present in the btree. When a btree is used with multiple keys, it can be necessary to specify the data part to uniquely identify an entry. TRUE is returned when the required entry is found, FALSE otherwise. void insert(void *key,void *data); Inserts a new entry in the btree. 'key' is a pointer to the key field and 'data' is a pointer to the data part. // Example: // Error checking omitted for conciseness. void main(void) { typedef struct { char name[30]; float income; }record; TBASE db; BTREEa index; db.define("demo.dbf",sizeof(record)); //Creating. index.define("demo.ndx",30,sizeof(long)); //Creating. db.open("demo.dbf",30); // Opening the database with 30 Kb buffers index.open("demo.ndx",80); // Opening the index with 80 Kb buffers. record rec; strcpy(rec.name,"John Wayne"); rec.income=7000; // Filling the record. long recnr=db.append_rec(&rec); // Insert in the database. index.insert(rec.name,&recnr); // Update the index. db.close(); // Close the database. index.close(); // Close the index. } int max(void *Key,void *Data); int max_dat(void *Data); int max_key(void *Key); int max(void); These functions return the last (highest) entry in the btree. The parameters 'Key' and 'Data' have to be pointers to respectively a buffer for the key part and a buffer for the data part. These buffers will be filled with the appropriate data upon function return. The functions max_dat() and max_key() can be used when only one of the two items is required. TRUE is returned on success, FALSE otherwise. int min(void *Key,void *Data); int min_dat(void *Data); int min_key(void *Key); int min(void); These functions return the first (lowest) entry in the btree. The parameters 'Key' and 'Data' have to be pointers to respectively a buffer for the key part and a buffer for the data part. These buffers will be filled with the appropriate data upon function return. The functions min_dat() and min_key() can be used when just one of the two items is required. TRUE is returned on success, FALSE otherwise. void multiple_keys(int TrueOrFalse); With this function the use of multiple keys is controlled. Multiple_keys(TRUE) will allow for multiple keys. Multiple_keys(FALSE) will not allow for multiple keys. Important: This function has to be called before the define() function is invoked. It is not possible to alter the setting of the multiple key parameter later on. For more information about multiple keys please read paragraph 16.3. int multiple_keys_YES(void); Same as multiple_keys(TRUE); int multiple_keys_NO(void); Same as multiple_keys(FALSE); int multiple_keys(void); This function returns TRUE if multiple keys is set to 'YES' and FALSE otherwise. int next(int n); int next_key(int n,void *Key); int next_dat(int n,void *Data); int next(int n,void *Key,void *Data); A set of functions to move the current pointer closer to the 'end' of the btree. Apart from that, they are similar to the prev() funtions. For more information, please see over there. int next(void); Same as next(1); but more efficient. long numkey(void); This function returns the number of different keys which is in the btree. int open(char *name, int kb_buff); Opens an existing btree 'name'. The parameter 'kb_buff' indicates how many Kb ram has to be used for buffering. The function returns TRUE on success, FALSE otherwise. // Example: #include "iostream.h" #include "csbtree.h" void main(void) { BTREEa index; if(!index.open("demo.ndx",100)) { cout<<"Error"<<end; } } This will open the btree 'demo.ndx' and will, at most, use 100 Kb ram for buffering. NOTE: read the chapter 11 about buffering, before using really large amounts of buffers. void pack(void); A function which optimizes disk usage. Due to many insertions and deletions it is possible for blocks with zero keys to emerge. There are no pointers to these blocks but they will still be part of the btree because they can not be removed unless they are the last block in the file. These blocks will be used as soon as the need for a new block arises. The pack function will remove these empty blocks and rearrange all the keys. This is done by writing all the data to a temporary file and reload the btree. int prev(void); int prev(int n); int prev_key(int n,void *Key); int prev_dat(int n,void *Data); int prev(int n,void *Key,void *Data); A set of functions to move the current pointer closer to the 'beginning' of the btree. Parameters: n The number of entries the current pointer needs to be moved. Key A pointer to the buffer which is going to be filled with the value of the key field. Data A pointer to the buffer which is going to be filled with the value of the data part. If the current pointer is NOT moved, the key and data buffers will not be filled. Otherwise they will be filled with the values of the entry to which the current pointer is moved. The key field, the data field, or both can be obtained by selecting the appropriate function. The prev(void) function is a more efficient version of the prev(1) function. Important: This current pointer needs to be set first. Please, read paragraph 16.4 about the 'current pointer' for more information. When one of the prev() functions is called while the current pointer is not set, the function will return 0. The current pointer can not be moved before the beginning of the btree, therefore the number of positions moved can differ from the number requested. The return value is the number of positions the current pointer has actually moved. int search(void *key,void *Data); Searches for 'key' and fills buffer 'Data' with the corresponding data part if 'key' is found. The function returns TRUE if found, FALSE otherwise. int search_gt(void *key,void *Key,void *Data); int search_ge(void *key,void *Key,void *Data); int search_lt(void *key,void *Key,void *Data); int search_le(void *key,void *Key,void *Data); The purpose of this set of functions is to search for a key value close to a given value. 'key' is the key value searched for while 'Key' is the key value actually found. 'Data' is the data part belonging to 'Key'. The suffix '_xx' has a meaning conform the corresponding FORTRAN operators: gt: Greater Then > ge: Greater Equal >= lt: Less Then < le: Less Equal <= The functions return TRUE if such a key could be found, FALSE if not. Example: Assume the next table represents a btree. Entry Key value Data value 1 Blue 123 2 Green 45 3 Red 678 search_gt("Blue",Key,Data) Return value: TRUE Key: Green Data: 45 search_ge("Blue",Key,Data) Return value: TRUE Key: Blue Data: 123 search_lt("Blue",Key,Data) Return value: FALSE Key: undefined Data: undefined search_ge("Orange",Key,Data) Return value: TRUE Key: Red Data: 678 int search_dat_..(void *key,void *Data); int search_key_..(void *key,void *Key); The previously described functions return both the key value and the data value. In some cases this will be a waste of memory, therefore there are two similar sets of functions, which either return the found key value or the data part. search_dat_..() returns only the found data value. search_key_..() returns only the found key value. Example: With the same btree as in the previous example search_dat_gt("Blue",Data) Return value: TRUE Data: 45 search_key_ge("Blue",Key) Return value: TRUE Key: Blue int skip(int n); int skip(int n,void *Key,void *Data); int skip_key(int n,void *Key); int skip_dat(int n,void *Dat); A set of functions to move the current pointer. These functions are front ends for the next() and prev() functions. E.g. this is how the skip_key() function is implemented: int skip_key(int n,void *Key) { if(n>0) return next_key( n,Key); else return -prev_key(-n,Key); } So, with these functions the argument 'n' may be positive or negative. With 'n' positive the current pointer is moved to the 'end' of the btree. This is done by calling next(n). With 'n' negative the current pointer is moved to the 'beginning' of the btree. This is done by calling -prev(-n). The return value can be either positive or negative depending on the value of 'n'. When 0 is returned, the current pointer has not been moved or was not set. See also the prev() functions for more information. int tBOF(void); Returns TRUE if 'current' is pointing to the first entry of the btree, FALSE otherwise. Behaviour is undefined in case 'current' is NOT set! int tEOF(void); Returns TRUE if 'current' is pointing to the last entry of the btree, FALSE otherwise. Behaviour is undefined in case 'current' is NOT set! 17 CSDBGEN This chapter describes the CSDBGEN program generator. 17.1 Introduction So far we have discussed the TBASE class which is capable of reading and writing records and the BTREE class which can be used as an index. CSDBGEN can aid in building an indexed database out of this. A database which is capable of manipulating fields, maintaining indexes and the alike. As input it takes a 'definition file' which describes the fields and the indexes. From this, it produces the source for a new C++ class. Member functions of this newly created class are used to access the fields. CSDBGEN does not generate a user-interface. There are several good reasons for using a program generator. - The TBASE class can concentrate on manipulating records rather then fields. Because of that, it remains a universal and efficient way to read and write blocks of data. - With this approach it is easier to deal with field types that are not supported by the C programming language, particularly dates. - It is relatively easy for the program generator to convert to dBASE format because it has all the required knowledge at hand. Figuring out this conversion during runtime is a lot more complicated and will also make your executable larger because the knowledge to do the conversion is in the application instead of in the program generator. - Or more in general: everything the program generator does, can be left out from the application, making the executable smaller. - Without a program generator, the differences between the field types have to be dealt with runtime, perhaps even with every call accessing a field. Doing so, will inevitable result in some sort of an interpreter. 17.2 Overview Using CSDBGEN starts off with creating a database definition file. This file describes the layout of a record, the indexes, the name of the new class and the name of the files. When this file is created, CSDBGEN is called with this filename as a parameter. In return it produces the source for a brand new C++ class. This source is ready to compile, without the need for manual editing. This new class has public member functions for reading a field, setting the active index, exporting to ASCII, exporting to dBASE, reindexing, packing and so on. These member functions are very easy to use because they take very few, and often none, parameters. This is possible because a lot of the information which is specific for the database is already in the source of the functions. The generated class controls one database and all the indexes that come with it. If you need more then one database you have to repeat this procedure for the other databases as well. An elaborate example is at the end of this chapter. 17.3 Features - Indexes with more then one reference to a record! This is an innovation! It creates the possibility to locate a record by searching for a substring of the field, rather then the entire field. This topic is discussed in more detail in paragraph 17.6 called 'tokenizing'. - Conversion to-and-from ASCII. This is convenient for backups, conversions to other systems and also for making changes in the record layout. - Export to dBASE compatible file format. CSDBGEN generates a function (member of the new class) capable of writing the contents of the database to a file in the dBASE format. - Import from dBASE. There is no direct way to import a dBASE file, but the command-line utility CS4DBASE can be used to convert a dBASE file into an ASCII file which can be read by the import() function. - Can manage very large databases. The design of the libraries has been keen on avoiding limitations. As a result, databases up to 2 billion records are theoretically feasible. - No overhead. Due to the use of a program generator, the overhead involved in accessing fields is next to none. - Large buffers. This system is capable of effectively using large amounts of RAM for buffering. - Fast. The previous two points, together with the efficient BTREEs guarantee a very fast database. 17.4 Limitations - No record locking. - No multi-user support. This system was designed to be used in single user applications. Time being, there is no support for network/shared databases. Perhaps there will be in the future but if so, it will take the form of a new series of classes. 17.5 Definition file The information needed to generate the new class is obtained from a 'definition file'. To get started, CSDBGEN is capable of generating an example. example c:\borlandc>csdbgen /example>example.def Something like this will generate an example definition file 'example.def'. To get acquainted, let's look what's in it. Example database definition file: class: NAM record: NAM_record file: demo field: name s 30 Y field: city s 20 field: birthday d Y field: salary f Explanation: line 1: class: NAM The program generator generates a class, which of course has to have a name. In this example 'NAM' . line 2: record: NAM_record As explained before, we use a C structure as a record. The name of this structure is defined in the second line. In the generated header file the following (among others) will appear: #define NAME_LENGTH 30 #define CITY_LENGTH 20 typedef struct { char _name[NAME_LENGTH+1]; char _city[CITY_LENGTH+1]; long __birthday; float _salary; } NAM_record; line 3: file: demo This line indicates the name of the files the database system will use. In this example three files will be used: - demo.dbf, the TBASE database file. - demo01.idx, the BTREE index file on field 'name'. - demo02.idx, the BTREE index file on field 'birthday'. line 4/7: Field definitions. The syntax for a field definition is: field: <field_name> <field_type> [length] [format] [index] With: field_name, the name of the field. field_type, the type of the field which can be: i: integer l: long f: float F: double c: character s: string length, only for strings. Indicates the number of characters the field needs to be able to store. One additional byte is reserved to store the null terminator. format, only for date fields. Please, see the documentation on DATE fields. To give a quick example: MDY4 means, Month, Day and 4 positions for the Year. index, 'Y' means a normal index. 'T' means a 'tokenized' index. Nothing means no index at all. See also the documentation on 'tokenizing' further on. In the example: field: name s 30 Y A field 'name' of type string. 31 Bytes are reserved, 30 for the characters and 1 for the null terminator. An index is maintained for this field because of the additional 'Y'. field: city s 20 A field 'city' of type string with a length of 20 characters. There is NO index on this field. field: birthday d Y A field 'birthday' of type DATE. An index is placed on this field. field: salary f A field 'salary' of type float, without an index. 17.6 Tokenizing This is a new concept! Let's explain things with an example. Say you have a database with a record for the 'World Health Organization'. Normally, the only way to locate this record would be by entering a search string starting of with 'World ....'. But now, with tokenizing, this record could also be located by searching for 'Healt..' or 'Organization..'. It's important to know that this is not implemented by traversing the entire database from the first record to the last, as is done by some toy-applications, but through maintaining an extensive index. 17.6.1 How does it work? Traditionally, the index stores the entire field. In this example that would mean 'World Health Organization' is stored in the index together with a reference to the record number in the main database. With tokenizing, the index will store 3 entries, namely 'World', 'Health' and 'Organization'. This approach means that there will be a lot more entries in the btree then there are records in TBASE. In other words, an index is maintained on every suitable substring, rather then the entire field. To save disk space, the length of the key field in the BTREE is only half of the field length in the main database. 17.7 When is a substring indexed? First of all: tokenizing only applies for string fields. E.g. 'Tokenizing' a float field seems pointless. Whether or not a substring is placed in the index is controlled by two things: - the way it is separated from the rest of the field. - the length of the substring. For every field tokenized', CSDBGEN generates two symbolic constants: one for the token delimeters and a second for the minimal token length. // Example: // Suppose you use tokenizing on a field NAME'. // In the generated .cpp file two #defines will appear: #define TD_NAME "\t(),- " // Token delimiters for field 'name'. #define TL_NAME 4 // Minimal Token length for field 'name'. The tokenize function separates the field into substrings according to the characters in the TD_????? string. Notice that the '.' is not a delimiter. This is to prevent abbreviations from being split up. A substring has to be at least TL_????? bytes long to appear in an index. By default this will be four bytes, that's not too long for most cases but it means that three letter abbreviations like 'IRS' are not indexed. Of course you can alter these two defines when needed. Example definition file: class: NAM record: NAM_record file: demo field: name s 30 T field: city s 20 field: birthday d field: salary f Notice the 'T' behind the 'name' field. This is short for 'tokenize'. The generated class will apply tokenizing on the name' field. 17.8 Compound indexes On several occasions there is a need for an index on a concatenation of fields, rather then on one single field. E.g. the combination of a string field and a date field is very popular. Example: Suppose you are building a database for sport events. Some events like 'Heavy Weight Champion Match' may appear more then once. Therefore, you want the events listed first by alphabet, and second by date. In this way all the heavy weight champion matches appear together and in order of their scheduled date. CSDBGEN can generate indexes on a concatenation of fields. For that, it requires an 'index' line in the database definition file. 17.8.1 A simple example class: NAM record: NAM_record file: demo field: name s 30 Y field: city s 20 field: birthday d Y index: nabi a:name+a:birthday Note the last line. This will generate the index nabi, which is an index on the concatenation of the name field and the birthday. The a: indicates 'ascending'. In case you need 'descending', use d:. There is no need to limit the indexes to a concatenation of just two fields. Many more fields can be used, and the same field may even appear more then once. Also, there is no reason why a field which already has a 'normal' index, shouldn't be used in a compound index. E.g. in the example, the name field is already indexed because the 'Y' behind its definition, despite that it's also used in the nabi index. 17.8.2 A more complex example class: NAM record: NAM_record file: demo field: name s 30 T field: city s 20 field: birthday d MDY2 index: nabi d:name+a:birthday index: binaci a:birthday+d:name+a:city This will generate a NAM class with a tokenized index on the name field and two compound indexes, nabi and binaci. The nabi index sorts the name descending and the birthday ascending, binaci sorts first ascending on the birthday, secondly descending on the name and last ascending on city. Example: #include "iostream.h" #include "nam.h" void main(void) { NAM nam; nam.open(); // Opening nam.order(NAME_INDEX); // Use the name index. .... // Do something nam.order(NABI_INDEX); // Use the nabi index. .... // Do something nam.order(BINACI_INDEX); // Use the binaci index. nam.top(); // Display the records in order do // of the binaci index. { cout<<nam.birthday()<<" "; cout<<nam.name()<<" "; cout<<nam.city()<<endl; } while(nam.skip(1)); nam.close(); } 17.8.3 Compound & Tokenizing Indexes It's also possible to use a compound index while a tokenizing techique is applied on one of more of its fields. The syntax to generate such an index can be seen in the next example. Example database definition file: class: NAM record: NAM_record file: demo field: name s 30 field: birthday d MDY2 index: nabi aT:name+a:birthday Notice the 'T' in the last line. The nabi index will now 'tokenize' the name field, concatenate the birthday and use this as a reference for the record. Example: Record 1: Bjorn Esverald 04/07/70 Record 2: Bjorn Gensjeng 03/23/55 Record 3: Gata Esverald 05/11/93 Assume the above three records resemble the database. Which references will the nabi index contain? Entry nr: Key: Record pointed to: ────────────────────────────────────────────────────── 1 Bjorn 03/23/55 2 2 Bjorn 04/07/70 1 3 Esverald 04/07/70 1 4 Esverlad 05/11/93 3 5 Gata 05/11/93 3 6 Gensjeng 03/23/55 2 Just as with the ordinary tokenized fields, the length of the token placed in the index, is at most half of that of the original field. Although it's not easy to come up with an application in which it is useful, it's possible to have a compound index on the concatenation of more then one tokenized field: Example: class: NAM record: NAM_record file: demo field: name s 60 field: interests s 50 index: nabi aT:name+dT:interests This creates a vast index because a reference is generated for every possible combination of a token in the name field and a token in the interests field. Let's say the name field contains three tokens and the interest field four. Then there will be 3*4=12 references in the index, just for this single record! Example: Record 1: Roger Sander Barkakati Bonsai Trees Suppose this is the only record in the database. Then the nabi index will contain the following 6 references: Entry nr: Key: Record pointed to: ────────────────────────────────────────────────────── 1 Barkakati Trees 1 2 Barkakati Bonsai 1 3 Roger Trees 1 4 Roger Bonsai 1 5 Snder Trees 1 6 Sander Bonsai 1 Note: 'Trees' appear before 'Bonsai' because the interests field is tokenized descending. 17.8.4 Locating an Entry With a compound index, locating a specific entry is not all that easy. Normally, supplying a pointer to the search argument is sufficient, but this time the pointer needs to point to the concatenation of two or more values which don't even have to be of the same type. Therefore, CSDBGEN typedefs a record structure for every compound index. Example: class: NAM record: NAM_record file: demo field: name s 60 field: interests s 50 index: nabi aT:name+dT:interests // This defintion file will generate a header file // which, among others, contains the following lines: #define NAME_LENGTH 60 #define INTERESTS_LENGTH 50 #define UNSORTED 0 #define NABI_INDEX 1 typedef struct { char name[NAME_LENGTH/2]; char interests[INTERESTS_LENGTH/2]; }NAM_rec_nabi; Note that the field lengths in the NAM_rec_nabi structure are only half of the database field lengths. NAM_rec_nabi is the structure which can be used to search for an entry in the compound index nabi. Example: #include .... void main(void) { NAM_rec_nabi nrn; NAM nam; nam.open(); nam.order(NABI_INDEX); // Make the NABI index active. strcpy(nrn.name,"Gandalf"); strcpy(nrn.interests,"Witchcraft"); // Locate. Case INsensitive. if(nam.find(&nrn)) cout<<" Found "; else cout<<" Not Found "; nam.close(); } 17.9 Export to dBASE The program generator also produces a member function: int to_DBASE(char *filename); When called, this function produces a file 'filename' which can be read by dBASE. Index files are NOT written. 17.10 Importing from dBASE CSDBGEN does not generate a function to directly import a dBASE file. However, the CS4DBASE utility, discussed in chapter 27 of this documentation, can read a dBASE file. It produces an ASCII file which is formatted to be read by the import() function. The next paragraph explains how to read ASCII files using the import() function. 17.11 Exporting/Importing to/from ASCII int export(char *filename); This writes out the contents of the database to an ASCII file 'filename'. That file will also contain information about the fields. In this way the import() function knows how to process this data, even after changes in the record layout. int import(char *filename); This member function reads the ASCII file 'filename' and appends the data to the current database. It is meant to be used in conjunction with the export() function. The export function starts of with writing the entire definition file. The import function uses this information to skip fields which are not in the database and to read fields in the right order. (Only 'familiar' fields are read, the others are ignored.) This mechanism can be used to make changes in the record layout. Note: Because it's an ASCII file, the names of the fields can be changed with a normal editor. 17.12 Starting a new database A member function int define(void); is available to create a new database. If the database already exists, it is overwritten! 17.13 Opening a database The member function int open(void); opens an existing database. Index files are automatically rebuilt if they don't exist. 17.14 Current Record At any moment there is always a record the 'current record'. The functions to read and write fields all work with this current record. - After opening, the first record becomes the current record. - The go_to(), skip(), top(), bottom() and the search() functions can be used to make another record 'current'. 17.15 Accessing fields CSDBGEN generates two member functions for each field. One to read the field, and a second to write the field. The names are the same but the arguments differ. Example: // A part of the definition file: class: NAM record: nam_record file: dbtest field: name s 40 field: number i // We have a 'name' field consisting of a string with 40 characters // and a 'number' field which is an integer. // Among others, the next member functions are generated by CSDBGEN: class NAM { public: // For reading char *name(void); int number(void); // For writing void name(char *s); void number(int i); }; The next example gives an impression of how the generated class could be used. Example: void main(void) { NAM db; // We now have a class 'NAM'. if(!db.open()) // Opens database, assuming it already exists. { // No file names have to be entered. cout<<"error"; // All the indexes are opened automatically. } puts(db.name()); // Displays the name field of the first record. // After 'open' the first record is 'current'. db.name("Pjotr Idaho"); // Changes the contents of the 'name' field to // 'Pjotr Idaho'. Indexes are updated automatically. if(!db.close()) // Close database. { cout<<"error"; } } 17.16 DATE fields Standard C doesn't support date variables. Therefore, this library has its own DATE class. The functions to read and write date-fields are using a string representation of a date. These strings can represent a date in several formats. CSDBGEN uses a default of DMY4. This means 2 positions for the Day, 2 positions for the Month and 4 positions for the Year. Example: "02/04/1994" // By default interpreted as: April the 2th 1994. When a two position representation of the year is wanted, use Y2 instead of Y4. Roughly speaking: Every sensible order of M,D,Y2 or Y4 is accepted. Example: If you want "02/04/94" to be interpreted as February the 4th 1994, use the format MDY2. The line in the database definition file has to be: field: birthday d MDY2 If you want the field to be indexed, add an additional 'Y': field: birthday d MDY2 Y On disk, dates are stored as longs. The sem_jul() function of the DATE class is used to convert a date to a long. For more information about the date formats and the DATE class, please see chapter 30. 17.17 Changing the record layout. Even when the database is already in use, the need to make changes in the record layout may occur. With the next procedure this can be accomplished quite easily, without to the need to reenter any data manually. To put it in a nut shell: save your data to an ASCII file with the old export() function and reload with the new import() function. Or in more detail: a) Export with the 'old' export() function. This will produce an ASCII file which fully resembles the database. b) Make a new definition file or alter the old one. c) Generate a new Class with CSDBGEN. d) Compile & link. The last three steps are simply the procedure for creating a database using CSDBGEN. e) Use the new import() function to reload the data. Import the ASCII file created with step 'a'. The import() function is doing the actual conversion. It can do this because it has knowledge of both the old and the new definition file. The old one is on top of the ASCII file and knowledge about the new one is hard coded in the import() function by CSDBGEN. 17.18 Member functions in alphabetical order Next is a list of the public member functions as they appear in the generated class. With the sole exception of open() and define(), the database needs to be open for these functions to work properly. void append_blank(void); Appends an additional record to the database. The record is filled with binary zeros and becomes the current record. int bottom(void); The current record is set to the last record according to the active index. The function returns FALSE if the database is empty, otherwise TRUE is returned. int close(void); Closes the open files. All buffers are flushed and all allocated memory is released. This function is called automatically by the class destructor if needed. The function always returns TRUE. long curr_rec(void); Returns the number of the current record. The first record is number 1. int define(void); Creates a new database. Files are generated for the database and all the indexes. If a file already exists, it's overwritten. TRUE is returned on success. Otherwise FALSE is returned and the error_nr() function will return the error generated. void delet(void); Marks the current record for deletion. When the pack() function is called all the marked records are removed from the database. int export(char *filename); Writes the contents of the database to an ASCII file 'filename'. This file is meant to be read back by the import() function. The exported file contains a header which resembles the 'database definition file'. The function returns TRUE on succes, FALSE otherwise. int go_to(long rec_nr); The record 'rec_nr' becomes the current record. Whether or not the record is marked for deletion makes no difference.The function returns TRUE on succes, FALSE otherwise. int import(char *filename); Reads records from an ASCII file 'filename' generated by the export() function and appends these records to the database. TRUE is returned on success, FALSE otherwise. int is_delet(void); This function returns TRUE if the current record is marked for deletion, FALSE otherwise. long numrec(void); Returns the number of records currently in the database. The records marked for deletion are also counted. int open(void); Opens the database for use. The define() function has to be called, that is, the database file needs to exist. Index files are automatically generated if they are missing. TRUE is returned on success. Otherwise FALSE is returned and the error_nr() function will return the error generated. int order(void); Returns the number of the current active index. int order(int index_number); This function controls the use of indexes. The variable 'index_number' indicates which index has to become the active index. All the indexes however, are updated when a record is altered. In the header file a preprocessor constant is defined for each index. The name of this constant is generated by converting the field name to upper case and adding _INDEX. Example: An index on field: Street Preprocessor constant: STREET_INDEX <Class>.order(STREET_INDEX); will make the index on the street field the active index. <Class>.order(UNSORTED); makes all the indexes inactive. Changing the active index does not alter the current record. The preprocessor constant UNSORTED can be used to render all the indexes inactive. The database will be browsed in its 'natural' order. The function returns TRUE on succes, FALSE otherwise. int pack(void); Removes all the records marked for deletion. No temporary files are used! The function returns TRUE on succes, FALSE otherwise. int reindex(void); Rebuilds all the indexes of the database. The function returns TRUE on succes, FALSE otherwise. int search(void *key); The active index is searched for value 'key'. The current record becomes the first record which matches the search value. The function accepts a pointer to the search argument. When the search argument is not exactly matched, the current record becomes the record with next higher' value. In this case the funtion will return TRUE. If no higher' value is available, the last record becomes the current and the function returns FALSE. This strategy proofs to work fine when searching for names etc.. int skip(int delta=1); Moves the current record pointer delta positions. Examples: skip(1); // The next record becomes the current record. skip(-1);// The previous record becomes the current record. skip(0); // Nothing happens. skip(10);// The record 10 positions to the end becomes // the current record. skip(); // Same as skip(1); If an attempt is made to go 'before' the first record, record number 1 becomes the current record. Similar, the last record becomes the current record if an attempt is made to pass beyond the last record. The order in which the records are traversed is controlled by the current active index. The function returns the number of positions actually moved. int tBOF(void); Test for Beginning Of File. int tEOF(void); Test for End Of File. The functions return TRUE if the end is reached, (according to the active index) FALSE is returned otherwise. int to_DBASE(char *filename); Exports the database to a file 'filename', which can be read by dBASE. Index files (for dBASE) are NOT generated. The function returns TRUE on succes, FALSE otherwise. int top(void); The current record is set to the first record according to the active index. The function returns TRUE on succes, FALSE otherwise. void undelet(void); If the current record is marked for deletion, this function removes the marker. 17.19 Warning The program generator is not 'fool proof'. This means that you should avoid using names which already are reserved C++ key words. E.g. if you try to define a field with the name 'delete' the resulting source will not compile. 17.20 A Large Example Let's say we want to build a database with stores a person's name and his/hers birthday. Step 1 First we need to construct a definition file. Next is a working example. class: BIRTH record: BRecord file: bdays field: name s 30 T field: birthday d Y4MD Y Assume the name of this definition file is 'birth.def'. Step 2 From the definition file we have to generate the source for the database. We do that by calling CSDBGEN. c:\borlandc\csutil\test> csdbgen birth.def This produces two output files: 'birth.cpp' and 'birth.h'. These names are derived from the name of the definition file. Not from the class name as one might expect from this example. Step 3 We are now ready to start compiling. Normally, creating the database will be an option in the main menu of the application, but because this is a demonstration we do things differently. #include "iostream.h" #include "birth.h" void main(void) { BIRTH db; // Declare an instance of the new BIRTH class. if(!db.define()) // Create the database and its indexes. { cout<<"Error "<<endl; } } Compile this together with the 'birth.cpp' file and link it. When ran, it should create three files: - 'bdays.dbf' The TBASE main database file. - 'bdays01.idx' The BTREEa index on the field name. - 'bdays02.idx' The BTREEl index on the field birthday. Remember, dates are stored as longs. If you run CSDIR in the same directory it will show something like this: Directory C:\BORLANDC\CSUTIL\TEST\ Name Size Type Entries Created Updated -------------------------------------------------------------------------- BDAYS.DBF 174 TBASE 0 Nov 01 1994 Nov 01 1994 BDAYS01.IDX 174 BTREEa 0 Nov 01 1994 Nov 01 1994 BDAYS02.IDX 174 BTREEl 0 Nov 01 1994 Nov 01 1994 -------------------------------------------------------------------------- Total: 522 bytes in 3 files. Step 4 By now, we have created the database files and we have the class to work with it. In other words, we are ready to write an 'application'. // Some error checking omitted for conciseness. #include "iostream.h" #include "birth.h" void main(void) { BIRTH db; // Declare an instance of the new BIRTH class. if(!db.open()) // Open it. { cout<<" Error "<<endl; exit(1); } db.append_blank(); // Because it's empty, add a blank record db.name("Luke Skywalker"); // Modify the name. db.birthday("2015/07/03"); // Modify the birthday. db.append_blank(); // Add a new record. Becomes the current. db.name("Al Bundy"); // Modify the name. db.birthday("1945/11/30"); // Modify the birthday. db.reindex(); // Reindexing. For demonstration purposes. // Shouldn't be necessary. db.order(BIRTHDAY_INDEX); // Make BIRTHDAY the active index. db.top(); // Go to the oldest person. do { cout<<db.name()<<endl; } // Display his name. while(db.skip()); // Skip to the next. db.go_to(1); // Make record 1 the current record. // Index INdependent! db.delet(); // Mark it for deletion. db.pack(); // Remove it from the database. db.close(); // Close database and indexes. } If you run CSDIR again afterwards, you will see something like this: Directory C:\BORLANDC\CSUTIL\TEST\ Name Size Type Entries Created Updated -------------------------------------------------------------------------- BDAYS.DBF 4096 TBASE 1 Nov 01 1994 Nov 01 1994 BDAYS01.IDX 3072 BTREEa 1 Nov 01 1994 Nov 01 1994 BDAYS02.IDX 3072 BTREEl 1 Nov 01 1994 Nov 01 1994 -------------------------------------------------------------------------- Total: 10240 bytes in 3 files. Part Three Next are some classes which can be used where the traditional database will not do. A VRAM class is discussed which makes it possible to maintain pointer structures on disk. Two other classes, VBASE and VBAXE, are presented which deal with variable length records. 18 VRAM 18.1 Introduction VRAM is without any doubt the most flexible and versatile class in this library. Contrary to the traditional database, this one doesn't suffer from fixed record sizes and doesn't have problems with deletions. In other words: it isn't a database at all! Assuming a C++ programmer has a good understanding of a 'heap', it shouldn't take long to explain this class. In one sentence, VRAM mimics a 'heap on disk'. The idea is simple: use functions like 'malloc' and 'free' to manipulate the necessary space, just like with an ordinary heap, only this time the heap is in fact a file. In this way the data is not lost when the program exits while all the flexibility of a heap is still there! 18.2 Creating int define(char *name,U16 struclen); This is the function needed to create a VRAM system. Contrary to what you might have expected, it takes two parameters. The first is as usual the file name, the second however, is the maximum size you are planning to allocate. This differs from the ordinary heap which simply accepts allocations of any size right from the start. (Which also explains why the ordinary heap allocations are so amazingly inefficient.) In a way, the second parameter 'struclen' is a performance parameter. If you like, you can always use the maximum, which is 32 Kb, but this would yield a highly inefficient VRAM. The VRAM system will perform better the more accurate 'struclen' reflects the true state of affairs. However, performance option or not, 'struclen' is a upper limit to what you are allowed to allocate. Any attempt to allocate more, will be answered with a runtime error. // Example VRAM define() // Error checking omitted for conciseness. #include "CSVRAM.H" void main(void) { VRAM vr; vr.define("VRAM.TST",614); // Allocating at most 614 bytes. } The CSDIR utility recognizes VRAM files. When the example program has run, it will display something like: Directory C:\BORLANDC\TEST\VRAM\ Name Size Type Entries Created Updated -------------------------------------------------------------------------- VRAM.TST 174 VRAM 0 Nov 18 1994 Nov 18 1994 -------------------------------------------------------------------------- Total: 174 bytes in 1 files. 18.3 Opening & Closing Like all the databases classes in this library, VRAM needs to be 'opened' before it can be used and, consequently, 'closed' afterwards. syntax: int open(char *name,U16 kb_buf); This opens the vram file 'name' and uses 'kb_buf' Kb for buffering. syntax: int close(void); This closes the VRAM system. This function is also called by the class destructor when needed. // Example VRAM // Error checking omitted for conciseness. #include "CSVRAM.H" void main(void) { VRAM vr; vr.define("VRAM.TST",614); // Allocating at most 614 bytes. vr.open("VRAM.TST",300); // Opens VRAM.TST using 300 Kb buffers. // Doing something interesting. vr.close(); // Close VRAM system. } 18.4 VRAM Pointers The normal malloc() function returns a void pointer, unfortunate VRAM cannot do that. It uses its own type of pointer: VPOI which is short for Virtual POInter. VPOI is a simple 32 bit unsigned long, defined in 'CSVRAM.H'. The VPOI also limits the size of a VRAM system to 4 Gb. ( Of course you can always use more then one VRAM... ) There is another important difference between VRAM and a normal heap. VRAM distinguishes between reading and writing. The buffer system used, cannot tell whether you are making changes. Therefore, the programmer need to supply that information by calling different functions for reading and writing. Reading: char *R(VPOI p); Writing: char *W(VPOI p); // Example // Error checking omitted for conciseness. #include "CSVRAM.H" void main(void) { VRAM vr; // A VRAM system. VPOI vp; // A VRAM pointer. char *cp; // A normal character pointer. vr.define("VRAM.TST",614); // Initially create it. vr.open("VRAM.TST",50); // Opening with 50 Kb buffers. vp=vr.malloc(20); // Allocate 20 bytes from the virtual heap. cp=vr.W(vp); // Obtaining a character pointer to // the allocated space. We are planning to // write, so the 'W' function is used. strcpy(cp,"Some Data"); // Write data into it. vr.close(); // Close the VRAM system. // "Some Data" is now on disk! } From the above example it becomes clear how the VPOI pointers can be used. The method is simple: convert them into normal pointers and apply standard C++ programming technique. Only the last 2 converted VPOI pointers are guaranteed to be valid. VRAM has a limited number of buffers, so you cannot expect all data to be in ram forever. Every time you convert a VPOI pointer into a character pointer by using the W() or the R() function, VRAM calculates the corresponding position in the file and loads the required page in ram. The pointer returned, points directly into this page. Because only the last two pages are guaranteed to be in ram under all circumstances, the third time you convert a VPOI pointer, it can overwrite a previously loaded page. Because at least two pointers are valid, you can copy data from one VRAM position to another without using temporary storage. With the W() function, the loaded page is marked 'dirty' which makes sure it's written back to disk when the page is removed from the buffer system. This is not so for the R() function. In that case the page is simply discarded. // Example, copying between two VPOI pointers. // Error checking omitted for conciseness. #include "CSVRAM.H" void main(void) { VRAM vr; VPOI vp1,vp2; vr.open("VRAM.TST",50); // Opening with 50 Kb buffers. strcpy(vr.W(vp1=vr.malloc(20)),"Some Data"); // Allocate and fill // one VPOI. vp2=vr.malloc(100); // Allocate a second. memcpy(vr.W(vp2),vr.R(vp1),20); // Copy! vr.close(); // Close the VRAM system. } 18.5 Fragmentation Just as with an ordinary heap, VRAM can suffer from fragmentation. The normal heap can become prematurely exhausted because of fragmentation while for the VRAM system it only means the file becomes larger then strictly necessary. On the other hand: the normal heap gets a fresh start every time the program is run while the VRAM files may be in use for years. Therefore a defrag() function is available. If you decide to use it, it is best to use it regularly. It mainly does three things: a) Joining free space wherever possible. This is not done during normal operation because it may involve additional IO. b) Sorting the empty-data-chains. When space is needed, its taken from the beginning of a empty- chain. After sorting the chains, the empty blocks at the beginning of the file will also be at the beginning of the chain. Eventually this leads to pages at the end becoming completely free and pages at the beginning (almost) full. c) Empty pages above the highest used location are stripped from the file. The defrag() function links in a lot of code, it uses an entire btree and a temporary file. In a way this makes the defrag() function 'bigger' then the rest of the VRAM class combined! 18.6 Root Under some circumstances you may need a 'starting point' in the VRAM. Example: Let's say you are writing some flowcharting program and you have decided that VRAM is a great help in storing and manipulating a flowchart. The flowchart probably consists of several independent parts pointered together. Once in it, each part can be reached by the VPOI's stored in the data structure. This leaves you with just one problem: where does the flowchart start? It takes just one VPOI to store that location and it would be a shame if you needed an additional configuration file for that. Therefore two very simple functions are implemented to store and retrieve a 'special' VPOI. void root(VPOI p); Stores VPOI 'p'. VPOI root(void); Obtains the VPOI stored with the previous function. These functions just manipulate this single VPOI. They have absolutely no effect on the rest of the VRAM system. 18.7 Functions in Alphabetical order. Prototypes are in 'csvram.h'. With the exception of the define(). open() and zap() functions, the class needs to be opened for the functions to work. U16 alloc(VPOI p); U16 alloc(void *p); Returns the number of allocated bytes at a certain location. The pointer may be either a VPOI pointer or a normal pointer to the same location. int close(void); Closes the VRAM system. Returns TRUE on success, FALSE otherwise. int define(char *name,U16 struclen); Creates the VRAM system 'name' with 'struclen' being the maximum size of any allocation. Returns TRUE on success, FALSE otherwise. int defrag(void); Defragments the virtual heap. Returns TRUE on success, FALSE otherwise. int empty(void); Makes the VRAM system empty. The class remains open but all allocations will be undone. Returns TRUE on success, FALSE otherwise. U32 number(void); Returns the number of allocations currently done. This is the number of malloc()'s minus the number of free()'s. int open(char *name,S16 kb_buf); Opens VRAM 'name' using 'kb_buf' Kb ram for buffering. Returns TRUE on success, FALSE otherwise. char *R(VPOI p); Converts a VPOI pointer into a character pointer. It is assumed no modifications are going to take place. void root(VPOI p); Stores VPOI 'p'. VPOI root(void); Obtains the VPOI stored with the previous functions. int save(void); Safes all buffered data to disk. All the buffers are flushed and the header page is updated. Returns TRUE on success, FALSE otherwise. void free(VPOI p); Frees the VPOI p. VPOI malloc(U16 size); Allocates 'size' amount of bytes from the virtual heap. The corresponding VPOI is returned. char *W(VPOI p); Converts a VPOI pointer into a character pointer. It is assumed modifications are going to take place. int zap(void); Closes the VRAM system when needed and restores all class defaults. Returns TRUE on success, FALSE otherwise. 19 VBASE 19.1 Introduction The use and purpose of the VBASE class are much similar to that of the TBASE class. There is however, one huge difference, VBASE supports variable length records! The 'V' in VBASE stands for 'variable'. Compared with TBASE, the differences in the public member functions are minimal. The append() function now takes an additional parameter indicating the length of the record. The same goes for the write_rec() function. Apart from the 'normal' read_rec() function there is now an additional read_rec() which returns the length of the obtained record. VBASE is a 'stand alone' class. It has nothing to do with the databases produced by CSDBGEN. 19.2 Using VBASE. Using VBASE is very straightforward. - Initially create the VBASE system by calling define(). - Open it through a call to open(). - Read, write and append records. - Close VBASE by calling close(). That's all! // Example // Error checking omitted for conciseness. #include "csvbase.h" void main(void) { VBASE vb; vb.relocate_when_shrunk(TRUE); // Move the record to a better // fitting position when shrunk. vb.define("VBASE.dbf",1230); // Maximum record length 1230 bytes. vb.open("VBASE.dbf", 200); // Open with 200 Kb buffers. char *s="Some chunk of data. "; vb.append_rec(s,strlen(s)+1); // Append a record. Notice the length // parameter which is not needed with // TBASE. char d[200]; vb.read_rec(1,d); // Read record 1 into array 'd'. strcpy(d,"New Data"); vb.write_rec(1,d,strlen(d)+1); // Overwrite record 1 with a new // block of data. This does not have // to have the same length! vb.close(); // Ready. Close VBASE. Also called by // the class destructor if needed. } For more information, please read the documentation on the TBASE class. (Chapter 15. ) 19.3 Relocating records When an existing record is overwritten with a new bigger record, it no longer fits in its original slot, which means the record has to be relocated. This is not necessarily so when the record shrinks. In that case you have the choice between relocating, which saves disk space but is relatively slow or leaving the record where it is and waste some disk space. The function 'relocate_when_shrunk()' is there to choose between these two strategies. It has to be called before 'define()'. Calling 'relocate_when_shrunk(TRUE);' will relocate a record when it becomes smaller. 'Relocate_when_shrunk(FALSE);' will leave the records in place when possible. The default is set to: relocate_when_shrunk(TRUE). The function has to be called before 'define()' and its setting cannot be altered afterwards. 19.4 Limitations. VBASE was designed for databases up to around a million records. This is not a 'hard' limit, its possible to add many more records but under some unfavourable conditions memory utilization can get out of control. The way the class uses the available ram is controlled by the open() function. Therefore, adding a huge number of records in one go, poses a problem. If the records are not appended all at once but with several close/open sequences in between, VBASE can easily store 16 million records. So: - Under worst case conditions 1 million records. - Under favourable conditions 16 million records. - Avoid more then 16 million records. The above limitations stem from ram utilization. For those drowning in memory, there are also software limitations: - maximum file size 4 Gb. - 4 billion records. Because there are so many 'buts' and 'ifs', there is another class VBAXE, discussed in the next chapter, to deal with the larger databases. As a rule of thumb, use VBASE for databases up to 1 million records and VBAXE for more then 1 million records. 19.5 Functions in alphabetical order. The function prototypes are in csvbase.h. U32 append_rec(void *data,U16 len); Append a record to the database. 'data' is a pointer to the data and 'len' is the number of bytes data. The function returns the number of the newly created record. int close(void); Closes the database. All buffers are flushed and all allocated memory is freed. TRUE is returned on success, FALSE otherwise. int define(char *name,U16 struclen); Creates a new database. 'name' is the name of the file and 'struclen' is the maximum length of a record. Do not make 'struclen' unnecessary large because its value controls space efficiency. The maximum value of struclen is 32767. TRUE is returned on success, FALSE otherwise. void delet(U32 record); Marks record 'record' for deletion. Only the 'delete bit' is set. The pack() function needs to be called to actually remove the record from the file. void empty(void); Removes all records from the database. Upon return the database will contain zero records but will still be 'open'. int is_delet(U32 record); Returns TRUE if record 'record' is marked for deletion. FALSE otherwise. char *locate_rec(U32 rec); char *locate_rec_d(U32 rec); Functions to return a pointer to record 'rec' directly into the buffer system. The returned pointer can be used to change the contents of a record but not the length. Please, read paragraph 15.8.2 about locating before using these functions. U32 numvrec(void); Returns the number of records currently in the database. int open(char *name,U16 kb_buf); Opens database 'name', using 'kb_buf' Kb ram for buffering. Returns TRUE on success, FALSE otherwise. int pack(void); Removes all records marked for deletion. A temporary file is used. TRUE is returned on success, FALSE otherwise. void read_rec(U32 rec,void *ptr,U16 &length); Reads record 'rec' and copies it into the buffer 'ptr' is pointing at. The variable 'length' is set to the length of the retrieved record. void read_rec(U32 pos,U16 maxlen,void *ptr,U16 &length); The same as the precious function but with an additional parameter 'maxlen' specifying the maximum number of bytes that can be copied into the buffer 'ptr'. If the record proofs to be longer then 'maxlen', only 'maxlen' bytes will be copied to 'ptr'. U16 rec_len(U32 rec); Returns the length of record 'rec'. void relocate_when_shrunk(int TrueOrFalse); When called with 'TrueOrFalse' set to TRUE, records will be relocated when shrunk. When called with FALSE the records will stay at the same place. The function has to be called before define(). For more information, please see the paragraph about this topic. int save(void); As a precaution measure, all 'dirty' buffers are written to disk and the header page is updated. The database remains open. Returns TRUE on success and FALSE otherwise. void undelet(U32 rec); Removes the 'delete' marking from record 'rec'. void write_rec(U32 rec,void *data,U16 len); Overwrites the existing record 'rec'. 'len' bytes are copied from 'data'. Afterwards the record will be of length 'len'. 20 VBAXE 20.1 Introduction As explained in the previous chapter, VBAXE is similar to VBASE but is intended for larger databases. That is, more then 1 million records. The public member functions of the classes are 100% identical. The inner workings however are completely different. VBAXE uses two files for a database where as VBASE uses only one. VBAXE is build from two other classes namely TBASE and VRAM. Building a class for variable length records is not easy, but writing one that can store millions of records, is fast, uses little ram, doesn't use unnecessary disk space and still stores everything in one file is next to impossible. So, rather then coming up with something slow & clumsy, VBAXE gives up on storing everything in one file. 20.2 Working. The working of VBAXE is very simple. It allocates the necessary space from VRAM and stores the VRAM pointer together with the length in a TBASE record. E.g. to obtain record 714 it starts with retrieving record 714 from TBASE. Because TBASE uses fixed size records, the position of record 714 can easily be calculated. Once this record is obtained, the VRAM pointer to the data of record 714 is known. From this pointer the position in the VRAM file can again be easily calculated. If nothing is in the buffers, it takes two IO's to obtain the data, but at least no searching is done. The positions in the files are always known through simple arithmetic. ( Which, btw., also holds for the VBASE class.) 20.3 Files As explained above there are two files to every VBAXE database. The TBASE part stores it's data in a file with extension '.vbi'. The VRAM part uses extension '.vbd'. If define() or open() is called with with a name which already has an extension, that extension will be removed. The CSDIR utility recognizes these files and will display the TBASE class as VBASEi and VRAM as VBASEd. // Example // Error checking omitted for conciseness. #include "csvbaxe.h" void main(void) { char buf[1000]; VBAXE vb; vb.define("demo",390); // Max record length 390 bytes. vb.open("demo",200); // 200 Kb buffers. for(int i=1;i<=100;i++) { vb.append_rec(buf,1+random(390)); // Append 100 records // with random length // and random contents. } vb.close(); // Close database. } Afterwards CSDIR will display something like: Directory C:\BORLANDC\DEMO Name Size Type Entries Created Updated -------------------------------------------------------------------------- DEMO.VBI 4096 VBASEi 100 Dec 14 1994 Dec 14 1994 DEMO.VBD 26624 VBASEd 100 Dec 14 1994 Dec 14 1994 -------------------------------------------------------------------------- Total: 30720 bytes in 2 files. However, CSINFO will still say DEMO.vbi is a TBASE file and DEMO.vbd a VRAM file. 20.4 Prototypes. The class defintion and it's function prototypes are in "CSVBAXE.H". Part Four Part four discusses the low-level OLAY and DLAY classes. Basically, these classes work as a normal sequential file but with two whopping differences: insertions and deletions!! It also covers IBASE, which implements a database class with the ability to insert and delete records anywhere in its file. 21 OLAY 21.1 Introduction & Overview The OLAY class performs the same functions as a normal sequential file but with two major additions: insertions and deletions! This makes it possible to insert or delete data anywhere in the file. This is not done by copying the entire file, but by moving data 'around' inside. It takes a while to realise the potential of such a system! It seems to us that several, very basic, problems in computer programming are related to the limitations of the filesystem. E.g. wordprocessing would be a lot easier if you could simply add or delete every character/sentence directly into the file. In databases it would also be a great help, making it possible to delete a record strait away, instead of using a tag/pack technique. The OLAY class encapsulates the standard file system and adds insert & delete functionality. Still, through the public member functions, the data will appear as an contingious stream of bytes. 21.2 Buffering Just as the (other) database classes in this library, the OLAY class is derived from the PAGE class. This means it has a build in buffering system. 21.3 Performance Due to its build-in-buffering, the OLAY system normally outperfoms the traditional file system. However, the OLAY files are not 100% full. It depends on the type of application, but 70% effectively used disk space seems a typical value. 21.4 Core Functions The features of the OLAY class are implemented through a small set of functions called "core functions". Several more functions are discussed in the remaining parts of the chapter but these functions are not strictly necessary for using the OLAY class. They are merely implemented for convenience. The core functions produce the smallest and fastest code. This section will only discuss the core functions. These functions are: int define() To create an OLAY file. int open() To open it. U32 read() To read from it. int write() To overwrite existing data. int delet() To delete data. int insert() To insert data. int append() To append data. int close() To close the file. U32 filesize() To return the file size. U32 bottom() To return the last position in the file. The function prototypes are in "CSOLAY.H". First a working example. // Error checking omitted for conciseness. #inlude "iostream.h" #include "CSOLAY.H" void main(void) { char buf[100]; // A text buffer. OLAY db; // An instance of the OLAY class. db.define("demo.fil"); // Creating the file. db.open("demo.fil",100);// Open the file. // Use 100 Kb ram for buffering. strcpy(buf,"Some chunk of data"); db.append(buf,strlen(buf)+1); // The file is empty. // Append some data. db.insert(5," larger",7); // Insert 7 bytes at position '5'. // (The first byte is at position '1'.) db.read(1,buf,db.filesize()); // Read everything back. cout<<buf; // Displays: Some larger chunk of data } This program will create the file 'demo.fil' on your harddisk. Only the OLAY class can make sense out of it. E.g. the DOS command 'type' will only produce garbage. The OLAY files have to be treated as any other database file. That is: use them with the application they belong to, nothing else.21.4.1 Creating The OLAY class requires a file to be explicitely created before it can be opened. int define(char *name); This creates the file 'name' on your harddisk and inserts the correct header block. If the file already exists, it is overwritten! The function returns TRUE on success and FALSE otherwise. 21.4.2 Opening Before an OLAY file can be used it has to be opened. The open function does not distinguish between reading, writing or appending. int open(char *name,U16 kb_buf=30); This opens the file 'name' using 'kb_buf' Kb ram for buffering. 'kb_buf' has a default value of 30 Kb. The function returns TRUE on success and FALSE otherwise. 21.4.3 Reading and Writing If you like, you can think of the OLAY class as a database with records of only one byte. The first record/byte is, as always, at position 1. Fortunate, we are not forced to read and write only one byte at the time. U32 read(U32 pos,void *p,U32 length); Reads 'length' bytes, starting of from position 'pos'. The data is copied to pointer 'p', which should be pointing to a buffer large enough to hold 'length' bytes. If the end of file is reached before 'length' bytes are read, the copying process stops without an error. The function returns the number of bytes actually copied to 'p'. int write(U32 pos,void *p,U32 length); This function writes, or to be more precise, overwrites 'length' bytes starting off from position 'pos'. The data is copied from 'p'. The already present data is overwritten. Write() can not append data to the file. Trying to write more data then exists between 'pos' and end-of-file is an error. The function returns TRUE on success, FALSE otherwise. 21.4.4 Insert & Delete The beauty of the OLAY class lays in its ability to instantly insert or delete data in/from its file. Inserting and deleting also implies the remaining data in the file changes position. E.g. if you insert 10 bytes at position 5 the data which was originally at position 120 is now at 130! S32 delet(U32 pos,S32 length) Deletes from position 'pos' 'length' number of bytes. The position 'pos' itself is also deleted. Remember: the first byte is postition 1. If 'length' is less then or equal to 0 the function returns 0. When an attempt is made to delete more data then is left in the file, all the remaing data will be deleted. The function returns the number of bytes actually deleted. int insert(U32 pos,void *buffer,U32 len) The insert() function insert 'len' number of bytes copied from 'buffer' at position 'pos'. The byte at position 'pos' itself is also moved. This means that a call to insert() with 'pos' equal to 1 inserts new data before ALL other data. // Example // Error checking omitted for conciseness. // This program will display the string // 'Led Zeppelin' on the screen. void main(void) { char buff[200]; OLAY db; db.define("test.dbf"); db.open("test.dbf",40); strcpy(buff,"Zeppelin"); db.append(buffer,strlen(buffer)+1); //Write terminating zero also. strcpy(buff,"Led "); db.insert(1,buffer,strlen(buffer)); //Insert before everything. db.read(1,buff,db.filesize()); //Read it all back. puts(buff); } 21.4.5 Filesize & bottom U32 bottom(void); The function returns the position directly after the last byte in the file: the first 'free' position. If the OLAY file is empty, bottom() will return 1. U32 filesize(void); This function returns the number of bytes in the OLAY file. This is not the size of the file on disk, but the number of bytes that can be read by the 'read()' function. This value is equal to bottom()-1. 21.4.6 Closing The OLAY class does a lot of buffering. A close() function is needed to safe all the data to disk. int close(void); Closes the class and the associated file. All buffers are flushed and all allocated memory is freed. When needed, the function is automatically called by the class destructor. The function returns TRUE on success and FALSE otherwise. 21.5 Additional functions Next are some functions to 'make live easy'. They are not essential for working with the OLAY class. int writea(U32 pos,void *p,U32 len); The normal write() function cannot append data. This function can. It uses the write() function to overwrite excisting data and calls the append() function when data has to be added to the file. It (over)writes 'len' bytes starting of from position 'pos'. The data is copied from pointer 'p'. The function returns TRUE on success, FALSE otherwise. // Example // Error checking omitted for conciseness. void main(void) { OLAY db; int i=3; long l=4; db.define("example.dbf"); //Create an empty file. db.open("example.dbf",100); //Use 100 Kb for buffering. db.append(&i,sizeof(int)); //Append 'i'. (2 bytes) db.writea(1,&l,sizeof(long)); //Overwrite 'i' with 'l'. (4 bytes) //The normal write() cannot do this, //because data has to be appended! db.close(); } int replace(U32 pos,U32 old_len,void *buffer,U32 new_len); This function makes it possible the replace a block of data with a new block data of different size. (The new block can be smaller or bigger.) 'pos': the position of the first byte which is to be replaced. 'old_len': length of the chunck which needs to be replaced. 'buffer': pointer to the buffer which holds the new data. 'new_len': number of bytes which has to replace the original 'old_len' bytes. TRUE is returned on success, FALSE otherwise. int inserta(U32 pos,void *p,U32 len); An inline function which calls insert() or append() depending on the value of pos. Its purpose is to overcome a limitation of the basic insert() function which cannot properly handle inserts beyond the end of the file (which of course are in fact appends). The function inserts or appends data at position 'pos'. If 'pos' is equal to 'bottom()' it calls append(), otherwise insert(). 'len' bytes are copied from pointer 'p'. TRUE is returned on success, FALSE otherwise. 21.6 Import & Export The OLAY class stores its data in a format that is not compatible with anything else. Therefore two sets of functions are available to convert to-and-from a normal sequential file. int export_bin(char *name); int export_asc(char *name); int export(char *name,int bin_mode=TRUE); Exports all data to the file 'name'. 'Name' will be a normal sequential file. If it already exists it will be overwritten. If not, it is created. The variable 'bin_mode' controls the mode in which 'name' is opened. 'Bin_mode' equal to TRUE, as is the default, will open the export file in binary mode. A value of FALSE opens it in ascii mode. In addition, two inline functions are defined, export_bin() and export_asc(), which call import with bin_mode set to respectively TRUE and FALSE. The status of the OLAY class will remain unchanged. TRUE is returned on success, FALSE otherwise. int import_asc(char *name); int import_bin(char *name); int import(char *name,int bin_mode=TRUE); The import function appends the data from the file 'name' to the current OLAY system. The file 'name' can be opened in binary or in ascii mode, controlled by the parameter 'bin_mode'. 'Bin_mode' equal to TRUE, as is the default, will open the export file in binary mode. A value of FALSE opens it in ascii mode. Two inline functions are defined, import_asc() and import_bin(), which call the import() function with 'bin_mode' set to respectively FALSE and TRUE. TRUE is returned on success, FALSE otherwise. // Example // Error checking omitted for conciseness. OLAY db; db.define("example.bin"); // Start off with a new file. // (Not a prerequisite for applying // the import() function.) db.open("example.bin",300); // Open the file. db.import_bin("somefile.bin"); // Load it with the data from // 'somefile.bin', assuming this // exists. db.close(); // Close the OLAY class. 21.7 Sequential functions Completely independent of the functions discussed sofar, another set of functions is implemented which follows, as closely as possible, the standards set by the ANSI committee. Again, these functions are not strictly necessary to use the OLAY class, but they have some advantages when traversing a file sequentially. Therefore they are called 'sequential functions'. These functions are build around file pointer'. This file pointer is automatically moved forward with the amount of data read or written. The OLAY class itself doesn't use a file pointer, it has no need for it. To implement the sequential functions, a file pointer is simulated by a variable. This variable is referred to as VFP', which is short for virtual file pointer'. However, the OLAY class is capable of using the VFP to optimize the process of locating a particular byte. In other words. ( E.g. it is easier for the OLAY class to locate position 'p' if it already knows where position 'p-1' is. ) Only the sequential functions use the VFP. All the other functions DO NOT use it and consequently don't update it! Because of this, it's propably best not to try mixing the sequential functions with the others. But if you do, you have to reposition the VFP by calls to the fseek() function every time you have called a function which does not belong to the set of sequential functions! The sequential functions are: int fseek() // To position the VFP. long ftell() // To return the position in the file. int feof() // To test for end-of-file. int fgetc() // To read a character. int fputc() // To write a character int fread() // To read blocks of data. int fwrite() // To write blocks of data. char *fgets() // To read strings. int fputs() // To write strings. long fdelete() // To delete data. int finsert() // To insert data. void fflush() // To flush the buffers. There are no sequential functions for opening, closing or creating the file. You still have to use the 'normal' open(), close() and define() functions for that. 21.7.1 Sequential functions in alphabetical order long fdelete(long amount); Its working is fully equivalent to the 'delet()' function, except in this case the position from which the data is deleted is controlled by the virtual file pointer. The first byte deleted, is the one pointed at by the VFP. The VFP remains unchanged. The function returns the number of bytes actually deleted. int feof(void); Inline function which returns TRUE if the VFP is beyond the last byte, and FALSE otherwise. Because moving the VFP beyond 'bottom()' produces a runtime error, this can only mean that the VFP points exactly to 'bottom()'. // Example // Error checking omitted for conciseness. OLAY db; char c; db.open("example.dbf"); // Accept the default 30 Kb for buffering. db.fseek(0); // Don't assume the VFP is set. while(!db.feof()) // Read until the end. { c=db.fgetc(); putchar(c); } db.close() // If omitted, called by the class destructor. void fflush(void); All dirty buffers are written back to disk. It is important to understand that this is all it does. Afterwards the file on disk is still in an undefined state. Only the close() function produces a disk file which is valid input for the next call to open(). Mainly implemented for completeness. int fgetc(void); The function reads an unsigned char and returns this as an integer. If the end of the file is reached, fgetc() returns -1. NOT EOF. char *fgets(char *str,int num); The function reads up to 'num'-1 characters and copies them to 'str'. Bytes are read until a newline character is encountered or end-of-file reached. Upon success 'str' is returned, otherwise a NULL pointer is returned. The VFP is increased with the number of bytes read. int finsert(void *buf,long amount); Its working is fully equivalent to the 'insert()' function, except the position at which the data is inserted is controlled by the virtual file pointer. The VFP will be increased with the number of bytes inserted. TRUE is returned on success, FALSE otherwise. int fputc(int character); The function accepts an integer which is converted into an unsigned character and written to the position indicated by the VFP. (So, only one byte is written.) If the VFP points beyond the last byte the character is appended, otherwise it overwrites the existing value. The VFP is increased with one byte. The return value is the value written. int fputs(char *str); It is an inline function which calls fwrite(). The contents of string 'str' is written to disk. The terminating zero however is NOT written. The VFP is increased with strlen(str). TRUE is returned on success, FALSE otherwise. int fread( void *buf,int size,int count) It reads 'count' number of blocks, each of size 'size'. The data is copied into 'buf'. The function returns the number of blocks actually read. This differs from 'count' when an error has occurred or the end of file was reached. The total amount of data read can exceed 64Kb. The VFP is increased with the amount of data copied to 'buf'. TRUE is returned on success, FALSE otherwise. int fseek(long offset,int origin=SEEK_SET); Its purpose is to position the VFP. Depending on the value of 'origin' offset is taken from: a) the beginning; origin=SEEK_SET b) the current position; origin=SEEK_CUR c) the end; origin=SEEK_END Fseek() has a default value for 'origin' of SEEK_SET. Note that 'offset' indeed means offset and NOT position. Fseek(2) makes the VFP points to the third byte! The next two examples both read the first 10 bytes. Example 1: db.fseek(0); db.fread(buffer,10,1); Example 2: db.read(1,buffer,10); Fseek may be used to move the VFP one byte beyond the end. This makes the VFP points precisely to 'bottom()'. Trying to move beyond that is an error. Also, the VFP can not be positioned before the beginning of the file. When 'origin' is set to SEEK_END the value of 'offset' needs to be positive. Fseek(0,SEEK_END) makes the VFP points to 'bottom()'. TRUE is returned on success, FALSE otherwise. long ftell(void); It is an inline function which returns the number of bytes the VFP is removed from the beginning of the file. E.g.: If the VFP points to the very first byte, ftell will return zero. int fwrite(void *buf,int size,int count); It writes 'count' number of blocks, each of size 'size'. The function returns the number of blocks actually written. This differs from 'count' only when an error has occurred. The total amount of data written can exceed 64K. The VFP is increased with the amount of data written to disk. 21.7.2 Miscellanious functions int already_open(void); This function returns 1 if the class is 'open' and 0 otherwise. int data_2_header(void * ptr,U16 length); Inherited function. int empty(void); Makes the file empty. The class needs to be open and will still be open afterwards. Upon return the system will contain 0 bytes data and bottom() points to 1. TRUE is returned on success, FALSE otherwise. int header_2_data(void * ptr,U16 length) Inherited function. void header_page_size(U16 n) Inherited function. U16 max_data_in_header(void) Inherited function. int pack(void) After a long serie of insert's and/or delete's the data in the OLAY file can become scattered. Some pages will contain relatively few data while other pages are still 100% filled. To put everything back in order it is adviseable (although not strictly necessary) to call the pack() function once in a while. The pack() function uses a temporary file. This file will be about the same size as the OLAY file. So, make sure sufficient free disk space is available. If the free space is inadequate, the function will return 0, the temporary file is removed, and the OLAY file will be unaltered. void page_size(U16 t); Inherited function. 22 DLAY The DLAY class performs the same functions as the previously discussed OLAY class. DLAY however, is meant for far larger files. Both classes need a complex datastructure to locate a specific byte in the file. The OLAY class is keeping this datastructure in ram while DLAY is storing it on disk in the same file it uses for the data. As a consequence DLAY can handle files of 'unlimited' size where as OLAY files should be kept below 5 Mb. Because it has its datastructure in ram, OLAY is somewhat faster then DLAY. So: - Use OLAY for files below 5 Mb. - Use DLAY for files above 5 Mb. - If you are short on ram, use DLAY. - If you are not sure, use DLAY. 22.1 Performance DLAY perfoms quite well. To test its usefulness as 'datastructure' for an editor or wordprocessor, the class has been tested on a 100 Mb file. No matter the typing speed, the DLAY class was able to individually insert every typed character in the middle of his file! (Tests done on a 486DX2 66Mhz.) 22.2 Member functions The public member functions of the DLAY class are 100% indentical to those of the OLAY class. Please, refer to the the documentation of the OLAY class for more details. The function prototypes are in: CSDLAY.H. 23 IBASE 23.1 Introduction IBASE is a class similar to TBASE. That is: a easy to use class for reading and writing records, without indexes. The 'I' in IBASE stands for 'insert'. Contrary to TBASE which can only append a record, IBASE can insert a record anywhere in its file. The IBASE class is derived from DLAY, which explains why it is in this section of the documentation. If you have a file-system which can insert and delete, then a database system which can insert and delete records is all at a sudden easy to implement! Deleting records no longer requires the dreaded tag/pack sequence, but can be accomplished instantaneously. 23.2 Using IBASE Using IBASE very much follows the same lines as using TBASE. Deleting is now instantaneous, and records can be inserted. The performance of IBASE is nowhere near that of TBASE. If speed is an issue, you should consider using TBASE. 23.3 Using IBASE IBASE works very much like TBASE. Please, read the documentation on TBASE (chapter 15 ) also. Because the classes are so much alike, IBASE will be discussed in far less detail. 23.3.1 Creating int define(CSCHAR *name,U16 reclen); Creates the IBASE file 'name' for use of records with length 'reclen'. TRUE is returned on success, FALSE otherwise. 23.3.2 Opening int open(CSCHAR *name,S16 kb=32); Opens the IBASE file 'name' for use. It will use at most 'kb' Kb ram for buffering. 'kb' has a default value of 32. TRUE is returned on success, FALSE otherwise. int open(void); This function returns TRUE if the class is already opened. FALSE otherwise. 23.3.3 Appending Records S32 append_rec(void *data); Appends record 'data' to the database. The function returns the record number of the newly added record. E.g. if a record is added to an empty database the function will return 1. Don't forget; the first record is record '1'. S32 append_rec(void); Extends the database with one record. Contrary to the previous function this one doesn't fill the new record with data. Time being, the record will contain garbage. 23.3.4 Reading Contrary to TBASE, IBASE doesn't have 'locate' functions. The reason is that a record can now be scattered over two (or even more) different database pages. When such a record is in the buffer system it will no longer be laying on contingious memory addresses, making it impossible to access the record through a single pointer. void read_rec( S32 rec, void *data); Reads record 'rec' into 'data'. 'data' should be a buffer large enough to hold the record. 23.3.5 Writing void write_rec( S32 rec, void *data); Overwrites the existing record 'rec' with the record pointed to by 'data'. Record 'rec' must already exist. The function can not be used to add records. 23.3.6 Inserting Inserting a record means just that. A record is inserted between two existing records. Let's say you already have a record '2' and a record '3' but want a new record in between, because that's where it should be according to the alphabet. The traditional approach is to add the record at the end of the database and to maintain an index for the alphabetical order. But now, thanks to the DLAY class, we can do without the index. The record can directly be inserted at its correct position. int insert_rec_b(S32 rec,void *p); Insert a new record before record 'rec'. That is: the new record will become record number 'rec' and the old record 'rec' becomes record number 'rec'+1. The pointer 'p' points to the data of the new record. TRUE is returned on success, FALSE otherwise. int insert_rec_a(S32 rec,void *p); Insert a new record after record 'rec'. That is: the new record will become record number 'rec'+1. The old record 'rec' will stay at its place. The pointer 'p' points to the data of the new record. TRUE is returned on success, FALSE otherwise. 23.3.7 Deleting Deleting records is now done instantaniously. E.g. the moment you delete record '8', the old record '9' will become the new record '8' and so on. In IBASE, there is no such thing as a 'delete bit'. void delet(S32 rec); Deletes record 'rec'. No consecutive 'pack()' is needed. 23.3.8 Closing int close(void); Closing the database. All buffers are written back to disk, all allocated memory is freed. TRUE is returned on success, FALSE otherwise. 23.3.9 Miscellaneous functions For the next functions to work properly, the database has to be opened. U16 lengthrec(void); Returns the length of a record. This is the same value as used in the call to the define() function, when the database was created. S32 numrec(void); Returns the number of records currently in the database. int pack(void); This is the pack() function inherited from the DLAY class! This function has nothing to do with deleting records. Its purpose is to compress the file DLAY uses to store its data. TRUE is returned on success, FALSE otherwise. int empty(void); Makes the database empty. Upon function return the database will contain zero records but the database is still open. TRUE is returned on success, FALSE otherwise. Part Five Part Five will present some command-line utilities. Most noticeable CSDIR, which gives a quick survey of the databases in the current directory. It also discusses the demonstration application CSADD. CSADD is a DOS application to store addresses. 24 CSDIR CSDIR is a command-line utility similar to the well-known MS-DOS dir. It's purpose is to list the CS-databases. By default it ignores all other files. SYNTAX: csdir [filename] [/A] [/?] filename: the file(s) to be listed. Wildcards allowed. /A List all files. /? Display help. Example of its output: c:\bin\address>csdir Directory C:\BIN\ADDRESS\ Name Size Type Entries Created Updated -------------------------------------------------------------------------- CSADR.DBF 98382 TBASE 298 Sep 20 1994 Oct 31 1994 CSADR01.IDX 40960 BTREEa 403 Oct 29 1994 Oct 31 1994 CSADR02.IDX 10752 BTREEa 104 Oct 29 1994 Oct 31 1994 CSADR03.IDX 4608 BTREEl 28 Oct 29 1994 Oct 31 1994 CSADR04.IDX 5120 BTREEa 22 Oct 29 1994 Oct 31 1994 -------------------------------------------------------------------------- Total: 159822 bytes in 5 files. As can be seen from this example, CSDIR displays: - the name of the class involved. - the number of entries in the database. - in case of a btree, the number of different keys. If the same key is entered twice, it is counted as one entry. - date of creation. - date of last update. Example of the /a option. c:\bin\adres>csdir /a Directory C:\BIN\ADRES\ Name Size Type Entries Created Updated -------------------------------------------------------------------------- ADRES.EXE 137872 DOS Oct 29 1994 CSDEMIO.DEF 277 DOS Apr 17 1994 BACKUP.TXT 34478 DOS Oct 29 1994 CSADR.DBF 98382 TBASE 298 Sep 20 1994 Oct 31 1994 CSADR01.IDX 40960 BTREEa 403 Oct 29 1994 Oct 31 1994 CSADR02.IDX 10752 BTREEa 104 Oct 29 1994 Oct 31 1994 CSADR03.IDX 4608 BTREEl 28 Oct 29 1994 Oct 31 1994 CSADR04.IDX 5120 BTREEa 22 Oct 29 1994 Oct 31 1994 ERROR.ERR 12964 DOS Oct 27 1994 -------------------------------------------------------------------------- Database files: 159822 bytes in 5 files. Other files: 185591 bytes in 4 files. -------- + --- + Total: 345413 bytes in 9 files. Another example: c:\bin\adres>csdir cs*.* /a Directory C:\BIN\ADRES\ Name Size Type Entries Created Updated -------------------------------------------------------------------------- CSDEMIO.DEF 277 DOS Apr 17 1994 CSADR.DBF 98382 TBASE 298 Sep 20 1994 Oct 31 1994 CSADR01.IDX 40960 BTREEa 403 Oct 29 1994 Oct 31 1994 CSADR02.IDX 10752 BTREEa 104 Oct 29 1994 Oct 31 1994 CSADR03.IDX 4608 BTREEl 28 Oct 29 1994 Oct 31 1994 CSADR04.IDX 5120 BTREEa 22 Oct 29 1994 Oct 31 1994 -------------------------------------------------------------------------- Database files: 159822 bytes in 5 files. Other files: 277 bytes in 1 files. -------- + --- + Total: 160099 bytes in 6 files. 25 CSINFO CSINFO is a command-line utility to display information about a particular database. It only recognizes the databases made with the CSDB-library. An example of its output: c:\adres>csinfo csadr01.idx Information about database: csadr01.idx. Type..................: BTREEa Version...............: 1.1.b Class compiled at.....: Apr 25 1994, 04:28:24 With..................: Borland C++ 3.1 NOTE: The above information refers to the version of the class used during the CREATION of the database file. Btree created at......: September 20 1994, 10:02:11,47 Btree last updated at.: September 26 1994, 23:25:19,96 Multiple keys allowed.: YES Number of keys........: 622 Number of blocks......: 111 Block size............: 511 bytes Key size..............: 41 bytes Data size.............: 4 bytes Data degree...........: 10 Index degree..........: 10 Number of levels......: 4 26 CSERROR Normally all the errors are read from the file 'error.err'. It has to be in the current working directory or it cannot be found. Using a runtime error file produces smaller executables because the error messages are not linked in. However, the error file is not kept open all the time and for opening a file, some dynamic memory allocations have to be done. This can lead to problems when the error message that has to be displayed results from an 'out of memory' condition. (It needs memory to say 'there is no more memory'.) To overcome this, and other problems, the command-line util CSERROR can be used. It generates a C source file which, when compiled and linked in, makes the runtime error file redundant. Example: c:\borlandc>cserror error.err This will produce a file 'error.cpp' in the current directory. Compile this and link it in with the rest of your application. Make sure it's linked in before the libraries. In this way the csmess_read() function which is in the 'error.obj', will replace the one in the library. // Example of how the resulting 'error.cpp' file could look: // Many errors are left out. #include "csmess.h" char *_csa_error[]= { "Error 9370: TBASE: %s Can't write report file %s. Disk full?", "Error 9390: TBASE: %s Out of memory during pack().", "Fatal Error 9545: PAGE: %s Header_2_data(): can't perform fseek.", "Fatal Error 9550: PAGE: %s Write_header: can't perform fwrite.", "Fatal Error 9555: PAGE: %s Header_2_data(): can't perform fread.", "Fatal Error 9560: PAGE: %s Can't open file during definition.", "Error 9562: PAGE: %s Can't open report file %s.", "TheEnd" //THIS HAS TO BE THE LAST LINE!! }; ///////////////////////////////////////////////////////////////////// char *csmess_read(long error) { char tmp[25]; ltoa(error,tmp,10); char **p=_csa_error; for(;;) { if(strstr(*p,tmp)) return *p; if(!strcmp(*p,"TheEnd")) return NULL; p++; } Notice the 'TheEnd' line, which was not in the original 'error.err' file. Never remove that line! 27 CS4DBASE 27.1 Introduction The database classes generated by CSDBGEN cannot import a dBASE file directly. However, they do have a function to import an ASCII file. This function is called import(), and it expects its input file to be in a specific format. CS4DBASE is a command-line utility which is able to convert a dBASE file into a format required by the import() function. The generated intermediate ascii file can be manually edited to adjust field names. 27.2 Converting Example: c:\bin\dbase> cs4dbase person.dbf Assuming there is a dBASE file 'person.dbf', this will produce the ascii file 'person.txt'. On top of the file are some lines indicating field names and types. The import() function need this information to correctly parse the remainder of the file. In the example ('person.txt') it looked like this: Class: CONV Record: CONV_rec File: conver.dbf field: NAME s 40 field: ADRE s 32 field: CITY s 35 field: UPDATED d Museum Langeveld Langevelderweg 27 Noordwijkerhout 1994/02/18 The Truck Giant Goeverneurlaan 471 Den Haag 1993/02/16 The lines before the first resemble a 'database definition file'. The import() function ignores the first three lines but it uses the 'field' lines. They have to match the names of the fields in the database definition file which generated the import() function, or these fields will be skipped. If these names do not match, it maybe necessary to manually edit the field names in the ASCII file. The order of the fields never matters. 27.3 Example Suppose the next database definition file, 'person.def', is used to generate the class PERSON. Class: PERSON Record: PERSON_rec File: pers.dbf field: NAME s 40 T field: HOBBY s 32 Y field: PHONE s 15 field: UPDATED d MDY2 From this you created a PERSON class by calling CSDBGEN. c:\test> CSDBGEN person.def Now you want to import some data from an old dBASE database called 'member.dbf'. However, 'member.dbf' has different fields. The dBASE 'display structure' command reveals something like this: Structure for database : C:member.dbf Number of data records : 3 Date of last update : 04/19/95 Field Field name Type Width Dec 1 MEMBER Character 30 2 TELEPHONE Character 13 3 INTERESTS Character 50 ** Total ** 94 Despite the differences, call CS4DBASE. c:\test>CS4DBASE member.dbf This produces the file 'member.txt'. Class: CONV Record: CONV_rec File: conver.dbf field: MEMBER s 30 field: TELEPHONE s 13 field: INTERESTS s 50 Rudolf Mandrake 0592-24-2379 Ancient Building Techniques Rachel Labrosse 913-814-1378 Extraterrestrial Encounters Victor Plauger 312-241-2808 Home Gardening It doesn't make much sense to import this file. All the field names are different so the import() function will skip each and every line. To alter all that, 'import.txt' has to be edited to make the field names match those in 'person.def'. After editing it should resemble something like: Class: CONV Record: CONV_rec File: conver.dbf field: NAME s 30 field: PHONE s 13 field: HOBBY s 50 Rudolf Mandrake 0592-24-2379 Ancient Building Techniques Rachel Labrosse 913-814-1378 Extraterrestrial Encounters Victor Plauger 312-241-2808 Home Gardening Note that de order of the fields is still not the same. However, import() is smart enough to overcome that obstacle. Now, 'import.txt' can be savely loaded. The following code could be used for that. #include "person.h" void main(void) { PERSON pers; // pers.define(); // Uncomment this to create the database. pers.open(); pers.import("import.txt"); pers.close(); } 27.4 Importing large databases As can be seen from the previous example, it is sometimes required to load the entire ASCII file in an editor to change the field names. This is fine for small databases, but when the number of records increases the ASCII file can become too large. Solution: c:\test>CS4DBASE member.dbf /2 Note the '/2' at the end of the command. This instructs CS4DBASE to generate two files instead of one, namely 'member.def' and 'member.txt'. 'member.def' will contain: Class: CONV Record: CONV_rec File: conver.dbf field: MEMBER s 30 field: TELEPHONE s 13 field: INTERESTS s 50 and 'member.txt': Rudolf Mandrake 0592-24-2379 Ancient Building Techniques Rachel Labrosse 913-814-1378 Extraterrestrial Encounters Victor Plauger 312-241-2808 Home Gardening The only things which need editing are the field names, and they are conveniently placed together in one small ASCII file! When the editing is done, the two files have to be joined again to create valid input for the import() function. Like this: c:\test>copy member.def+member.txt member.asc 'member.asc' can now be read by the import() function! Part Six Part Six discusses the classes and functions implemented in the CSA-library. They have nothing to do with databases so you can use or ignore them at will. However, two chapters may require some attention. Alloc-logging which deals with heap corruption and memory leaks. The HEAP class for efficiently allocating large numbers of small blocks. 28 CSTOOLS 28.1 Introduction Cstools is a collection of odds & ends, merely intended to support the other classes, but if you see something to your liking, please feel free to use it. The function prototypes are in cstools.h. int add_path(char *filen,char *path); Adds path 'path' to filename 'filen'. Afterwards 'filen' contains the new name. It returns TRUE if successful, FALSE otherwise. int csrand(int amount); U32 csrand(U32 amount); Returns a VERY random number in the range 0..(amount-1), including both 0 and (amount-1). int cstmpname(char *name); Generates the name of a non-existing file in the 'temp' directory. It first searches for the environment variable 'TMP' and if not found for 'TEMP'. A filename is generated which does not already exist in this directory. The function has to be called with a parameter 'name' pointing to a buffer large enough to hold the complete drive, path and filename. If non of the evironment variables exist, a filename for the current directory is produced. It only generates a filename, no file is actually created. The function returns TRUE if a unique filename was found, FALSE otherwise. int disk(char *s); Sets the current drive and path as indicated by string 's'. If 's' is the empty string, 's' is set to the current drive and path! It returns TRUE if successful, FALSE otherwise. void empty_kb(void); Empties the keyboard buffer. int file_exist(char *fnaam); Returns TRUE if file 'fnaam' exists. char *file_ext(char *name,char *ext); Adds an extension to filename 'name'. It returns a pointer to an internal buffer which contains the new name. If 'name' already has an extension, it is overwritten. The string 'name' itself is not changed! long filesize(char *name); Returns the size of file 'name'. void filter_string(char *source,char *allowed); All the characters in 'source' which are not in 'allowed' are removed from 'source'. void lower_upper(char *ptr); Converts the entire string to upper case. long lrandom(long amount); Returns a long random number in the range 0..(amount-1), including both 0 and (amount-1). size_t next_prime(size_t pri); Calculates next higher prime number. char *notabs(char *s); Replaces every occurrence of a tab character in 's' with a single space. That is: tabs are not expanded, but simply removed. char *remove_space(char *s); Removes ALL the blanks from the string 's'. It returns character pointer 's'. unsigned int sqrti(unsigned int n); unsigned long sqrtl(unsigned long n); Calculates the sqrt from n WITHOUT USING FLOATING POINT arithmetic. void str_split(char *source,char ch,char *first,char *last); Split 'string' source at the first occurrence of character 'ch'. Ch is included in neither the 'first' nor the 'last' string. void str_strip(char *source,char *remove); Characters in 'source' which are also in 'remove' are removed from 'source'. int str_equal(char *s1, char *s2); Returns TRUE if 's1' is equal to 's2', discriminating between upper and lower case. void str_left(char *source,char *dest,int len); Copies at most 'len' number of characters from the left of 'source' to 'dest'. char *string_replace_ones(char *source,char *d,char *r); Replaces the first occurrence of 'd' in 'source' with 'r'. int string_replace(char *s,char *d,char *r); Replaces every occurrence of 'd' in 'source' with 'r'. It returns an integer number indicating the number of times a substitution was made. long time_stamp(void); Returns a higher long number on each successive call, starting with zero again when MAXLONG is reached. void trim_string(char *s); Removes heading and trailing blanks from string 's'. void wait(long msec); Waits 'msec' milliseconds. void waitkb(long msec); Waits 'msec' milliseconds or until the next keyboard hit. 29 CSKEYS Almost all the input for the library functions is done through the cskey() function. Syntax: int cskey(void); The return value can be one of the following symbolic constants: (defined in CSKEYS.H ) CTRL_A KEY_A KEY_a ALT_A DELETE CURSOR_UP CTRL_B KEY_B KEY_b ALT_B END CURSOR_DOWN CTRL_C KEY_C KEY_c ALT_C HOME CURSOR_RIGHT CTRL_D KEY_D KEY_d ALT_D PAGE_UP CURSOR_LEFT CTRL_E KEY_E KEY_e ALT_E PAGE_DOWN CTRL_F KEY_F KEY_f ALT_F INSERT CTRL_G KEY_G KEY_g ALT_G BACKSPACE CTRL_H KEY_H KEY_h ALT_H TAB CTRL_I KEY_I KEY_i ALT_I SHIFT_TAB CTRL_J KEY_J KEY_j ALT_J ENTER CTRL_K KEY_K KEY_k ALT_K ESC CTRL_L KEY_L KEY_l ALT_L SPACE CTRL_M KEY_M KEY_m ALT_M CTRL_N KEY_N KEY_n ALT_N CTRL_DELETE CTRL_O KEY_O KEY_o ALT_O CTRL_HOME CTRL_P KEY_P KEY_p ALT_P CTRL_CURSOR_UP CTRL_Q KEY_Q KEY_q ALT_Q CTRL_CURSOR_DOWN CTRL_R KEY_R KEY_r ALT_R CTRL_CURSOR_RIGHT CTRL_S KEY_S KEY_s ALT_S CTRL_CURSOR_LEFT CTRL_T KEY_T KEY_t ALT_T CTRL_PAGE_UP CTRL_U KEY_U KEY_u ALT_U CTRL_PAGE_DOWN CTRL_V KEY_V KEY_v ALT_V CTRL_END CTRL_W KEY_W KEY_w ALT_W CTRL_X KEY_X KEY_x ALT_X CTRL_Y KEY_Y KEY_y ALT_Y CTRL_Z KEY_Z KEY_z ALT_Z F1 SHIFT_F1 CTRL_F1 ALT_F1 KEY_1 ALT_DELETE F2 SHIFT_F2 CTRL_F2 ALT_F2 KEY_2 ALT_HOME F3 SHIFT_F3 CTRL_F3 ALT_F3 KEY_3 ALT_CURSOR_UP F4 SHIFT_F4 CTRL_F4 ALT_F4 KEY_4 ALT_CURSOR_DOWN F5 SHIFT_F5 CTRL_F5 ALT_F5 KEY_5 ALT_CURSOR_RIGHT F6 SHIFT_F6 CTRL_F6 ALT_F6 KEY_6 ALT_CURSOR_LEFT F7 SHIFT_F7 CTRL_F7 ALT_F7 KEY_7 ALT_PAGE_UP F8 SHIFT_F8 CTRL_F8 ALT_F8 KEY_8 ALT_PAGE_DOWN F9 SHIFT_F9 CTRL_F9 ALT_F9 KEY_9 ALT_END F10 SHIFT_F10 CTRL_F10 ALT_F10 KEY_0 F11 SHIFT_F11 CTRL_F11 ALT_F11 F12 SHIFT_F12 CTRL_F12 ALT_F12 The predefined values for the 'normal' keys like 'A', 'a' or '1' are the same as the ASCII values. This means you are not forced the type things like if( KEY_A==cskey() ) .... but can also use: if( 'A'==cskey() ) ...... 29.1 CSKEYS.exe This simple command-line utility displays an integer value corresponding to the pressed key. It is the same value the cskey() funtion returns and can therefore be used to make additions to het list of symbolic constants. also a simple utility to test the return value of the cskeys() function. Syntax: c:\tmp>cskey To exit the program press CTRL-END. 30 DATE The csDATE class is implemented to aid in dealing with dates. It has build-in functions to convert from-and-to julian dates. There are also functions to read and write dates as strings. The julian-date routines require floating point, therefore the csDATE class tries to avoid these routines whenever possible. As a consequence, using the csDATE class doesn't necessarally means a floating point library has to be linked in, but it depends on the functions used. 30.1 Example A quick example to show where we are talking about: #include "iostream.h" #include "date.h" void main(void) { csDATE d; d.format(Y4MD); // Choose the Year/Month/Day format. d="1967/04/23"; // Set 'd' to April 23th 1967 d+=100; // Add 100 days. (Requires floating point) DATE e; // A new instance of the DATE class. e.now(); // Set 'e' to the system date. cout<<endl<<e-d;// Print the number of days in between. // (Requires floating point) cout<<endl<<(char *)e; // Print 'e'. } 30.2 Initialising The next function can be used to initialise a DATE instance: void month(int m); Sets the month to 'm'. Januari is 1, Februari is 2, etc.. void year(int y); Sets the year, using a 4 digit format. void day(int d); Sets the day. void now(void); Sets the date to the system clock. void julian(long j); Sets the date to the julian date 'j'. 30.3 Converting Strings The assignment operator is overloaded to be able to assign a string to the date instance. For this to work properly, the class has to know the format in which the date is represented. The format used has to be indicated by a call to the format() function. Example: void main(void) { csDATE date; date.format(DMY4); date="27/02/1994"; // Februari, 27th 1994 } In the header file 'CSDATE.H', constants are defined for the following formats: MDY2 MY2D Y2MD Y2DM DMY2 DY2M MDY4 MY4D Y4MD Y4DM DMY4 DY4M There is logic behind these formats! 'M' means Month 'D' means Day 'Y4' means Year with 4 positions 'Y2' means Year with 2 positions So: Y2MD means: Two positions for the year/month/day. MDY4 means: month/day/year four positions. When 'Y2' is used, years >75 are interpreted as 20th century, years<=75 as the 21st century! Example: #include "csdate.h" void main(void) { csDATE date; date.format(Y2MD); date="80/04/15"; // April 15th, 1980 date="20/04/15"; // April 15th, 2020 } The default format is DMY4. 30.4 Obtaining date info The next functions can be used to 'read out' information about the DATE instance. int week_day(void); Returns the day of the week. Monday=1, Tuesday=2, etc.. int month(void); Returns the month [1..12]. int month(csCHAR *); Returns also the name of the calendar month, [January,February...December] int year(void); Returns the year in 2 or 4 digits, depending on the format used. Remember, the default format is DMY4, which implies 4 digits. int year4(void); Returns the year with 4 digits, independent of the format chosen. long julian(void); Returns the julian date. operator char*(); Casts the date to a string with respect to the format used. Example: void main(void) { csDATE date; date.now(); date.format(DMY2) cout<<(char *)date<<endl; // Displays 25/04/95 } 30.5 Comparing dates All the comparison operators like, <=, != etc. are overloaded to compare instances of the csDATE class. The csDATE instances do NOT have to use the same formats for the comparison operators to work properly. Example: #include "iostream.h" #include "csdate.h" void main(void) { csDATE d1,d2; d1.now(); d2.format(MDY4); d2="01/01/2000"; if(d1>=d2) cout<<" The turn of the century!"<<endl; } 30.6 Arithmetic It is possible to apply simple arithmetic to the dates. That is, adding or subtracting days and subtracting one date from the other. This requires the use of julian dates and consequently the use of floating point. Example: #include "iostream.h" #include "csdate.h" void main(void) { csDATE d1,d2; d1.now(); d1+=100; // Add 100 days d1-=300; // Subtract 300 days. d2.format(Y4MD); d2="1999/04/20"; cout<<d2-d1<<endl; // Display the number of days in between. } 30.7 Miscellaneous int format(void); Returns the format used. int leap_year(void); Returns TRUE if the date is a leap-year, FALSE otherwise. int long_year(void); Returns TRUE if 4 positions are used to represent the year, FALSE otherwise. S32 sem_jul(void); 'sem_jul' is short for 'semi julian'. It converts each date into an unique 32 bits number, using the formula: year*512+month*32+day. This number is convenient for storing or comparing dates. void sem_jul(S32 l); The reverse of the previous function. It sets the DATE instance according to the 'sem_jul' number 'l'. int valid(void); Returns TRUE if the date is a valid calendar date, FALSE otherwise. 31 HEAP 31.1 Purpose Some type of applications allocate numberous small blocks from the heap. This is inefficient in terms of ram uitlization, can lead to heap fragmentation and is also slow. To overcome these problems this library contains a special HEAP class. The idea is to do allocations in chunks of about 2Kb and take the small amounts from that, when needed. This approach has considerable advantages. - You can release all allocated memory with just one function call instead of freeing many small blocks separately. - It is a lot faster because normal heap operations are relatively slow. - Heap efficiency is also improved. It is much easier for the heap to deal with relatively few allocations of about 2Kb then it is to deal with numerous small allocations. - It can save valuable memory. There is considerable overhead involved in using the heap. Apart from what you need, several additional bytes are used to 'pointer' the allocated blocks together. In addition, allocations are done in multiples of 16 bytes. This can lead to a situation where you need only 13 bytes while 32 bytes are used! 31.2 When to use it? The HEAP class is particularly useful when dealing with pointer structures in ram. Pointer structures are small, all of the same size and an application will probably use a lot of them. The BUFFER class described earlier in this documentation also uses the HEAP class. This means you can use it without enlarging your application. The HEAP class assumes allocations of a fixed size. This limits its usefulness but improves efficiency. It is very well possible to use more then one instance of the HEAP class in an application. It is feasible to use a different HEAP for every size of allocation needed. The class was designed with small allocations in mind, something below 50 bytes. It is doubtful whether the HEAP class still makes sense for allocations above 100 bytes. Summarizing: - Allocations have to be of a fixed size. - Allocations have to be small, below 50 bytes. - Many allocations of this type are going to take place. 31.3 Using HEAP. Using the HEAP class starts off with an initialization, stating the size of the allocations. Afterwards the class has to be 'opened'. From there on allocations can be made, and blocks can be free-ed again. When the work is done, the close or the zap function can be called to free all allocated memory. // Example #include "csheap.h" void main(void) { typedef struct { void *next; void *prev; int number; } pStruct; // A typical pointer structure. HEAP heap; // HEAP class instance. heap.init(sizeof(pStruct)); // Initialize it for the size of the // pointer structure. heap.open(); // Open the class so it can be used. pStruct *p,*q; p=(pStruct *)heap.malloc(); // Allocation. q=(pStruct *)heap.malloc(); // Allocation. p.next=q.prev=NULL; // Doing something. // Doing much more. heap.free(q); // Freeing q. heap.close(); // Finally finished. // Freeing all allocated memory. } 31.4 Functions in alphabetical order. The function prototypes are in CSHEAP.H. void close(void); Closes the class. All allocated memory is freed. The initialisation parameters are retained, which makes it possible to reopen the class without calling init(). This function is also called by the class destructor. void empty(void ); Frees all allocated memory, but the class remains open. void init(U16 alloc_size,U16 page_size=2048); Initializes the class. 'Alloc_size' is the size of the allocations needed. 'Page_size' is the size of the chunks which are going to be allocated from the heap. This parameter has a default value of 2048 bytes. int open(void); Opens the class. The 'init()' function has to be called first. The function returns TRUE on success and FALSE otherwise. void free(void *p); Frees the previously allocated block 'p' is pointing at. void *malloc(void); Allocates a block and returns a pointer to it. If no memory is available, a NULL pointer is returned. void zap(void); Frees all allocated memory and closes the class. The initialization parameters set by the init() function are reset. This means the class has to be initialized again before it can be reopened. 32 Alloc-Logging 32.1 Introduction Dynamic memory allocations can create problems which are difficult to trace. Therefore, this library contains a set of functions which can be used to replace the normal malloc() and free() functions. The replacements can be made to write a record to a log. This log can be used later to check for memory leaks. The replacements also test for things like freeing a NULL pointer or a malloc which returns NULL. Replacements are preprocessor commands which can be switched on and off with the preprocessor variable CS_DEBUG. If CS_DEBUG is not defined, the normal functions are called. 32.2 Replacements Replacements are available for the following functions: Prototypes in csmalloc.h. Function: Replacement: malloc csmalloc calloc cscalloc realloc csrealloc free csfree farmalloc csfarmalloc farcalloc csfarcalloc farrealloc csfarrealloc farfree csfarfree The use of the replacements is fully equivalent to the original. The allocations in the CS-libraries are always done by calling the replacements. The production version of the libraries was compiled without CS_DEBUG being defined, so the normal functions are used and are called without any additional overhead. When the debug version was compiled, CS_DEBUG was defined and as a result all the allocations done by the library functions can be logged! 32.3 Logging Logging of allocations can be switched on and off through the use of two functions. void alloc_logging(int TrueFalse); After a call to alloc_logging(TRUE) all the allocations are logged in the ASCII file 'malloc.log'. Calling alloc_logging(FALSE) switches the logging off. void alloc_logging(int TrueFalse,char *name); This is basically the same as the previous function but has the additional option of specifying the name of the log file. The next example displays a part of an allocation log. 4E79:0004 file csedst30.cpp line 8: malloc() 8 bytes 4E7A:0004 file csedst30.cpp line 8: malloc() 8 bytes 4E78:0004 file csedst30.cpp line 23: free() 4E78:0004 file csedst30.cpp line 8: malloc() 9 bytes 4E79:0004 file csedst30.cpp line 23: free() 4E7B:0004 file csedst30.cpp line 8: malloc() 22 bytes 4E7B:0004 file csedst28.cpp line 12: realloc free() 4E7B:0004 file csedst28.cpp line 12: realloc malloc() 4E7A:0004 file csedstr.cpp line 15: free() 4E7B:0004 file csedstr.cpp line 15: free() As can be seen, the first column displays the pointer involved, the second and third display the file and the line where the call was made. When an allocation is concerned its size is also displayed. Reallocs appear as two lines. 32.4 Memory Leaks. With a log like this it is easy to check for memory leaks. In fact, a command-line utility is supplied to check for that. It is called CSMALLOC. Only one parameter needs to be supplied: the name of the allocation log. Example c:\test>CSMALLOC malloc.log If it encounters a malloc which is not matched by a free, it displays the pointer involved. Like: UNMATCHED address: 29CC:0004 If all malloc's are matched by a free it says something like this: NO ERRORS encountered!! Number of addresses: 23 Lowest address: 29CC:0004 Highest address: 2EAE:0004 If malloc logging is kept on for longer periods, the log file can become extremely large. However, this poses no problem for CSMALLOC. If you are planning to use this method to detect memory leaks, it is essential to switch on the logging before the first allocation is done. Don't forget class constructors do allocations as well! 33 csSTR The class for manipulating strings. Instead of providing the formal syntax we will clarify things by supplying a large number of examples. Class name: csSTR. Prototypes are in CSSTR.H. #include "csstr.h" main(void) { csSTR str; str=" A test "; // Assign a string str.upper(); // Convert to upper case str.lower(); // Convert to lower case str.trim(); // Remove all leading and trailing blanks // str contains now: "a test"; str+=" At the end "; // APPEND at the end // str contains now: "a test At the end" int i=-345; str=i; // Convert the integer value to string // str contains now: "-345"; str="1001"; i=str; // Assign string to integer str="A line"; str.strip("ijkl"); // Stripping the characters i,j,k,l from // str. // Str now contains: "A ne"; str="A line"; str.filter("ijkl"); // Allow only the characters i,j,k,l in // str. // Str now contains: "li"; str="The quick brown fox"; str[4]='Q'; // Str now contains: "The Quick brown fox" csSTR str2="by C++ !"; str="Made possible "; str=str+str2; // Str: "Made possible by C++ !"; if( str<str2) .. if( str>str2) .. if( str<=str2) .. if( str>=str2) .. if( str==str2) .. // Comparisons are possible. }