home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
World of Shareware - Software Farm 2
/
wosw_2.zip
/
wosw_2
/
PASCAL
/
TBTREE16.ZIP
/
TBTREE16.DOC
< prev
next >
Wrap
Text File
|
1989-07-15
|
85KB
|
1,849 lines
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
TBTREE16
( Turbo BTree version 1.6 )
Yet another implementation of BTrees for Turbo Pascal 4.0/5.0/5.5
Copyright (c) 1988,1989 Dean H. Farwell II
All source code and documentation are considered to be shareware and are
copyrighted. The following rules of engagement apply to all documentation and
code which make up TBTREE16:
All users can and are encouraged to distribute this software freely. This
includes the uploading of this software onto applicable Bulletin Boards etc.
The entire TBTREE16 product must be distributed as a complete product. Pieces
can not be distributed individually. The documentation must be distributed
with the source code. No copyright statements may be modified.
All users can and are encouraged to use these utilities to develop and/or
enhance their private, shareware, or commercial software applications.
All users are encouraged to suggest enhancements, make suggestions and
generally lead to the development of a better product to the benefit of all.
Users may change the code and documentation as desired for their own use but
may not then distribute those changes.
Users may distribute these routines as part of another SHAREWARE package. For
instance, a user could develop a DBMS using this as a basis and can freely
distribute the source code provided that:
1. None of this documentation or code is changed
2. ALL code and documentation is distributed (not just parts)
3. The user is not selling or distributing this for profit
4. The user be registered.
The restriction that the code and documentation not be changed is to ensure
that I can maintain some control over version releases. I do not want several
versions running around that I had nothing to do with.
1
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
The bottom line is that you can not sell this or distribute it for profit
(except for a small fee not to exceed $10 to cover cost of mailing,
duplication and handling). A user developing a commercial application can not
distribute the SOURCE code or documentation but a user can use these routines
to compile a commercial application.
Users are encouraged to register their copy of this software. The
registration fee is a very nominal $25. You may be aware that this is an
increase in my earlier $15 fee. The justification I use is that the product
has greatly matured and I honestly believe that you are getting a good bang
for the buck. Also, for the $25 you will get all future upgrades mailed FREE.
I am no longer charging a $5 shipping and handling fee for each upgrade.
Anyone using this software to develop a commercial application must register
and state their intentions to use it in a commercial application. I will not
charge royalties (except for the registration fee). I just want to know what
people are doing with it. Also, if someone is distributing it as part of a
shareware product, the same restrictions apply.
Most of all I want any feedback that you have the time and desire to submit.
Suggestions, complaints, etc. are welcome. I especially want feedback on
errors. I do not know of any bugs (or I'd fix them) but I probably missed an
item or two. Send me notice immediately so that I can distribute a fix if
required. Since you have the source code, do your best to isolate the
problem. I'll try to take it from there.
To reach me use CompuServe ID - 73240,3335 or drop me a line at:
P.O. Box 4731
San Bernardino, CA 92409
(zip code is essential!!!!!)
My phone number is:
714-887-9847
For hopefully obvious reasons, I will not accept collect calls. Otherwise,
feel free to call as I enjoy hearing from users.
As a final motivation to register, all registered users will receive future
upgrades free (via mail - no postage due or anything!). I will normally mail
these well before I put the version out on CompuServe or anywhere else. The
registered users will, in effect, become beta testers to make sure that all is
well before I release to the world. By registering you will ensure that you
always have the latest version and is probably less than the cost to download
off of a bulletin board. I will do my best to not release trivial changes but
only significant upgrades.
Liability:
This software is not warranted. There are no warranties expressed or implied
as to the quality, correctness, or applicability of this software. The author
is not responsible for any loss or damages due to the use of any software or
documentation distributed as TBTREE16.
2
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
Product Description
TBTREE16 is version 1.6 of a shareware btree and database product for use with
Turbo Pascal 4.0, Turbo Pascal 5.0 and Turbo Pascal 5.5. It is intended to
give a user the capabilities to develop pascal applications for the creation
and manipulation of databases. This product is extremely powerful and FAST!!!
However, it is not the easiest thing to get a grasp on. If you understand the
concept of btrees or at least indexing then this will not be complicated to
use. I can just about guarantee that it is the best product of its type for
the money and maybe the best period. If you have any feedback on how I can
take the mystery out of using this please let me know. I also solicit any
feedback which will enhance the quality of the product.
I am presently working on a second product (albeit at a much more leisurely
pace than I had anticipated) which will be called TBASE and will use these
routines as the database engine but will have a much easier to understand
interface for code developers. It will use SQL like calls instead of those
used in this product. All registered users of TBTREE16 will be shipped a beta
version of TBASE as soon as one becomes available and will get the first whack
at using it. This in itself is worth the $25.
Although this product provides many of the same general database capabilities
as Borland's Turbo Pascal Database Toolbox it is not intended to be
compatible. There are several reasons for this. First, I do not want to
create any impression of any type of copyright infringement. Secondly, I
needed to implement things differently than they were implemented in Borland's
product to be able to do the enhancements I am planning. Thirdly, I wanted to
implement things that Borland does not. For instance, I support data types in
my indexes. Other differences will be more apparent as you read through this
documentation. Lastly, I did not have a copy of Borland's product when I was
doing this. This really started out as more of an academic exercise than
something that I was going to release for others. This turned out to be a
more extensive job than I thought (big surprise there, huh?). I figured that
I might as well make it available for others.
This product has been greatly enhanced since version 1.0 was released. As it
presently stands, it has the following features:
1. The ability to create data files and indexes for those files.
2. Any number of indexes can be used for a single data file.
3. Any number of data files can be used.
4. All data files and index files use a virtual memory scheme in which the
most recently used pages are kept in memory using the heap. This is
different than many other implementations which use a buffer or stack only
for index pages. The buffer is shared by all files. In this way separate
space does not need to be set aside for different files. Space utilization
is maximized and performance is enhanced.
5. All management of pages in memory is automatically controlled although the
user can exercise limited control if desired.
6. Routines are provided to dump the page buffer to the printer to facilitate
viewing pages if a user is curious or masochistic.
7. The size of the buffer is determined by the user and can be changed
3
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
dynamically as the program runs. For example a 128K buffer could be used
initially. If a user had a large heap requirement the buffer size could be
reduced. The buffer size could later be increased if desired. This can
all be done while the application is running.
8. Problems related to keeping track of numbers of open files are alleviated.
The user initially specifies the maximum number of files allowed to be open
at once and never has to address it again. Routines are provided to manage
the opening and closing of files dynamically. This increases performance
because files do not have to be continually opened and closed
unnecessarily.
9. All data types supported by Turbo Pascal are explicitly supported. In other
words, values do not have to be converted to strings. This is a great
improvement over previous systems (MY UNBIASED OPINION). For example an
index can be set up for type Byte, another for type ShortInt, another of
type String and even for type Real (but wait you say .. Reals in an
index?). See next note!
10. Indexes can be searched for equality, inequality, greater than, greater
than or equal to, less than, less than or equal to, and existence. In
other words you can ask for a list of all records for which some field of
type real is LE (less than or equal to) XXX where XXX is a var of type
real and XXX := 3.141592 etc. An index can also searched on a range of
values. For instance an index of type byte can be checked for a range >=
10 and < 20 etc. Indexes containing strings can be searched for partial
string matches. Partial string matches can be done three ways: find a
string which is at the beginning of a string, anywhere in the string, or
at the end of a string.
11. As physical records are deleted space is reused automatically on future
inserts. Files should never have to be reorganized to recover space
(unless a large number of deletes were performed without any inserts and
even in this case there is not really a good reason to reorganize unless
your disk is filling up).
12. Included with the package is a very crude but effective source code
printing routine which can be used to dump the code in a more usable
format. Specifically, it handles page breaks and prints a header of
useful information. It will also only print out the interface section if
you instruct it to do so.
13. The user does not need to perform any calculations or make any estimates
prior to data file or index creation.
14. Code was designed using an Object Oriented Design approach to a large
extent (Turbo 4.0 and Turbo 5.0 lend themselves to this approach much
more than Turbo Pascal 3.0 did). This is one reason that there are so
many units. Much of the code is reusable for other applications.
15. Sorts can be accomplished using any number of fields in any order of
precedence as determined by you. A Quicksort algorithm is used to give
high performance. Sort fields can be any type supported (Byte, Real,
Integer, String etc.).
16. Added the following set operations: Union, Intersection and Difference.
These allow easy queries on multiple fields. For example, retrieving all
personnel who have a last name starting with 'N' and who's job is
4
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
'MANAGER', etc. is greatly simplified. This enhancement is a definite
plus and not available in many other products.
17. Error handling added for trapping I/O errors
18. Capability to handle files with fixed size records and files with
variable sized records
List Of Files
The following files are provided as part of the TBTREE16 product:
Units - Source Code For All Units
BTREE.PAS - This unit contains the interface and implementation for
the btrees themselves. Although the implementation is fairly
complex, only a few routines are visible to the user and give the
user all the functionality required. The actual working of the
btree unit itself is not critical to the user. The user is provided
with routines to add an entry to an index, delete one entry from an
index, delete all entries with a matching key value from an index
and retrieve record numbers which match some selection criterion.
The ability to create an index is also provided. To delete an index
just delete the file using a routine provided in BTREE.PAS. The
BTREE unit is different than any other unit in the product in that
it consists of not one but four source files. See entry for
BTREE1.INC, BTREE2.INC and BTREE3.INC.
BTREE1.INC, BTREE2.INC and BTREE3.INC - These are additional source
files needed for the BTREE unit. I split up the source code for the
unit because the source code ended up greater than 64K bytes. The
three additional files are include files and will automatically be
compiled when BTREE.PAS is compiled.
BYTEDATA.PAS - This unit was added to version 1.4 for use with a
future product. It can be used to create concatenated indexes,
although I am still working on the specifics.
COMPARE.PAS - This unit contains no inherent object although a
couple of types are declared and put in the interface. It is made
of several routines which give the capability of comparing two
values of the same type or searching for a substring within a
string. This routine alleviates the need for the user to develop a
routine which can be used for comparisons. All Turbo Pascal 4.0
scaler types are handled. Strings and ByteArrays are the only
structured (composite) types handled. Several routines are used to
handle partial string searches.
FILEBUFF.PAS - This unit contains the interface and implementation
for a files open list. This list will contain information on all
files opened (using this unit). A user can specify how many files
can be on the list at once (how many the unit can allow to be open).
The unit takes it from there. From then on all requests to use a
file must be preceded by a call to the unit. The unit will
determine if the file is open and will open it if it is not. The
routine is exceptionally useful for alleviating the need to open and
close files for every operation. Continually opening and closing
files causes unacceptable performance problems. The file types
5
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
handled are only Untyped files and Text files. Typed files are not
currently handled. This is due to the fact that strong type
checking precludes the use of typed files (or at least I couldn't
break the code and figure out how to make it work. If anyone can
figure out how to make it work, it would make this even more
useful).
FILEDECS.PAS - This unit is nothing more than a collection of
declarations required to support several of the other units. There
is no code in the implementation section.
FILES.PAS - This unit contains the interface and implementation for
the manipulation of files. This fulfills the role of file manager
(mid level management of files as opposed to the low level
management found in FILEBUFF.PAS). It is the place where files are
created, deleted and where the existence of a file is checked. Most
of the routines are for internal use only. There are several
exceptions, however. See the unit documentation for details.
HEX.PAS - This unit contains the interface and implementation for
dealing with a hex array which is a 2 byte array used for storing a
hex character (such as 7F). A couple of routines are provided for
converting bytes and bits to hex array representations. It is
intended for internal use, but may prove useful for your specific
application, as well.
LOGICAL.PAS - This unit contains the interface and implementation
for dealing with logical (data) files and records. It also handles
the conversion between logical and physical records. In TBTREE16
all physical records (records stored on disk) are the same size (512
bytes). This is how all files can share the same page buffer (See
PAGE.PAS). However, the logical records (data records in the data
files) can be any size from one byte up to 65520 (although they will
not be that large in practice). This unit takes the user's data
records stored in a variable and puts them in the number of physical
records required. It also takes the appropriate data from physical
record(s) and puts it in the variable of the users choice. A great
deal of flexibility is available here. One caution however!
This unit only handles FIXED LENGTH LOGICAL RECORDS. All data
records (within one data file) must be of the same size (obviously
each data file can have a different logical record size). If
variable length data is required then the size of the logical record
is the largest number of bytes required as determined by the user.
A good example of this is a record containing a string declared as
String[20]. The logical record size is 21 bytes (one byte for the
length). 21 bytes should be stored and retrieved even though a
string such as 'howdy' is stored (howdy uses 6 bytes). Most products
of this nature work in the same way. This unit handles the creation
and deletion of data files as well. See unit for details. See also
the VLOGICAL unit information below. VLOGICAL is the other unit
which handles data records. It is an alternative to this unit and
is designed specifically for records of varying size.
LRECLIST.PAS - This unit contains the interface and implementation
to handle logical record lists. A logical record list is a concept
which is different than you will find in most other implementations
of btrees. When a query is done on an index, most btrees set a
cursor to the node where the first matching value is found
6
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
(equality). This is very efficient but does not lead to the
flexibility of my system. TBTREE16 builds a list of logical record
numbers corresponding to the logical records which meet the criteria
of the query. TBTREE16 can handle all boolean operations such as
equality, inequality, greater than, greater than or equal to, etc.
Therefore, you can get a list of all logical records which meet some
criteria and deal with that list independently of the btree. Once
you have the list you can compare it to other lists and do
intersections, unions, joins, etc. This gives you some of the power
of a real database management system. You can relate a file to
itself or to other files. This system is the basis of my future
work to bring true relational capabilities to programs written in
Turbo Pascal. It gives the capability to do set at a time
processing. Unfortunately, nothing is free. A query on a database
of 10000 logical records on existence will give a list of 10000
logical records numbers (40000 bytes required to store the list).
Lists like this take time to build. A fact of life is that it takes
time to process a lot of data in a large database. The user is
cautioned to structure queries in such a way that lists are not
huge. Queries on equality are not normally a problem (unless the
index provides little selectivity). Queries on inequality will be
faster if a range is used. No matter what type of query is used,
the page buffer alleviates a great deal of performance problems. An
alternative to using these logical record lists can be found in
BTREE3.INC.
MATH.PAS - This unit contains the interface and implementation to
manipulate bytes at the bit level. Only two routines are visible as
that is all that is required in TBTREE16. This unit has been redone
in assembly language to enhance performance. See the source listing
for further details.
MYPRINT.PAS - This routine is a collection of printer codes and
routines to print the codes. This is sufficient for my purposes and
is included only because I needed a couple of routines for
TP4PRINT.PAS and PAGE.PAS. This unit is not very elegant and will
need to be modified if it does not support your printer.
NUMBERS.PAS - This unit contains no code in the implementation. It
is simply a repository for often used types which are not part of
the original Turbo Pascal product.
PAGE.PAS - This unit contains the interface and implementation to
manipulate the page buffer. The page buffer is stored internally
on the heap and is used to temporarily hold physical records. The
buffer greatly enhances up the performance of the product. As the
program runs, physical records are created or are read in from disk.
As this happens, they are stored in the buffer. Therefore, the
buffer grows and shrinks as the demand for pages grows. You can set
the maximum size of the buffer. By doing this, you can ensure that
the buffer does not eat up all of the heap and leave nothing for you
own needs. To function the buffer can be set to as small as one
page (512 bytes). However, performance is going to be dismal. The
amount of buffer space to allocate is dependent on many factors and
the size needs to be played with to see what works best. One rule
stands though .. bigger is better! One interesting note - routines
are provided to change the size (both increase and decrease)
dynamically as often as desired. There are other routines available
7
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
as well. For example, a routine is provided to allow the user to
force a page to be written to disk immediately when it is changed
(stored in the buffer) as opposed to waiting until it gets swapped
out or all pages are written to disk. Immediately writing to disk
ensures that nothing gets lost on a system or program crash but it
has severe performance implications. Use of this is not normally
required, but the option is there and usable.
SETOPS.PAS - This unit contains routines that support set
operations. They facilitate queries using multiple fields from one
data file. This greatly reduces the workload on the programmer.
SORT.PAS - This unit contains three procedures for sorting. This
unit was was created in version 1.2 and completely reworked in
version 1.5. It uses a quicksort algorithm to provide good
performance. Just as importantly, it supports all data types
supported by TBTREE16 and is extremely easy to use.
STRINGS.PAS - This unit contains routines to manipulate strings.
They are for internal use, but may prove useful for your specific
applications, as well.
TIME.PAS - This unit contains the implementation and interface to
manipulate a time counter which will keep track of a sequence of
events. It is really intended for internal use only, although you
can use it if you find a need.
VLOGICAL.PAS - This unit is much the same as the LOGICAL unit except
that it is designed to handle variable length record files. In
other words, each record can have a different number of bytes
associated with it.
Programs - Source Code For All Programs
TP4PRINT.PAS - This program is provided merely to permit my source
code to be printed out in a better format than provided by the DOS
Print command. It looks for page control codes embedded in my
source (my control codes are comments to the compiler) and controls
paging. It puts a header which includes file name, page, print time
and file creation time. It provides top and bottom margins as well.
Give it a try by printing out one of the source code files to see if
you want to use it or another source code printer. Syntax is :
TP4PRINT filename where filename includes path if source is
not in the current directory
After the file name, a an optional parameter can be used. Only a
small i or capital I will be recognized. If this is present, only
the opening comments and the entire interface section will be
printed. The implementation section will not be printed. This
allows you to look at what you need without printing out everything.
An example of this follows:
TP4Print logical.pas I
IFACE.BAT - A batch file which will print out the comments and
interface section of every unit in TBTREE16. This is the easiest
8
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
way to print out what you need from the code without printing out
stuff you probably don't need (implementation sections). One word
of warning though, over 80 pages will be printed. Of course, you
can go in an modify the batch file to only print the ones you want.
BUILDTPU.PAS - When this program is compiled, all units associated
with TBTREE16 will also be compiled. This simply provides a
convenient way to build all the units at once without compiling them
individually.
EXAM0.PAS - This is really a unit that is used a global section for
the other examples. It contains the record declaration for the test
record used for all examples.
EXAM1.PAS thru EXAM3.PAS and EXAM5.PAS thru EXAM11.PAS
demonstrate the capabilities of TBTREE16 to handle fixed
length data records
EXAM4.PAS has nothing to do with data records and is
explained below
EXAM51.PAS thru EXAM54.PAS demonstrate the capabilities of
TBTREE16 to handle variable length records
EXAM1.PAS - This example program demonstrates how to create a data
file and a couple of indexes. It then demonstrates how to insert
data into the data file and the indexes. Several other routines are
demonstrated as well. Important to note is that this program has
been changed from version 1.2 and prior. It now uses
StoreNewLogicalRecord to store the record as opposed to using
StoreALocicalRecord. Look at the source and this may make sense.
EXAM2.PAS - This program shows how to perform retrievals using both
the logical record lists and the internal cursor. The files created
in EXAM1.PAS are used.
EXAM3.PAS - This program shows how to do deletes and updates. The
files created in EXAM1.PAS are used.
EXAM4.PAS - This program shows how to use routines in FILEBUFF.PAS
with text files.
EXAM5.PAS - This program shows how to use the retrieval on range
routine. The files created in EXAM1.PAS are used.
EXAM6.PAS - This program shows how to use the retrieval on partial
string match routine. The files created in EXAM1.PAS are used.
EXAM7.PAS - This program shows how to build/rebuild indexes from an
existing data file. The files created in EXAM1.PAS are used.
EXAM8.PAS - This program shows use of routines to delete entries
from a logical records list. The files created in EXAM1.PAS are
used.
EXAM9.PAS - This program shows an example of sorting using the SORT
unit. The files created in EXAM1.PAS are used.
9
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
EXAM10.PAS - This program shows an example of using one of the set
operations found in the SETOPS unit. The files created in EXAM1.PAS
are used.
EXAM11.PAS - This program shows an example of the use of the
internal cursor. Of note is the fact that the cursor keeps track of
its position in the index even after inserts and deletes have been
done in the index. The files created in EXAM1.PAS are used.
EXAM51.PAS - This routine is the direct counterpart to EXAM1.PAS
except that it creates variable length records.
EXAM52.PAS - This shows how to do retrievals using the variable
length records. Notice that the use of indexes is the same as in
EXAM2.PAS. I did not use the cursor in this example, but they work
the same as in EXAM2.PAS.
EXAM53.PAS - This is an example of doing a delete using variable
length records. Notice that it is not really very fast. I am
working on improving the speed of deletes for variable length
records.
EXAM54.PAS - This is a direct counterpart to EXAM9.PAS except that
it sorts variable length records.
10
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
Instructions For Use of TBTREE16
The only way to truly understand how TBTREE works is to look through the
Interface sections of the various units. As noted above, I have supplied a
batch file which can be used to print all of the Interface sections at once.
The first two units to try to figure out are the LOGICAL unit and the BTREE
unit. These really are the heart of TBTREE16. Then look over the PAGE unit
(concentrate on the few routines which are used to set parameters such as
buffer size etc. Also try to understand the buffer and why it is important to
TBTREE16. Then glance over the FILEDECS unit. You will need to use a couple of the
routines in the FILEBUFF unit as well. Once you grasp these units, you are
well on your way to understanding TBTREE16. The following provides a short
set of instructions for performing the more common database type operations
using TBTREE16. The Quick Reference guide, which is provided in a separate
file, will also prove helpful.
Uses Statements - All of the units were compiled with the appropriate
uses statement included. Therefore, the user does not need to be
concerned about order when specifying units in the uses statements in the
user's programs. As you can see in my examples, I put the units in
alphabetical order. The user only needs to put the units which have
type statements and/or routine calls which must be visible within the
user's program. My example programs demonstrate which ones are probably
required. One method which actually works is to not include any
initially and let the compiler find the syntax errors. When an error is
found because a type or routine is not found then see which unit it's in
and put that unit in the uses statement. It's not a great way to do it,
but it works.
Setting Initial Parameters - Several parameters should be set by the user
although they have defaults if a value is not set by the user. All of
the parameters are only accessible by calling routines supplied to check
and set values. The variables themselves are not global, so they are not
visible to the user. The following parameters can be set by the user:
Open Files Allowed - Contained in FILEBUFF.PAS. Determines number
of open files allowed exclusively for use within TBTREE16. TBTREE16
will never have more that this number of files open at a time.
However, if TBTREE16 needs to deal with more files than this, it
will open and close its files and function properly. See
FILEBUFF.PAS for details. To set the parameter to a value of 5 do
the following:
SetMaxOpenFiles(5);
Maximum Buffer Pages Allowed - Contained in PAGE.PAS. Determines
the maximum number of 512 byte pages to keep in memory at one time.
Set this to something that makes sense for your machine (memory size
available). Since this is set at run time, you can set this after
you determine how much memory is available. You can reset this at
any time to a higher or lower value, and it will handle it properly.
To set this parameter to 50 pages (25K bytes) use the following:
SetMaxBufferPages(50);
One note is that the buffer is initialized to 128K bytes which may
be too large for your application. You should reset the default to
whatever you want.
11
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
Immediate Disk Write - This parameter if set to TRUE will force a
write of a page to disk immediately after it is stored in the page
buffer. Otherwise, the page will only be written when it gets
swapped out or is explicitly forced out by the user (all pages for a
file or all pages). To set this value to TRUE use:
SetImmediateDiskWrite(TRUE);
Creating Data Files - One of the first things we would want to do is
create data files to hold our data. This is extremely simple. We only
need to make one call for each data file. The following statements would
create a data file with fixed length records (LOGICAL unit). Using the
VLOGICAL unit is very similar.
type
MyRecord = record
field1 : Byte; (* for this example this
is the only field
indexed (Index1) although
you could index all fields
if desired. *)
field2 : Word;
field3 : String[10];
end;
var
xxx : FnString; (* holds name of data file *)
begin
xxx := 'MYFILE'; (* name of file *)
CreateDataFile(xxx,SizeOf(MyRecord)); (* create the file *)
end;
Notice that the xxx is not a file type but an 80 character string type
which holds the name of the file including path (if desired).
Creating Index Files - The next step is to create one or more index files
for each data file. You will not normally have a data file with no
indexes (although for small files, you may want to). To create an index
file the user must know the type and size of the key (item to be stored
in the index). The following statements will create an index file for
the data file created above. The index will hold Byte values (values
from 0 - 255).
var
yyy : FnString;
begin
yyy := 'Index1';
CreateIndexFile(yyy,SizeOf(Byte),BYTEVALUE);
end;
Notice use of the SizeOf routine to pass in the size of the item to be
stored in the index. This is usually more accurate than passing in a
constant value. This will especially help avoid errors when using
strings. Remember, strings use one extra byte for the length.
12
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
Inserting Data - Once files and indexes exist we can insert data.
Inserting data consists of two separate steps. The first step is to
insert the data into the data file. The second step is to insert the
applicable data into each index which is associated with that data file.
To insert data we first need to have the data record stored in memory
somewhere. The most logical place to have the data record is in a
variable which is of type record of some kind. For example, the
declaration of a data record (MyRecord) as defined above will suffice.
var
thisRec : MyRecord;
lrNum : LrNumber;
We could use the following sequence to insert a record in the data file
and the index associated with this data file:
thisRec.field1 := 20;
thisRec.field2 := 60000;
thisRec.field3 := 'Hi Mom';
lrNum := StoreNewLogicalRecord(xxx,thisRec);
InsertValueInBTree(yyy,lrNum,thisRec.field1);
Notice that StoreNewLogicalRecord was used because it will find an unused
logical record number and assign it to this new record. This logical
record number is the one inserted (along with the associated value) into
the BTree. This is a change from versions prior to version 1.3. This
simplifies things and makes storing much more straightforward. Hopefully,
it is obvious that data records are not stored sequentially or in any
other ordered manner. In other words, there is not a relationship
between logical record 1 and logical record 2. To ensure order, you can
either retrieve using the indexes or create a logical record list and
sort it.
Retrieving Data - Retrieving consists of getting the data for one or more
logical records from a data file. To determine which logical records are
needed, one or more indexes are normally used. TBTREE supports two
different methods for performing a retrieval using an index. The first
and most powerful is to build a logical record list and the traverse that
list. Routines to do the retrieval are found in the file BTREE2.INC
(part of the BTREE unit). The routines to traverse the list are found in
the FILEBUFF unit. The second method is to use a cursor which is built
into each index. This method is more traditional and has the advantage of
being dynamic but is not as powerful/flexible. Saying that the second
method is dynamic refers to the fact that as index entries are inserted
and deleted, the cursor continues to point to the same entry, unless that
entry is deleted. The routines to support this type of retrieval are
found in the BTREE3.INC file. Both methods are discussed further below:
Method 1 - Using the logical record lists - Retrieving data consists of
making a query to an index. The query is a procedure call which will
build and return a list of logical records which match the selection
criteria. The query conditions are defined in NUMBERS.PAS and are of
type Condition. Once the list is retrieved, the user can step through the
list starting at either end and going in either direction. The logical
record numbers are stored in the index in such a way that their
corresponding key values (values in the index) are in ascending order. A
13
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
logical record list created using an index will reflect this order. If
there are multiple occurrences of the same key then the last one added is
in the front. The list can be traversed front to back (ascending) or
back to front (descending). A cursor is kept internally in the list (not
to be confused with the cursor used in the other method) to keep track of
where the user is in the list. The user can not get to the cursor but
can set it to the front or back and move it forward or backward (by
retrieving values from the list). Normally, to traverse the list, set
the cursor to the front of the list and then set up a loop which will
traverse the list and terminate when a logical record number of 0 is
returned. The list is not disturbed as you move through it. It will
exist until you destroy it or until the program ends. If the program ends
before the list is destroyed explicitly by the user then there may or may
not be a file on disk with an extension of '.lrl'. If you terminate and
find any of this type of file laying around then you did not destroy it
prior to ending the program (shame on you). You should fix this since
leaving a list hanging around just takes up space and does nothing
productive. If you find files with this extension ('lrl') hanging around
after your program terminates, you can delete them. An example of a data
retrieval follows:
var
key : Byte;
lrLst : LrList;
lrNum : LRNumber;
begin
key := 20;
GetValuesFromBTree(yyy,key,LE,lrLst); (* retrieve all record
numbers for records
having an index entry less
than or equal to 20. *)
(* lrLst is the newly created list *)
lrNum := GetFirstLr(lrLst); (* set cursor to front *)
while lrNum <> 0 do (* traversal loop *)
begin
GetALogicalRecord(xxx,lrNum,thisRec,);
(* we can do whatever with the record *)
.
.
.
lrNum := GetNextLr(lrLst);
end; (* this will continue until end of list *)
You can also do a retrieval for a range of values or retrieve records on
a partial string match. An example of each follows:
var
key1,
key2 : Byte;
lrLst : LrList;
lrNum : LRNumber;
begin
key1 := 10;
key2 := 75;
GetRangeFromBTree(yyy,key1,GE,key2,LE,lrLst);
14
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
(* build list of logical
record numbers which
are between 10 and 75 *)
(* list is in lrLst *)
lrNum := GetFirstLr(lrLst);
while lrNum <> 0 do
begin
GetALogicalRecord(xxx,lrNum,thisRec,);
(* we can now print the record or look at it or whatever it is
that you would want to do with it *)
.
.
.
lrNum := GetNextLr(lrLst);
end; (* this will continue until end of list *)
One note about the above example is in order. The first Condition must
be either GE or GT and the second Condition must be LE or LT. If they
are reversed, the retrieval will not work properly. Also, one of the
conditions can not be NE or EQ or EX. This should make sense and is not
really a restriction. This routine is designed to handle ranges, not all
query types. If you have a query such as LE 10 and EQ 88 then you must
do two queries and perform an Intersection. See the SETOPS unit for
details.
------------------------------
var
keyString : String[10];
lrLst : LrList;
lrNum : LRNumber;
begin
keyString := 'C';
(* assume zzz is an index holding values of type STRINGVALUE *)
GetSubstringFromBTree(zzz,keyString,ST,lrLst);
(* build list of logical
record which start with
'C' *)
(* list is in lrLst *)
lrNum := GetFirstLr(lrLst);
while lrNum <> 0 do
begin
GetALogicalRecord(xxx,lrNum,thisRec,);
(* we can do whatever with the record *)
.
.
.
lrNum := GetNextLr(lrLst);
end; (* this will continue until end of list *)
Method 2 - Using the cursor internal to the index - Each index contains
one internal cursor. The cursor can be set to the beginning or the end
of the index. It can also be set to a specific place in the index (first
occurrence of a value). Once the cursor is in place, it can be moved in
either direction retrieving logical record numbers along the way.
15
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
Although it is not nearly as powerful as the logical record list
approach, it has the advantage of simplicity as well the fact that the
cursor stays valid even after inserts and deletes are performed on the
index. Rather than presenting the expanded discussion here, refer to the
BTREE3.INC file.
Retrieving Data Records - As noted in the examples above, once the
logical record number for the desired logical record is determined, use
the GetALogicalRecord routine found in the LOGICAL unit to perform the
retrieval. It should be noted that you do not need to use and index to
retrieve a data record. You can retrieve all records by building a
logical record list of all valid record number. Use the
GetValidLogicalRecords routine for this.
Deleting Data - Deleting data is the inverse of inserting data. There
are three steps involved. The first step is to retrieve the record.
This is only required if you do not know the values for all fields which
have a corresponding index. Once we have the record we can delete the
entries from the index(es). The last step is to delete the logical
record from the data file. For the following example assume that the
same definitions used above apply but that field2 also has an index with
the file name residing in a variable zzz. Also assume that we know that
we want to delete logical record number 24:
GetALogicalRecord(xxx,24,thisRec);
DeleteValueFromBTree(yyy,24,thisRec.field1);
DeleteValueFromBTree(zzz,24,thisRec.field2);
DeleteDataRecord(xxx,24); (* delete value from data record *)
Updating Data - There are no separate commands to update data. Updating
is a two step process. The first step is to fetch the appropriate data
record(s). Once this is accomplished, the appropriate fields can be
updated and the data file can be stored. A third step may be required.
If any indexed fields (fields which have an index associated with them)
are changed then the old value will have to be deleted and then the new
value will have to be inserted into the index. Rather than show an
example here, I refer you to EXAMPLE3.PAS. You may notice that in that
example, I store the updated data record after updating the index rather
than before. The order of these two operations is not important.
Deleting Files - The operation of deleting a file is not going to be done
very often but will be necessary from time to time. The following is an
example:
DeleteDataFile(xxx);
DeleteIndexFile(yyy);
Notice that there is a separate routine for data and index files. Also,
deleting a data file will not automatically delete the associated index
files. You must do this explicitly as appropriate.
Sorting - Sorting can be done on any number of fields as you desire.
Sorting requires two steps. The first is to build a sort field list, a
list containing information on each of the fields to sort on. The second
step is to use this sort field list to sort a logical records list.
After sorting is completed, the user should destroy the sort field list
16
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
using the provided routine unless the list will be required later. An
example follows:
type
MyRecord = record
field1 : Byte;
field2 : Word;
field3 : String[10];
end;
var
lrLst : LrList;
sPtr : SortFieldList;
success : Boolean;
begin
GetValidLogicalRecords(xxx,lrLst); (* list of all records *)
sPtr := NIL; (* EXTREMELY CRITICAL !!! - you must pass in a
pointer variable whose value is NIL. If
you forget this little step all kinds of
bad things will probably happen!!! *)
AddFieldToSortList(sPtr,4,STRINGVALUE);
AddFieldToSortList(sPtr,2,WORDVALUE);
SortList(lrLst,xxx,sPtr,success);
if not success then
begin
(* there was not enough heap space to facilitate the sort. Free
up some space and try again if desired *)
end;
DestroySortList(sPtr); (* retrieve the heap space *)
end;
Notice that the second parameter is the BYTE position of the sort field
within the data record. It is not the field number. This is extremely
important!! See the SORT unit for details.
Building/Rebuilding Indexes - Once in awhile you may manage to wipe out
an index or in some other way cause it to be out of whack with your data
file. Or, you may want to add an index to a field which previously did
not have an index. This can easily be accomplished using a routine added
in version 1.1. Rather than repeating the example here, please refer to
EXAMPLE7.PAS which will show how to build a list of existing records and
rebuild an index from that list. The added routine will also allow you
to build a list of all logical records associated with a data file
without using an index. To check out an index to be sure that certain
logical record numbers are in fact in the index, you can use the
FindLrNumInBTree routine.
Viewing Status Information - I have included several routines for
checking on the values of certain parameters and for allowing the user to
delve into the depths if desired. The first routine to discuss is one
which allows you to find out how many entries are in a logical record
list once it has been created. To accomplish this the following call is
used:
cnt := GetCountLr(lrLst); (* lrLst is your variable where
17
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
the list resides *)
A second valuable utility is a routine which will return the number of
files which are presently open using the FILEBUFF unit. In other words
this returns the number of files which are on the files open list. This
is not necessarily the number of files which DOS thinks is open. Only
files opened using the create file or open file routines provided in
FILEBUFF.PAS will be included in this number. Not included will be other
files opened by the user and the files such as standard input and output
etc. An example of a call is below:
cnt := GetNumberOpenFiles;
Several routines are provided in FILES.PAS which have not been discussed
yet. The first is FileExists which will determine the existence of any
file. Another routine of use is SaveUserDataArray. This will store a
256 byte array in the parameter record for a data or index file. This
can be used to store whatever the user desires. FetchUserDataArray will
retrieve the 256 byte array for the user. Another routine of use is
FetchFileVersion which will return a 6 character string which determines
which version was used to create the index or data file.
The largest number of utilities lies in PAGE.PAS. The first routines of
interest are the ones for writing part or all of the buffer to disk. The
demand paging used in the PAGE.PAS unit is critical to the performance of
TBTREE16. Since a large number of pages are in the memory buffer, the
need to constantly go out to disk is reduced. Disk seeks (head movement)
is what slows down most database operations. The secret is to do
everything you can to alleviate the need to go out to disk. Memory is
many times faster than external storage. However, if changes are made to
pages in the buffer we want to ensure that they do not get lost. We need
to write them to disk at some point. Normally, a page will stay in
memory until another page needs the space. The unit uses a least recently
used algorithm which in nothing earth shattering (but it works). When a
page needs to be swapped out, a check is made to see if it is dirty
(changes have been made since it was read in or last written to disk).
If it is clean no problem. If it is dirty it is written to disk prior to
allowing the page to be reused. The user can force pages to be written
out at any time. One method is to set a parameter which forces a write
to disk anytime a page is written to in memory. Essentially, a page will
never be dirty this way. To accomplish this the SetImmediateDiskWrite
routine is used. The user can set this parameter to TRUE or FALSE.
FALSE is the default, but the user should probably make an explicit call
anyway on initialization so there is no doubt or confusion. If FALSE is
set, the user may still want to periodically write pages to disk. The
user can write all pages for a given file or all pages in the buffer.
For example:
WriteEntireBufferToDisk;
or
WriteBufferToDisk(xxx); where xxx is a file name
The user should always write all pages to disk prior to termination.
Otherwise, changes made to pages still in memory will be lost. A handy
way to do this is to call it from an exit procedure. See EXAMPLE2.PAS
for an example.
18
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
A second set of routines are provided to get rid of (release) pages in
the buffer. The pages should be written to disk prior to release. These
routines can be dangerous. I need them internally for operations such as
changing the size of the buffer dynamically and also for deleting a file.
They are visible and free to use, but understand what you're doing first.
If you're looking for a convenient way to wreck havoc with your data
these two are perfect candidates. The only reason I can foresee why you
would want to attempt to use these is to release some pages for files you
won't need for a while in order to free up heap space (such as for a
sort). Before releasing, write the affected pages to disk!! Except for
freeing up heap space, there is not really a need to micro manage the
resources like this and these routines can lead to problems if misused.
Accidentally, losing a node (page) in the middle of an index will leave
it in an unrepairable state and will ruin your whole day.
A third set of routines are much friendlier and I can stay off my soap
box. These routines manipulate and check the size of the buffer. All
values are in terms of pages. A page is 512 bytes. To digress slightly,
512 worked well because a smaller number caused inefficiency in the least
recently used page algorithm and also meant that index keys had to be
smaller. A larger number means that less pages can be in memory at once.
Do not try to change this constant as a lot of things are affected. Just
changing the constant will not work properly. This is due to the fact
that a constant declaration can not be declared using a simple
expression (version 4.0). For example:
const
xxx = 10;
yyy = xxx/2; (* illegal in Turbo Pascal version 4.0 *)
Therefore, other constants that depend on xxx have to be set explicitly.
In the above, yyy would have to be set to 5. Anyway, back to the matters
at hand. To set the size of the buffer the following call should be
made:
SetMaxBufferPages(50); (* 25K bytes *)
If a user does not set the page limit explicitly, the value will default
to one page. Everything still works, but performance will be extremely
poor. A minimum of 10 pages is required to have this work well at all.
One hundred or so pages is preferred. A great deal depends on your
application as well! A user can check this value at any time by doing
the following:
cnt := CheckMaxBufferPages;
The user can check the number actually in use by doing the following:
cnt := CheckPagesInUse;
One routine not mentioned earlier is CheckImmediateDiskWrite which
returns TRUE if dirty pages will be immediately written to disk, FALSE
otherwise.
One interesting routine is PrintPagebuffer. This routine is one that I
developed for my own debugging purposes and is included to allow you to
see what the page buffer looks like. One warning though, if the page
19
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
buffer is large, the output will be quite lengthy.
The PrintBufferStats routine is more for curiosity's sake and for tuning
the size of the buffer. A call to this routine will print the statistics
for buffer use. The number of pages in use, the maximum number of pages
allowed, the number of requests for a page (fetches), the number of
fetches in which the page was in the buffer (hits) and the ratio of hits
to fetches will be output. A hit ratio of much below 95% could signal a
requirement to increase the buffer size. You really need to keep this
number in perspective though. If you are waltzing through a file in
which none of the records are in the buffer, the hit ratio will be poor.
Other operations will have higher hit ratios. At least it's there to use
as desired.
A recently added routine contained in the BTREE unit is
NumberOfBTreeLevels which returns the number of levels currently in an
index. If that number gets above 10 you are doing something rather
strange and may run into stack problems. The cause of such an occurrence
is that you have a huge number of index entries (many many thousands) or
your index entries are large. The only entries over 4 bytes are strings
or arrays of type ByteData. These types of indexes should use entries as
small as possible. Don't use first name, last name, and middle initial
as all one indexed filed unless there is a really good reason to
(doubtful). Any string index entry over about 30 bytes is suspect.
Remember what an index is for .. to store something to allow quick
retrievals. The index is not designed to be the data file. A deep index
will simply not be as fast as a shallow index. The smaller the index
entry size, the shallower the index.
Another new routine is the FindLrNumInBTree which can be used for
debugging and checking the consistency of an index file.
Other Routines - I discussed above the routines which are directly
applicable to a user who wants to use the indexing and data manipulation
capabilities of TBTREE16. There are some lower level general routines
such as those found in FASTMOVE.PAS, MATH.PAS and STRINGS.PAS which are
used internally. Since they are fairly general in nature, they may have
applicability for your other applications. I won't go into details on
their inner workings here, but you have the source code, so you can look
in the units to see if there is anything else useful for whatever you
happen to be doing. The quick reference will provide assistance as well.
Error Handling - In version 1.5 I have added a unit (ERROR unit) which is
designed to handle I/O errors. This unit handles only I/O errors and
allows these errors to be trapped. TBTREE16 I/O is now accomplished with
I/O checking off. If an I/O error occurs, the ERROR unit comes into
play. A routine is called with pertinent information being passed to it.
The routine should be modified by the user, as desired, to handle the
error. As written, if an I/O error occurs, the program will terminate
after an error message is written. This parallels what would happen in
versions prior to TBTREE15. However, as stated above, you are encouraged
to change the routine to handle the error as desired.
20
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
Inner Workings
I am trying to avoid bogging you down with too much detail concerning the
inner workings of this product. However, there are a few things which you may
need to be aware of. One important area is how TBTREE16 manages the heap.
Turbo Pascal has two different methods for heap management. TBTREE16 uses
New, GetMem, FreeMem and Dispose exclusively. Mark and Release are not used
and CAN NOT be used in your application if TBTREE16 is used. Mark and/or
Release will cause a disaster!!! Also, TBTREE16 handles the heap well, but
you need to provide it with a sufficient amount of heap to operate with. Heap
is needed by the page buffer (PAGE unit), it is needed for sorting (SORT unit)
and it greatly enhances the speed of set operations (SETOPS unit). If your
application hogs all of the heap, sorts may be impossible and disk I/O caused
by a small buffer will greatly reduce throughput. Also, set operations will
slow down considerably. One more note - If the heap is almost full, an error
can occur since there will not be room for the free list to expand when
FreeMem is used (which is done extensively throughout TBTREE16). To preclude
this from happening, you can set the freeMin variable (internal to TURBO
PASCAL and has nothing to do with TBTREE) to a value which will reserve some
number of bytes for the free list. Set freeMin to some multiple of 8 bytes to
reserve space. The Turbo Pascal manual discusses this in depth in the section
concerning the FREE LIST (p 198 of the version 5.0 manual). Again, this error
has nothing to do with TBTREE (but it does affect it) and occurs in Turbo
Pascal programs which use the whole heap and then try to dispose of space.
You need to be aware of the tradeoff of using the LOGICAL unit versus the
VLOGICAL unit. VLOGICAL obviously provides flexibility. However, there is a
cost in performance and file size. There is an overhead of 10 bytes per
logical record associated with variable record length files whereas there is
an overhead of only one bit (not byte) for each record in a fixed length data
file. The same holds true for records which have been deleted. This overhead
is not as significant if the size of the records themselves is large. In the
area of performance, you will find the variable length record routines to be a
little slower. Which way to go is up to you. Remember, you can use both
types in a single application, But once a file is declared to be of one type,
you must not use the other type's routines with it. Doing that will cause
problems.
21
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
Current Product Limitations
The following limitations apply to TBTREE16. Only limitations inherent to
TBTREE16 are discussed. DOS limits and practical limits like disk drive
capacity are not discussed. These limitations may change (become less
restrictive) in future additions.
Number of Data Files - No Limit
Number of Index Files Per Data File - No Limit
Size of Data File - MAXLONGINT (2,147,483,647) Logical or Physical
Records (whichever comes first) or disk space (the
real limit!!)
Size of Index File - MAXLONGINT Physical Records. The real restriction
is going to be stack size. I use a recursive
routine to traverse the BTrees. Recursion is
very efficient for this but has the problem that
each level keeps a couple of thousand bytes on
the stack. It is very important to use a large
stack size if indexes are going to be deep. Most
indexes will not be more than 4 or 5 levels deep
unless a HUGE number of records are involved or
the keys in the indexes are large. Large is
relative, but you need to try to keep entries
below 30 bytes or so. This really only affects
strings presently. I would like to know if
anyone is running into problems with this as I
can handle this iteratively rather than
recursively which will make the problem disappear
at the expense of some added coding complexity.
I recently did a test to see to see how critical
this is. I did a test using an index entry size
of 100 bytes (100 byte strings). I entered over
27000 records before running out of disk space
(took over one day on my archaic 4.77 MHz
computer!). I had somewhere in the area of 15
levels which is more than you should reasonably
have. I also used a 32K stack (64K being the
maximum allowed). Anyway, I couldn't blow it up
so I don't think you will either!
Size of Page Buffer - Limited by heap space. I have not used this
with any type of heap extender or other program
which uses other memory (extended, expanded) for
the heap. I would enjoy seeing feedback if
anyone tries it. If I need to do something to
facilitate use of a heap extender let me know
and I'll try to accommodate.
Number of Files Open At Once - Actual number set by user, but user
does not have to worry about keeping
track of files used by TBTREE16 since
it opens and closes files as required.
Size of Key (Index Entry) - 245 bytes. Versions prior to 1.6 had a
22
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
higher number. However, no version ever
worked for values higher than 245. I
apparently missed that during testing.
245 is still much higher than you can
practically use anyway.
Size of Data Record - 65520 bytes since this is the limit for
structured (composite) types in Turbo Pascal.
Max File Name Length - 80 characters including drive designator, path,
file name and extension.
Logical List Limit - Unlimited
Number Of Sort Fields - Unlimited
23
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
Future Enhancements / Directions
One of the advantages of using the shareware concept to distribute this
software is that I can be very flexible and can tailor this product to the
needs of users. I am soliciting as much feedback as possible. I need to know
what is really important to users. I hope to be able to accomplish the
following in the future:
I may change the delete routines in the BTREE.PAS unit to force the
combining of nodes when each is half full or less. Presently, a node is
not deleted until the last entry is deleted. This will slightly increase
performance on queries and will reduce the size of the indexes if deletes
are done often, but will otherwise be transparent.
I may pursue adding multitasking capabilities.
The most significant change for the future will be an abandonment of Turbo
Pascal 4.0 support. Version 5.0 has some nice features which I can not
use yet because I wanted this version to support Turbo Pascal 4.0. If I
do not get negative feedback, future versions will support only version 5
(and later when there is one) of Turbo Pascal. I would like to get
feedback concerning peoples feelings on this (registered and nonregistered
alike). I also need feedback to determine if most people are moving to
version 5.5. Mine is still in the mail, so I need to see what good
things that will do for me before I make any determinations.
Version Release Notes
Version 1.1
1. Added capability to do a search of an index on a range of values.
2. Added capability to do partial string match searches (strings only).
Added one routine to the BTREE unit and several routines to the COMPARE
unit and one type declaration to the NUMBERS unit to facilitate this
change.
3. Corrected bug in BTREE unit. Documentation stated that duplicate entries
for a given logical record number and field value were prevented from
being entered. This was not previously the case although that is now
true.
4. Added a routine (LOGICAL unit) to build a list of all current records for
a file (all valid ie not deleted records). This is essential for
rebuilding indexes or accessing data without an index. In actuality, a
user could have built this routine, but it was not intuitively obvious.
5. Added the capability to delete an entry from an existing logical record
list.
6. Divided BTREE.PAS into BTREE.PAS and two include files : BTREE1.INC and
BTREE2.INC.
7. Added several more examples.
Version 1.2
1. Added Sorting capabilities.
24
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
2. Added several routines to LRECLIST.PAS to facilitate sorting. I also made
a couple of internal cosmetic changes to LRECLIST.PAS.
3. Added one example.
4. Removed CalculateRecordSize and several unused data types from
LOGICAL.PAS. These were for future projects and now are now targeted for
TBASE.
Version 1.3
1. Added SETOPS unit which provides Intersection, Union and Difference.
2. Added one example.
3. MATH.PAS redone in assemble language. See listing for details and credit
to the author.
4. Added one routine (FindLrInList) to LRECLIST.PAS.
5. In LOGICAL.PAS changed MAXDATASIZE to 65520 since this is the maximum size
for a structured type in Turbo Pascal 4.0.
6. TP4PRINT.PAS now prints the file creation date as part of the header.
7. Added NumberOfBTreeLevels routine to the BTREE unit.
8. Added StoreNewLogicalRecord routine to LOGICAL.PAS. This makes it easier
to store new records. You no longer need to find the next available
logical record number using the FirstUnUsedRecord routine located in
FILES.PAS. StoreNewLogicalRecord does this for you and returns the
associated logical record number.
Version 1.4
1. Biggest change is a redesign of the data and index files. Both file types
have a parameter record (previously only the index files did). There is
more information contained in the parameter records, including a place for
you to store 256 bytes of user data for your own requirements. I also did
away with the bitmap files. Bitmaps are contained in the individual
files. Adds some speed and cuts in half the number of files floating
around. However, all files that were created using previous versions will
not work with this new version of software. You will need to rebuild all
data and index files. I realize that this will cause some problems. The
index files will be easy, as EXAM7.PAS shows you how. However, the data
files will be tough. The easiest method will be to write a small
application using the old software and dump the data to to a flat file.
Then you can create an application with the new version and read in the
old data to the new files. Make sure you have a backup of your old files
before doing anything!!!
2. Many routine calls have been changed because of the above changes.
Specifically, stores and retrievals of logical records no longer require
the size of the data record to be passed in. However, the size is needed
when the file is created. Specific routines were developed for creating
and deleting the data files and other routines were developed for creating
25
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
and deleting the index files.
3. The ByteData unit was added, although at this point it is of limited
value. It is really for work I am doing on TBASE. It is available for
your use, however.
4. Corrected several errors. Specifically, logical record sizes > 512 bytes
did not work properly and the previous size limit for an index entry was
really 255. This has been corrected. The limit is now 494 bytes as
specified in the documentation.
5. Added a second way to perform retrievals of logical record numbers from an
index. See the BTREE unit for details.
6. Added GetSubstringAtPosition routine.
7. GetSubstringFromBTree routine did not work as advertised because of an
error in the COMPARE unit which has been fixed.
8. Added ByteData unit
9. Moved ValueType to NUMBERS unit and added BYTEARRAYVALUE
10. Corrected EndsWithSubstring routine
11. Added ContainsSubstringAtPosition routine
12. Now use {$IFOPT N+} conditional compilation directive to handle 8087 types
13. All INDEX and DATA files now contain the version used to create the file.
This is contained in the index.
14. Changed definition for RecordNumber and FileTypes type
15. Added UserDataArray type
16. Added FileExists routine
17. Redesigned entire FILES unit. See unit for details.
18. Many changes to LOGICAL unit. See unit for details.
19. Deleted PowerInt from MATH unit.
20. Changed SortList procedure call so that size is no longer required.
21. Added MAXSTRINGLENGTH constant and StringLengthRange type to STRINGS unit
22. Redesigned TIME unit. See unit for details.
23. Many changes in BTREE unit. See unit for details.
24. Updated the examples to implement the changes. One note - I had an error
in the MyExit routine in the examples. I forgot to to reinstall the value
of ExitProc which essentially broke the chain for calling exit routines.
Examples are now correct.
Version 1.5
26
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
1. Fixed documentation errors in the examples.
2. Cleared up documentation in the FILEBUFF unit.
3. Added FASTMOVE unit to speed up moves.
4. Made internal changes to use Inc and Dec to increase performance
5. Completely reworked the SORT unit. Performance has increased threefold.
6. Added ERROR unit to trap I/O errors. See error unit for details.
7. Changed the PAGE unit and FILEBUFF units to keep heap allocation errors
from ever occurring. Units now reserve a minimum amount of space during
initialization to ensure that there is always at least some minimum amount
of space available. See units for details.
8. Added FindLrNumInBTree and UsingCursorAndGEValueGetLr routine to BTREE
unit.
9. Changed way records are added as data and index files grow. As files grow
larger, the number of records added at one time increases. This will
speed up the insert process.
10. Reworked FILEBUFF unit to handle Text files better.
11. Changed problems in LOGICAL unit associated with calling routines with
logical record number equal to 0. See LOGICAL unit for details.
12. Added LastDataRecord to LOGICAL unit.
13. Added CopyLrList routine to LRECLIST unit.
14. Changed definition of SortList routine. See SORT unit for details.
15. Added Asciiz2Str routine to STRINGS unit.
16. Rewrote TotalString routine to increase performance.
17. Parts of TIME unit redone using INLINE statements to increase performance.
18. Fixed error in GetEqualsFromBTree routine which would cause random
failures on queries where the Condition was EQ
19. Fixed error in GetRangeFromBTree routine which caused problems when the
upper range and lower range were the same
Version 1.6
1. Fixed an error which caused the wrong entries to be deleted from an index
when using DeleteValueFromBTree
2. Changed the value of MAXVALSIZE to 245 which is the correct limit on the
maximum size of an index entry
3. Added VLRDATA to the FileTypes enumerated type
27
TBTREE16 Copyright (c) 1988,1989 Dean H. Farwell II
4. Added FetchFileType routine
5. Corrected error in LOGICAL unit which caused an error when dealing with
records larger than 512 bytes
6. Added VLOGICAL unit. This unit is an alternative to the LOGICAL unit and
it handles variable length records
7. Fixed error in the SORT unit which would lead to an infinite loop when the
heap became full and memory was being allocated for a sort
8. Changed SORT unit to facilitate sorting variable length record data files
9. Fixed an error in the SORT unit which caused the sort field list to be
modified upon completion of the sort. The sort field list is now returned
unmodified which allows it to be used again to sort on the same fields
10. Fixed a major error in the SORT unit which caused logical record numbers
to be lost from a logical record list when that list was sorted
Turbo Pascal is a registered trademark of Borland International
CompuServe is a registered trademark of CompuServe Incorporated
28