home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Overload
/
ShartewareOverload.cdr
/
progm
/
disam.zip
/
DISAM.DOC
< prev
next >
Wrap
Text File
|
1990-04-20
|
26KB
|
695 lines
DISAM3
Dynamic Indexed Sequential Access Method
Version 3.5
Apr. 20, 1990
Written by:
Robert Pearce
2326 W. Cabana
Mesa, AZ. 85202
(602) 835-9189
LICENSE AGREEMENT
The author, Robert Pearce, grants you without charge
the right to reproduce, distribute and use copies of this
"shareware" version of the DISAM file handler software product
(including the on disk documentation), on the express condi-
tion that you do not receive any payment, commercial benefit,
other consideration for such reproduction or distribution, or
change this license agreement.
Support from users such as yourself enable the author
to develop additional features and future versions of the
DISAM product. Your contribution of $10.00 would be greatly
appreciated and should be mailed to:
Robert Pearce
2325 W. Cabana
Mesa, AZ. 85202
By sending your contribution, along with your name and
address you will become a registered user of DISAM and
eligible to receive technical support, announcements of new
releases and fixes to problems as they become known.
THIS PRODUCT IS LICENSED WITHOUT ANY WARRANTY OF
MERCHANTABILITY, FITNESS OF PARTICULAR PURPOSE, PERFORMANCE,
OR OTHERWISE; ALL WARRANTIES ARE DISCLAIMED. BY USING THE
DISAM PRODUCT, YOU AGREE THAT THE AUTHOR WILL NOT BE LIABLE TO
YOU OR ANY THIRD PARTY FOR ANY USE OF (OR INABILITY TO USE)
THIS SOFTWARE, OR FOR ANY DAMAGES WHATSOEVER.
This software was tested as follows:
DISAM and DFH3 were assembled using Microsoft MASM V5.1.
User programs are compiled GW-BASICusing Microsoft GW-BASIC
Compiler V3.2 as sold by Zenith Data Systems and assembler
programs also assembled using Microsoft MASM V5.1.
Introduction
In the beginning, there was punched paper tape and fly
readers. Records were stored and read sequentially. Then came
magnetic tape drivers for mass storage but records were still
stored and read sequentially.
Records were blocked by one and it took 6 seconds to
read or write them. Then came blocked record formats and many
records could be gotten for that 6 seconds. I/O was getting
faster but it was still sequential.
When the diskette came along, sequential access was
preserved and random access was added. Random access provides
a way to access a specific record in a file. However, there
is no intelligence built into the random access file handler
and records are fixed length.
The main difference between DISAM and Random access is
the way data is accessed.
Random access, accesses records by relative numbered
data blocks. Each block is fixed length without regard to how
much data is stored in them.
DISAM accesses records by an assigned character key.
Record lengths are variable and can be up to the GWBASIC limit
of 255 bytes. Records can be longer if DISAM is accessed
through an assembler or "C" interface.
Random access will read records sequentially in block
number order.
DISAM access will read records sequentially in
ascending key order. (0-9,A-Z,a-z)
Random access allows fields to be defined within the
data block via the FIELD command.
DISAM relies on the user to provide field delimiters
and a parsing routine. Possible delimiters could be "\", "^",
or "*" or any other special character not being used as data.
DISAM
DISAM is a resident file handler initially designed
for use with GWBASIC. The need for this type of file
handler arose when ASCII data files were required to be
accessed by a specific alphanumeric key.
DISAM stores records in key ascending order regardless
of data entry. The record length is limited by GWBASIC's
limit of 255 bytes,although DISAM can handle much longer
records. The size of the file is limited to the amount of free
space on the recording media.
The record key has a maximum length of 125 bytes and
can be anywhere within the first 255 bytes of the record. The
key's position in the record has no affect on the response
time of the file handler.
Because of the nature of the file handler, there is a
special utility, DISAM.COM, that is used to define the file.
There is also a special set of GWBASIC code statements used to
access the file handler. Other than the above restrictions, a
DISAM file can be copied, renamed, or deleted through the
normal MS-DOS commands.
DISAM records can be retrieved either sequentially, by
full key or by a shortened (generic) key. You might use a
generic key to set the position in the file and then read
sequentially until the generic key changes, or you might read
a specific record by full key, or just read the entire file
sequentially. DISAM is designed for all of these.
Adding records to DISAM can be done in any order. The
file handler will insert them in their proper place. An
existing record may be replaced with a record of a different
length. The file handler will make additional room for the
replaced record or free up the extra space if the new record
is shorter that the original.
The file handler buffers its data blocks so that if
records being accessed or added fall within the same block,
unnecessary I/O is not performed. However if the users
program should fail, current added records may be lost. To
provide for additional data integrity, there is an "Immediate"
mode that can be invoked to force the file handler to write
the data buffer after each record is added, changed or
deleted. This is a great help during program development but
should be used sparingly after the program is completed
because of the increased time it takes to write the data
buffers.
Installation
There is no special installation needs for DISAM. The
module, DFH3.COM, is executed. Its execution will make it
resident in low memory and set up an entry pointer at 12:0
Hex. The user has the opportunity at this time to provide
multiple file access buffers. Based on using the index and
data block defaults, one file buffer is needed to access each
DISAM file. One is provided by default. If there is to be
more than one DISAM file open at the same time, a buffer must
be provided for each file. The limit is 5 DISAM files opened
at the same time. E.I. DFH3 3 will provide buffers for three
DISAM files.
If there is a second attempt to load the file handler
the user will get a message that it is already loaded. If the
address, 12:0, is being used by another module, then the user
will get a message for that also and DISAM cannot be loaded.
This address is also known as INT 48H and the code uses 5
bytes. This also includes 1 byte of INT 49H.
More on the DFH3 buffers. The buffers are made up of
four dynamic parts. Based on the defaults this amounts to
about 3K per buffer.
128 = File Control Block
512 = File Index Block
2048 = File Data Block
256 = User Add Record Buffer
----
2944 = Default Buffer size
If your file needs more than this you can define more
buffers. If you do and you need to have more than one DISAM
file open at the same time then use the odd numbered ones.
Define 4 buffers and use 1 and 3. This will give you 6K of
buffer space. If you really need a larger buffer size, you can
"zap" the space allocation size using PC-ZAP and file
DFH3.Z03.
The following code is placed in the GWBASIC program.
nnn REM Make sure that the file handler is in memory.
nnn DEF SEG=&H0012 'This sets the segment address
nnn X=PEEK(&H0) 'Get the code byte at the entry
nnn DEF SEG 'Restore the data segment to GWBASIC
nnn IF X<>234 THEN STOP 'Expect to find a long JMP inst.
'If it is not, DFH3 is not loaded.
nnn REM Set up variables for file handler calls
nnn F$="x[,n]" ' "x" is the file handler command
nnn R$="y" ' "y" depends on the command
nnn DEF SEG=&H0012 'Set the segment address
nnn DFH3=&h0 'Set offset address
nnn CALL ABSOLUTE (F$,R$,DFH3) 'Compiled basic call
nnn CALL DFH3 (F$,R$) 'Interpretive basic call
nnn IF LEN(R$)=1 .... 'Return code from file handler
'else R$=data record
COMMANDING DISAM
OPEN
F$="O[,n]" (for all programs except Quick-BASIC)
F$="Q[,n]" (for Quick-BASIC programs)
R$="filename.ext"+""
The open command; opens the file, reads the control
block, the first index block and the first data block into the
file handler's buffer. Quick BASIC does not allow the size in
the string descriptor block to be changed for chaining reasons
I suppose. If you use the "O" to open a quick BASIC file you
will get a string corrupt message. The "Q" open sets an
internal flag to stop the modification of the string length
being passed back to the calling program. This means that if
you send 80 bytes to DFH3 your response will be 80 bytes. You
must test for the first character in a return-code. This is
for "Q" open only. e.i nnn IF LEFT$(R$,1)="0" GOTO ...
RETURN-CODES
"0" Normal response to the open.
"5" The file is currently in use by another program and
the share option is set to "N". No sharing allowed.
"7" The file was not found.
"8" The buffer specified was found in use. This is
because it has been opened by another file or the
program using the buffer ABENDed leaving it open.
Use the FREE command to close the buffer if it was
left open by a program ABEND.
"9" This is a general error usually accompanied by a
register dunp and error address.
CLOSE
F$="C[,n]"
R$=" "
The close command writes the file buffers if required
to disk and closes the file to the system.
RETURN-CODES
"0" Normal response.
"9" An invalid buffer number was used.
ADD
F$="A[,n]"
R$="Data record to be added to the DISAM file"
The add command checks for an existing record of the
same key. If it is not found, the record is added to the DISAM
file.
RETURN-CODES
"0" Normal response.
"2" Record already exists.
"4" Record length invalid.The record length must be
longer then the key offset plus the key length and
shorter than Data Block Size - 10.
DELETE
F$="D[,n]"
R$="full-key"
The delete command requires a full key. Generic deletes
are not permitted. The delete removes the record and makes
room in the data block for another record.
RETURN-CODES
"0" Normal response.
"1" Record not found.
GET
F$="G[,n]"
R$="full-key"+SPACE$(255-keylength) 'Keyed read
R$="generic-key"+SPACE$(255-keylength) 'generic keyed read
R$=SPACE$(255) 'sequential read
Space must be provided for the record to be inserted
into the GWBASIC string work space. The file handler will not
corrupt the basic string space. Only as much space sent to the
file handler will be used. 255 bytes is the max. for a GWBASIC
program. The file handler will return a shorter record and
make the necessary changes to the string descriptor. It will
not use more space than provided by the calling program.
Quick-BASIC users, be careful here. You must know how
long the record is and pass that many bytes to DFH3.
Return-codes of "1" and "3" will also be "record length" long.
Be aware of records that start with the above numerics.
RETURN-CODES
"<------data record------>" Normal response.
"1" Record not found.
"3" End of file. (sequential read)
PUT
F$="P[,n]"
R$="Existing record that is to be changed"
The put command searches out the existing record on
the file. Deletes it and adds the replacement record. This
way there is no need for the existing and the new record to
be the same length.
RETURN-CODES
"0" Normal response.
"1" Record not found.
"4" Record length invalid.The record length must be
longer then the key offset plus the key length and
shorter than Data Block Size - 10.
IMMEDIATE
F$="I[,n]"
R$=" "
The immediate command sets a file handler flag to
update the data block immediately after each update. This will
provide maximum file integrity and maximum response time. This
command is provided to maintain the DISAM file while the
calling program is being developed and should be removed when
all program fixes are in place. The flag is reset when the
file is closed.
RETURN-CODE
"0" Only response
FREE
F$="F[,n]"
R$=" "
The free command makes the specified buffer available
for use. This command is provided to free a buffer after a
program has ABENDed, leaving the file and the buffer open.
RETURN-CODE
"0" Normal response
THE DISAM UTILITY PROGRAM
DISAM.COM is the utility program that is used to set up
a DISAM file. When the program is started the screen will
display:
DISAM UTILITY PROGRAM Ver 3.n
Written by R. Pearce. dd-mmm-yy
Enter Function:
CATalog listing of DISAM file statistics
DEFine a new DISAM file
DELete a file
FREe DISAM file handler buffer space
LOAd a DISAM file from a sorted source
UNLoad all or part of a DISAM file
VERify the condition of a DISAM file
END
The following defines the functions of DISAM
DEFine:
This function defines the DISAM file. You will be prompted
for the file name. A check will be made for an existing DISAM
file. Then you will be prompted for the key length and the key
offset.
The key length can be from 1 to 125 characters in length.
The key offset is a place within the record where the key
starts. The first position in the record is offset 0 (zero). The
offset plus the key length will establish the minimum record
length.
Normally a key length of 11 to 15 characters is adequate
for most data retrieval.
After the key length and offsets are entered you will be
prompted to specify the Index Block Size (IBS), or take the
default of 512 bytes.
The minimum IBS is four times the key length (FKL) plus 4,
plus eight. IBS = (4*(FKL+4))+8. If you specify a size smaller,
DISAM will calculate the minimum size for you.
Next you will be prompted for the Data Block Size (DBS), or
you may take the default of 2048 bytes.
The minimum size of the DBS is equal to the IBS. The
maximum size to stay within the buffer default is 2048. This will
allow you to write a record up to 2038 bytes long. DISAM will
round up the DBS to the next multiple of the IBS if you specify a
size that is not a multiple of the IBS.
Next you will be prompted for the load free space in whole
percentage points. This is the amount of free space to be kept
available in each data block during the load of the DISAM file.
At this point you have to know what the DISAM file is going to be
used for.
If there are no data records to be loaded after the define
of the DISAM file, then the value is not used and it does not
matter what you enter. a C/R will provide 0%. If this is a read
mostly reference file then you could use 0 - 2 %. If you are going
to be changing and adding data on a regular basis then 10 to 20
percent would be good.
The intent is to provide the file growth space before it is
necessary to split the data blocks. A file load function with 0%
load free space will pack the records into the smallest space so
that any record adds will cause the data blocks to be split and
increase file response time for adds. Fifty percent load free
space will cause the load function to double the size of the DISAM
file. Records added on a random basis will not cause as many block
splits hence faster file response time. The maximun value is 50%.
Lastly you will be asked if this file may be shared with
other programs.You must answer "Y" or "N". In such multitasking
environments as DESQview, several virtual 8086 windows may be
opened, each running tasks that may include DISAM files. If you
specify "N" only one program may access a DISAM file at a time. If
you specify "Y" then no data integraty is insured. Several
programs can update the same DISAM file, even the same record. It
is the users responsibility to insure that if "Y" is specified,
only one program updates and all other programs read the DISAM
file.
With the file definition complete a CATalog display will be
presented. Take note of the maximum record length on this display.
This is the edited value when it comes to adding records.
LOAd:
The load function loads the DISAM file from sequential file. A
program such as QSORT may be used to sort the sequential file
prior to the load. DISAM expects the input file to be sorted in
key sequence.
You will be prompted for the sequential file name and the
DISAM file name.
At the end of the load the number of records loaded will be
displayed.
UNLoad:
The unload function copies the DISAM file to a sequential
file. You will be prompted for the DISAM file name, the sequential
file name, the number of records to unload (default is ALL), the
number of records to skip (default is NONE), and the starting
DISAM record key (default is NONE). The skip count and the
starting key are mutually exclusive. Use one or the other. Not
both.
The DISAM file will be unloaded and the unload count will
be displayed.
The sequential file may also be the printer (PRN)or the
display (CON).
DELete:
The delete function allows you to delete a file while using
the DISAM utility program. Used during repacking of the DISAM
file.
CATalog:
The catalog function displays the current information about
the DISAM file. You will be prompted for the DISAM file name.
Below is a display of the pertanent information about the DISAM
file.
1) the key length
2) the key offset
3) control block size
4) index block size
5) data block size
6) next available block number
7) the maximum record length allowed
8) number of index records
(same as the number of data blocks)
9) number of index splits
10) number of data records
11) number of data splits
12) percent free space at load time
13) the share option
VERify:
The verify command is used to close files left open by
program ABENDs. You will be prompted for the DISAM file name.
Verify will display a message indicating how the file was found
and the catalog record count. It then will read the file
sequentially and display the actual record count.
If there is a difference between the two record counts then
records were lost as a result of the program ABEND. Use the
"Immediate" command while debuging the program to prevent record
losses.
FREe:
The free command frees all 5 of the file handler's
buffers. This command is useful during program development to
clean up the buffers used by DFH3 after program ABENDs.
THE COMPRESS PROCEDURE
Because of the way the records are added, described above,
the DISAM must be compressed from time to time. The following is
the recommended procedure.
run DISAM
select CAT to get the file key length and offset
and the Index and Data buffer sizes.
select UNL to unload the DISAM file to a seq. file
select DEL to delete the DISAM file
select DEF to redefine the DISAM file
select LOA to reload the DISAM file
select CAT to verify record counts
select DEL to delete the seq. file
select END
Note:
If you are an assembler programmer, there is a sub routine
included in this package that will allow you to access DISAM from
the program as easy as:
mov si,offset datrec ;point to record key
call dsmget ;get the record
datrec db max_rec_len
ZAPS......
Zap fixes documented and applied to this product use the
PC-SIG software "PC-ZAP" located on disk #355. PC-ZAP is available
at your local authorized PC-SIG dealer.
SYSTEM ERROR messages:
You may get a system error message due to an internal
error. It will be in the form:
System Error occurred in DFH3 near 027D
AX=0005 BX=.. CX=.. DX=.. SI=.. DI=.. CS=264B DS=2724 ES=2830
AX= INT 21 RETURN-CODE
CS= SEGMENT ADDRESS FOR DFH3
DS= SEGMENT ADDRESS FOR THE BUFFER SPECIFIED
ES= SEGMENT ADDRESS OF THE CALLING PROGRAM (e.i. GWBASIC)
In the above message, 027D is the address of the open file
interrupt +3. AX=0005 is an access denied error.
SOME INTERNAL THOUGHTS
DISAM uses three buffers within the file handler. The first
is the control block buffer. It contains pointers, counters, and
the necessary things to manipulate the other two buffers. This
buffer is 128 bytes long and is built when the file is defined.
The second buffer is the index block buffer. This buffer is
by default, 512 bytes in length and contains key records that
point to data blocks. There is a also a pointer to the next index
block. The first index block is numbered, block zero (0).
The third buffer is the data block buffer. It is by
default, 2048 bytes in length and contains the data records. The
data block also has a pointer to the next data block. The first
data block is numbered, block 1.
Records are stored in the data block in ascending key
order. The last record in the data block has its key stored in the
index block with a pointer to the data block.
When a record is added to the data block, the free space in
the data block is used. When the free space is used up, the data
block is split in half. A new index record is created pointing to
the new data block. The new record is then added to one of the
data blocks. The other remains half used. Over time the DISAM
file will grow bigger than necessary and will have to be
compressed. There is a process using the utility program to
compress the file.
The addition of new records is the most costly in response
time. Normally a record is added by simply rearranging the data
block in the buffer. No I/O required. When a data block split
occurs there are four writes performed. If a data block split
causes an index block split, six writes are required before the
record is added. After a split occurs, all blocks will have been
written to the file.
An example using the defaults; If the key length of a file
is 11 bytes. the index block can hold 33 data block keys. If the
average length of a data record is 80 bytes, each data block can
hold 24 records. 24x33=792 Since the index block is already in the
buffer, with 1 I/O you have access to any one of 792, 80 byte
records.