home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Power-Programmierung
/
CD1.mdf
/
forth
/
compiler
/
fpc
/
doc
/
chapter6.txt
< prev
next >
Wrap
Text File
|
1989-10-29
|
19KB
|
470 lines
CHAPTER 6. SEQUENTIAL FILES
6.1. SEQUENTIAL FILES IN F-PC
More than 90% of what is in F-PC kernel came from F83. Much of what you
are seeing should be somewhat familiar. Things that are different are
normally different because they need to be. There are a significant
number of things about F83 that should have been changed, but were not
for compatibility reasons. However since BLOCK is not present (it is
available as a load-on utility in the file BLOCK.SEQ in the ZIMMER
archive) and is replaced by sequential files, you will have to adjust to
them. Nevertheless, many of the familiar file manipulation words from
F83 are still present, and those that are will work in a very similar if
not identical way. Some of these are:
OPEN, CLOSE, VIEW, OK, ED, EDIT, LOAD, LIST
Some words, like BLOCK and BUFFER, simply did not fit into the new scheme
of things in a logical manner, and so were omitted. To load an entire
file, use the sequence
FLOAD <filename>
Scanning through a file is best done with VIEW, although of course the
file must have already been loaded. To scan through a file which has not
been loaded, you can use LIST, which lists from a line number, rather
than from a block, but the following sequence is easier:
OPEN <filename> Open a file
n LIST List 23 lines from n'th line in the SED
window
These words are very fast, using indices into the file to maintain their
line pointer information. The text of the entire file is stored in a
text segment, and the line number indices are stored in a line segment.
Similar to LIST, LOAD and EDIT also take a line number as input to start
the loading or editing function in the middle of the current file. The
line number before LIST, LOAD, or EDIT is optional. If the line number
is omitted, the listing or loading starts from line 1.
The compiler in F-PC has been tailored to be as fast as it could be made.
Since much of the functionality of WORD has been reworked into assembly
code for performance and all text data are in RAM memory, F-PC compiles
sequential text files much faster than standard F83 compiles BLOCKs. The
compilation speed averages 20,000 lines per minute on a 10 MHz AT.
Utilities are provided to convert large BLOCK files into sequential files
and vice versa. Converting a block file to a sequential file typically
saves more than 60% in mass storage.
6.2. HANDLES
According to DOS manual, a handle is a 16 bit number used by the
operating system to identify a file or a device when it is opened or
created. I F-PC, a handle is a data structure which contains several
fields,. Words have been defined to traverse to the various fields.
Here is a picture of the data structure of a handle as shown in Figure
6.1.
Figure 6.1. The file handle array
Each of the words shown in Figure 6.1 after 'handle array', steps from
the address returned by the handle name, to the field indicated. The word
HANDLE followed by <name> creates and initializes the above structure.
When <name> is later used, it returns the address labeled +0 above.
A file handle stack is used in F-PC to allow a file to load other files.
SEQHANDLE is a value which contains the pointer to the currently opened
file in the handle stack. A sequential line read word LINEREAD is
provided, which reads one line at a time from the file whose handle is in
SEQHANDLE, returning an address of a counted string which also includes
the CRLF characters at the end of the line. You will have to strip them
off if you don't want them. The LINEREAD word is used as follows:
: sample ( -- ) \ followed by <filename>
open \ open a file
0.0 seek \ reset file pointer buffer
begin
lineread \ read a line, returns an address of counted $
dup c@ \ check for length,
0 <> while \ while buffer contains something
cr count 2- type\ type line just read without the CRLF chars.
repeat drop \ repeat till file empty.
close ; \ close the file.
This simple example may seem complicated, but it really is easy to read
the lines of a sequential file, and writing is just as easy. The word
LINEREAD automatically buffers the reads from disk in a 4K buffer to
minimize the number of DOS calls performed. Lines up to 250 characters
can be read with LINEREAD. Longer lines or lines not terminated by a
CRLF will be split at 250 characters.
For more detailed and comprehensive examples, please consult the files
SEQTOBLK.SEQ and BLKTOSEQ.SEQ. These two files contain code which
converts sequential files to block files and vice versa. You will find
useful word sequences which open and close files, read data from a file,
write data to a file, move the file pointer around, pushing and pooping
the file stack, etc.
6.3. SEQUENTIAL FILE WORD SET
The file system interface in F-PC uses handles to talk to DOS, and
only the number representing the File ID is passed to DOS from Forth. To
make the interface as simple and clean as possible, you the Forth
programmer need never deal with the details of how this works. You only
need to know that handles are created with the word HANDLE. Handles are
arrays within which special file information is stored, and when a
handle's name is executed, it returns an address which is the address
that is passed to the handle file control words. Word definitions and
usage in this set of file handling words are as follows:
.FILES ( -- )
Print to the screen a list of the files currently open.
.LOADED ( -- )
Print a list of the files that have been loaded. This list is used to
locate the source file for a particular word that has been compiled.
">$ ( char-addr count -- counted-string )
Convert a string compiled as a 'quoted string' to a counted string, by
dropping the count and decrementing the char-addr by one.
$>HANDLE ( a1 handle-addr -- )
Move the counted string at a1 into the filename field in a handle.
$>EXT ( a1 handle-addr -- )
Move the counted string at a1 into the extension field of handle. The
extension string should not contain a decimal point, and should be
exactly (3) three characters long.
$HOPEN ( a1 -- return-code )
Close the current file if one is open. Move the counted string from
address a1 to the current handle on the handle stack. Return the result
code from DOS as return-code.
!HCB <filename> ( handle-addr -- )
Pick up text from the input stream with WORD, and place the name into the
handle array.
?DEF.EXT ( -- )
Conditionally apply the extension specified in the array DEFEXT to the
filename in handle. DEFEXT is a counted string, three characters long
plus a count.
CHARREAD ( -- c1 )
Read a character from the currently open file specified by SEQHANDLE.
Before using this word, you will need to initialize the sequential input
buffer to empty ( to force a refill from the currently selected file) by
saying IBRESET. This will force a disk read on the next call to
CHARREAD, insuring that you get data from the file you selected.
CLOSE ( -- )
Close the currently open file on SEQHANDLE. Move down one level on the
handle stack, so another file may be open after performing this
operation. Normally you will be able to operate on the handle in SHNDL
as an empty handle after performing CLOSE.
CLR-HCB ( handle-addr -- )
Clear the handle to nulls, and reset the handle identifier field to -1 to
indicate no file is open.
CURPOINTER ( handle-addr -- double-current-ptr )
Return the current 32-bit double pointer into the file specified by
handle.
DEFEXT ( -- a1 )
Return the address of the default file extension that will be applied to
any file to be opened, if no extension is specified in the filename when
the HOPEN occurs. The address a1 is the address of a 4 byte array
containing a count byte, and three extension bytes following. In no case
should a string longer than 3 characters plus count be placed in DEFEXT.
ENDFILE ( handle-addr -- double-end-ptr )
Return the double-end number which represents the length of the file
specified by handle. The file must already be open.
EXHREAD ( a1 n1 handle-addr segment -- n2 )
Read from file handle into the buffer specified by segment and address a1
for a length of bytes n1, and return n2 the length of bytes actually
read. The file must already be open. Useful for reading from a file
into memory other than Forth's Code Segment. A read from a file is
limited to 65535 bytes.
EXHWRITE ( a1 n1 handle-addr segment -- n2 )
Write from segment and address a1 for a length of n1 bytes to file
handle, and return n2, the length of bytes actually written. The file
must already be open. Useful for writing from memory other than Forth
code segment to a file. A write to a file is limited to 65535 bytes.
FILE>TIB ( a1 -- )
Move the counted string filename from address a1 to the Terminal Input
Buffer (TIB), available for use by !HCB.
FLOAD <filename> ( -- )
Open and load the file specified by filename.
HANDLE <handlename> ( -- )
Create a handle with name <handlename>. When <handlename> is later
executed, it returns the address of the handle array created.
HANDLE>EXT ( handle-addr -- a1 )
Step from the handle address to the address of the file extension in the
handle, if an extension exists; else it steps to the null following the
filename. The address a1 will be the address of a decimal point
character if the file contains an extension, or the address of a null if
no extension was contained in the handle.
HCLOSE ( handle-addr -- return-code )
Close the file currently open on handle, and return the result code from
DOS as return-code.
HCREATE ( handle-addr -- return-code )
Create a file with the filename specified by the handle array, and return
the DOS result code as return-code.
HDELETE ( handle-addr -- return-code )
Delete the filename as specified by handle name, and return the result
code from DOS as return- code.
HIDELINES ( -- )
Specify that lines loaded with FLOAD NOT be displayed to the display
screen.
HOPEN ( handle-addr -- return-code )
Given the handle address, open the filename in it, and return the result
code from DOS as return- code.
HREAD ( a1 n1 handle-addr -- n2 )
Read from file handle into the buffer address a1 for length of bytes n1,
and return n2 the length of bytes actually read. The file must already
be open. A read from a file is limited to 65535 bytes.
HRENAME ( handle1 handle2 -- return-code )
Rename the filename specified by handle1 to be the name specified in
handle2, and return the DOS result code as return-code.
HWRITE ( a1 n1 handle-addr -- n2 )
Write from address a1 for length n1 bytes to file handle, and return n2
the length of bytes actually written. The file must already be open. A
write to a file is limited to 65535 bytes.
IBRESET ( -- )
Clear the input read buffer in preparation for reading a new file. This
is done by OPEN.
LINEREAD ( -- a1 )
Read one line from the current file, buffered by INBUF, which holds 4k
bytes. Return a1 the address of OUTBUF, a 256 byte buffer used to hold
lines read. When switching to a new file, use MOVEPOINTER to reset the
file pointer to the beginning of the file, and execute IBRESET so the
next LINEREAD will cause a read from the disk file. The read line length
is limited to 250 bytes. Lines read with LINEREAD are terminated with
LF=$0A.
LIST ( line-number -- )
List 18 lines starting at line-number from the currently open file.
Default to line 1 if line-number is omitted.
LOAD ( line-number -- )
Start loading the currently open file, at line-number. Used alone
without a line number, it defaults to load from line 1. Load through the
end of the file or until \S if no errors are encountered.
MOVEPOINTER ( double-offset handle-addr -- )
Move the filepointer into the file handle to the offset location
specified by double-offset. The file must already be open.
PATHSET ( handle -- f1 )
Check the file contained in handle. If it does not contain a path, then
apply the current drive and path to the handle. Return f1 FALSE if it
succeeded; else TRUE if it failed to read the path from DOS.
RWMODE ( -- a1 )
A variable which holds the read/write attributes for any file to be
opened by HOPEN, normally contains a two (2) for read/write. It may be
set to one (1) for write only, or to zero (0) for read only.
SEEK ( d1 -- )
Position the file pointer for the file currently open on SEQHANDLE, to
d1, that is SEEK to position d1 relative to the beginning.
SEQDOWN ( -- )
Close the current file on the current level of the handle stack, and step
down one level to the previous handle. The handle stack is five levels
deep allowing files to load files.
SEQHANDLE ( -- a1 )
A VALUE that returns the address of the current file handle in the handle
stack.
SEQHANDLE+ ( -- a1 )
A VALUE that returns the address of the next available handle in the
handle stack. This handle is used by the SED editor for temporary file
operations like renaming, or deleting. You can use SEQHANDLE+ as well,
but its usage duration should be kept very short to avoid problems with
other sections of the program using it.
SEQUP ( -- )
Step up one Handle on the handle stack, if there is a file open on that
stack level, close it. The handle stack is five levels deep. FLOAD can
thus be nested for four levels.
SHOWLINES ( -- )
Specify that lines loaded with FLOAD will be displayed to the display
screen.
6.4. CONVERSION BETWEEN SEQUENTIAL AND BLOCK FILES
To ease your transition from F83 to F-PC, utilities are provided to
convert block files used by F83 to the sequential files required by F-PC.
The command sequence is :
FLOAD BLKTOSEQ <enter> \ load converter
CONV <filespec> <enter>
The file specified after CONV will be converted to a sequential file of
the same name with .SEQ extension. If a file specification is not given
after CONV, CONV will request a valid file name to be converted. CONV
also assumes that the block file has shadow blocks in its second half if
the source file has an extension of .BLK. It then brackets the text in
the shadow blocks between COMMENT: and COMMENT; , and inserts them after
the corresponding source block. However, if the source file has an
extension of .SCR, the entire file will be converted as source blocks and
the second half of the file is not treated as shadow blocks.
The utility package SEQTOBLK.SEQ will do the opposite: converting a
sequential file back to a block file. The commands are:
FLOAD SEQTOBLK <enter>
CONV <enter>
CONV will request the names of the sequential and block files. The
extensions must not be given, as the sequential file defaults to .SEQ and
the block file to .SCR. CONV does a line to line conversions and does
not try to detect boundaries between definitions. It tends to break long
definitions and puts them in consecutive blocks. However, as all
experienced Forth programmers tend to write single line short
definitions, the block file thus produced should be compilable under F83
without trouble, we hope.
These two conversion utility files contain the best examples on how to
open/create files and how to read/write them. If you have to use files
in your application, pay special attention to these two files for advice
and guidance. The meanings of the words in the glossary become clear as
you see them in action.
6.5. PROGRAMMING STYLE AND SEQUENTIAL FILES
The use of sequential files for source code instead of blocks has its
most significant influence on the style of Forth programming. Using
blocks, the favored style of Forth code is the 'Horizontal code style' by
which code is arranged in horizontal lines with many Forth words spread
in the same line. This style emphasizes the structure of words, whose
definitions appear to be small, modular units. These units are then used
to form other words which are again small units to build other words.
Word definitions expressed in the horizontal style are concise and
powerful. The problem is that it tends to leave little space for
comments and documentation. The lack of comments and in-line
documentation make Forth code difficult to understand and hard to
maintain.
There are only 16 lines in a block, while each line has 64 characters.
The aspect ratio of a block is 4:1, an much elongated rectangle. The
block seems to exert a great pressure on the source code in the vertical
directions and presses the code into flat, horizontal lines. In a text
file, the vertical pressure is completely released because the file does
not have practical limitation on the number of lines it can contain. The
residual pressure is from the limited number of characters that one can
put in a line. This pressure in the horizontal direction tends to press
the code into the 'Vertical code style' favored by traditional
programming languages. In the vertical code style, each line is limited
to have one statement, leaving plenty of white space for comments and
documentation.
It is thus natural that you will find most of the source files in F-PC
follows the vertical code style. It is especially evident in the low
level code definitions. Each line contains only a small number of Forth
words, allowing much space for documentation, which may or may not be
filled in by the author. Nevertheless, this mechanism is apparently
favoring less code in a line. Indentation can also be used with much
more freedom than under the pressure between block boundaries.
By removing blocks from the system, F-PC also eliminates the last excuse
for Forth programmers not to write clear and well-documented code.
Henceforth, if we see bad, undocumented code in F- PC, we will shoot the
programmer and leave Forth in peace.