5-6 FILE PORTING AND FTP
*************************
For the purpose of file porting computers can be divided into
the following groups:
Hardware Floats Endianity Unformatted
----------------- ------ --------- ------------
UNIX workstations IEEE BIG Variable (4)
CRAY CRAY BIG
DEC VAX DEC LITTLE Segmented
DEC ALPHA IEEE+ LITTLE Variable (4)
IBM PC compatibles LITTLE
IBM mainframes IBM BIG
File porting may be a problem when you transfer unformatted files
between different groups.
Methods of file transfer
------------------------
The usual file transfer methods are:
1) Direct transfer of files by FTP
2) Transfer by FTP of archived files
3) Standard ANSI magnetic tape
A short digression on FTP
-------------------------
The File Transfer Protocol (FTP), is usually used interactively
by invoking a program with that name.
Many of the transfer options proposed by Postel and Reynolds in
RFC959 were not implemented, and FTP programs can properly handle
only text file transfers. Binary transfers are properly handled
only in the simplest case, between two byte-oriented (e.g. UNIX)
file-systems.
FORTRAN require record-oriented files, on byte-oriented systems
the FORTRAN compiler has to support this requirement, it produces
and reads files with variable-length records.
However, binary FTP transfers between a record-oriented system
(e.g. VMS) and a byte-oriented one are not supported, and all or
some of the control information of each record is discarded in one
direction, and is passed without proper translation in the other.
FTP shortcomings can be worked around by proper modification in
the FORTRAN source code. When writing files intended to be
transferred from a record-oriented system to a byte-oriented one,
a count-field value can be prefixed to each record. In the other
direction a routine that understands the foreign record format
should be used for reading.
FTP of archived files
---------------------
Archiving programs like the VMS version of ZIP (used with "-V")
store the control information of each record. When the compressed
file is decompressed by GZIP on a UNIX system, that information
can be retrieved in a useful form.
Porting formatted files
-----------------------
This is relatively simple, possible problems are:
1) Different character codes (EBCDIC on IBM mainframes,
ASCII on all others).
2) File type translations (Variable-size-records on VMS,
some Stream type on almost all others).
Direct FTP can take care of both these problems, character codes are
transformed into a standard character set (standard 8-bit Network
Virtual Terminal-ASCII) before transmission and are transformed again
to the local character set upon reception.
Similarly, records are translated to a standard form (stream CR/LF)
before transmission and transformed to the local structure upon reception.
It is recommended to use formatted files to transfer information
between different systems. The disadvantages are that the formatted
files are larger and some precision is lost on the radix translations.
Porting unformatted files
-------------------------
Here the problems start:
1) Different endianity (DEC machines and PCs are little
endian, all else are big endian).
2) Different integer sizes / float formats (integers have
the same general format, most floats are now IEEE).
3) Different character codes (EBCDIC on IBM mainframes,
ASCII on all others).
4) File type translations (Variable-size-records on VMS,
some Stream type on all others).
5)
Porting unformatted files is less frightening than the list of
problems suggests, e.g. UNIX workstations are compatible except
for the endianity problem.
Problems #1, #2 makes porting unformatted files content dependant,
i.e. you need to know the contents of a file in order to port it.
In the general case each variable has to be converted separately,
so the converting program has to know in detail the layout of
variables in the file.
Endianity conversion
--------------------
Integer/Float format conversion
-------------------------------
Control information conversion
------------------------------
If you have the program source you can do it with a few modifications,
in the general case you'll need a conversion program.
1) Unformatted file from VMS to UNIX:
On VMS you can use unformatted variable records if your records
are no longer than 32764 bytes, specify RECORDTYPE='VARIABLE'
in the OPEN statement, as the default for unformatted I/O is
'SEGMENTED'.
FTP discards the the 2-byte count-field of the variable records,
you can re-prefix (and re-suffix) the record length to the data
in each WRITE statement:
INTEGER RECLEN
REAL X, Y, Z
......................
RECLEN = SIZEOF(X) + SIZEOF(Y) + SIZEOF(Z)
WRITE (10) RECLEN, X, Y, Z, RECLEN
If your unformatted records has to be longer, write each record
in parts, each one smaller than 32764 bytes, write the record
length in the beginning of the first part, and in the end of
the last part.
An unrecommended option is using the C Run-Time-Library function
'write', it converts VAX/ALPHA little-endian longwords (4 bytes)
to big-endian. 'write' doesn't add the prefix & suffix count-fields,
and creates a stream/LF file, an unsuitable type. Include the unixio.h
and file.h standard headers, they contain the function prototype and
associated argument constants.
2) Unformatted file from UNIX to VMS:
FTP will create by default 512 bytes long fixed-length records.
If the records are of the same known length, you may change the formal
record-length to that value, without changing anything in the file.
Use either: SET FILE/ATTRIBUTES=(LRL:size) filespec
or Joe Meadows FILE utility, then use OPEN with RECORDTYPE='FIXED'
and RECL=size, read and ignore the count field (first 4 bytes).
If the records are not the same length, you'll need a routine that
can reconstruct the original structure.
EBCDIC/ASCII conversion
-----------------------
File type conversion
--------------------
FTP options
-----------
FTP transfer options can be divided into 5 categories, most
of them are unimplemented:
File structure (e.g. stru file)
-------------------------------
file Byte-oriented file system
record Record-oriented file system
page ----
mount ----
vms [On VMS/MULTINET] Preserve all VMS characteristics
automatically negotiated.
Transfer type (e.g. type ascii)
-------------------------------
ascii For text files, default.
ebcdic For IBM mainframes
backup [On VMS/MULTINET] for VMS/BACKUP files
binary Same as IMAGE
image For unformatted data files and executables.
local-byte-size
logical-byte Same as LOCAL-BYTE-SIZE
tenex
Transfer mode (e.g. mode stream)
--------------------------------
stream Usual mode
compressed Supported by TGV/MULTINET only?
block
Form formats
------------
non-print
telnet format effectors
carriage control (ASA)
Auxiliary
---------
record-size [On VMS/MULTINET]
site rms recsize [On VMS/MULTINET]
block [On VMS/MULTINET]
case
Return to contents page