home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 9 Archive
/
09-Archive.zip
/
ZOODOC.ZIP
/
ZOO.TXT
< prev
Wrap
Text File
|
1987-01-14
|
41KB
|
857 lines
Note to the reader: This document gives a rationale for the Zoo project.
It also answers recent criticism. To help provide a balanced picture to
users, please upload this file to your local BBSs. Please give it a name of
the form ZOOPLAN1.* to help users identify is as the first file in a series
of possibly more than one. A suggested upload description is, "Zoo author
answers critics." Thank-you. -- R.D.
The entire contents of this document are (C) Copyright 1986 Rahul Dhesi.
Permission is granted to reproduce it in any form, for any purpose, on any
medium, whether for commercial or noncommercial use, provided only that it
is not modified.
A Zoo Manifesto
by
Rahul Dhesi
1 November 1986
INTRODUCTION. The Zoo archiver has raised a controversy. Vociferous
criticism has appeared and users are being advised to "forget it".
For Zoo to have been ignored and to have come and gone in complete obscurity
would have been a punch in my face. Instead, Zoo's critics have felt it
worthy of their attention.
But much of the criticism of Zoo is short-sighted. Some important facts may
be omitted. This document gives you my perspective.
ARCHIVE/LIBRARY FORMATS
First, we must understand archive formats. The traditional LBR format file
(often called a "library") has the following structure:
[dir. entry]------ LBR format
[dir. entry]----- | ---
[dir. entry]----- | -- | ---
. | | |
. | | |
. | | |
[file data] <---- | |
[file data] <--------- |
[file data] <--------------
.
.
Each "directory entry" contains information about the file, such as its
name, size, and its position within the library, and optionally the time-
stamp of the file. All the directory entries are at the beginning of the
library.
The ARC format, by contrast, does not reserve a separate area for directory
entries. Each file is stored as a directory entry for that file immediately
followed by the file data.
[dir. entry] ARC format
[file date]
[dir. entry]
[file data]
[dir. entry]
[file data]
.
.
Both formats have their advantages and disadvantages. The LBR format,
having all the directory entries clustered together at the beginning, allows
very fast listing of names of files in the library. The ARC format utili-
ties must go through the archive step by step to read and list all the
filenames. A disadvantage of the LBR format is that space may be wasted
maintaining empty directory slots to allow addition of new files. Other-
wise, when a new file is added, the entire library must be reorganized to
create additional space for the directory entry. The ARC format allows new
files, and their directory entries, to be simply appended to the archive.
Another disadvantage of the LBR format is that it remembers the size of a
stored file rounded up to the nearest 128 bytes. Both ARC and Zoo store the
exact size of files.
Now, let's look at the Zoo archive format.
[dir. entry]--- Zoo format (current)
|
[file data] <-
[dir. entry]---
|
[file data] <-
[dir. entry]---
|
[file data] <-
All currently existing Zoo archives have the format shown above. Like ARC
archives, they contain the directory entry for each file followed by its
data. In the Zoo format, however, each directory contains a pointer to the
file data, which can be anywhere. This makes it possible to reorganize a
Zoo archive to look like this:
[dir. entry]--- Zoo format (alternative
[dir. entry]-- | -- format -- fast directory
[dir. entry]-- | -- | --- listings)
. | | |
. | | |
[file data] <- | |
[file data] <------ |
[file data] <-----------
This reorganized format makes fast directory listings possible. All exis-
ting versions of Zoo will understand the reorganized format. A separate
utility, or a new option in Zoo itself, will allow the user to reorganize
Zoo archives.
Despite the LBR-like organization here, though, files can still be easily
added to the archive. This is because each directory entry contains another
pointer, which points to the next directory entry. If we added another file
to the archive represented above, it would become:
[dir. entry]--- Zoo format (reorganized,
[dir. entry]-- | -- then another file added)
[dir. entry]-- | -- | ---
. | | |
. | | |
[file data] <- | |
[file data] <------ |
[file data] <-----------
[dir. entry]---
|
[file data] <-
The Zoo format thus attempts to provide the advantages of both the LBR and
ARC formats without the disadvantages of either.
Since the file data can be anywhere in the archive, improved versions of Zoo
can include additional information in an archive without affecting extrac-
tion by prior versions of Zoo. For example, Zoo version 1.10 introduced the
ability to attach comments to archived files. Versions earlier than 1.10 do
not understand or interpret comments, yet all versions of Zoo will extract
files from all existing Zoo archives. Adding comments only meant adding a
new pointer to the directory entry that points to the comment. Adding
comments to two of the files of the example above gives the following
archive structure:
----[dir. entry]--- Zoo format (after
| | adding comments to
| [file data] <- two files)
|
--- | ---[dir. entry]---
| | |
| | [file data] <-
| |
| | [dir. entry]---
| | |
| | [file data] <-
| |
| --> [comment ]
-------> [comment ]
Since Zoo doesn't care exactly where the file data and comments go, we need
only append all new data to the archive and update a pointer in the direc-
tory entry. Older versions of Zoo ignore the additional data. Older ver-
sions still recognize that a newer format is being used and they will refuse
to manipulate the archive in a way that could cause loss of information.
Now, with an understanding of the various archive/library formats, we can
discuss some of the strengths of the Zoo format.
ADVANTAGES OF THE ZOO FORMAT
The key issue here is one of extendability. One of the features planned for
Zoo is the ability to store long filenames and pathnames. The MS-DOS file-
name standard of 8+dot+3 characters is restrictive. Other systems offer
longer filenames, and allow more special characters to be part of filenames.
For example, Berkeley UNIX allows filenames to be up to 255 characters long,
and they may contain any printable or unprintable 7-bit character except
slash ("/") and null (0x00). Other implementations of the UNIX system also
allow 8-bit character codes in filenames. The Macintosh allows filenames
to contain spaces. A truly standard archive format should not arbitrarily
restrict filename size or syntax. Yet it should be possible to extract
every file from every archive on every system.
Another desirable feature is the ability to store the full pathname of a
file. This is especially useful if one makes backups of files in compressed
form and then wishes to restore each file to its original directory.
A truly standard archiving format should also allow text files to be cor-
rectly interpreted on all systems. Under CP/M and MS-DOS, a carriage ret-
urn/line feed pair terminates each line of text. On the Macintosh, the
terminator is a carriage return. On the Amiga and under all versions of
UNIX, the terminator is a linefeed. User convenience requires that the
archiver store text files in a canonical format that will be converted to
the appropriate host system format at extraction time.
When one of these features is added to the archive format, all existing
implementations of the archive software should still be able to extract
files from the new archive format. It was the goal of the Zoo archiver
project to create such a truly extendable and portable format.
Another desirable characteristic of a good archive utility is the ability to
hold multiple versions of the same file. Some mainframe systems automati-
cally maintain a version number for each file. Users of other systems,
which do not do so, would benefit if this feature were provided by an
archive utility. This would be especially useful to programmers who would
like to conveniently save multiple versions of source files without having
to give them different names.
Finally, security of data is an often-neglected issue. A bad telephone
connection, or a bad sector on a floppy disk, can cause file corruption.
This should not make an entire archive unusable. It should be possible for
a repair utility to search for files within the archive and extract those
that are not corrupted. The ARC scheme has a potential advantage over the
LBR format, since corruption of a part of the archive seldom affects more
than one file. If an LBR format file is corrupted near the beginning, all
directory entries can be lost. However, existing ARC utilities do not allow
recovery of files from a corrupted archive beyond the point of damage. This
is because there is no way of arbitrarily recognizing where a file begins.
The Zoo format includes enough redundancy that it is usually possible to
locate individual directory entries and individual file data areas in a
corrupted archive. A planned repair utility will allow recovery of un-
damaged portions of a Zoo archive. This is an impossibility with existing
library/archive formats because they contain no redundant information.
None of these features can be easily added to the ARC format. One important
reason for this is the fact that extending the directory structure of an ARC
archive makes all existing ARC utilities unable to understand the improved
archive structure. If an ARC format utility adds additional information to
an archive in a transparent way, then this information is likely to be lost
if any other utility manipulates the archive.
For example, the PKARC utility can add brief comments to archived files. It
does so by appending comments to the end of the archive. However, other ARC
format utilities will silently strip all comments from the archive when they
manipulate it. The result can be a nasty surprise for the user.
To summarize, then, here is a list of specific features planned for Zoo:
o Long filenames. Each file will be optionally stored with the original
filename, regardless of its length. When extraction is performed, a
shorter name will used if necessary. Any unacceptable special charac-
ters will be converted to acceptable characters.
o Pathnames. The original directory name will be optionally stored.
During extraction, the archive program will optionally restore files to
their original directories. If the original pathname uses a syntax
different from that of the current system, the file will be restored to
the current (default) directory.
o Text files. These will be stored in a canonical format. At extraction
time the archive program will convert the canonical format to the
format used by the host system.
o Multiple versions. Optionally, when a file is added to an archive, the
added file and already-archived file(s) with the same name will be
assigned version numbers. It will be possible to selectively extract a
specific version of a file from an archive.
o Archive repair. A utility will allow extraction of undamaged data from
a damaged archive.
o Fast directory listings. It will be possible to reorganize archives to
bring all directory entries to the beginning. This will allow instan-
taneous directory listings sorted by name, extension, size, and date.
o Portable code. A truly standard archiver should be easily implementa-
ble on a new machine. The portable implementation of Zoo is discussed
later in this document.
CURRENT FEATURES
Since the purpose of this document is partly to defend Zoo against its
critics, I describe here some of the features of Zoo that distinguish is
from the competition. Version 1.31 of Zoo for MS-DOS is discussed. In some
cases samples of captured screen output are given to give you a feel for the
software. It is my hope that you, the reader, will try Zoo out for
yourself.
The following special features of Zoo are discussed.
Comments: A comment may be attached to any file and optionally listed in a
directory listing.
Enhanced wildcards: The "*" character can be used anywhere in a filename.
Character ranges are accepted (e.g. "a-d").
Convenient cataloging: An entire directory or disk may be catalogued with
a single command.
Two-step deletion: Accidental replacement of a precious archived file is
prevented.
Batch operation: Unattended batch operation is possible.
User interrupts: Control C is quickly and gracefully handled.
Z format files: This format allows convenient reorganization of archives.
It also lets BBS users selectively download files without losing the benefit
of compression.
FEATURES IN DETAIL
COMMENTS. The user may optionally attach a comment of up to 65,535 charac-
ters to one or more archived files. Comments may be optionally shown in a
directory listing.
Here is an example:
C:/scr>zoo v tvx_exe
Archive TVX_EXE.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
readme.doc 272 16% 228 29 Jul 86 11:17:08
tv.exe 40100 33% 26948 29 Jul 86 10:05:22 C
|The original tvx.
tv0.exe 39502 32% 26695 29 Jul 86 10:21:26 C
|The modeless tvx.
tv0_cfg.exe 17905 28% 12838 29 Jul 86 10:22:46 C
|Run this to create key configuration file for tv0.exe.
tv0_ptch.exe 13830 27% 10147 29 Jul 86 10:23:56 C
|Run this to patch tv0.exe with permanent key definitions.
tv_cfg.exe 17905 28% 12838 29 Jul 86 10:06:42 C
|Run this to create key configuration file for tv.exe.
tv_ptch.exe 13830 27% 10150 29 Jul 86 10:07:50 C
|Run this to patch tv.exe with permanent key definitions.
tvm.exe 40190 33% 27100 29 Jul 86 10:34:16 C
|The Emacs emulator tvx.
tvm_ptch.exe 13803 27% 10113 29 Jul 86 10:35:56 C
|Run this to patch tvm.exe with permanent key definitions.
tvv.exe 40622 32% 27631 29 Jul 86 11:00:32 C
|The vi emulator tvx.
tvv_ptch.exe 13830 26% 10175 29 Jul 86 11:02:14 C
|Run this to patch tvv.exe with permanent key definitions.
------------ -------- --- -------- --------- --------
TOTAL 11 251789 31% 174863
------------
Attached comments are meant to be pointers to help the user decide which
files to extract. Users can ease the busy Sysop's workload by adding brief
but meaningful comments to files before uploading a Zoo archive. Comments
also allow the addition of brief editorial information without in any way
changing the package of archived files. This avoids offending authors of
shareware.
The presence of comments does not preclude an uncommented directory listing:
C:/scr>zoo l tvx_exe
Archive TVX_EXE.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
readme.doc 272 16% 228 29 Jul 86 11:17:08
tv.exe 40100 33% 26948 29 Jul 86 10:05:22 C
tv0.exe 39502 32% 26695 29 Jul 86 10:21:26 C
tv0_cfg.exe 17905 28% 12838 29 Jul 86 10:22:46 C
tv0_ptch.exe 13830 27% 10147 29 Jul 86 10:23:56 C
tv_cfg.exe 17905 28% 12838 29 Jul 86 10:06:42 C
tv_ptch.exe 13830 27% 10150 29 Jul 86 10:07:50 C
tvm.exe 40190 33% 27100 29 Jul 86 10:34:16 C
tvm_ptch.exe 13803 27% 10113 29 Jul 86 10:35:56 C
tvv.exe 40622 32% 27631 29 Jul 86 11:00:32 C
tvv_ptch.exe 13830 26% 10175 29 Jul 86 11:02:14 C
------------ -------- --- -------- --------- --------
TOTAL 11 251789 31% 174863
------------
C: file has attached comment.
Zoo still gently reminds you that comments exist should you wish to read
them.
Comments are preserved when an updated file replaces an older version with
the same name. The user may update, add, or delete comments at any time,
either when the archive is initially created, or later.
ENHANCED WILDCARDS. The character "*" may be used anywhere within a file-
name. A character range of the form "a-d" may be used to select all files
beginning with letters "a" through "d". Thus, to add selected files to an
archive, one could proceed as follows:
C:/scr>zoo a sample /bin/g-i /bin/x-z
Zoo: hdchek.exe -- (23%) added
Zoo: hdtest.exe -- (23%) added
Zoo: ibu.com -- (19%) added
Zoo: indent.com -- ( 1%) added
Zoo: xpc.com -- (44%) added
Zoo: ye.exe -- (17%) added
Zoo: yhp.exe -- (15%) added
Zoo: zoo.exe -- (20%) added
This example showed how one could add to an archive all files in the c:/bin
directory that begin with letters "g" through "i" and "x" through "z".
A selective directory listing using wildcards shows all files containing a
"t" before the dot:
C:/scr>zoo l sample *t.*
Archive SAMPLE.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
hdtest.exe 32546 23% 25091 28 Aug 86 19:44:14
indent.com 458 1% 453 1 Jun 85 00:00:00
------------ -------- --- -------- --------- --------
TOTAL 2 33004 23% 25544
The wildcards accepted by Zoo are intended to make it easier to reorganize
large numbers of archives. They give the user more powerful selection
options than have been traditional on microcmputer systems.
Zoo will also select groups of archives for directory listings. Here's a
directory listing of all archives with names beginning with "z":
C:/new>zoo l z*.zoo
Archive ZCOMMDOC.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
zcomm.bug 1387 42% 798 4 Oct 86 12:34:18
zcomman.aa 149915 62% 57692 4 Oct 86 05:15:14
zcomman.ab 186384 64% 66581 4 Oct 86 05:24:08
zcomman.ac 148480 63% 55538 4 Oct 86 05:30:44
zcomman.ad 2910 68% 942 4 Oct 86 05:30:46
zcomman.ai 14823 55% 6686 4 Oct 86 05:36:04
------------ -------- --- -------- --------- --------
TOTAL 6 503899 63% 188237
Archive ZCOMMEXE.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
cdemo 7785 46% 4178 27 Mar 86 15:42:42
cisnodes.lst 11436 44% 6382 27 Mar 86 15:30:22
phodir.t 13223 45% 7250 14 Oct 86 02:16:10
revv 708 33% 476 17 Aug 86 15:28:48
zcomm.exe 122510 30% 85895 13 Oct 86 14:24:22
------------ -------- --- -------- --------- --------
TOTAL 5 155662 33% 104181
CONVENIENT CATALOGING. Any number of archives may be cataloged with a
single command.
C:/scr>zoo la *
hdchek.exe 25456 23% 19644 28 Aug 86 19:40:42 SAMPLE.ZOO
hdtest.exe 32546 23% 25091 28 Aug 86 19:44:14 SAMPLE.ZOO
ibu.com 9216 19% 7420 18 May 85 15:48:34 SAMPLE.ZOO
indent.com 458 1% 453 1 Jun 85 00:00:00 SAMPLE.ZOO
xpc.com 40562 44% 22754 18 Jun 86 08:42:22 SAMPLE.ZOO
ye.exe 11396 17% 9504 11 Aug 86 19:26:02 SAMPLE.ZOO
yhp.exe 14936 15% 12694 8 Jan 86 22:20:24 SAMPLE.ZOO
zoo.exe 31168 20% 24799 18 Oct 86 21:46:50 SAMPLE.ZOO
readme.doc 272 16% 228 29 Jul 86 11:17:08 TVX_EXE.ZOO
tv.exe 40100 33% 26948 29 Jul 86 10:05:22 C TVX_EXE.ZOO
tv0.exe 39502 32% 26695 29 Jul 86 10:21:26 C TVX_EXE.ZOO
tv0_cfg.exe 17905 28% 12838 29 Jul 86 10:22:46 C TVX_EXE.ZOO
tv0_ptch.exe 13830 27% 10147 29 Jul 86 10:23:56 C TVX_EXE.ZOO
tv_cfg.exe 17905 28% 12838 29 Jul 86 10:06:42 C TVX_EXE.ZOO
tv_ptch.exe 13830 27% 10150 29 Jul 86 10:07:50 C TVX_EXE.ZOO
tvm.exe 40190 33% 27100 29 Jul 86 10:34:16 C TVX_EXE.ZOO
tvm_ptch.exe 13803 27% 10113 29 Jul 86 10:35:56 C TVX_EXE.ZOO
tvv.exe 40622 32% 27631 29 Jul 86 11:00:32 C TVX_EXE.ZOO
tvv_ptch.exe 13830 26% 10175 29 Jul 86 11:02:14 C TVX_EXE.ZOO
------------ -------- --- -------- --------- --------
TOTAL 19 417527 29% 297222
The output from the "la" command contains all selected files in all selected
archives, sorted by archive name. Each file is listed on a line that
includes the archive name. Running this output through any sort utility
will yield a list of all archived files sorted by filename.
We could also choose the multicolumn format, which lets over a hundred
filenames fit on the screen:
C:/scr>zoo lf *
Archive SAMPLE.ZOO:
hdchek.exe hdtest.exe ibu.com indent.com xpc.com
ye.exe yhp.exe zoo.exe
Archive TVX_EXE.ZOO:
readme.doc tv.exe tv0.exe tv0_cfg.exe tv0_ptch.exe
tv_cfg.exe tv_ptch.exe tvm.exe tvm_ptch.exe tvv.exe
tvv_ptch.exe
TWO-STEP DELETION. When a file is added to an archive, any existing ar-
chived file with the same name is replaced. However, Zoo does not erase the
replaced file's data until explicitly asked to do so. Inadvertent deletion
of the replaced file is avoided. For example, we create an archive:
C:/scr>zoo a sample2 /*.bat
Zoo: auto2.bat -- (13%) added
Zoo: autoexec.bat -- (30%) added
Zoo: single.bat -- (16%) added
We add the same files again, replacing the previous ones:
C:/scr>zoo a sample2 /*.bat
Zoo: auto2.bat -- (13%) replaced
Zoo: autoexec.bat -- (30%) replaced
Zoo: single.bat -- (16%) replaced
Here's a directory listing:
C:/scr>zoo l sample2
Archive SAMPLE2.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
auto2.bat 225 13% 196 10 Jul 86 12:45:22
autoexec.bat 1210 30% 851 22 Oct 86 16:08:08
single.bat 287 16% 242 6 Jul 86 19:00:42
------------ -------- --- -------- --------- --------
TOTAL 3 1722 25% 1289
------------
There are 3 deleted files.
The replaced files are still there and can be made them visible:
C:/scr>zoo ld sample2
Archive SAMPLE2.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
auto2.bat 225 13% 196 10 Jul 86 12:45:22 D
autoexec.bat 1210 30% 851 22 Oct 86 16:08:08 D
single.bat 287 16% 242 6 Jul 86 19:00:42 D
auto2.bat 225 13% 196 10 Jul 86 12:45:22
autoexec.bat 1210 30% 851 22 Oct 86 16:08:08
single.bat 287 16% 242 6 Jul 86 19:00:42
------------ -------- --- -------- --------- --------
TOTAL 6 3444 25% 2578
------------
D: deleted file.
The deleted files may still be extracted with the appropriate command. The
archive may be packed when the user is sure the deleted files are no longer
needed. (The Novice commands simulate the behaviour of other archive utili-
ties and automatically pack the archive when a file gets replaced. However,
a backup copy of the original archive remains so replaced files can still be
recovered.)
BATCH OPERATION. The "Overwrite?" question may be suppressed while extrac-
tion is being done so that a batch file never has to pause for input. When
Zoo terminates, it returns an error code if any errors occurred. The error
code may be tested by the MS-DOS "if errorlevel..." command.
USER INTERRUPTS. Zoo may be interrupted at any time. The interrupt is
recognized within a few seconds, even if extraction or addition of a long
file was in progress. No garbage files are left behind and archive integ-
rity is not harmed.
Z FORMAT FILES. Like the traditional squeezed format, each Z format files
contains one compressed file. Files may be extracted from any Zoo archive
into Z format:
C:/scr>zoo xz sample
Zoo: hdchek.exe ==> hdchek.eze -- extracted
Zoo: hdtest.exe ==> hdtest.eze -- extracted
Zoo: ibu.com ==> ibu.czm -- extracted
Zoo: indent.com ==> indent.czm -- extracted
Zoo: xpc.com ==> xpc.czm -- extracted
Zoo: ye.exe ==> ye.eze -- extracted
Zoo: yhp.exe ==> yhp.eze -- extracted
Zoo: zoo.exe ==> zoo.eze -- extracted
Each selected file is extracted into a Z format file with a name of the form
*.?Z?. Since no uncompression is done, the extraction is extremely fast.
All file attributes (date, time, length, compression factor, and any
attached comments) are preserved. Z format files may now be selectively
added to other archives:
C:/scr>zoo az new h*.?z? *t.?z?
Zoo: hdchek.exe <== hdchek.eze -- added
Zoo: hdtest.exe <== hdtest.eze -- added
Zoo: indent.com <== indent.czm -- added
This has created a new archive with only three files in it. A directory
listing shows this:
C:/scr>zoo l new
Archive NEW.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
hdchek.exe 25456 23% 19644 28 Aug 86 19:40:42
hdtest.exe 32546 23% 25091 28 Aug 86 19:44:14
indent.com 458 1% 453 1 Jun 85 00:00:00
------------ -------- --- -------- --------- --------
TOTAL 3 58460 23% 45188
Since all file attributes are retained in Z format, reorganization of
archives via this format does not risk changing a file's timestamp or losing
any attached comment.
The real beneficiary of the Z format is likely to be the bulletin board user
who wishes to avoid downloading a huge archive solely to get selected files
from it. Bulletin board systems allowing online execution of programs can
let users selectively extract archived files into Z format without giving up
the advantage of compression.
Because of the availability of a version of Zoo written entirely in portable
C, authors of bulletin board software can integrate Zoo extraction code into
their own. This makes it unnecessary to go to awkward extremes to allow
online execution in a safe manner.
THE PORTABLE ZOO
Zoo is written in portable C. The MS-DOS version uses some assembly
language modules and some compiler-specific code for speed and convenience.
The portable Zoo as it currently exists includes all features of the MS-DOS
version except the following: addition of files to an archive, preservation
of file timestamps, and Z format files. Porting of these remaining features
is in progress. The target date for the availability of the full portable
Zoo archiver is December 31, 1986.
The code for the current portable version is completely independent of
machine architecture, byte ordering, and the way the compiler packs struc-
tures. It currently compiles and runs on three systems: (a) MS-DOS using
the Microsoft C compiler version 3.0; (b) An Intel 310 system with an 80286
CPU running Xenix release 3.0; and (c) An AT&T 3B2 system running System V
Release 2.1.
The eventual objective is to create a portable version that will compile and
execute on a variety of systems with no changes other than in a separate
machine-dependent header file.
COMMON USER QUESTIONS
Here are my responses to commonly-raised questions.
Q. Why can't Zoo be compatible with ARC?
A. Compatibility with the ARC format would make most of Zoo's major goals
impossible to achieve. The ARC format is not extensible. Any change
in the directory structure of an ARC archive makes it immediately
unreadable by all existing ARC format utilities.
ARC extractors are available on several systems. Invariably, each
implementation was an independent effort. Any change in the ARC format
would therefore make it necessary for all those independent implemen-
tors to separately rewrite their code.
Using the ARC format in modified form would only confuse and frustrate
uses who would expect their current utilities to work with any file
with an extension of ".ARC".
Q. Why can't you at least make Zoo read and extract ARC files?
A. ARC extractors already exist. It would just make Zoo a little more
complicated.
Q. What is the overworked Sysop going to do? It's bad enough having both
LBR and ARC formats to deal with, without making things worse by adding
a third.
A. Some busy Sysops have already solved the problem by banning all Zoo-
related files from their systems. It's not an approach I encourage,
but it's effective.
Another approach is to recognize that it's possible for several archive
formats to co-exist while the user population decides what the right
balance is. You will still see LBR and LQR files on many systems, and
users do manage quite well.
It's also possible for a Sysop to simply convert to the Zoo format.
Q. I remember when Sysops converted from LBR to ARC. It was a mess. It
took ages to perform the conversion and it was confusing keeping track
of which file belonged in which archive.
A. It certainly was, but not any more. The Atoz utility allows conversion
of a large number of files in LBR or ARC format to Zoo format
relatively easily.
For each selected archive, Atoz executes an external extraction program
of your choice. Then it runs Zoo to create a Zoo archive from the
extracted files. Atoz will ask permission before creating each new Zoo
archive. Optionally, it will convert an entire batch of LBR or ARC
files to Zoo archives without user attention.
On a 4.77 MHz MS-DOS system, with a hard disk, conversion speed is
roughly 3 megabytes per hour, or most of a 10 megabyte collection in 3
hours. This does not include time for any disk swapping needed.
Q. This is all very well for experienced users, but what about the poor
novice users?
A. They will learn more easily than you might suspect. The world of
public domain software has always depended on the more experienced
users helping out the less experienced ones. Zoo isn't going to change
that.
Q. I hate to see Sysops spending more time than they have explaining
another archiving technique. Even if it takes very little time to
explain it to each user, it begins to add up.
A. True, and that's why most Sysops have a bulletin describing the various
file formats such as LBR, LQR, and ARC. Individual Sysops have dif-
ferent policies about users who won't read bulletins. Some Sysops
ignore such users; others gently point out which bulletin answers the
question.
Most Sysops run a bulletin boards as a hobby and a public service. If
they can save their users download time by making Z format extraction
available, that can only enhance the operation of the bulletin board.
Many BBSs cater to several computer systems. The availability of a
truly universal standard will help.
Q. Why do I care about whether or not a Macintosh or a UNIX System V
machine can extract archvies? I only use MS-DOS and that's good enough
for me.
A. You don't have to care, but users of those machines do. Perhaps they
would like to download some text files from an IBM-oriented BBS, or
perhaps they have some thoughts they would like to share with you by
uploading a file or two. There are enough text files that are of use
to users of many different machines to make a universal standard very
appealing. In particular, programs written in high-level languages can
be adapted to run on a variety of machines. Making such exchange
possible without sacrificing the advantages of data compression will
help extend the dwindling supply of public domain software in source
form.
Users of specific computer systems have frequently expressed disdain
for users of other computer systems. First it used to be the TRS-80
versus the Apple. Then, for a while, it was CP/M versus MS-DOS. Now
it's often the Atari versus the Amiga. And sometimes, it's IBM
compatibles versus everything else. On the entire landscape of tech-
nological development, however, none of these is more than a speck.
Historically, few computer systems have survived much longer than about
five years. Some lasted a little longer. Others were fading away even
as they were introduced. The Zoo format is designed to grow as the
needs of users grow. It goes far beyond any one machine or any one
operating system.
Individual users will have to decide how narrow or how wide they want
their horizons to be.
Q. Talking about software in source form...what about the source code for
Zoo?
A. The full source for Zoo will be distributed when the portable version
is ready. In the meantime, C programmers who would like to port Zoo to
a system of interest are invited to join in the effort. Required is
the ability to exchange data with IBM format diskettes, and the availa-
bility of a full C compiler. The file addition part of Zoo still
remains to be re-coded in portable C and it should be ready in a few
weeks as I write this. The current Zoo documentation tells you how to
contact the author.
PERFORMANCE
Bundled with Zoo version 1.20 was a file called ZOOBENCH.TXT, which gave the
results of my own tests. I tested Zoo 1.20 against ARCA 1.18, ARCE 1.18,
PKARC 1.0, and PKXARC 2.5. I used three sets of files: (a) binary files
only; (b) text files only; and (c) a "typical" mix. My conclusion was
that Zoo 1.20 produced smaller archives than the other programs, and did it
faster.
The competition apparently has not stood still. In a file called
ZOOBAD.TXT, Bob Mahoney, Sysop of the Exec-PC BBS, says that Zoo gives only
a 1% compression improvement and is much slower.
A couple of days after I came across ZOOBAD.TXT, I decided to confirm Bob
Mahoney's results for myself. At about 0350 EST on the morning of Octo-
ber 31, I called the Exec-PC BBS. From the files area that is accessible
for download to all callers, I downloaded the file PKX32A11.COM. As best as
I can remember, it was entitled: "The best, fastest, file archiver".
The benchmarks quoted by Bob Mahoney are for PKARC 1.2 and PKXARC 3.3. When
I executed the self-extracting file PKX32A11, what came out from it was
PKARC version 1.1 (not 1.2) and PKXARC version 3.2 (not 3.3). That concerns
me. Sysop Mahoney is quoting figures for versions of these programs that as
far as I can tell aren't listed on his BBS. I cannot confirm or disprove
the claimed results.
Here's a test using the versions that I could download. I used my IBM PC,
which has a 10 megabyte hard disk. I also included the ARCA utility in the
test. I used the contents of file ZCOMMDOC.ARC downloaded from GEnie's IBM
PC RoundTable (it's file number 2134 in software library 6). The files in
this archive were (in Zoo listing form):
Archive ZCOMMDOC.ZOO:
Name Length CF Size Now Date Time
------------ -------- --- -------- --------- --------
zcomm.bug 1387 42% 798 4 Oct 86 12:34:18
zcomman.aa 149915 62% 57692 4 Oct 86 05:15:14
zcomman.ab 186384 64% 66581 4 Oct 86 05:24:08
zcomman.ac 148480 63% 55538 4 Oct 86 05:30:44
zcomman.ad 2910 68% 942 4 Oct 86 05:30:46
zcomman.ai 14823 55% 6686 4 Oct 86 05:36:04
------------ -------- --- -------- --------- --------
TOTAL 6 503899 63% 188237
Here are my results:
program archiving time final size of archive extraction time
------- --------- ---- ----- ---- -- ------- ---------- ----
Zoo 1.31 2 min 44 s 188,665 bytes 2 min 0 s
PKARC 1.1 2 min 20 s 198,086 bytes
PKXARC 3.2 1 min 18 s
ARCA 1.22 2 min 38 s 199,438 bytes
The PK utilities are clearly fast. ARCA is faster than Zoo too. But notice
the size of each archive.
I make this suggestion to you: Be very careful when you evaluate claims of
performance, whether from me or from anybody else. It will be a tragedy if
users decide against using Zoo without ever themselves trying it out. Re-
member, Zoo is currently public domain and it costs you nothing save perhaps
a phone call or some connect time charges to get a copy for yourself. Try
it out, get a feel for it, experience its power.
And then, Gentle Reader, before you make your decision, be sure you have
understood the philosophy behind the Zoo project. Remember too that other
archivers have reached their current state after a long period of develop-
ment. They can go little further without violating the standard they are
trying to adhere to.
This is just the beginning for Zoo.
As my online acquaintance Tom Herman says: Let ZOO go at its own pace, and
see if all the promised features appear. If you set out to kill it through
defamation, we may never see what it could grow to be.
Thanks, Tom. And I thank you, the reader, for reading this far.
-- Rahul Dhesi 1986/11/01
====End of File====