home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Archive Magazine 1996
/
ARCHIVE_96.iso
/
discs
/
mag_discs
/
volume_2
/
issue_05
/
utilities
/
COMP_DOC
< prev
next >
Wrap
Text File
|
1988-09-28
|
6KB
|
136 lines
->docs.compress
Name: compress
Purpose: Data compression
Usage: compress [-dfvcV] [-b maxbits] [file ...]
-V => print Version
-e => erase old file
-d => uncompress
-v => verbose
-f => force overwrite of output file
-n => no header: useful to uncompress old files
-b maxbits => maxbits. If -b is specified, then
maxbits MUST be given also.
-c => cat all output to stdout
-C => generate output compatible with compress 2.0
Overview
========
Compress, is a UNIX program for squashing files so that they take up less
room either for transfer or for storage. Compress files, can be identified
by the fact that they usually have a '.Z' at the end of their filename and
are indecipherable by conventional means. Basically compress works by being
fed an ordinary file which it compresses and writes back to disc with a '.Z'
(or in the case of the Archimedes a '_Z') tacked onto the end of the old file
name. File atributes, load, exec addresses etc. are preserved on the new file.
When you actually want to use the file, you feed the _Z form into either
uncompress or compress with the -d flag (these are identical programs) and the
original file without the '_Z' is recreated.
Compress uses the LZW technique like arc (elsewhere on this disc). A crucial
parameter in this, is the 'number of bits' used for codes. Whereas arc uses
a fixed number 13. Compress, can use a variable number. This has advantages,
because for long files, the more bits, the more efficient the compression is.
However, more bits means more memory and some machines do not have as much
memory as one might guess. All this implies, that either you or someone else
may end up with a 'Z' file that can't be expanded on a given machine.
This version of compress, needs 500K of memory to run, over and above the
program memory and will cope with compress files encoded using upto and
including 16 bits. If you want to be on the safe side, encode your files
using 12 bits which is something of a standard; being the best that PDP11
owners can manage. But not it should be said, too much of a standard in case
you were thinking things were getting simple at this point. There have been
several older formats for .Z files which compress can cope with if the
correct flags are set so look out for these.
Description
===========
Compress reduces the size of the named files using adaptive Lempel-Ziv coding.
Whenever possible, each file is replaced by one with the extension _Z while
keeping the same attributes. If no files are specified, the standard input is
compressed to the standard output. Compressed files can be restored to their
original form using uncompress or compress -d.
Compress uses the modified Lempel-Ziv algorithm popularized in
"A Technique for High Performance Data Compression", Terry A. Welch,"IEEE
Computer,"vol. 17, no. 6 (June 1984), pp. 8-19.
Common substrings in the file are first replaced by 9-bit codes 257 and up.
When code 512 is reached, the algorithm switches to 10-bit codes and
continues to use more bits until the limit specified by the flag is reached
(default 16). Bits must be between 9 and 16. After the bits limit is attained,
compress periodically checks the compression ratio. If it is increasing,
compress continues to use the existing code dictionary. However, if the
compression ratio decreases, compress discards the table of substrings and
rebuilds it from scratch. This allows the algorithm to adapt to the next
"block" of the file.
Note that the -b flag is omitted for uncompress, since the bits parameter
specified during compression is encoded within the output, along with
a magic number to ensure that neither decompression of random data nor
recompression of compressed data is attempted.
The amount of compression obtained depends on the size of the input, the number
of bits per code, and the distribution of common substrings. Typically, text
such as source code or English is reduced by 50-60%. Compression is generally
much better than that achieved by Huffman coding or adaptive Huffman coding
and takes less time to compute. Under the -v option, a message is printed
yielding the percentage of reduction for each file compressed. If the -V
option is specified, the current version and compile options are printed on
stderr.
The -f flag, causes existing files on disc to be overwritten whilst the
-e flag, makes compress delete the source file after using it.
Diagnostics
===========
Usage: compress [-dfvcV] [-b maxbits] [file ...]
Invalid options were specified on the command line.
Missing maxbits
Maxbits must follow -b.
file :not in compressed format
The file specified to uncompress has not been compressed.
file :compressed with xx bits, can only handle yy bits.
File was compressed by a program that could deal with more bits than the
compress code on this machine. Recompress the file with smaller bits.
file :already has _Z suffix -- no change
The file is assumed to be already compressed.
Rename the file and try again.
file :filename too long to tack on _Z
The file cannot be compressed because its name is longer than 10 characters.
Rename and try again.
file already exists; do you wish to overwrite (y or n)?
Respond "y" if you want the output file to be replaced; "n" if not.
uncompress: corrupt input
A memory access violation was detected which usually means that the input file
has been corrupted.
Compression: "xx.xx%"
Percentage of the input saved by compression.
(Relevant only for -v.)
-- not a regular file: unchanged
When the input file is not a regular file, (e.g. a directory), it is left
unaltered.
-- file unchanged
No savings is achieved by
compression. The input remains virgin.
Notes
Release 1.00 October 1988.
Archimedes implementation (c) David Pilling 1988.