home *** CD-ROM | disk | FTP | other *** search
- This is version 2.0b1 of macutil (26-APR-1992).
-
- This package contains the following utilities:
- macunpack
- hexbin
- macsave
-
- Requirements:
- a. Of course a C compiler.
- b. A 32-bit machine with large memory (or at least the ability to 'malloc'
- large chunks of memory). For reasons of efficiency and simplicity the
- programs work 'in-core', also many files are first read in core.
- If somebody can take the trouble to do it differently, go ahead!
- There are also probably in a number of places implicit assumptions that
- an int is 32 bits. If you encounter such occurrences feel free to
- notify me.
- c. A Unix (tm) machine, or something very close. There are probably quite
- a lot of Unix dependencies. Also here, if you have replacements, feel
- free to send comments.
- d. This version normally uses the 'mkdir' system call available on BSD Unix
- and some versions of SysV Unix. You can change that, see the makefile for
- details.
-
- File name translation:
-
- The programs use a table driven program to do Mac filename -> Unix filename
- translation. When compiled without further changes the translation is as
- follows:
- Printable ASCII characters except space and slash are not changed.
- Slash and space are changed to underscore, as are all characters that
- do not fall in the following group.
- Accented letters are translated to their unaccented counterparts.
- If your system supports the Latin-1 character set, you can change this
- translation scheme by specifying '-DLATIN1' for the 'CF' macro in the
- makefile. This will translate all accented letters (and some symbols)
- to their Latin-1 counterpart. This feature is untested (I do not have
- access to systems that cater for Latin-1), so use with care.
- Future revisions of the program will have user settable conversions.
-
- Another feature of filename translation is that when the -DNODOT flag is
- specified in the CF macro an initial period will be translated to underscore.
-
- Appleshare support:
-
- Optionally the package can be compiled for systems that support the sharing
- of Unix and Mac filesystems. The package supports AUFS (AppleTalk Unix File
- Server) from CAP (Columbia AppleTalk Package) and AppleDouble (from Apple).
- It will not support both at the same time. Moreover this support requires
- the existence of the 'mkdir' system call. And finally, as implemented it
- probably will work on big-endian BSD compatible systems. If you have a SysV
- system with restricted filename lengths you can get problems. I do not know
- also whether the structures are stored native or Apple-wise on little-endian
- systems. And also, I did not test it fully; having no access to either AUFS
- or AppleDouble systems.
-
- Acknowledgements:
- a. Macunpack is for a large part based on the utilities 'unpit' and 'unsit'
- written by:
- Allan G. Weber
- weber%brand.usc.edu@oberon.usc.edu
- (wondering whether that is still valid!). I combined the two into a
- single program and did a lot of modification. For information on the
- originals, see the files README.unpit and README.unsit.
- b. The crc-calculating routines are based on a routine originally written by:
- Mark G. Mendel
- UUCP: ihnp4!umn-cs!hyper!mark
- (this will not work anymore for sure!). Also here I modified the stuff
- and expanded it, see the files README.crc and README.crc.orig.
- c. LZW-decompression is taken from the sources of compress that are floating
- around. Probably I did not use the most efficient version, but this
- program was written to get it done. The version I based it on (4.0) is
- authored by:
- Steve Davies (decvax!vax135!petsd!peora!srd)
- Jim McKie (decvax!mcvax!jim) (Hi Jim!)
- Joe Orost (decvax!vax135!petsd!joe)
- Spencer W. Thomas (decvax!harpo!utah-cs!utah-gr!thomas)
- Ken Turkowski (decvax!decwrl!turtlevax!ken)
- James A. Woods (decvax!ihnp4!ames!jaw)
- I am sure those e-mail addresses also will not work!
- d. Optional AUFS support comes from information supplied by:
- Casper H.S. Dik
- University of Amsterdam
- Kruislaan 409
- 1098 SJ Amsterdam
- Netherlands
-
- phone: +31205922022
- email: casper@fwi.uva.nl
- This is an e-mail address that will work.
- See the makefile.
- Some caveats are applicable:
- 1. I did not fully test it (we do not use it). But the unpacking
- appears to be correct. Anyhow, as the people who initially compile
- it and use it will be system administrators I am confident they are
- able to locate bugs! (What if an archive contains a Macfile with
- the name .finderinfo or .resource? I have had two inputs for AUFS
- support [I took Caspers; his came first], but both do not deal with
- that. Does CAP deal with it?) Also I have no idea whether this
- version supports it under SysV, so beware.
- 2. From one of the README's supplied by Casper:
- Files will not appear in an active folder, because Aufs doesn't like
- people working behind it's back.
- Simply opening and closing the folder will suffice.
- Appears to be the same problem as when you are unpacking or in some
- other way creating files in a folder open to multifinder. I have seen
- bundle bits disappear this way. So if after unpacking you see the
- generic icon; check whether a different icon should appear and check
- the bundle bit.
- The desktop isn't updated, but that doesn't seem to matter.
- I dunno, not using it.
- e. Man pages are now supplied. The base was provided by:
- Douglas Siebert
- ISCA
- dsiebert@icaen.uiowa.edu
- f. Because of some problems the Uncompactor has been rewritten, it is now
- based on sources from the dearchiver unzip (of PC fame). Apparently the
- base code is by:
- Samuel H. Smith
- I have no further address available, but as soon as I find a better
- attribution, I will include it.
- g. UnstuffIt's LZAH code comes from lharc (also of PC fame) by:
- Haruhiko Okumura,
- Haruyasu Yoshizaki,
- Yooichi Tagawa.
- h. Zoom's code comes from information supplied by Jon W{tte
- (d88-jwa@nada.kth.se). The Zoo decompressor is based on the routine
- written by Rahul Dhesi (dhesi@cirrus.COM). This again is based on
- code by Haruhiko Okumura. See also the file README.zoom.
- i. MacLHa's decompressors are identical to the ones mentioned in g and h.
- j. Most of hexbin's code is based on code written/modified by:
- Dave Johnson, Brown University Computer Science
- Darin Adler, TMQ Software
- Jim Budler, amdcad!jimb
- Dan LaLiberte, liberte@uiucdcs
- ahm (?)
- Jeff Meyer, John Fluke Company
- Guido van Rossum, guido@cwi.nl (Hi!)
- (most of the e-mail addresses will not work, the affiliation may also
- be incorrect by now.) See also the file README.hexbin.
- k. The dl code in hexbin comes is based on the original distribution of
- SUMacC.
- l. The mu code in hexbin is a slight modification of the hcx code (the
- compressions are identical).
- m. The MW code for StuffIt is loosely based on code by Daniel H. Bernstein
- (brnstnd@acf10.nyu.edu).
-
- -------------------------------------------------------------------------------
- Macunpack will unpack PackIt, StuffIt, Diamond, Compactor/Compact Pro, most
- StuffItClassic/StuffItDeluxe, and all Zoom and LHarc/MacLHa archives.
- Also it will decode files created by BinHex5.0, MacBinary, UMCP,
- Compress It, ShrinkToFit, MacCompress and DiskDoubler.
-
- (PackIt, StuffIt, Diamond, Compactor, Compact/Pro, Zoom and LHarc/MacLHa are
- archivers written by respectively: Harry R. Chesley, Raymond Lau, Denis Sersa,
- Bill Goodman, Jon W{tte* and Kazuaki Ishizaki. BinHex 5.0, MacBinary and
- UMCP are by respectively: Yves Lempereur, Gregory J. Smith, Information
- Electronics. ShrinkToFit is by Roy T. Hashimoto, Compress It by Jerry
- Whitnell, and MacCompress and DiskDoubler are both by Lloyd Chambers.)
-
- * from his signature:
- Jon W{tte - Yes, that's a brace - Damn Swede.
- Actually it is an a with two dots above; some (German inclined) people
- refer to it (incorrectly) as a-umlaut.
-
- It does not deal with:
- a. Password protected archives.
- b. Multi-segment archives.
- c. Plugin methods for Zoom.
- d. MacLHa archives not packed in MacBinary mode (the program deals very
- poorly with that!).
-
- Background:
- There are millions of ways to pack files, and unfortunately, all have been
- implemented one way or the other. Below I will give some background
- information about the packing schemes used by the different programs
- mentioned above. But first some background about compression (I am no
- expert, more comprehensive information can be found in for instance:
- Tomothy Bell, Ian H. Witten and John G. Cleary, Modelling for Text
- Compression, ACM Computing Surveys, Vol 21, No 4, Dec 1989, pp 557-591).
-
- Huffman encoding (also called Shannon-Fano coding or some other variation
- of the name). An encoding where the length of the code for the symbols
- depends on the frequency of the symbols. Frequent symbols have shorter
- codes than infrequent symbols. The normal method is to first scan the
- file to be compressed, and to assign codes when this is done (see for
- instance: D. E. Knuth, the Art of Computer Programming). Later methods
- have been designed to create the codes adaptively; for a survey see:
- Jeremy S. Vetter, Design and Analysis of Dynamic Huffman Codes, JACM,
- Vol 34, No 4, Oct 1987, pp 825-845.
- LZ77: The first of two Ziv-Lempel methods. Using a window of past encoded
- text, output consists of triples for each sequence of newly encoded
- symbols: a back pointer and length of past text to be repeated and the
- first symbol that is not part of that sequence. Later versions allowed
- deviation from the strict alternation of pointers and uncoded symbols
- (LZSS by Bell). Later Brent included Huffman coding of the pointers
- (LZH).
- LZ78: While LZ77 uses a window of already encoded text as a dictionary,
- LZ78 dynamically builds the dictionary. Here again pointers are strictly
- alternated with unencoded new symbols. Later Welch (LZW) managed to
- eliminate the output of unencoded symbols. This algorithm is about
- the same as the one independently invented by Miller and Wegman (MW).
- A problem with these two schemes is that they are patented. Thomas
- modified LZW to LZC (as used in the Unix compress command). While LZ78
- and LZW become static once the dictionary is full, LZC has possibilities
- to reset the dictionary. Many LZC variants are in use, depending on the
- size of memory available. They are distinguished by the maximum number
- of bits that are used in a code.
- A number of other schemes are proposed and occasionally used. The main
- advantage of the LZ type schemes is that (especially) decoding is fairly fast.
-
- Programs background:
-
- Plain programs:
- BinHex 5.0:
- Unlike what its name suggest this is not a true successor of BinHex 4.0.
- BinHex 5.0 takes the MacBinary form of a file and stores it in the data
- fork of the newly created file.
- Although BinHex 5.0 does not create BinHex 4.0 compatible files, StuffIt
- will give the creator type of BinHex 5.0 (BnHq) to its binhexed files,
- rather than the creator type of BinHex 4.0 (BNHQ). The program knows
- about that.
- MacBinary:
- As its name suggests, it does the same as BinHex 5.0.
- UMCP:
- Looks similar, but the file as stored by UMCP is not true MacBinary.
- Size fields are modified, the result is not padded to a multiple of 128,
- etc. Macunpack deals with all that, but until now is not able to
- correctly restore the finder flags of the original file. Also, UMCP
- created files have type "TEXT" and creator "ttxt", which can create a
- bit of confusion. Macunpack will recognize these files only if the
- creator has been modified to "UMcp".
-
- Compressors:
- ShrinkToFit:
- This program uses a Huffman code to compress. It has an option (default
- checked for some reason), COMP, for which I do not yet know the
- meaning. Compressing more than a single file in a single run results
- in a failure for the second and subsequent files.
- Compress It:
- Also uses a Huffman code to compress.
- MacCompress:
- MacCompress has two modes of operation, the first mode is (confusingly)
- MacCompress, the seecond mode is (again confusingly) UnixCompress. In
- MacCompress mode both forks are compressed using the LZC algorithm.
- In UnixCompress mode only the data fork is compressed, and some shuffling
- of resources is performed. Upto now macunpack only deals with MacCompress
- mode. The LZC variant MacCompress uses depends on memory availability.
- 12 bit to 16 bit LZC can be used.
-
- Archivers:
- ArcMac:
- Nearly PC-Arc compatible. Arc knows 8 compression methods, I have seen
- all of them used by ArcMac, except the LZW techniques. Here they are:
- 1: No compression, shorter header
- 2: No compression
- 3: (packing) Run length encoding
- 4: (squeezing) RLE followed by Huffman encoding
- 5: (crunching) LZW
- 6: (crunching) RLE followed by LZW
- 7: (crunching) as the previous but with a different hash function
- 8: (crunching) RLE followed by 12-bit LZC
- 9: (squashing) 13-bit LZC
- PackIt:
- When archiving a file PackIt either stores the file uncompressed or
- stores the file Huffman encoded. In the latter case both forks are
- encoded using the same Huffman tree.
- StuffIt and StuffIt Classic/Deluxe:
- These have the ability to use different methods for the two forks of a
- file. The following standard methods I do know about (the last three
- are only used by the Classic/Deluxe version 2.0 of StuffIt):
- 0: No compression
- 1: Run length encoding
- 2: 14-bit LZC compression
- 3: Huffman encoding
- 5: LZAH: like LZH, but the Huffman coding used is adaptive
- 6: A Huffman encoding using a fixed (built-in) Huffman tree
- 8: A MW encoding
- Diamond:
- Uses a LZ77 like frontend plus a Fraenkel-Klein like backend (see
- Apostolico & Galil, Combinatorial Algorithms on Words, pages 169-183).
- Compactor/Compact Pro:
- Like StuffIt, different encodings are possible for data and resource fork.
- Only two possible methods are used:
- 0: Run length encoding
- 1: RLE followed by some form of LZH
- Zoom:
- Data and resource fork are compressed with the same method. The standard
- uses either no compression or some form of LZH
- MacLHa:
- Has two basic modes of operation, Mac mode and Normal mode. In Mac mode
- the file is archived in MacBinary form. In normal mode only the forks
- are archived. Normal mode should not be used (and can not be unpacked
- by macunpack) as all information about data fork size/resource fork size,
- type, creator etc. is lost. It knows quite a few methods, some are
- probably only used in older versions, the only methods I have seen used
- are -lh0-, -lh1- and -lh5-. Methods known by MacLHa:
- -lz4-: No compression
- -lz5-: LZSS
- -lzs-: LZSS, another variant
- -lh0-: No compression
- -lh1-: LZAH (see StuffIt)
- -lh2-: Another form of LZAH
- -lh3-: A form of LZH, different from the next two
- -lh4-: LZH with a 4096 byte buffer (as far as I can see the coding in
- MacLHa is wrong)
- -lh5-: LZH with a 8192 byte buffer
- DiskDoubler:
- The older version of DiskDoubler is compatible with MacCompress. It does
- not create archives, it only compresses files. The newer version (since
- 3.0) does both archiving and compression. The older version uses LZC as
- its compression algorithm, the newer version knows a number of different
- compression algorithms. Many (all?) are algorithms used in other
- archivers. Probably this is done to simplify conversion from other formats
- to DiskDoubler format archives. I have seen actual DiskDoubler archives
- that used methods 0, 1 and 8:
- 0: No compression
- 1: LZC
- 2: ???
- 3: RLE
- 4: Huffman (or no compression)
- 5: ???
- 6: ???
- 7: An improved form of LZSS
- 8: Compactor/Compact Pro compatible RLE/LZH or RLE only
- 9: ???
- The DiskDoubler archive format contains many subtle twists that make it
- difficult to properly read the archive (or perhaps this is on purpose?).
-
- Naming:
- Some people have complained about the name conflict with the unpack utility
- that is already available on Sys V boxes. I had forgotten it, so there
- really was a problem. The best way to solve it was to trash pack/unpack/pcat
- and replace it by compress/uncompress/zcat. Sure, man uses it; but man uses
- pcat, so you can retain pcat. If that was not an option you were able to feel
- free to rename the program. But finally I relented. It is now macunpack.
-
- When you have problems unpacking an archive feel free to ask for information.
- I am especially keen when the StuffIt part detects Fixed Huffman or bigram
- compression methods. If you encounter such an archive, please, mail a
- 'binhexed' copy of the archive to me so that I can deal with it. Password
- protected archives are (as already stated) not implemented. I do not have much
- inclination to do that. Also I feel no inclination to do multi-segment
- archives.
-
- -------------------------------------------------------------------------------
- Hexbin will de-hexify files created in BinHex 4.0 compatible format (hqx)
- but also the older format (dl, hex and hcx). Moreover it will uudecode
- files uuencoded by UUTool (the only program I know that does UU hexification
- of all Mac file information).
-
- There are currently many programs that are able to create files in BinHex 4.0
- compatible format. There are however some slight differences, and most
- de-hexifiers are not able to deal with all the variations. This program is
- very simple minded. First it will intuit (based on the input) whether the
- file is in dl, hex, hcx or hqx format. Next it will de-hexify the file.
- When the format is hqx, it will check whether more files follow, and continue
- processing. So you can catenate multiple (hqx) hexified files together and
- feed them as a single file to hexbin. Also hexbin does not mind whether lines
- are separated by CR's, LF's or combinations of the two. Moreover, it will
- strip all leading, trailing and intermediate garbage introduced by mailers
- etc. Next, it does not mind if a file is not terminated by a CR or an LF
- (as StuffIt 1.5.1 and earlier did), but in that case a second file is not
- allowed to follow it. Last, while most hexifiers output lines of equal length,
- some do not. Hexbin will deal with that, but there are some caveats; see the
- -c option in the man page.
-
- Background:
-
- dl format:
- This was the first hexified format used. Programs to deal with it came
- from SUMacC. This format only coded resource forks, 4 bits in a byte.
- hex format:
- I think this is the first format from Yves Lempereur. Like dl format,
- it codes 4 bits in a byte, but is able to code both resource and
- data fork. Is it BinHex 2.0?
- hcx format:
- A compressing variant of hex format. Codes 6 bits in a byte.
- Is it BinHex 3.0?
- hqx format:
- Like hcx, but using a different coding (possibly to allow for ASCII->EBCDIC
- and EBCDIC->ASCII translation, which not always results in an identical
- file). Moreover this format also encodes the original Mac filename.
- mu format:
- The conversion can be done by the UUTool program from Octavian Micro
- Development. It encodes both forks and also some finder info. You will
- in general not use this with uudecode on non Mac systems, with uudecode
- only the data fork will be uudecoded. UU hexification is well known (and
- fairly old) in Unix environments. Moreover it has been ported to lots of
- other systems.
- -------------------------------------------------------------------------------
- Macsave will read a MacBinary stream from standard input and writes the
- files according to the options.
- -------------------------------------------------------------------------------
- This is an ongoing project, more stuff will appear.
-
- All comments are still welcome. Thanks for the comments I already received.
-
- dik t. winter, amsterdam, nederland
- email: dik@cwi.nl
-
- --
- Note:
- In these programs all algorithms are implemented based on publicly available
- software to prevent any claim that would prevent redistribution due to
- Copyright. Although parts of the code would indeed fall under the Copyright
- by the original author, use and redistribution of all such code is explicitly
- allowed. For some parts of it the GNU software license does apply.
- --
- Appendix.
-
- BinHex 4.0 compatible file creators:
-
- Type Creator Created by
-
- "TEXT" "BthX" BinHqx
- "TEXT" "BNHQ" BinHex
- "TEXT" "BnHq" StuffIt and StuffIt Classic
- "TEXT" "ttxt" Compactor
-
- Files recognized by macunpack:
-
- Type Creator Recognized as
-
- "APPL" "DSEA" "DiskDoubler" Self extracting
- "APPL" "EXTR" "Compactor" Self extracting
- "APPL" "Mooz" "Zoom" Self extracting
- "APPL" "Pack" "Diamond" Self extracting
- "APPL" "arc@" "ArcMac" Self extracting (not yet)
- "APPL" "aust" "StuffIt" Self extracting
- "ArCv" "TrAS" "AutoSqueeze" (not yet)
- "COMP" "STF " "ShrinkToFit"
- "DD01" "DDAP" "DiskDoubler"
- "DDAR" "DDAP" "DiskDoubler"
- "DDF." "DDAP" "DiskDoubler" (any fourth character)
- "DDf." "DDAP" "DiskDoubler" (any fourth character)
- "LARC" "LARC" "MacLHa (LHARC)"
- "LHA " "LARC" "MacLHa (LHA)"
- "PACT" "CPCT" "Compactor"
- "PIT " "PIT " "PackIt"
- "Pack" "Pack" "Diamond"
- "SIT!" "SIT!" "StuffIt"
- "SITD" "SIT!" "StuffIt Deluxe"
- "Smal" "Jdw " "Compress It"
- "TEXT" "BnHq" "BinHex 5.0"
- "TEXT" "GJBU" "MacBinary 1.0"
- "TEXT" "UMcp" "UMCP"
- "ZIVM" "LZIV" "MacCompress(M)"
- "ZIVU" "LZIV" "MacCompress(U)" (not yet)
- "mArc" "arc*" "ArcMac" (not yet)
- "zooM" "zooM" "Zoom"
-
-