home *** CD-ROM | disk | FTP | other *** search
- From: guy@sun.com (Guy Harris)
-
- > As the moderator of this newsgroup, I solicit comments about what should
- > be done with section 10.
-
- One thing that should not be done, under any circumstances, is to
- replace "tar" with "cpio" - *especially* if it includes the old
- non-"-c" form. The non-portable form is completely useless for
- moving data between systems with different byte orders unless you
- have a clever "cpio" that figures out that the byte order is
- backwards and undoes the damage.
-
- I discovered this when trying to read a "cpio" tape made on a VAX in
- the old format; no combination of "cpio" byte-swapping options and
- "dd conv=swab" would help. I finally ended up fixing our "cpio" to
- do the aforementioned look-at-the-header-and-undo-the-damage stuff.
-
- The X/OPEN standard uses "cpio". The rationale given exhibits a
- distressing degree of incompetence:
-
- If an exchange mdeium is to be read on a target machine that
- is architecturally different from the source machine,
- problems may arise concerning the ordering of bytes within a
- word and words within a long word (see the portability guides
- in Part III). These can easily be handled when using "cpio"
- as an exchange utility, while with "tar" it may be a little
- more difficult.
-
- Now, I will first note here that the *only* time I had a problem
- moving "tar" tapes between machines was when I had to move things to
- a Plexus. The problem was *not* that the machines had different byte
- orders; the problem was that the Plexus had a typical brain-damaged
- Multibus tape controller that swapped bytes when it transferred data
- to and from memory.
-
- "cpio" would not have made this any easier; the System III
- byte-swapping option did not swap the bytes on *all* blocks read, but
- just swapped the bytes on data blocks and in file names. The intent
- here was clearly that you would read a tape written on a machine with
- a different byte order by doing something like
-
- dd if=/dev/rmt0 conv=swab | cpio -ids
-
- "dd" would swap everything; "cpio -s" would un-swap everything but
- the binary data in the header. (We pause to note that merely
- swapping the binary data in the header would be much more efficient,
- especially given that "dd" is somewhat of a pig.) This works, but is
- less than wonderful. (And it doesn't solve the problem with the
- Plexus; to solve that you just stick the "dd" in front of "cpio" and
- don't bother with "-s" at all.)
-
- The System V "cpio" byte-swapping and word-swapping options work
- *only* on data blocks; they have no effect whatsoever on binary data
- in the header or on file names. This means that the trick that
- worked with the System III "cpio" wouldn't work at all - and the
- problem with the Plexus still isn't fixed, if that was the intent.
- The S5 options are useless for old-style non-"-c" tapes. They are of
- some use with "-c" tapes - but only if all the files on the tape
- consist solely of "short"s or "long"s, since the data in the data
- blocks are all byte-swapped or word-swapped in the same fashion.
- Most files I tend to put on or extract from "cpio" tapes are text
- files, which obviously need no swapping.
-
- In short, the arguments offered by X/OPEN in favor of "cpio" are
- completely bogus.
-
- Now for the arguments against "cpio" format:
-
- 1) It is somewhat more UNIX-specific, in that the "mode"
- field of the "stat" structure is written out numerically.
- POSIX does not specify required numeric values for this
- field. "tar" indicates the file type with a standard
- symbolic code, so you can read "tar" tapes even if the
- machine on which the tape was written and the machine on
- which it is being read do not have the same values for
- this field.
-
- 2) It does not handle hard links particularly elegantly.
- "cpio" knows nothing of files with multiple hard links
- when it writes a tape; if it is told to write "foo" and
- "bar" to the tape, and they are both hard links to the
- same file, it writes two copies of this file to the tape.
- The hard links are established when the tape is read.
- If the files appear on the tape in the order "foo" and
- then "bar", "foo" will be read in first. Once "bar" has
- been read in, "cpio" will check to see if it has already
- read in a file with the same dev/inumber value. If so, it
- will delete "bar" and make a hard link to "foo" called
- "bar".
-
- 3) It is less common. Almost all UNIX systems that support
- "cpio" also support "tar"; many UNIX systems that support
- "tar" do not support "cpio".
-
- 4) POSIX has already chosen "tar" format; why should it
- change horses in midstream, especially given that the new
- horse is lame and, despite the claims made by the person
- selling the horse, is not capable of pulling any heavier
- loads than the existing one?
-
- Anyway, I'll have to dig up the proposal made to POSIX that "cpio"
- supplement or replace "tar" and cast a very strong "no" vote citing
- the above.
-
- Now, as for the proposal for handing the whole thing off to P1003.2 -
- I have some inclination to support this. It could, in some ways, be
- considered neither part of the scope of P1003.1 nor of P1003.2, but
- to be a separate standardization topic entirely. However, if I had
- to choose which of the two items - C-language binding to OS
- system call and library functions, or command-language functions -
- the data interchange standard belonged to, I'd vote in favor of the
- latter. There is no library of functions for reading or writing
- "tar" tapes, but there is a command (namely, "tar") for reading and
- writing them, so I think it belongs in that category - especially
- given that Section 10 currently says "A conforming system shall
- implement a user utility..." which really sounds a lot more like
- a P1003.2 requirement than a P1003.1 requirement.
-
- Volume-Number: Volume 11, Number 9
-
-