home *** CD-ROM | disk | FTP | other *** search
- Submitted-by: jsh@usenix.org (Jeffrey S. Haemer)
-
- An Update on UNIX*-Related Standards Activities
-
- September 1990
-
- USENIX Standards Watchdog Committee
-
- Jeffrey S. Haemer <jsh@usenix.org>, Report Editor
-
- ANSI X3B11.1: WORM File Systems
-
- Andrew Hume <andrew@research.att.com> reports on the July 17-19, 1990.
- meeting in Murray Hill, NJ:
-
- Introduction
-
- X3B11.1 is working on a standard for file interchange on write-once
- media (both sequential and non-sequential (random access)): a portable
- file system for WORMs. The fifth meeting was held at Murray Hill, NJ
- on July 17-19, 1990. We adopted a working paper and set to work on a
- list of issues suggested by the chair.
-
- Data Compression
-
- Despite the huge capacities of WORM disks, people always want more.
- Data compression is an easy way to supply more, and on current machine
- architectures, probably can speed data access by trading CPU cycles
- for I/O bandwidth. Its main problem is that you need to support more
- than one algorithm and thus, you need some way to specify algorithms.
- This is a purely administrative issue, but luckily, it appears that X3
- may soon act as a registry for compression algorithms (driven by the
- need to register compression algorithms for IBM 3840 cartridge tape
- work in X3B5). (How does this fit in with the rumblings about
- compress from POSIX.2? I'm not certain. I think part of becoming
- part of the register means giving up patent rights or allowing liberal
- licensing, but maybe not. After all, the CD formats are now an ISO
- standard, but I still think you have to be licensed to make them.)
-
- Path Tables and Extended Attributes
-
- Path tables were removed from the working paper. We agreed to support
- hard and symbolic links. The next question was how to handle
- ``secret'' files: files primarily intended for system use. Examples
- might include the file describing free space, associated files (like
- the resource fork of a Macintosh file), and extended attributes (of a
- Microsoft HPFS file). We agreed that the latter two cases should be
- handled by regular files that probably are not in the directory tree
-
- __________
-
- * UNIXTM is a Registered Trademark of UNIX System Laboratories in
- the United States and other countries.
-
- September 1990 Standards Update ANSI X3B11.1: WORM File Systems
-
-
- - 2 -
-
- but are pointed to by the ``inode'' for a file. (Note that this
- implies there is a way to scan all the files in a volume set without
- traversing the directory tree(s), analogous to running down the inodes
- in UNIX.)
-
- Given this, we have decided to support extended attributes as a
- ``secret'' or system file (and probably include pointers to things
- like resource forks as those attributes). This also gives us an
- extensible way of handling non-standard or non-essential inode fields.
- One of the important tasks remaining is to decide which fields are
- more-or-less mandatory (such as modify time, owner) and which can
- safely be pushed off into the extended attributes (access control
- lists, file valid after date). Please send us your suggestions!
-
- Space Allocation and Management
-
- We agreed that we have to support preallocating space for files,
- freeing some or all of that space and then reusing that space for
- other files. After much discussion about extent lists and bit maps,
- we compromised on a scheme based on extent lists (the details to be
- worked by the working paper editor). The idea is that is that the
- free space is described by an extent list (of small but specifiable
- size) of the ``best'' (probably largest) free spaces, and if this
- overflows, ``worst'' free spaces are added to a system file
- representing all the free spaces not in the above extent list.
-
- Checksums
-
- It was decided that all system data structures would include a 16 bit
- checksum (CRC-16). We anticipate that most errors would be transient
- (cabling or memory) and not be media errors.
-
- Multi-Volume Sets
-
- I had thought the last meeting had settled just about all the
- questions about multi-volume sets; I was wrong. It took most of a day
- to agree on these.
-
- - You have to have the last volume in order to grok the whole
- volume set (access any/all of the directories and files).
-
- - You can extend volume sets at any time. This and the last item
- taken together imply the existence of ``terminal'' volumes (which
- can act as master volumes of a volume set) and ``nonterminal''
- volumes (the rest). For example, if I extend a single-volume
- volume set by two volumes, then volumes 1 and 3 are terminal and
- volume 2 is not.
-
- - You can extract file data from any volume by itself. This is
- meant only for disaster recovery (I dropped the master volume
- down the stairwell) and doesn't imply any requirements on
-
- September 1990 Standards Update ANSI X3B11.1: WORM File Systems
-
-
- - 3 -
-
- directory tree information (much as fsck restores unattached
- inodes to /lost+found).
-
- - Volumes can refer to data (say, extents) on other volumes (both
- earlier and later volumes). Preallocated space on any volume in
- a volume set can be returned for future reuse.
-
- - The address space of logical blocks for the volume set will be 48
- bits; 16 bits for the volume number and 32 bits for the logical
- block number within a volume. Media can be big (200GB helical
- scan media exist now) so 32 bits may seem barely big enough, but
- in such cases you can use a big logical block size. For example,
- a logical block size of 16KB implies a limit of 64 terabytes per
- volume; this should be ample for a few years.
-
- Defect Management
-
- We spent a lot of time on this and learned a lot, but basically put it
- off to the next meeting. What we mean by ``defect management'' is
- ``How do we deal with write errors from the file system's point of
- view?'' (We ignore the disk controller and the device driver, both of
- which do some unknown amount of more-or-less transparent error
- management.)
-
- We discussed the ``sane'' approach: insert a layer between the file
- system that handles errors, allowing the file-system code to assume an
- error-free interface. This apparently good idea is ruled out by
- slip-sectoring, a (to my mind bogus) technique, which says, ``if
- writing block n fails, then try subsequent blocks (n+1, n+2, ...)
- until we succeed.'' Slip-sectoring is mainly used to enhance
- performance (it does ensure that blocks are more-or-less contiguous),
- and some disk controllers use it as their error-management technique.
- (This really screws up your logical address space; it is legitimate
- for a SCSI disk, your typical error-free, logical-address-space disk
- interface, to write logical block 5 at physical block 5, then logical
- block 1 at physical block 4 (1-3 were write errors), then disallow I/O
- to logical blocks 2,3, and 4 because there is no place to put them -
- these blocks just vanish!)
-
- As preparation for the next meeting, Don Crouse, who deals mainly with
- high-end machines like Crays and large IBMs, is writing a position
- paper on performance, and members of the committee, many of whom are
- drive manufacturers or integrators, are collecting estimates of error
- rates we have to deal with. (This matters; I see one bad block out of
- 100,000, but some people have used drives with a bad block in every
- 100.) The problem is that WORMs have really slow seek times, and when
- you are pouring a 50MB/s Cray channel at a set of WORMs, you can't
- afford to spend 1-2 seconds seeking to the bad block area. I
- personally think we should just do regular bad-block mapping (like
- most SMD disk drivers) out of a special system file, and people with
- performance concerns should arrange to have this space spread over the
- disk.
-
- September 1990 Standards Update ANSI X3B11.1: WORM File Systems
-
-
- - 4 -
-
- Endian-ness
-
- A poll was taken of who really cared which way integer fields were
- stored; the results were LSB - 1, MSB - 1, Don't Care - 11. It is
- awkward to specify one of LSB and MSB; this puts half the systems out
- there at a competitive (performance) disadvantage (though I am
- skeptical of whether it's significant). Even though we're specifying
- an interchange standard, the group felt that most interchange would be
- between systems of the same endian-ness, so we should, somehow, allow
- native byte order. Accordingly, we agreed that endian-ness will be
- specified in the volume header (for the whole volume set). In
- retrospect, I think this was silly; we should have just picked one
- way. In order that everyone important be evenly disadvantaged, we
- could have used some byte order like 3-0-1-2 that no one uses.
-
- Finale
-
- The committee is trying to nail down a firm proposal for balloting.
- We anticipate a substantial amount of change at the next meeting (Oct
- 16-18 in Nashua, NH) and have reserved time (Dec 11-13, but no place)
- for an additional meeting so that we can ballot after the following
- meeting (Jan 29-31, Bay area). We now have a working paper (available
- by the end of September or so); I think it likely we can meet this
- schedule, but who knows.
-
- Anyone interested in attending any of the above meetings should
- contact either the chairman, Ed Beshore (edb@hpgrla.hp.com), or me
- (andrew@research.att.com, research!andrew, (908)582-6262). I am also
- soliciting your comments on necessary inode fields and defect
- management. I will present anything you give me at the next meeting.
-
- September 1990 Standards Update ANSI X3B11.1: WORM File Systems
-
-
- Volume-Number: Volume 21, Number 116
-
-