home *** CD-ROM | disk | FTP | other *** search
- .\" Use -mm macros
- .\"
- .ds Rh ANSI X3B11.1: WORM File Systems
- .ds Au Andrew Hume <andrew@research.att.com>
- .ds Dt January 22\-24, 1991
- .ds Lo Murray Hill, NJ
- .ds Ed Jeffrey S. Haemer <jsh@usenix.org>
- .ds Wd U\s-3SENIX\s0 Standards Watchdog Committee
- .if '\*(Su'' \{\
- .ds Su the \*(Dt meeting in \*(Lo:
- .\}
- .if n \{\
- .tm Subject: Standards Update, \*(Rh
- .tm From: \*(Ed
- .tm Reply-To: std-unix@uunet.uu.net
- .tm Organization: \*(Wd
- .tm
- .\}
- .S 12
- .TL
- An Update on U\s-3NIX\s0\u\s-41\s0\d-Related Standards Activities
- .FS 1.
- UNIX\u\(rg\d is a Registered Trademark of UNIX System Laboratories
- in the United States and other countries.
- .FE
- .nr :p 1
- .EQ
- delim $$
- .EN
- .sp
- \*(Rh
- .AF "\*(Ed, Report Editor"
- .AU "\*(Wd"
- .MT 4
- .if n \{\
- .nh
- .na
- .\}
- .PF "'\*(DT Standards Update' '\*(Rh'"
- \*(DT
- .sp
- .P
- \fB\*(Au\fP reports on \*(Su
- .HU "Introduction"
- X\s-13B11.1\s0 is working on a standard for file interchange
- on write-once media
- (both sequential and non-sequential, i.e., random access):
- a portable file system for \s-1WORM\s0s.
- First let me apologize for laggardly snitching;
- we have had an extra meeting (in December)
- to accelerate our progress with the draft proposal
- and I have been busy writing
- a programmer's guide to the draft proposal.
- I shall describe the results of the last three meetings,
- October (Nashua, \s-1NH\s0),
- December (Murray Hill, \s-1NJ\s0),
- and January (San Jose, \s-1CA\s0),
- not in chronological order,
- but rather as a summary of where we are now.
- Although many details remain to be ironed out,
- we have broad agreement on the current proposal.
- .HU "Multi-volume file systems"
- The draft proposal supports multi-volume file systems.
- To avoid the confusion that reigned at our meetings,
- I will define what this means.
- A
- .I volume
- is a logical address space
- (on some medium).
- Thus, a typical \s-1WORM\s0 disk is two volumes,
- as each side is addressed separately.
- A
- .I "volume partition"
- is simply a contiguous subset of a volume's address space.
- A
- .I "logical volume"
- is simply a set of (volume) partitions
- upon which a file system is recorded.
- Finally,
- a
- .I "logical volume set"
- is a set of volumes with a single volume set identifier.
- (That is,
- it is simply a publishing concept.)
- Note, however,
- that when I say file system,
- I mean a set of files and directories
- described by possibly multiple directory hierarchies
- (typically each would be in a different character set).
- The (logical) block size, not the physical sector size, is $2 sup i$ bytes,
- $ 9<=i<65536$,
- and implementations would have to support at least a block size
- of 64\s-1KB\s0.
- The various size limits are generous;
- internal block addresses allow 64K volumes,
- 64K partitions per volume,
- and $2 sup 32$ blocks per partition.
- .HU "Volume Headers"
- The location of the volume header
- (the analog of the superblock)
- is a tricky issue
- because of the requirement that systems be able to boot off a disk
- in our format
- and there is simply no consensus
- on the size or location of the boot area.
- Accordingly,
- pointers to the volume header
- (actually a sequence of various descriptor records)
- are recorded at one or more of
- 0,
- 16,
- 64,
- 128,
- 192,
- 256,
- $N - 16$,
- $N - 4$
- (where $N$ is the size of the disk).
- The seek speed
- (or rather the lack of seek speed)
- of \s-1WORM\s0 disks
- encouraged us to put these at both ends of the disk.
- The volume header record,
- like all the other major control structures,
- has a 16-bit \s-1CRC\s0
- and a unique 8-byte tag,
- which should prevent misrecognition.
- .HU "Volume/Partition Structure"
- The volume layer handles space allocation for the volume,
- definitions of partitions,
- and bad-block mapping.
- The partition layer does its own space allocation,
- supports the file system,
- and does partition-access logging.
- Partitions have file-system-type tags;
- the intent is to allow partition $w$ to be an X3B11.1 file system,
- partition $x$ to be a \s-1CDROM\s0 file system,
- partition $y$ to be an \s-1MS-DOS\s0 floppy file system
- and partition $z$ to be of unknown type.
- There should be a registry for this type field;
- vendors may want to register their file-system formats.
- .HU "Bad-Block Handling"
- A simple defect-management scheme has been adopted;
- it is similar to the bad-block remapping scheme
- used for most \s-1SMD\s0 disks.
- There was considerable resistance to such a scheme,
- particularly from the representatives of the hardware vendors,
- as the (\s-1SCSI\s0) \s-1WORM\s0 disks
- already do as much error detection/correction as is possible.
- However,
- defect management
- (above the disk driver level)
- is still necessary because
- .AL
- .LI
- error correction/detection in the drive can,
- and for performance reasons often is,
- turned off,
- .LI
- errors can easily occur between the disk
- and the host's main memory
- (have you ever heard of \s-1DMA\s0 or bus errors?),
- and
- .LI
- even though \s-1SCSI\s0 disks present an ``error free'' interface,
- most drives have a limited number of errors they can cope with,
- and many early drives did little or no error correction.
- .HU "FCB Format"
- As you may recall,
- multiple versions of the
- .I "direct entry"
- (the equivalent of the inode)
- are stored in a data structure
- called the file control block (\s-1FCB\s0).
- The original proposal involved various levels of indirect blocks
- exactly like classic Unix file systems.
- We adopted my proposal
- (adapted from an observation by Dennis Ritchie)
- for a simpler, more general format
- that allows arbitrary structures,
- which can be specialized for different applications.
- .HU "Partition Access Records"
- This is more like logging changes to the file system
- than a security thing like access control lists.
- The idea is to have periods of writing to the partition
- bracketed by specific control records
- so that it will be possible to tell
- if a system closed out that partition gracefully.
- (More bluntly, did we unmount the partition gracefully
- or did the system crash in the middle of a session?)
- These records are kept on a per-file-system basis
- and are recorded as variants of direct entries
- in a structure identical to \s-1FCB\s0s.
- Another side issue is support for a so called ``stable'' record,
- which is analogous to the proposed stable sync feature of \s-1BSD\s0 Unix.
- (The control structures such as inodes and indirect blocks
- are written to disk
- but the user's data may not be, yet.)
- This peculiar state avoids the need to run
- .I fsck
- (or its equivalent)
- on the disk
- but you still have to get the user's data from somewhere.
- [Ed: does anyone really need this ``stable'' state?]
- .HU "Recording Directories"
- For performance reasons,
- it is proposed that directories,
- or rather the records (\s-1FIDS\s0)
- identifying the files (and subdirectories) in that directory,
- be kept in optionally sorted order.
- This would be in binary and not lexicographic order
- (thus evading nettlesome character-set-collating-order issues).
- It is not trivial to support this
- but is probably worth it.
- Related to this is the issue of system areas in directories and \s-1FID\s0s.
- It is expected that these areas will contain accelerator structures,
- such as B-tree indices and so on.
- Here, and elsewhere in the standard,
- the governing principle is to allow systems to use such structures
- but to neither mandate nor standardize their use.
- .HU "Anonymous Files"
- There are numerous \s-1FCB\s0s,
- or file-like objects,
- that have no \s-1FID\s0.
- An example might be a Macintosh resource fork.
- The question is whether to make these visible to the user.
- This is a serious issue,
- and one not confined to this standard.
- It is an issue for the system supporting access to the file system on the disk.
- Do we rely on this system to do the right thing
- or should we mandate a mechanism?
- For example,
- take the example of a Macintosh file (with its resource fork)
- on a system (say Unix) that doesn't have that concept.
- We can either trust that the vendor supplying your Unix
- has implemented an
- .I fcntl
- (or
- .I ioctl )
- to access the resource fork,
- or we can evade the issue completely
- by mandating that the resource fork be available for normal access
- by a reserved name such as
- \f(CRfoo.\s-1RFORK\s0\fP.
- The general feeling is that users will not allow a standard
- to reserve parts of the file name space for its own use.
- Thus,
- it seems likely that access would have to be via standardized
- .I fcntl
- calls,
- but these are outside the scope of our standard.
- .HU "Byte Order"
- I have pressed the issue of the byte order for numeric fields.
- The previous notion was to allow the recording system to choose the byte order.
- The issue is not technical
- (everyone seems happy to pick just one and stick with it)
- but political.
- We picked \s-1LSB\s0 order:
- the order used by the low-end (and slowest) systems.
- We measured the performance degradation for low-end \s-1MSB\s0 systems
- (the slowest Macintosh we could find),
- and the \s-1CPU\s0 cost of straightforward C code.
- Interpreting the byte order for the worst case
- (a block of integer block numbers)
- was about 10ms
- \(em comparable to doing a single disk \s-1I/O\s0
- and one or two orders of magnitude less
- than the cost of doing a disk seek.
- (Careful assembly code would be much faster than this.)
- .HU "Extended Attributes"
- The direct entry for a file has many attributes or fields.
- Some of these will be faster to access
- and be stored directly in the direct entry.
- The rest will be stored in an extended attribute record area
- much like resources in a Macintosh resource fork.
- There are two issues:
- which attributes get faster access
- and
- how do you access the other attributes?
- The former is something the standard specifies;
- our guiding principle was to include the fields needed for a Unix
- .I stat
- or an \s-1MS-DOS\s0 (or \s-1VMS\s0)
- .I dir
- command.
- Unfortunately,
- the issue of access is beyond the domain of our
- standard
- and needs to be addressed by \s-1POSIX\s0,
- probably best by 1003.8.
- Internally within our standard,
- the extended attributes are identified by a 32-bit number,
- some of which are set in the standard
- and the rest by a registry maintained by some authority
- (like \s-1ANSI\s0).
- The current list of extended attributes is given below;
- treat it as very preliminary and subject to change.
- .nf
- .DS
- .TS
- center;
- a a.
- information creation file abstract
- information modification file type
- information expiration associated file
- information effective data compression
- file creation protection
- file access application-specific data segment
- file modification implementation segment
- file backup escape sequences segment
- file expiration action history
- file attribute icon
- file effective environment type
- .TE
- .DE
- .fi
- .HU "Character Sets"
- We have adopted a somewhat simpler way of dealing with character sets
- than the \s-1CD-ROM\s0 standard (\s-1ISO\s0 9660).
- The current schemes available are
- .nf
- .DS
- .TS
- box, center;
- n | a.
- 0 \f(CR0-9A-Z_.\fP from Latin-1 (ISO 8859-1),
- 1 portable filename character set \f(CR0-9A-Za-z_.-\fP (POSIX 1003.1),
- 2 $G sub 0$ set from Latin-1,
- 3 all graphic characters from Latin-1, and
- 255 T{
- defined via escape sequences \(em the full scale mechanisms
- \ of ISO 2022, which are only rarely implemented.
- T}
- .TE
- .DE
- .fi
- .HU "International Activity"
- The appropriate \s-1ISO\s0 committee (\s-1SC15\s0)
- has been reconstituted with Japan supplying secretariat duties.
- A meeting is expected in July or September
- and it is hoped that there will be close cooperation between X3B11.1
- and \s-1SC15\s0.
- There is some concern that \s-1ANSI\s0 might awaken the long-dormant
- file structure committee
- and that this might delay acceptance of X3B11.1's work.
- Also,
- because of a request by a working group
- involved in the Philips \s-1CD-WO\s0 device
- (a combination medium
- that is a 5.25in \s-1WORM\s0 with a \s-1CD-ROM\s0 portion),
- \s-1ECMA\s0 might also reconstitute
- its file structure committee (\s-1TC15\s0).
- .HU "Finale"
- What can, or should, you do?
- As always, I welcome any feedback,
- specific or general on the work our committee does.
- (I must express my appreciation to USENIX for publishing these reports;
- nearly all the mail I have received about X3B11.1's work
- starts off like,
- ``I read your report in the so-and-so
- .I login; ''.)
- In particular,
- I invite comments on any fields or attributes you would like standardized and
- \(em perhaps more important to the Unix community \(em
- how to access auxiliary information about a file in
- .I "a standard way" .
- Plenty of ad hoc solutions already exist
- for the cases of versioned files
- (\s-1VMS\s0 file systems on Ultrix systems),
- Macintosh files mounted as \s-1NFS\s0 file systems,
- and \s-1CD-ROM\s0 file systems.
- The number of these problems will certainly increase over time;
- we need to address the solutions now
- before we standardize on file system interfaces (such as 1003.8)
- that omit such mechanisms.
- .P
- If you would like more details on X3B11.1's work,
- you should contact either me
- (\f(CRandrew@research.att.com\fP,
- (908)\ 582-6262)
- or the committee chair,
- Ed Beshore (\f(CRedb@hpgrla.hp.com\fP).
- I think the two most useful documents
- are the current draft of the working paper
- (about 80 pages)
- and a programmer's guide to the draft
- (about 12 pages written by me).
- I will send you copies of the latter document;
- requests for other documents or more general inquiries about X3B11.1's work
- would be best sent to Ed Beshore.
- .P
- The next meeting is in North Falmouth, \s-1MA\s0 on April 23\-26, 1991.
- Anyone interested in attending should contact either me or Ed Beshore.
-
-