The summary is somewhat technical but more important it is factual: I wrote it after reading the original CD standards documents available from Sony or Philips to CD licensees. If you are interested in the standards documents, you need to contact them directly -- sorry, I don't have a specific contact or phone number.
I do work for Apple but this summary contains a minimum of Apple references. I hope everyone agrees that the result is in keeping with net policy on the matter.
Andy Poggio
To meet goals (2) to (4), it is not possible to encode arbitrary binary data. For example, the integer 0 expressed as thirty-two bits of zero would have too long a run length to satisfy goal (3). To accommodate these goals, each eight-bit byte of actual data is encoded as fourteen bits of channel data. There are many more combinations of fourteen bits (16,384) than there are of eight bits (256). To encode the eight-bit combinations, 256 combinations of fourteen bits are chosen that meet the goals. This encoding is referred to as Eight-to-Fourteen Modulation (EFM) coding.
If fourteen channel bits were concatenated with another set of fourteen channel bits, once again the above goals may not be met. To avoid this possibility, three merging bits are included between each set of fourteen channel bits. These merging bits carry no information but are chosen to limit run length, keep data signal DC content low, etc. Thus, an eight bit byte of actual data is encoded into a total of seventeen channel bits: fourteen EFM bits and three merging bits. To achieve a reliable self-clocking system, periodic synchronization is necessary. Thus, data is broken up into individual frames each beginning with a synchronization pattern. Each frame also contains twenty-four data bytes, eight error correction bytes, a control and display byte (carrying the subcoding channels), and merging bits separating them all. Each frame is arranged as follows:
Sync Pattern 24 + 3 channel bits Control and Display byte 14 + 3 Data bytes 12 * (14 + 3) Error Correction bytes 4 * (14 + 3) Data bytes 12 * (14 + 3) Error Correction bytes 4 * (14 + 3) TOTAL 588 channel bitsThus, 192 actual data bits (24 bytes) are encoded as 588 channel bits.
Editorial: A CD physically has a single spiral track about 3 miles long. CDs spin at about 500 RPM when reading near the center down to about 250 RPM when reading near the circumference.
Disc with a 'c' or disk with a 'k'? A usage has emerged for these terms: disk is used for eraseable disks (e.g. magnetic disks) while disc is used for read-only (e.g. CD-ROM discs). One would presumably call a frisbee a disc.
As each frame is read from the disc, it is first decoded from fourteen channel bits (the three merging bits are ignored) into eight-bit data bytes. Then, the bytes from each frame (twenty-four data bytes and eight error correction bytes) are passed to the first Reed-Solomon decoder which uses four of the error correction bytes and is able to correct one byte in error out of the 32. If there are no uncorrectable errors, the data is simply passed along. If there are errors, the data is marked as being in error at this stage of decoding.
The twenty-four data bytes and four remaining error correction bytes are then passed through unequal delays before going through another Reed-Solomon decoder. These unequal delays result in an interleaving of the data that spreads long error bursts among many different passes through the second decoder. The delays are such that error bursts up to 450 bytes long can be completely corrected. The second Reed-Solomon decoder uses the last four error correction bytes to correct any remaining errors in the twenty-four data bytes. At this point, the data goes through a de-interleaving process to restore the correct byte order.
Required CD digital audio data rate: 44.1 K samples per second * 16 bits per sample * 2 channels = 1,411,200 bits/sec.
CD data rate: 8 bits per byte * 24 bytes per frame * 98 frames per subcoding block * 75 subcoding blocks per second = 1,411,200 bits/sec.
The eight subcoding channels are labeled P through W and are encoded one bit for each channel in a control and display byte. Channel P is used as a simple music track separator. Channel Q is used for control purposes and encodes information like track number, track type, and location (minute, second, and frame number). During the lead-in track of the disc, channel Q encodes a table of contents for the disk giving track number and starting location. Standards have been proposed that would use the remaining channels for line graphics and ASCII character strings, but these are seldom used.
The other type of track specified by the subchannel Q control bit field is the data track. These must conform to the CD-ROM standard described below. In general, a disc can have a mix of CD digital audio tracks and a CD-ROM track, but the CD-ROM track must come first.
Editorial: This first level error correction (the only type used for CD Audio data) is extremely powerful. The CD specification allows for discs to have up to 220 raw errors per second. Every one of these errors is (almost always) perfectly corrected by the CIRC scheme for a net error rate of zero. For example, our tests using Apple's CD-ROM drive (which also plays audio) show that raw error rates are around 50-100 per second these days. Of course, these are perfectly corrected, meaning that the original data is perfectly recovered. We have tested flawed discs with raw rates up to 300 per second. Net errors on all of these discs? Zero! I would expect a typical audio CD player to perform similarly. Thus I expect this raw error rate to have no audible consequences.
So why did I say "almost always" corrected above? Because a sufficiently bad flaw may produce uncorrectable errors. These very unusual errors are "concealed" by the player rather than corrected. Note that this concealment is likely to be less noticeable than even a single scratch on an LP. Such a flaw might be a really opaque finger smudge; CDs do merit careful handling. On the two (and only two) occasions I have found these, I simply sprayed on a little Windex glass cleaner and wiped it off using radial strokes. This restored the CDs to zero net errors. One can argue about the quality of the process of conversion of analog music to and from digital representation, but in the digital domain CDs are really very, very good.
Mode 0 -- all data bytes are zero. Mode 1 -- (CD-ROM Data): Sync Field - 12 bytes Header Field - 4 User Data Field - 2048 Error Detection Code - 4 Reserved - 8 Error Correction - 276 Mode 2 -- (CD Audio or Other Data): Sync Field - 12 bytes Header Field - 4 User Data Field - 2048 Auxiliary Data Field - 288Thus, mode 1 defines separately addressable, physical 2K byte data blocks making CD-ROM look at this level very similar to other digital mass storage devices.
The error detection code is a cyclic redundancy check (CRC) on the sync, header, and user data. It occupies the first four bytes of the auxiliary data field and provides a very high probability that uncorrected errors will be detected. The error correction code is essentially the same as the first level error correction in that interleaving and Reed-Solomon coding are used. It occupies the final 276 bytes of the auxiliary data field.
Editorial: This extra level of error correction for CD-ROM blocks is one of the many reasons that CD-ROM drives are much more expensive than consumer audio players. To perform this error correction quickly requires substantial extra computing power (sometimes a dedicated microprocessor) in the drive.
This is also one reason that consumer players like the Magnavoxes which claim to be CD-ROM compatible (with their digital output jack on the back) are useless for that purpose. They have no way of dealing with the CD-ROM error correction. They also have no way for a computer to tell them where to seek.
Another reason that CD-ROM drives are more expensive is that they are built to be a computer peripheral rather than a consumer device, i.e. like a combination race car/truck rather than a family sedan. One story, probably apocryphal but not far from the truth, has it that a major Japanese manufacturer tested some consumer audio players to simulate computer use: they made them seek (move the optical head) from the inside of the CD to the outside and back again. These are called maximum seeks. The story says they managed to do this for about 24 hours before they broke down. A CD-ROM drive needs to be several orders of magnitude more robust. Fast and strong don't come cheap.
A typical disadvantage in hierarchical systems is that to read a file (which must be a leaf of the hierarchy tree) given its full path name, it is necessary to begin at the root directory and search through each of its ancestral directories until the entry for the file is found. For example, given the path name "Wine Regions:America:California:Mendocino", three directories (the first three components of the path name) would need to be searched. Typically, a separate seek would be required for each directory. This would result in relatively poor performance.
To avoid this, High Sierra specifies that each volume contain a path table in addition to its directories and files. The path table describes the directory hierarchy in a compact form that may be cached in computer memory for optimum performance. The path table contains entries for the volume's directories in a breadth-first order; directories with a common parent are listed in lexicographic order. Each entry contains only the location of the directory it describes, its name, and the location in the path table of its parent. This mechanism allows any directory to be accessed with only a single CD seek.
Directories contain more detailed information than the path table. Each directory entry contains:
Directory or file location. File length. Date and time of creation. Name of the file. Flags: Whether the entry is for a file or a directory. Whether or not it is an associated file. Whether or not it has records. Whether or not it has read protection. Whether or not it has subsequent extents. Interleave structure of the file.Interleaving may be used, for example, to meet realtime requirements for multiple files whose contents must be presented simultaneously. This would happen if a file containing graphic images were interleaved with a file containing compressed sound that describes the images.
Files themselves are recorded in contiguous (or interleaved) blocks on the disc. The read-only nature of CD permits this contiguous recording in a straightforward manner. A file may also be recorded in a series of noncontiguous extents with a directory entry for each extent.
The specification does not favor any particular computer architecture. In particular all significant, multibyte numbers are recorded twice, once with the most significant byte first and once with the least significant byte first.
Programs like HyperCard, with its ease of authoring and broad extensibility, are very useful for this purpose. Hypercard stacks, with related information such as color images and sound, can be easily and inexpensively stored on CDs despite their possibly very large size.
Editorial: The High Sierra file system gets its name from the location of the first meeting on it: the High Sierra Hotel at Lake Tahoe. It is much more commonly referred to as ISO 9660, though the two specifications are slightly different.
It has gotten very easy and inexpensive to make a CD-ROM disc (or audio CD). For example, you can now take a Macintosh hard disk and send it with $1500 to one of several CD pressers. They will send you back your hard disk and 100 CDs with exactly the same content as what's on your disk. This is the easy way to make CDs with capacity up to the size of your hard disk (Apple's go up to 160 megabytes). True, this is not a full CD but CDs don't need to be full. If you have just 10 megabytes and need 100 copies, CDs may be the best way to go.
If you are buying a CD-ROM drive, there are several factors you might consider in making your choice. Two factors NOT to consider are capacity and data rate. The capacity of all CD-ROM drives is determined solely by the CD they are reading. Though you will see a range of numbers in manufacturers' specs (e.g. 540, 550, 600, and 650 Mbytes), any drive can read any disc and so they are all fundamentally the same. All CD-ROM drives read data at a net 150 Kbytes/sec for CD-ROM data. Other data rates you may see may include error correction data (not included in the net rate) or may be a mode 2 data rate (faster than mode 1). All drives will be the same in all of these specs.