NetNews Usenet Archive 1992 #20

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #20 / NN_1992_20.iso / spool / comp / unix / question / 10842 < prev next >

Wrap

Text File | 1992-09-08 | 5.5 KB | 123 lines

Newsgroups: comp.unix.questions Path: sparky!uunet!spool.mu.edu!decwrl!pa.dec.com!nntpd2.cxo.dec.com!nabeth!alan From: alan@nabeth.enet.dec.com (Alan Rollow - Alan's Home for Wayward Tumbleweeds.) Subject: Re: Berkely Fast File System Message-ID: <1992Sep9.025316.19275@nntpd2.cxo.dec.com> Lines: 110 Sender: alan@nabeth (Alan Rollow - Alan's Home for Wayward Tumbleweeds.) Reply-To: alan@nabeth.enet.dec.com (Alan Rollow - Alan's Home for Wayward Tumbleweeds.) Organization: Digital Equipment Corporation References: <1992Sep7.085336.22530@nntp.uoregon.edu> Date: Wed, 9 Sep 1992 02:53:16 GMT In article <1992Sep7.085336.22530@nntp.uoregon.edu>, swynne@cie.uoregon.edu.uoregon.edu (Stephen Wynne) writes: > >I'm having difficulties understanding the Berkeley Fast File System. >I can see how fragmentation would be a problem under the old method, >but I'm getting lost in the details of cylinder groups, bitmaps, etc... >Would anyone care to start a discussion on this topic with me? Sure. The Fast File System offers a variety of solutions to a variety of problems not all related. Two of the more prominent problems solved were: A. Low bandwidth used compared to what was available. B. Very low locality of reference. The solution to A was straight forward, increase the block size. This presented the problem of wasting space on files that used very little of a block. Hence file fragments were born. This presented the additional problem of how to quickly find a good fragment of the right size when you needed one. By keeping each free block list in a table on a per cylinder group basis, one you had decided what cylinder group to put a block in, it was easy to do a pattern search for the a fragment that matched the one you wanted. This was aided by a VAX instructions that did this search very well. Problem B had lots of pieces of solutions. In the old file system blocks were allocated off the free list in the simplest way possible. When block were freed, they were put back on the list in the simplest way possible (I'm doing this from memory, if I'm wrong please correct me). As the result the free list got quickly scrambled. There was also another problem with the old file system. The arrangement of the superblock, inode table and data block was: +------------+ | Superblock | +------------+ | Inodes | +------------+ | Data | +------------+ If the data you were using happened to be at the far end of the disk from the inodes, any update to the inode result in a LONG seek. Now the invention of cylinder groups. A cylinder group resembles the old file system, with the addition of the cylinder group data. The file system is a collection of these groups of cylinders. The file allocation algorithms were changed so that when a new directory was created it was created in a cylinder with an above average amount of free space. As files were created in this directory the preference was to allocate them in the same cylinder group as the directory. As blocks were allocated it was preferred to allocate them in the same cylinder group as the file, except when the file was very large. In this way most files would be kept close together and if they were small enough in a single contiguous block. Even today a large percentage of files will fit in an 8 KB block. Large files would be scattered across the disk, in such a way that big pieces of the file would be close together, but these pieces might scattered completely across the disk. Further the blocks were allocated in such a way that they would be in a rotationally optimal location. One of the problems with many disk controllers is that two separate I/O requests for block next to each other will probably result in getting the first one, and having to wait a rotation for the 2nd block. In this case it is better to put the later block in a location such that the I/O has a chance to get to the controller before the block rotates by. The affect is very pronounced on DEC MSCP controllers with SDI disks. The rotational delay can sometimes double the bandwidth being used for sequential I/Os. This is not as much a problem with disks that have their own read-ahead caches. These block and file allocation methods greatly improved the locality of reference for a single user, especially if the applications used followed the assumptions under which the improvements were designed. The next problem that came up was what to do on a multi-user system. While one user may see most of his disk activity in one little area of the disk a number of user would cause the activity to be scattered all over the disk. The obvious solution was the to do seek ordering. I/O requests would be put on queues handled by the disk driver. When an entry was added the location would be noted so that it could be added with other requests for the same area of the disk. As long as many I/O requests are going to the disk, this will allow an through-put improvement. If there aren't many outstanding I/O requests, then disk probably isn't busy enough to worry about. This particular optimization was disabled for the DEC MSCP controllers that did their own seek ordering. I don't know of any performance tests to compare the two. I have heard one or two complaints from people that would have preferred the controller seek order feature be disabled and the Berkeley disksort kept. Well, I guess that's enough for now. I should probably go re-read the Fast File System paper and see how much I got wrong. > >Thanks, > >STEVE > -- Alan Rollow alan@nabeth.cxo.dec.com