home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.os.linux
- Path: sparky!uunet!munnari.oz.au!uniwa!oreillym
- From: oreillym@tartarus.uwa.edu.au (Michael O'Reilly)
- Subject: Thoughts on the kernel/mm/fs interface
- Message-ID: <1992Jul23.074530.8051@uniwa.uwa.edu.au>
- Sender: news@uniwa.uwa.edu.au (USENET News System)
- Nntp-Posting-Host: tartarus.uwa.edu.au
- Organization: University of Western Australia
- Date: Thu, 23 Jul 1992 07:45:30 GMT
- Lines: 81
-
- Just haveing thrown out my second attempt at implementing the BSD FFS, I
- would like to toss a few ideas into the ring as regards the current
- interfaces between the mm, fs and kernel proper.
-
- In my humble opinion, the current memory manager/file system interface
- seems a little clumbsy. In particular 3 things are very difficult to
- implement cleanly.
-
- 1) Blocksizes other than 1K. This is due to the code in /fs all
- assumeing a 1K block size. :( This really does need fixing.
-
- 2) A variable sized buffer. I.e. the buffer can use all pages not
- currently used by the kernel + user processes. This isn't too bad to
- implement, but it doesn't look clean. In particular the mm has some
- awkward semantics to deal with.
-
- 3) Anything that requires a larger super block than the minix one. :(
-
- Suggestion for improvements.
-
- 1) The current buffer system be seperated a bit more from the fs system
- with one main interface;
- long get_disk_page(struct inode *i, long offset);
-
- Note that it returns the page number of the resultant page. Note also
- that the offset is in bytes, but should be in multiples of the page
- size. This cleans up two things in particular. The shared pages, and the
- demand pageing. It also frees the mm from dependance on the disk block
- size.
-
- There is no reason why the page just read from the disk should
- be copied over to the user prossess space. (well, there is under the
- current modem, but we want to improve. ). SO lets just get the page
- number and map it directly in, without copying. This has the nice bit
- now that the page is still in the disk buffer (albeit locked). If
- another copy of the same application is run, get_disk_page() is called
- which maps tthe same page right into the new process's space. No copying
- of memory, and cleaner semantics for the memory manager. The thought
- just strikes me that this would probly improve mmap() as well. I
- haven't checked that ;)
-
- 2) The VFS really needs to get away from it's dependance on 1K blocks.
- This is reasonably difficult to do. The best way I have thought of to do
- it, is just to change b_dev in the buffer_head structure to be a pointer
- to tthe super block instead of a device number. Only other way I think
- is to add the actually buffer size to the buffer_head structure and fix
- it in getblk(). Note that getblk() only gets passed the device number,
- not the super block, so we need to search the mount list to find tthe
- buffer size. This would need to be fixed.
-
- 3) The super block and inode structures needs a lot of cleaning. The
- main problem is that there are a lot of elements on tthe current super
- block struct that are only meaningful to linux. The FFS in particular
- needs a far whack of info in memory which should be stored in the super
- block.
-
- The easiest way of fixing this is just to make the fs-independant parts
- of the struct lie of the front, and let individual file systems cast it
- to access fs-depedant variables.
-
- #define COMMON_SUPER \
- int sb_blen; /* block length */ \
- int sb_magic; /* magic number */
-
- ie. struct super_block {
- COMMON_SUPER;
- char sb_dummy[2048]; /* space for fs dependant info */
- };
-
- struct ffs_super_block {
- COMMON_SUPER;
- int sb_cgrps; /* number of cyl groups. */
- ...
- };
-
- Looks nice to me. comments?
-
- I have got down here, and am cartain I have things I havve forgotten to
- say. Any comments? furthur suggestions/improvements I have missed?
-
- Michael
-