home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.protocols.nfs
- Path: sparky!uunet!cs.utexas.edu!sun-barr!ames!biersch!aga
- From: aga@Comtech.com (Alan G. Arndt)
- Subject: Re: Sun PC-NFS performance (again)
- Message-ID: <1992Dec16.204443.803@Comtech.com>
- Organization: Comtech Labs Inc, Palo Alto
- References: <1g3673INNa64@seven-up.East.Sun.COM> <1992Dec15.015951.20329@Comtech.com> <1gksggINNjmh@seven-up.East.Sun.COM>
- Date: Wed, 16 Dec 1992 20:44:43 GMT
- Lines: 225
-
- In article <1gksggINNjmh@seven-up.East.Sun.COM> geoff@tyger.Eng.Sun.COM (Geoff Arnold @ Sun BOS - R.H. coast near the top) writes:
- >
- >????? PC-NFS is just a BWOS (big wad of software). Its performance is
- >determined by all of those things that you just said were *not*
- >limiting factors. (OK. Let me cover *all* the bases. It is
- >theoretically possible for PC-NFS to self-time in such a way that the
- >performance is independent of, e.g., CPU speed. I can assure you that
- >it doesn't do this.) Obviously the design of PC-NFS will affect the
- >performance, but the assertion that there is an absolute limit of 277KB
- >because of PC-NFS is incongruous.
-
- Sure it's performance is determined by ALL those things, but when every
- other item is sped up and there is NO performance increase what does one
- look to?
-
- >Your memory is faulty. PC-NFS has always used a read size of 1K.
- Ok. I will certainly submit on this one, it was many years ago and heck
- I could have even been looking at the wrong thing then.
-
- >(Initially this was because most/all PC Ethernet cards were incapable
- >of dealing with back-to-back packets. Unfortunately it got ingrained
-
- I know they were, both the previous cards we used (3c501 and 3c503's)
- have small buffers.
-
- >into the buffer management logic, and it's still there. But
- >I digress...) As for the sizes of the request (and response) packets,
- >these are NFS issues rather than anything to do with PC-NFS.
-
- I realize that. The Sun requests were the same lengths. Basically
- just a lot of overhead, and that will never change.
-
- >#When one
- >#starts to add up the requests and individual 1K bytes I can belive
- >#that the performance is ONLY 277 KB/sec and that the machines are
- >#doing their BEST the get that. That is why there is a noticeable
- >#difference between 50 Meters away and a machine RIGHT next to it's
- >#client.
- >
- >OK, but see below. (And the last sentence only makes sense if you are
- >going through some painfully slow routers. Bridges or repeaters
- >shouldn't introduce any noticeable delay.)
-
- I would agree you shouldn't notice a delay, I just couldn't imagine any
- other reasons for the performance differences. If I assume that the
- 4/670 is slower then the SS1+ at serving files, for what ever the
- reason, then it is explained. And I guess I can certainly belive there
- is some sort of interaction that actually makes the 4/670 slower.
-
- >#We also noticed that the PC was only creating WRITES of apparently
- >#512 bytes (734).
- >
- >Small writes strike again.
-
- Yep. Seems to always be the case. How much software actually makes
- small writes? For what I can determine almost everything.
-
- Is it possible to have an option on the config.sys line, for whatever
- driver is responsible, to specify the Write buffer size and the Read
- request sizes? If my card handled 8K reads or even 4K reads it would
- most certainly be ALOT faster. I can put up with the loss of the extra
- 4,8, or even 16K of DOS memory. I agree that some people can't. Even
- if it gets shoved into upper memory it will still be faster.
-
- >On an NFS server, any write other than an optimally aligned write of
- >exactly bsize bytes may require a read-modify-write to get the disk
- >updated. Small transfers are clearly much less efficient than large
- >ones in this mode, and in fact using a 512 byte write size will reduce
- >performance noticeably. I'm posting as a separate follow-up copies of
- >my "reader" and "writer" programs which do 8K writes and reads. Under
- >PC-NFS, 8K writes are performed using a single RPC (assuming that the
- >server can support this tsize); each UDP datagram is obviously
- >fragmented based on the network MTU.
-
- How do all networks work? Novell, Lan Manager, etc? Do they all have
- this same problem? Are all the benchmarks for them bull**** because they
- are not based on real world applications? Or do they even buffer up
- writes that are LARGER then 256 bytes into a full 8K transfer? In one of
- our applications we need to write blocks that are somewhat over 4K but
- note precisely 4k. We could munge all this around to create actual 8K
- writes but this would seem something that a program shouldn't have to
- know about or deal with.
- Is this just another issue of space? Your writes are direct from their
- respective locations in memory so if the write is larger then 256 you
- don't have a buffer for it? If one were to allocate a 8K buffer and
- deal with it as you currently do with <256 byte writes would that work?
- Assuming of course you have to take the extra precious DOS memory.
-
- >#Now we looked at the manuals and the only option I found was TSIZE
- >#which WAS set at 8Kbytes. The TSIZE is supposed to be only for
- >#writes but as there is only one option I assumed it might also be
- >#used for the read request size. It is however obvious that PC-NFS
- >#pay's NO ATTENTION to this parameter and it wasting a tremendous
- >#amount of time dealing with small block sizes.
- >
- >[Watch those capital letters - it reads as though you're shouting.]
-
- Well at that point I probably was shouting, sorry.
-
- >The tsize parameter is not "supposed to be only for writes. Consulting
- >AnswerBook, I find that it defines tsize as
- >
- > The optimum transfer size of the server in bytes. This is
- > the number of bytes the server would like to have in the data
- > part of READ and WRITE requests.
- >
- >This is actually misleading. A client should treat tsize as an upper
- >bound on the read and write size, since this is the only way a
- >server with a marginal network interface can discourage the client
- >from sending back-to-back packets.
-
- Well I don't have AnswerBook but I did just go read the User Manual and
- the Reference Manual for 4.0 and in at least 4 places the wsize
- parameter was defined and it ONLY refered to it as a write size. I
- couldn't ever find a definition of tsize in the manuals. AND based on
- what you said earlier of the reads always being 1K it seems obvious that
- the tsize and wsize do in fact only refer to the write size. As I
- said above I would love for them to control the read and write size
- but you would also need to specify that on the config.sys line so it
- could allocate the memory needed.
-
- >
- >#So what can be done to PC-NFS to get it to use the LARGE block sizes?
- >#Is there some parameter I have missed? The SMC8013 cards have
- >#buffers of 16K on the board and certainly could handle 4K blocks if
- >#not full 8K blocks. The 3C509 cards will stream the data off the
- >#card as it comes in so the block size is irrelavent.
- >
- >These factors are only a tiny part of the whole problem. Let's assume
- >that we're never going to drop packets, and that we can get the data
- >off the board as fast as memory will take it. The more difficult issues
- >revolve around buffer management. How many buffers? How big? In
- >conventional memory, UMB, EMS? If EMS, do you transfer directly to EMS
- >from the net or copy it up? If the former, what happens to interrupt
- >latency? If the latter, how do you avoid excessive copying of data? How
- >do you avoid the EMS performance hit in the simple case? Do you
- >optimize around an EMS model, and if so what do you do about the
- >millions of PCs which don't/can't support EMS? But wait - there's
- >more! How do you handle ReadDir response buffering? Do you use the
- >same buffer pool as for file data? If so, how do you deal with the
- >radically different buffer aging and re-use patterns for data and
- >directories? If not, where do you copy directory information to? Do you
- >hard-code a single policy and configuration, or do you make the whole
- >thing infinitely configurable? If the latter, how much does the
- >configurability cost in terms of code size and performance?
-
- I understand there are a lot of issues and I appreciate the time and
- effort you have put into making things as fast as they are. I also
- understand the many tradeoffs. I would certainly like it to be as
- fast as it can and run with the buffers in upper memory. If that isn't
- a feasible option I would settle for them in DOS memory as I am sure
- many people would. 8 or 16K of memory isn't that bad for a 2-6x
- performance increase. AND if I am typical of many setups I already have
- so many things loaded that I can't put all my things into upper memory.
- All I would do is rearrange and put my mouse driver up high and leave the
- pcnfs buffer down low. If however this can't go in upper memory and
- forces pcnfs.sys into low memory that would incease low memory usage by
- 80+K. That may not be acceptable in all our applications but certainly
- in some of them. However if those buffers are already in pcnfs.sys I
- see no reason why it can't stay in upper memory as it is now.
-
- >Over the lifetime of PC-NFS, the most consistent demand has been for
- >size reduction. We've added features over the years, but we've tried to
-
- I can understand that. I to as many others I am sure complained about
- how much memory I had left after loading PC-NFS. If however I was given
- the choice of performance or ram I know I would choose the performance.
-
- >keep the footprint more or less constant. What avenues are open to us
- >to improve performance? More buffering of small writes; support for
- >larger (up to 8K) reads; more read buffers. All these will increase the
- >size significantly. Putting buffers in EMS will slow some things down
- >(EMS access times plus at least one extra copy). The present design
- >and configuration represents our current best shot at balancing all of
- >these conflicting demands. You (and others) want better performance:
- >we'll obviously consider that in our product plans, but please
- >recognize that there are competing demands that we must consider.
-
- I appreciate the effort you have put into making a common configuration
- that accepts the demands of everyone, unfortunately not everyone has
- the same extreems of needs. I for instance no longer use many dos
- apps and the memory available in DOS is quickly becoming irrelavent.
-
- Now for the interesting news. I ran your test programs and I now know
- what I have always belived, my SS1+ is a lot faster then anyone gives it
- credit for, as are my PC's.
-
- On a WD8013
- Reading a 10M file
- total time 36140 msec total bytes 10485760 bytes/msec 293
-
- Writing 1280 blocks
- total time 18840 msec total bytes 10485760 bytes/msec 556
-
- On a 3C509
- Reading a 10M file
- total time 36030 msec total bytes 10485760 bytes/msec 291
-
- Writing 1280 blocks
- total time 17300 msec total bytes 10485760 bytes/msec 606
-
- And in fact DOS Copy and Windows file copies do use 8K writes. Due to
- the slowness of actually reading and copying a file on a DOS machine the
- performance for a 10Meg file topped out at 275K on writes and 154K on
- reads (writing to a DOS disk). The annoying thing is that a DOS read to
- nul: isn't any faster. I don't know if that is because DOS really is
- shoving it somewhere or what. It should also be noted that your same
- programs compiled as actual windows apps were about 5% slower and the
- Windows FileMgr was about 20-30% slower then DOS copy.
- However there wasn't a significant difference between running your test
- in DOS and in a DOS Window.
-
- I did look at etherfind for the writes and they were 8.5K total in a group.
- And from the performance numbers it is obvious that writes were 8K and
- that reads are still stuck at 1K. Obviously a Server should be able to
- provide a file that is cached in memory faster then it can do the
- writes. Oh and of course those writes are fast because we patched the
- kernel to perform async NFS writes.
-
- ----
- Alan Arndt Comtech Labs
- 415-813-4500 4005 Miranda Avenue, Suite 150
- aga@Comtech.com Palo Alto, CA 94304-1218
-
-
-