home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!olivea!sgigate!odin!mips!sdd.hp.com!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!yale!mintaka.lcs.mit.edu!ai-lab!life.ai.mit.edu!burley
- From: burley@geech.gnu.ai.mit.edu (Craig Burley)
- Newsgroups: comp.os.linux
- Subject: Re: Buffer corruption problems.
- Message-ID: <BURLEY.92Aug13153840@geech.gnu.ai.mit.edu>
- Date: 13 Aug 92 19:38:40 GMT
- References: <16078@ucdavis.ucdavis.edu> <1992Aug13.163854.21617@midway.uchicago.edu>
- Sender: news@ai.mit.edu
- Followup-To: comp.os.linux
- Organization: Free Software Foundation 545 Tech Square Cambridge, MA 02139
- Lines: 50
- In-reply-to: ace3@quads.uchicago.edu's message of 13 Aug 92 16:38:54 GMT
-
- In article <1992Aug13.163854.21617@midway.uchicago.edu> ace3@quads.uchicago.edu (Tony 'LLama' Acero) writes:
-
- I have no idea what's going on and would appreciate any input! :-)
- (The smiley is to indicate I'm not complaining and half-expecting
- that I've done something bone-headed)
-
- I'm not sure about your problem or the person's to whose post you followed up,
- but...
-
- ...I believe there is a bug in Linux that has the following behavior:
-
- - causes Linux to "misread" one 1024KB chunk of data from a disk-based file
- so that what your app ends up with is some _other_ 1024KB chunk
- (apparently from the same file)
-
- - occurs only during very heavy disk access, such as megabytes accessed
- continually
-
- - is intermittent, but happens enough to reproduce fairly easily
-
- - might be SCSI-related (I have a SCSI system) but, based on responses I've
- gotten from others saying they've seen the same behavior, probably isn't
-
- - is still in 0.97 and perhaps happens somewhat more often there (though of
- course it's hard to measure this)
-
- I keep putting off exploring this bug myself for various reasons, such as:
- I'm not a Linux-kernel hacker yet; I keep hoping someone else will fix it
- first; it's hard to debug when your own dev system _has_ the intermittent
- failure; I'm too busy with GNU Fortran; I'm waiting for the newer SCSI
- code with 4K blocks to see how that affects the bug (since that might be an
- important clue); I'm waiting until I have extfs &c up an running reliably
- so I can make use of my currently wasted disk space before tackling this
- bug; I'm just plain lazy; I'd rather play tennis with my wife; etc.
-
- I'm convinced that when I finally decide to tackle the bug (rather than just
- write a shell script and program to create a test-case to demonstrate it,
- as I've done so far), I'll blow 72 hours on it and _then_ find it someone
- else found and fixed it! (Unfortunately, nobody responded to my posted
- test case saying they'd reproduced the problem, much less had the know-how
- and desire to look into it themselves. If anyone wants, I could repost it
- to the mailing list as I did last time. Email me, don't post here, to
- keep traffic low, if you want me to post the test case.)
-
- Of course it could be a hardware bug in my system, but seeing as others
- _seem_ to have the same problem on wildly different hardware, I doubt it.
- --
-
- James Craig Burley, Software Craftsperson burley@gnu.ai.mit.edu
- Member of the League for Programming Freedom (LPF)
-