home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!elroy.jpl.nasa.gov!news.claremont.edu!nntp-server.caltech.edu!SOL1.GPS.CALTECH.EDU!CARL
- From: carl@SOL1.GPS.CALTECH.EDU (Carl J Lydick)
- Newsgroups: comp.os.vms
- Subject: Re: Question about RMS and MSCP-pair
- Date: 9 Jan 1993 11:38:21 GMT
- Organization: HST Wide Field/Planetary Camera
- Lines: 87
- Distribution: world
- Message-ID: <1imdfdINNg7a@gap.caltech.edu>
- References: <1993Jan4.115621.29717@news.th-darmstadt.de>
- Reply-To: carl@SOL1.GPS.CALTECH.EDU
- NNTP-Posting-Host: sol1.gps.caltech.edu
-
- In article <1993Jan4.115621.29717@news.th-darmstadt.de>, sysgaertner@cygnus.frm.maschinenbau.th-darmstadt.de (M. Gaertner, FRM, TH Darmstadt, Germany) writes:
- >Hi there,
- >happy new year!!!
- >Now to my question. I discovered some oddity with RMS and MSCP-served
- >disk, at least I think so.
- >I understand that MSCP-served disks (aka RQDX3) may do bad-block-revectoring
- >on the attached disk. I also understand that, when the original block
- >showes unrecoverable READ-errors, the "forced error flag" on the copied
- >block will be set.
- >What I don't understand is why the RMS then refuses to read the file
- >where the forced error-block is in. It always returns the (correct)
- >error (xxx, forced error flag set). So why does the revectoring take
- >place? I don't care if the file is unreadable because of read-errors or
- >forced errors. Or asked different, why isn't RMS showing the file as it
- >is with a warning about the damaged block?
- >Can anybody give an explanation to me?
- >Oh, I know that revectoring bad blocks on the controller-level has it's
- >advantages and I like the concept with the forced error flag BUT I don't
- >like RMS's handling of these blocks.
-
- (Bet everyone's expecting a flame here; sorry to disappoint you).
-
- OK, when a bad block is revectored on a READ operation, it's because the error
- correcting code built in to the system couldn't handle the problem. This means
- that there's no reason to believe that the data in the new block are in any way
- similar to the data in the original block. So the blocked in question is
- flagged with a forced error (in my not-so-humble opinion, it should've been
- called a "software error," but that would cause a whole new set of
- misunderstandings).
-
- OK. So here you are with a file containing a block of what's almost certainly
- gibberish. What do you do about it? Well, if you're a fairly low-level I/O
- routine, you return the data in the block, BUT YOU TELL THE CALLING ROUTINE
- THAT YOU'VE PROBABLY JUST HANDED IT GIBBERISH. Now, if you're a high level
- routine and have been handed gibberish, then unless you're intended
- specifically to handle such gibberish, your only reasonable alternative is to
- say "I quit!" That's what most VMS utilities do (BACKUP being the notable
- exception). Do you, as a user, *REALLY* want to, e.g., COPY a corrupted file?
-
- Now, why does RMS "refuse to read the file"? Well, let's take the canonical VMS
- file type: Variable length records with implied carriage control. We had a
- beginning-of-record going into the bad block. The next record is supposed to
- start with a byte count. But the block is gibberish. Can we trust this byte
- count for the next record? *_NO_*!!!!! Can we ever be confident that we'll
- find the actual beginning of any subsequent record? NO! Is *ANY* data we
- might return from the rest of the file to be trusted? (all together now) NO!
- What can we do? We can refuse to help the user shoot himself in the foot.
-
- Now, you, as a user, can open the original file in block mode and copy it to
- another file (BACKUP is a simple way to do this, if you don't want to write
- your own code), and you can then try to decipher the result. But VMS isn't
- going to let you do this without making it abundantly clear that you're doing
- something quite risky.
-
- In short, the behavior of RMS in this case is to prevent someone who's totally
- clueless about computers from shooting himself in the foot. You *CAN*
- circumvent this, but you've got to go out of your way to do so.
-
- Hope this helps. Btw, VAX/VMS has *MANY* such safeguards built in. That's one
- of the main reasons it's so much slower than many other platforms/systems. A
- few months ago I had a DECStation go south on me. How did I know? Well, one
- of the users started getting anomalous results out of one of his programs. He
- eventually tracked it down to the fact that FORTRAN code of the form:
- X = 97
- resulted in X being assigned a value of 532.0 (numbers made up on the spot here
- for illustrative purposes; the basic point that a convert from long to float
- was royally screwed up is, however, correct). Now, what did the DECstation do
- when that happened? Why, it just went merrily on its way (and by the way, the
- system tests available from the console didn't catch this; we sent the machine
- in for repair, and it came back, having "passed diagnostics" still suffering
- from the same problem). What would a VAX have done? It would've generated a
- machine check and dumped you back to the DCL level (unless, of course, the
- conversion had been attempted in kernel mode [I can't, however, think of a good
- reason why that would ever happen], in which case the system would've crashed).
- Now, the microcode required to permit machine checks slows things down
- considerably. After all, after every operation, the system has to perform some
- sort of sanity check. Me, I'd rather get the CORRECT result in twice the time
- it takes to get the spurious result. A lot of people out there, however, seem
- to be of the opinion that computers are infallible.
- --------------------------------------------------------------------------------
- Carl J Lydick | INTERnet: CARL@SOL1.GPS.CALTECH.EDU | NSI/HEPnet: SOL1::CARL
-
- Disclaimer: Hey, I understand VAXen and VMS. That's what I get paid for. My
- understanding of astronomy is purely at the amateur level (or below). So
- unless what I'm saying is directly related to VAX/VMS, don't hold me or my
- organization responsible for it. If it IS related to VAX/VMS, you can try to
- hold me responsible for it, but my organization had nothing to do with it.
-