home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.sysv286
- Path: sparky!uunet!cs.utexas.edu!hermes.chpc.utexas.edu!jonathan
- From: jonathan@chpc.utexas.edu (Jonathan Thornburg)
- Subject: (long) Microport/286 won't boot (was: Re: Microport cactus, newsfeed dead, spouse rabid)
- Message-ID: <1992Jul31.074716.21851@chpc.utexas.edu>
- Summary: big trouble
- Keywords: Microport 286 System V/AT usenet boot primary disk failure broken
- Sender: jonathan@einstein.ph.utexas.edu
- Organization: U of Texas at Austin / Physics Dept / Center for Relativity
- References: <1992Jul31.014347.14547@pta.pyramid.com.au>
- Date: Fri, 31 Jul 92 07:47:16 GMT
- Lines: 200
-
- In article <1992Jul31.014347.14547@pta.pyramid.com.au>
- russ@pta.pyramid.com.au (Russell Day) writes:
-
- >I am in trouble.
- Alas, yes.
-
- > [ sad story of computer failing to boot, apparently
- > due to disk corruption including partition table
- > and/or primary boot block ]
-
- Unfortunately, you don't say precisely what software you're running.
- I presume "Microport System V/AT", but you don't say what version.
- There are a lot of known bugs in Microport, especially in the 1.*
- releases -- knowing the version number might help track things down.
-
-
-
- >Naturally, no-one in his right mind would want to back up 38
- >MB onto floppies
-
- Well, I currently have full backups of 56 MB on floppies (5.25"
- high density, not compressed) from my Microport 286 system. That
- said, you're not the first person to question my state of mind. :-)
- (And there will doubtless be several more such questioners when
- my committee gets to see my thesis next month. :-) )
-
- >... so I dont have a recent backup.
-
- This is not good. Backups are like any other form of insurance --
- they're a boring waste of time and money, *until* you need them.
- Then they become worth their weight in silicon. I do a complete
- dump every 6 months to a year, and incrementals every 2 weeks or
- so. (Sometime after my thesis is done (October?) I'll post my
- "backup system", it's less than 50 non-comment lines, mostly in
- very simple awk, yet can do incremental dumps and selective restores,
- and if a read/write error occures on one floppy, it can still access
- the others in a backup set.) I have never had to recover anything
- due to hardware failure, but about once every few months I manage
- to delete an important file or files by mistake, even with the lines
- set limbo='/usr/limbo'
- alias rm "mv -f \!* $limbo"
- in the interactive-shell part of my .cshrc .
-
-
-
- >Some test were run:
- > All BIOS read and seek tests pass.
- > I can boot unix from the floppy.
- > fdisk describes the Unix and Dos partitions correctly.
- > I can boot dos (first 100 cyls of disk).
- >
- >So the drive and the partition table are both ok.
-
- It's possible that some hardware component is getting marginal,
- so some timing is being violated. Unix works the hardware a lot
- harder than DOS, and tends to be less tolerant of flaky hardware.
- This probably applies to the disk as well -- remember Unix *doesn't*
- use the DOS BIOS once it's up and running, so a marginal disk sector
- might work ok under DOS but fail under Unix. Have you tried using
- fdisk to rebuild your disk's bad track table? Oops, that might
- erase your disk, I'm not sure. But if not, it's certainly worth
- doing. Ditto for resetting the partition table -- even if it reads
- ok, the bits might not be all the way on :-), so refreshing them
- might be a good idea.
-
- Another idea: If your machine can run at several different clock
- rates, you might try switching to the slowest of them and see if
- it makes any difference. I suspect it won't, but it might be
- worth a try.
-
-
-
- >Having booted DOS, I try the following:
-
- I presume you meant "Having booted *Unix* from the floppy, ...".
-
- > mount /dev/dsk/0s0 onto /mnt
- >
- > It says 'Drive 0 Type 15'
- > Error reading super block.
- > It is getting a disk error trying to read head 5 sector 1.
- >
- >This is a no-no because the drive has only 5 heads numbered
- >0-4. And it is a type 17, not a type 15 as stated. I can
- >understand that it thinks that the super block is creamed. I
- >dont know why it wants a type 15 drive, when the cmos setup,
- >and fdisk, all agree that it is a type 17. Is it on drugs?
-
- I had some hardware trouble about 7 months ago with some symptoms
- in common with yours. When I restarted my machine (Microport System
- V/AT 2.3.0U, Apco 286, 8 MHz, 640K base + 2048K extended memory,
- 72 MB disk) after a transcontinental move, it had several hardware
- /setup problems. Among them was a scrambled cmos -- it said that
- the date was "January 56 1992" [sic]. I don't remember all the
- details of how I got it going again (several service trips never
- did find what was (is) broken with the hardware, and it's running
- fine now, though I ended up having to leave 128K of base memory
- disabled), but there were two key things:
- - I found that the battery in my computer had come loose during
- shipping. I tried to reconnect it myself, but couldn't find
- the right connections amongst several other bare-metal-terminals
- within reach. (This was before the machine's first trip to the
- computer hospital.) So you might try checking that your battery
- is properly hooked up, and think about whether it's going dead
- on you. (That can cause all kinds of flaky problems.)
- - I found that the Unix /etc/setup program would sometimes fail
- to properly store into the cmos even when the DOS setup program
- would work fine. So you might try reinitializing all your cmos
- parameters with the *DOS* setup program.
-
- >So I muck about some more, and I get to a stage where
- > mount /dev/dsk/0s0
- I presume you actually meant
- mount /dev/dsk/0s0 /mnt -r
- >returns
- > Not a file system.
-
- This is pretty clear -- your superblock is corrupt, or there's some
- disk problem (like trying to read from a nonexistant head on a
- wrong-type drive) that makes the kernel think the superblock is
- corrupt.
-
- (Aside #1 -- It would be nice if Microport supported
- the Berkeley Fast File System, which includes duplicate superblocks
- scattered at intervals through the disk, so one can try booting from
- an alternate if the primary one is corrupted. But alas FFS support
- isn't part of System V Release 2 (which is what Microport SysV/AT is),
- and nobody's doing any development work on ancient software like this
- anyway, so don't hold your breath...). Moreover, the FFS uses more
- CPU than the Unix Version 7 FS (which is what Microport has), and
- it's not clear whether the improved disk usage would compensate for
- this.)
-
- (Aside #2 -- does anyone have an disk defragmenter for Microport that
- works without having to dump and restore everything?)
-
-
- >And
- > divvy -d
- >returns
- > Invalid partition end record.
- >(This may be because the selected partition was somehow changed
- >to be the dos partition - I didnt think of this at the time, and
- >now I am here, not there, and cant check. )
-
- Hmm, I don't know, but I wouldn't be suprised if the superblock isn't
- used to calculate where to look on the disk for the partition end
- record, so this too might be a bad-superblock problem.
-
- That said, you should certainly check that divvy is looking at the
- Unix partition, and if not retry this there.
-
- By the way, my manual pages for "divvy(1M)" don't mention any options
- at all. I just tried "divvy -d" and it dumps all sorts of "interesting"
- information -- thanks for telling all of us dwindling tribe of
- Microport-ers out in net-land about it. How did *you* find out of
- its existence? (I.e. what other manuals should I have read to learn
- about it?)
-
- >So the question of the day is:
- > Why does the boot floppy decide my disk is a type 17
- > when all the indications are that everything else
- > thinks it is a type 15?
-
- Bad karma? Unfavorable morphic resonance? Wrong phase-of-moon?
- Unkind remarks about Microport in comp.unix.sysv286?
-
- Seriously, though, this certainly resembles the battery problems
- I had. I'd look at testing the battery, or taking it to a computer
- service shop and having them do so.
-
- >Any hints?
- Another thing to try: Boot Unix on floppy disk. Use dd on /dev/dsk/...
- to take a look at the superblock, and try to figure out what's corrupted.
- The superblock format is described in the manual pages, see fs(4).
- In fact, Microport has a program /etc/fsdb(1M) to grub around in damaged
- filesystems, printing and possibly repairing things. But alas it
- doesn't seem to be on the boot disk. If you want, I could mail you
- a copy of it on a floppy -- E-mail me if you want to try this, though
- unless you have a 2-floppy machine we'd have to somehow copy it onto
- a boot floppy.
-
- That said, I suspect the problem is somewhere "lower down", since
- it looks like your system is trying to access the disk with the
- wrong drive type and head count. This info comes from the cmos.
-
- When I was trying to figure out why my cmos wasn't working properly
- (before I knew that the battery was disconnected), one thing that I
- found is that a write to cmos (from either DOS setup or Unix /etc/setup)
- would "succeed", and the new data would stay there so long as the
- machine stayed on. But if I shut the machine down, actually turned
- the power off, then restarted, the cmos would be wrong again. You
- might check whether this is happening to you.
-
- Good luck,
-
- - Jonathan Thornburg
- <jonathan@einstein.ph.utexas.edu> or <jonathan@hermes.chpc.utexas.edu>
- University of Texas at Austin / Physics Dept / Center for Relativity
- and (for a few more months) U of British Columbia / {Astronomy,Physics}
-