home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.bsd
- Path: sparky!uunet!gatech!destroyer!caen!hellgate.utah.edu!fcom.cc.utah.edu!gateway.univel.com!gateway.novell.com!ithaca!terry
- From: terry@ithaca.npd.Novell.COM (Terry Lambert)
- Subject: Satanic boot problem tracked to CMOS, wd.c
- Message-ID: <1992Jul23.152046.13374@gateway.novell.com>
- Keywords: 386bsd wd.c boot CMOS satan
- Sender: terry@ithaca (Terry Lambert)
- Nntp-Posting-Host: ithaca.eng.sandy.novell.com
- Organization: Novell NPD -- Sandy, UT
- Date: Thu, 23 Jul 1992 15:20:46 GMT
- Lines: 126
-
-
- Well, curiousity got the better of me... and I found what I believe
- to be *the* boot problem... well, several boot problems, actually.
-
-
- The magic file? usr/src/sys.386bsd/i386/i386/machdep.c!
-
-
- 1) The value of 'maxmem' is global. This should result in it being
- auto-initialized to 0, if the compiler is a compiler. If either
- the 'biosbasemem' or 'biosextmem' is "invalid", then the value
- of maxmem is set by "maxmem = min (maxmem, 640/4);" to zero.
- This will result in 0, which is clearly incorrect, as the boot
- code is obviously running in RAM somewhere... besides, maxmem
- is calculated off 'Maxmem' directly after the if statement,
- blowing the value to 0-1, which puts us at 0xffffffff for our
- amount of memory.
-
- Correction: First, this is incorrect; the value being set in
- the default case should be 'Maxmem', not 'maxmem'. It is very
- arguable that the min of 0 and anything will be zero; why is
- the 'min()' function called at all in this case? It is also
- arguable that a base memory of less than 640K is unable to boot
- 386BSD, so the forced default should be 640K in the "bad CMOS"
- case. If the machine actually has less than 640K, it will fail
- anyway; but if the thing *has* 640K, this will allow it to boot.
-
- 2) If the amount of extended memory is not greater than 0, or the
- biosbasemem is not equal to 640, 'Maxmem' is *never* set. This
- is the missing "not handled" case which would more correctly be
- the second "else".
-
- Correction: I suggest propmting the user for the amount of memory
- in the machine at this point, and jumping to just after the
- "#endif" for "NDDB" to avoid reiterating the boundry check code.
-
-
- I suspect that one of these two (fatal) cases are being triggered
- by my CMOS having "incorrect" values. There are several reasons this
- might occur:
-
- 1) The CMOS truly has "incorrect" values. A diagnostic to this effect,
- along with what the values retrieved were, and a "Hit any key to
- continue" message immediately following the "degraded mode" message
- would greatly help debugging this. This is, I believe, the case,
- although the reason the values are "incorrect" is that "_rtcin" is
- broken.
-
- 2) The CMOS has the correct values, but the read of the CMOS fails
- due to timing; most likely, this is related to the reset rate of
- various items on my bus. I suspect that the longest delay reset
- items, specifically the built-in bus mouse, are the most likely
- suspects if this is indeed the cause. Again, the modified
- machdep.c would help me narrow this.
-
-
- The HP Vectra problems could easily be realted. Dollars to donuts
- says that my AT&T machines and the Vectra store their CMOS values in a
- strange place, unexpected by BSDI. There is code to the effect that
- "probing breaks certain 386 AT relics"; I suppose *NOT* probing is the
- cause of our problems. I suspect that only one location is being used,
- and that the entire memory is being listed there. Again, without a boot
- diagnostic with suffucuent delay, I have no way of telling.
-
-
- Additional notes on boundry conditions:
-
- I would suggest that the expression "maxmem = Maxmem - 1;" be
- checked for a minimum and maxum bounds (it immediately follows the "if"
- on line 876 of machdep.c). This is more likely to be the intent of the
- misuse of the "min()" expression for the first case of the "if".
-
-
-
- What I suspect: '_rtcin' in locore.s is broken. Specifically,
- it reads as follows:
-
- .globl _rtcin
- _rtcin: movl 4(%esp),%eax
- outb %al,$0x70
- subl %eax,%eax # clr eax
- inb $0x71,%al # Compaq SystemPro
- ret
-
-
- This should probably look like the following to guarantee that it
- is more generic (and therefore more likely to work):
-
- .globl _rtcin
- _rtcin: movl 4(%esp),%eax
- outb %al,$0x70
- inb $0x71,%al # Compaq SystemPro/ATT/HP
- andl $0x000000ff, %eax # Fix big nasty bug
- ret
-
-
- I believe that the zeroing of eax is detrimental, and have removed
- it; only a byte of the value returned is defined... the rest is undefined,
- and is set by the setup program to whatever.
-
-
- One of the reasons I need these fixes is to rebuild the kernel: the machine
- 386BSD currently runs on at Weber State University (1 whole box) has a
- problem with both memory and disk space. The machine I was doing file
- system developement on 0.0 has been confiscated to teach NetWare classes on
- (somewhat ironic, considering that I work for Novell), and this has brought
- me to a halt. My cross-compilation environment has died on a couple of the
- new header files; I really can't justify the time to fix this until I have
- 386bsd up on at least one real box, and I can't get it up on a real box until
- I have fixed binaries 8-(.
-
-
- Any of the partial (machdep.c) or full (locore.s) fixes suggested
- on a dist.fs disk would be greatly appreciated! I'm sure that this would be
- very helpful in diagnosing the HP Vectra problem, if it didn't fix it
- outright, and would certainly serve to expose a lot of internals students
- to BSD as well as System V.
-
-
- Regards,
- Terry Lambert
- terry_lambert@gateway.novell.com
- terry@icarus.weber.edu
- ---
- Disclaimer: Any opinions in this posting are my own and not those of
- my present or previous employers.
-