home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.bsd
- Path: sparky!uunet!spool.mu.edu!agate!dog.ee.lbl.gov!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry
- From: terry@cs.weber.edu (A Wizard of Earth C)
- Subject: Re: Program dies with FP Exception
- Message-ID: <1992Sep13.083846.6134@fcom.cc.utah.edu>
- Sender: news@fcom.cc.utah.edu
- Organization: Weber State University (Ogden, UT)
- References: <STARK.92Sep13002650@sbstark.cs.sunysb.edu>
- Date: Sun, 13 Sep 92 08:38:46 GMT
- Lines: 67
-
- In article <STARK.92Sep13002650@sbstark.cs.sunysb.edu> stark@cs.sunysb.edu (Gene Stark) writes:
- >Here's a tough one I've been trying to track down -- maybe somebody out there
- >who knows more can guess what is going on.
- >
- >I am running 386BSD on a 486/33 system with 4MB RAM and a 210MB Connor IDE
- >drive. A program I was working on dies on Signal 8 (Floating point exception)
- >in a perfectly repeatable fashion. It is not so easy to tell where the
- >exception actually comes from, though, because the signal seems to be getting
- >delivered to the process much later, when it is leaving the system after
- >a call to "write". I haven't been able to get a small test program that
- >repeats the bug, however there seem to be several crucial elements involved:
- >
- > (1) A call to "atof", which returns a double that is then
- > stored in a temporary on the stack. Removing the call
- > removes the error.
- >
- > (2) The actual magnitude of the number being converted by "atof".
- > I found that the string "1e10" and "1e12" cause the error,
- > but "1e9", "1e6", and "0.0" do not.
- >
- > (3) Some later "write" system calls. The signal is actually
- > delivered on the fourth call to write after the atof.
- > What is happening in the interim is just C code without
- > any other system calls. I do not know what causes the
- > signal to get delivered when it actually does.
-
- First of all, like all other signals, the SIGFPE gets delivered to a process
- as a result of the sigtrampoline code. The *only* way you get a signal is
- on return from a system call. The problem is that there appears to be no
- code in the library which forces a check for the exception *immediately*
- after the floating point function call. This is aggravated by the fact
- that GCC likes to in-line 386 floating point (from what little experimentation
- I've done). This has the effect of defeating any fixes made at the library
- level to hit the sigtrampoline code to check for an exception.
-
- Second, are you using a real FPU, or are you using the emulation? I know
- that I *could* try it myself, but I prefer to arrive at an expected answer
- before experimenting (I guess my physics background shows).
-
- Third, you were aware that for a 16 bit value to be multiplied/divided, you
- have to have a 32 bit area to receive the value, and for a 32 bit, you have
- to have a 64 bit receiver? Perhaps you are truly getting an exception.
-
- Fourth, I believe that the math stuff is actually not being done at the
- highest floating point resoloution (I read this in the newgroup here, so
- I could be totally wrong 8-)). This would lend credence to the idea that
- you are actually getting an exception.
-
- Fifth, there is a well known problem that causes 'ps' to die with the same
- exception -- the problem occurs when you have a double lvalue and assign it
- to an undeclared (int) rvalued function. Are you sure that atof() is declared
- extern double somewhere?
-
- Hope this helps narrow the problem.
-
-
- Terry Lambert
- terry_lambert@gateway.novell.com
- terry@icarus.weber.edu
- ---
- Any opinions in this posting are my own and not those of my present
- or previous employers.
- --
- -------------------------------------------------------------------------------
- "I have an 8 user poetic license" - me
- Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial
- -------------------------------------------------------------------------------
-