home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.unix.sysv386:13090 comp.unix.bsd:4076 comp.os.mach:1033 news.answers:2475
- Path: sparky!uunet!cbmvax!snark!esr
- From: esr@snark.thyrsus.com (Eric S. Raymond)
- Newsgroups: comp.unix.sysv386,comp.unix.bsd,comp.os.mach,news.answers
- Subject: Known Bugs in the USL UNIX distribution
- Message-ID: <1hQSyQ#5kbT1S8FRbB41gD9ZK5vKTn2=esr@snark.thyrsus.com>
- Date: 12 Aug 92 17:25:48 GMT
- Expires: 5 Nov 92 00:00:00 GMT
- Sender: esr@snark.thyrsus.com (Eric S. Raymond)
- Followup-To: comp.unix.sysv386
- Distribution: world
- Lines: 734
- Approved: news-answers-request@MIT.Edu
-
- Archive-name: usl-bugs
- Last-update: Wed Aug 12 13:15:22 1992
- Version: 6.0
-
- [My apologies to all for being a week late. Press of work and all that...]
-
- What's new in this issue:
- * Lots more on the origins of various bugs.
- * New bugs in tar, symbolic links, csh, SCSI support.
- * An explanation for many BSD-compatibility library bugs.
- * Some quasi-official USL responses to bug reports.
-
- I. Introduction
-
- This posting lists known bugs in System V Release 4 implementations, and known
- fixes applied by various porting houses. It was formerly part of the
- 386-buyers-faq issues 1.0 through 4.0, and is still best read in conjunction
- with the pc-unix/software FAQ descended from that posting.
-
- This document is maintained and periodically updated as a service to the net by
- Eric S. Raymond <esr@snark.thyrsus.com>, who began it for the very best
- self-interested reason that he was in the market and didn't believe in plonking
- down several grand without doing his homework first (no, I don't get paid for
- this, though I have had a bunch of free software and hardware dumped on me as a
- result of it!). Corrections, updates, and all pertinent information are
- welcomed at that address.
-
- This posting is periodically broadcast to the USENET group comp.unix.sysv386
- and to a list of vendor addresses. If you are a vendor representative, please
- check to make sure the information on your company is current and correct. If
- it is not, please email me a correction ASAP. If you are a knowledgeable user
- of any of these products, please send me a precis of your experiences for the
- improvement of future issues.
-
- The bug descriptions often include indications of fixes by the various porting
- houses to their current releases. These are:
-
- Consensys UNIX Version 1.3 abbreviated as "Cons" below
- Dell UNIX Issue 2.2 abbreviated as "Dell" below
- Esix Revision A abbreviated as "Esix" below
- Micro Station Technology SVr4 UNIX abbreviated as "MST" below
- Microport System V Release 4.0 version 4 abbreviated as "uPort" below
- UHC Version 3.6 abbreviated as "UHC" below
- SCO Open DeskTop 1.1 abbreviated as "SCO" below
-
- II. General Bugs
-
- 1. Dropout problems with tty devices
- The most serious problem anyone has reported is that the USL asy driver is
- flaky and occasionally drops characters at above 4800 baud.
- Microport, Dell, Esix, and UHC say that they believe they've fixed this.
- However, Dell, at least, was mistaken when they first made this claim; a more
- detailed description of the problem is given below. I have been assured that
- this is on the fix list for the next Dell release.
- Bela Lubkin at SCO comments "386 interrupt latency vs. unbuffered UARTs.
- This is a tough problem. Nobody's driver should drop characters with a
- turned-on 16550. It's not so easy with a 16450. Anyone with 16450s or lower
- should be able to solve their problems by dropping in a 16550."
-
- 2. Suid programs dump core when signalled
- Mark Snitily of SGCS says that under many SVr4s, signalling a
- process that is running suid root will cause it to core-dump. He says
- Dell and MST have fixed this, and SCO doesn't suffer from this.
- On the other hand, David Wexelblat writes "In Known Bugs in the USL Code,
- regarding core dumps from signalling suid-root SVR4 programs, Microport does
- not display this problem (either 3.1 or 4.1). I have reason to believe that
- Mark Snitily is incorrect about this. I am almost positive that he was seeing
- a bug in X386 (both 1.1 and 1.2) that we have fixed in X386 1.2E that caused
- X386 to dump core if you tried to kill it while it was not on the active VT (we
- have provided him the fix)." However, David notes that the bug allegedly
- involved has not been confirmed by third parties.
- More data on this as it becomes available.
-
- 3. DMAs on large ISA machines may fail
- On ISA machines with more that 16MB of RAM, SVr4 may try to do DMA
- from outside the bus's address space, causing serious problems. UNIX ought
- to do an in-memory copy to within the low 16MB but the USL base code doesn't.
- Dell says they've fixed this, and that's been confirmed by a user.
- UHC says they've fixed this; they add that the special buffer-allocation
- logic to handle the problem can be turned off with a tunable kernel parameter
- if you've got less than 16M.
- Microport says they've fixed this in their new 4.1 release, shipping early
- March.
- Esix offers a patch to correct this problem.
- SCO used to have a similar bug but fixed it long ago.
- John Sully <jms@mport.com> writes: "This was due to a bug in pre version 4
- dma code. The USL code has always at least attempted to do a copy from low
- memory to high memory on systems with more than 16Mb of RAM. By the way UHC is
- wrong; the buffer allocation code only comes into play if you have more than
- 16Mb of memory. You can turn it off if you have a machine (ie. an EISA bus)
- which will allow you to do DMA above 16Mb. You *must* have this tunable
- (MAXDMAPAGE) turned on if you are using *ISA* bus masters in a system with more
- than 16Mb of ram. Unfortunately doing this will affect all drivers which do
- dma as there is no good way to do this on a per-driver basis."
-
- 4. There is a cylinder limit on disk size
- Stock USL code is limited to 1,024 cylinders per Winchester, which
- might cause problems with some disk drives.
- Microport, Dell, Esix, MST, and UHC have fixed this.
- Bela Lubkin says "SCO's boot filesystem must lie below 1024 cylinder mark;
- anything else can be anywhere. This is more-or-less a limitation of the BIOS
- interface that the bootstrap loader must use. Could be circumvented by going
- directly to controller hardware in the bootstrap loader, but that would be
- horrendously complex with all the controllers & host adapters to be supported."
- This limit probably applies to all other UNIXes as well.
-
- 5. shmat(2) vs. vfork(2)
- The shmat(2) call is known to interact bady with vfork(2). Specifically,
- if you attach a shared-memory segment, vfork(), and then the child releases
- the segment, the parent loses it too! Workaround; use fork(2).
- UHC and Microport both suspect that they still have this bug and opine that
- anyone who uses vfork deserves to lose. Dell has no plans to fix it.
-
- John Sully <jms@mport.com> writes: "This is not a bug. It is completely
- consistent with the semantics of a change to the address space of the child.
- Think about it: any change to the address space of a child process created by
- vfork(2) is reflected in the parent since the child is actually executing in
- the parent's address space. Therefore if the child changes the address space
- (in this case by releasing the shared memory segment) what should happen?
- Right, the parent should have the same change happen. And what does happen?
- The segment is released in the parent. One can argue about the braindead
- semantics of vfork(2) all day, but the fact remains that this is exactly what
- one would expect to happen. To quote from the manual page:
-
- [...] vfork differs from fork in
- that the child borrows the parent's *memory* and thread of
- control until a call to execve or an exit (either by a call
- to exit or abnormally.) [ emphasis added ]
-
- and later:
-
- It does not work, however, to return while
- running in the child's context from the procedure which
- called vfork since the eventual return from vfork would then
- return to a no longer existent stack frame.
-
- Please note that the entire address space of the parent is used by
- the child created by vfork(2). The manual page also points out
- several other caveats involved in doing anything to the parent's
- address space except successfully calling an exec family function or
- _exit (note it specifically says *not* to call exit(2)). I do not believe
- that having a shared memory segment disappear from the parent's address
- space is out of line after reading the man page for vfork(2).
-
- It is interesting to note that Sun after implementing its new VM system in
- SunOS 4.0 initially had no plans to support vfork, since they felt that the COW
- semantics of the new fork would provide the necessary efficiency gain. Indeed
- they found that most programs which used vfork worked just fine by doing
- -Dvfork=fork. All that is, except for a certain popular command interpreter
- [ed: can you say C shell?]. So we are stuck with the legacy of this braindead
- system call.
-
- BTW, Microport has no plans to fix this :-)."
-
- 6. X11R4 performance problem
- Stock X11R4 is said to hog the processor if you use the
- LOCALCONNECT option.
-
- 7. UFS file system problems
- In stock USL 4.0.3, you can't use a UFS file system as the root; the
- system hangs if you try. Dell, Esix, Microport, MST and UNIX have fixed this.
- David Aitken, the UNIX product manager at UHC, writes "The ufs as root file
- system [problem] was not really a bug, just a little oversight on USL's part -
- we have fixed it completely by adding one line to the /stand/boot script:
- rootfstype=ufs!" He adds that they've been using ufs on their lab machines for
- over 10 months with no trouble, and the latest UHC release defaults to ufs if
- you have more than 120MB of disk.
-
- 8. A security hole in login
- David Wexelblat <dwex@mtgzfs3.att.com> reports: "There is a HUGE security
- hole in /bin/login in all USL derived SVR4s before 4.0.4. Refer to CERT
- advisory CA-91:08, dated 5/23/91. This is known to be present in AT&T SVR4
- 2.1, and Microport SVR4 3.1. ESIX claims to have fixed it, Microport reports
- that it is fixed in 4.1. I won't give any more details unless necessary.
- Suffice to say that this bug allows any non-privileged user on an SVR4 system
- to get read-write access to any file on the system."
-
- 9. COFF problems with long filenames
- A source at Dell urges: "Our SVR4v2 did some stuff that USL didn't get
- around to until SVR4v4. Try Dell UNIX 2.1 with a COFF program on a large UFS
- filesystem in a directory with long names. Runs on Dell UNIX. Breaks on
- others." I don't have more definite info yet.
-
- 10. Flakeouts in the Wangtek device driver
- Dell reports that USL's Wangtek device driver is seriously flaky. "How'd
- you like a multi volume backup where the second and subsequent volumes don't
- follow on from the previous volumes?" UHC confirms this and is actively
- working on the problem.
- An anonymous SCOer says "The QIC02 tape controller `standard' is seriously
- flaky. Our driver's in pretty good shape but nobody will ever have a truly
- solid driver that supports every QIC02 controller you can find."
-
- 11. A kernel declaration bug
- A botch in USL's /etc/conf/pack.d/kernel/space.c (which is present in Dell,
- Microport 4.0.3 and 4.0.4 and may also be present in other SVr4s) can step on
- the linesw[] table. The problem is that the domain name array initialization
- is wrong and too short; thus, when it's set, data past the end of the array can
- be stomped. To fix this, find the following near line 247:
-
- char srpc_domain[] = SRPC_DOMAIN;
-
- and change it to
-
- char srpc_domain[SYS_NMLN] = SRPC_DOMAIN;
-
- then rebuild the kernel.
- Microport officially knows about this bug and olans to fix it in a
- near-future update release.
-
- 12. fread(3) does the wrong thing on pipes and FIFOs
- Ed Hall <edhall@rand.org> writes: "Unlike the raw read() system call,
- fread() is supposed to be able to make several partial read's to satisfy the
- data requested by its arguments. The exceptions are an EOF or an error on the
- stream. This characteristic is quite useful when moving data through pipes or
- over network connections, since partial reads are quite common in these cases.
- Well, the version of fread() in ESIX 4.0.3 (and likely other Sys5R4's) only
- does a single physical read, and if it only satifies part of the requested
- number of bytes, that's all you get. This can sting you even if you carefully
- check the value returned by fread(), since the value returned is rounded down
- to the number of complete "nitems" read, although your position in the stream
- can be up to size-1 bytes beyond that point. Neither ferror() nor feof()
- indicate anything is wrong when this happens."
- This bug (which is also present in 4.0.4) is serious and nasty and should
- be high on every porting house's list to fix. It appears to be peculiar to
- USL 4.0.3 and 4.0.4; 4.0.2 does *not* have it, nor does SCO.
- A USL source claims it has been fixed in 4.1.
-
- 13. Process accounting is broken
- In 4.0.3, process accounting doesn't work. From examining the accounting
- scripts, it appears that /usr/lib/acct/accton is supposed to set a return code
- depending on whether accounting was switched on already or not. However, it
- always returns the same result - accounting switched off. This means that the
- /usr/lib/acct/ckpacct script, which is run every hour to keep the proccess
- accounting log in check, instead turns off accounting the first time it is run
- after booting. The same happens with the nightly /usr/lib/acct/monacct
- program.
- I don't yet know whether this bug is present in 4.0.4. It is definitely
- un-fixed in Dell 2.1.
-
- 14. tar(1) foos up in the presence of symbolic links
- Tar can get the names of symbolic links wrong when creating an archive.
- This bug can be demonstrated by doing the following:
-
- mkdir t
- cd t
- touch a 1234567890
- ln -s b 1234567890
- ln -s a c
- tar vcf ../t.tar .
-
- The output generated by tar is:
-
- a ./ 0 tape blocks
- a ./a 0 tape blocks
- a ./1234567890 0 tape blocks
- a ./b symbolic link to 1234567890
- a ./c symbolic link to a234567890
-
- (Note the above commands should be done in the order shown and in a new
- directory) This bug is nasty. Recommended solution: use GNU tar.
- This is reported from Esix 4.0.3, but probably exists on other SVr4s
- as well.
-
- 15. Symbolic links can interfere with shellscript execution
- There is a problem running #! scripts when symbolic links are involved.
- Typing in the following from a command shell demonstrates the problem:
-
- mkdir a b
- ln -s a c
- cd a
- cat > script <<!
- #!/bin/sh
- echo Hello
- !
- chmod 755 script
- cd ../b
- ln -s ../c/script .
- ./script
-
- The message generated from the last line is:
-
- a/script: a/script: cannot open
-
- This is reported from Esix 4.0.3, but probably exists on other SVr4s
- as well.
-
- 16. Piping a csh builtin causes the shell to hang.
- While running csh, this
- can be demonstrated by some of the following:
-
- echo Hello | cat
- history | more
-
- (A solution to this one is use tcsh-6.02.)
- This is reported from Esix 4.0.3, but probably exists on other SVr4s
- as well.
-
- 17. Quick port setup option in sysadm is broken
- In 4.0.3 sysadm, the quick port setup option, which is used to add and
- delete terminal ports, is seriously broken. The script modifies /etc/conf/*
- files, and has incorrect minor numbers, sets the 5th field of sdevice.d if Y
- when it should be N, and is missing columns for node.d. See
- /usr/sadm/sysadm/bin/q-add.
-
- 18. COFF binaries linked with curses(3) and shared libc hang, eating the CPU
- Cause unknown.
-
- 19. shl hangs, sxt devices bad
- shl(1) does not work. Try creating a layer and doing an 'ls'. Your session
- hangs. Bruce Momjian <root%candle.uucp@bts.com>, who reported this bug, says
- he believes it is the sxt devices which are broken.
-
- 20. num-lock prevents mouse from working properly
- When using the Motif window manager, if your num lock is on, your mouse
- clicks are not recognized by the window manager. The mouse still works in
- xterm(1). This is allegedly fixed in Destiny (4.2).
-
- 21. adjtime() doesn't work
- Hugh Stearns <hoyt@isus.tnet.com> reports that in 4.0.3.6 adjtime() doesn't.
- Calling `date -a' works to adjust the time slowly.
-
- 22. ttymon drops DTR
- Hugh Stearns <hoyt@isus.tnet.com> reports that in 4.0.3.6 the ttymon(1)
- utility for HDB uucp drops DTR every few weeks. The workaround is to disable
- and re-enable it.
-
- 23. cron mail doesn't go through aliasing
- Hugh Stearns <hoyt@isus.tnet.com> reports that in 4.0.3.6 cron mail to adm
- doesn't get redirected by an
-
- 23. fragility in xterm
- Hugh Stearns <hoyt@isus.tnet.com> reports that in 4.0.3.6, doing ~! from
- a cu in xterm kills xterm.
-
- 24. byte-order problem with NFS when accessing Sun disks
- Christoph Badura <bad@flatlin.ka.sub.org> notes that the stock USL resolver
- library suffers from serious confusion about the byte order in the
- socketaddr_in structure. This bug is acknowledged by USL for the 4.0.4
- release. A symptom of this bug is that Sun disks will not mount correctly over
- NFS. As a workaround, try removing the references to /usr/lib/resolv.so from
- /etc/netconfig and rebooting your system. Unfortunately, this will mean
- you can't use nameservers.
-
- 25. Under weird circumstances, lseek on UFS may cause corruption
- Christoph Badura <bad@flatlin.ka.sub.org> reports that a UFS lseek() to an
- offset which is a multiple of 4096 but not a multiple of 8192, followed by a
- write(), may corrupt the file being written. The bug shows up only, if the
- file has no pages in the page pool associated with it at the seek offset and at
- 4k before the seek offset. He has sent USL kernel fix for this.
-
- 26. FIONREAD fails on regular files
- Christoph Badura <bad@flatlin.ka.sub.org> reports that the FIONREAD ioctl()
- fails on regular (disk) files. He has sent USL a one-line kernel fix.
-
- III. SCSI Support Problems
-
- 1. sar is confused by SCSI
- Sar -d doesn't work on SCSI drives. Dell fixed this in 2.1; no report of
- any other SVr4 having fixed this yet. SCO fixed it in 3.2.4.
-
- 2. A configuration problem
- Stock USL requires you to jumper your SCSI devices to fixed IDs
- during installation (it can be changed to any other ID after).
- Dell and UHC have fixed this. The requirement is definitely still present
- in Esix.
-
- 3. Synchronous SCSI hang problem
- David Wexelblat <dwex@mtgzfs3.att.com> reports: "Stock SVR4.0.3 will hang
- the SCSI bus with a 1542 in synchronous mode. Dell fixed this, and this has
- been given to Microport [ed note: Microport 4.0.4 fixed the problem; MST UNIX
- and Esix 4.0.3 still have this problem; I have not yet been able to determine
- if ESIX 4.0.4 does]. In the file /sbin/bcheckrc, change the line:
-
- echo MARK > /dev/rswap
-
- to
-
- echo MARK | dd of=/dev/rswap bs=512 conv=sync > /dev/null 2>&1
-
- The magic is apparently the conv=sync, which forces a 512 byte block
- to be written. The original echo writes 4 bytes, which apparently causes
- synchronous SCSI to go out to lunch.
-
- Now, you ask, how can I fix this, since the system won't boot? There are
- a couple of methods. First, if possible, disable synchronous negotiation
- (1542 jumper J5-1 removed, plus whatever you may need to do to your drive).
- Then boot up, edit /sbin/bcheckrc, then shutdown, restrap for synchronous,
- then reboot. Everything should be OK.
-
- That's the easy way. Unfortunately, some hard drives will only work
- in synchronous mode. Well, you can still recover from this phenomenon.
- Here's how:
-
- 1) Install on your hard drive
- 2) Boot from the first boot floppy. When it tells you to, insert
- the second boot floppy. At the first prompt, hit <DEL> to
- break out to a shell.
- 3) Mount your hard drive under /mnt with the following command
- (replace FS-TYPE with s5, s52, or ufs, whichever you used for
- for your root partition):
-
- /etc/fs/FS-TYPE/mount /dev/dsk/c0t0d0s1 /mnt
-
- 4) Now edit /mnt/sbin/bcheckrc:
-
- ed /mnt/sbin/bcheckrc
-
- You may want the 'ed' man page handy (I barely remember how to
- to use 'ed' :->). For simplicity, you can delete/comment out
- the offending line, then replace it with the correct line later.
- 5) Unmount the hard drive:
-
- umount /mnt
-
- 6) Reboot from the hard drive. Everything should come up OK. and
- you can finish editing /sbin/bcheckrc, if necessary.
-
- Note that you perform these actions at your own risk. The first version was
- performed by me on Microport SVR4, and the second was performed by someone
- else (on my suggestion) on ESIX SVR4."
-
- 4. ps chokes on commands that do SCSI I/O
- Hugh Stearns <hoyt@isus.tnet.com> reports that in 4.0.3.6, ps
- doesn't work when a SCSI command in progress. It stops printing at the
- process executing the scsi command.
-
- 5. csh lossage due to bad optimization
- If a csh user sources a non-existent file in their .cshrc (eg, source .alias,
- where .alias doesn't exist), then the system will hang for a couple of minutes.
- Eventually the user get an "Out of memory" error and the console logs "NOTICE:
- out of swap space - Insufficient memory to allocate 2 pages - system call
- failed".
- This appears to be due to over-optimization of code surrounding a longjmp
- call.
-
- IV. Development Tools Problems
-
- 1. General UCB library brokenness
- The BSD compatibility libraries were badly broken in USL code. A Dell
- source adds "That meant that almost all the apps derived from them were broken
- too. Most stuff like automount will die when you send a SIGHUP, instead of
- rereading the map file. You can get a system into very strange states when
- that happens."
- John Sully <jms@mport> of Microport opines: "This is a bug in automount
- itself rather than BSD compatibility, since the automount which comes with SVR4
- is not compiled with the BSD libraries. (isn't this comforting?? :-()."
-
- Esix and UHC's BSD libraries are USL stock. I don't yet know
- the status of other ports. Microport has run into things they think may be
- symptoms of this but have no fix yet.
-
- John Sully <jms@mport> of Microport counters with: "One common thread I find
- on reading of these problems is that the BSD compatibility libraries are
- *misused*. [...] The problem is that BSD and SYSV have similarly named .h files
- which sometimes contain different definitions for objects with the same name.
- This has been known to cause all sorts of problems because the SYSV headers are
- picked up and then the calls are satisfied from the BSD library rather than the
- shared object library. I have found that if you use /usr/ucb/cc that the BSD
- compatibility is much less broken than it would seem at first because it
- ensures that the correct headers are picked up."
-
- However, note that there is at least one *real* bug known --- as of 4.0.4
- the signal emulation cannot explicitly set a handler to SIG_DFL or SIG_IGN.
-
- Ron Guilmette <rfg@ncd.com> writes "[Library lossage] may be easily
- demonstrated by attempting to build and link the GNU C compiler with
- `-L/usr/ucblib -lucb'. The resulting compiler will most certainly
- crash and die." John Sully thinks this is because the /usr/ucb/cc
- compiler should have been used, but wasn't.
-
- 2. USL emulation of BSD signals doesn't work
- A different source reports that the the USL implementatation of BSD signals
- is broken in both 4.0.3 and 4.0.4; in particular, the sigvec() family doesn't
- work properly. It is possible to make minor tweaks to source to make such apps
- work properly with the native USL signals implementation.
-
- Here's more on the signals problem, thanks to Richard <rc@siesoft.co.uk>:
- ------------------------------------------------------------------------------
- The problem is to do with the signal() function that is within the BSD
- compatability libc.
-
- To reproduce the problem do the following:
-
- #include <stdio.h>
- #include <sys/types.h>
- #include <signal.h>
- #include <sys/siginfo.h>
-
- main()
- {
- signal(SIGPIPE,SIG_IGN);
- pause();
- }
-
- and compile it with cc xx.c -o xx /usr/ucblib/libucb.a
-
- (John Sully observes that this is definitely wrong; /usr/ucb/cc should have
- been used rather than "cc ... -L/usr/ucblib -lucb" or the equivalent "cc ...
- /usr/ucblib/libucb.a".)
-
- If you run the program and then signal it with a SIGPIPE, the program
- will die, even though you've told it to ignore SIGPIPE.
-
- The fix is difficult unless you've got source cos there's a missing 'else'
- clause from the signal() code. This is the only signal fault I've found in
- the BSD signal functions, details of the rumoured sigvec problem would be
- useful?
-
- If you're trying to compile an application you could change the application
- code to do the following, this does work..
-
- void
- catch(s)
- int s;
- {
- /* DO NOTHING */
- ;
- }
-
- main()
- {
- signal(SIGPIPE,catch);
- pause();
- }
-
- SUMMARY
- You can only change a signal handler to a function handler, any number of
- times. Any attempt to set the handler to SIG_DFL, or SIG_IGN will fail.
-
- This bug has given some people working with X11R5 aggro, causing the X server
- to die when you close a client.
-
- Christoph Badura <bad@flatlin.ka.sub.org> confirms this bug
- He has sent USL a source fix.
- ------------------------------------------------------------------------------
-
- 3. Possible string library problems
- There are also persistent rumors of problems in the BSD-emulation string
- libraries. I have not been able to pin down specifics on this.
-
- 4. Compiler problems
- Ronald Guilmette <rfg@ncd.com> also reports the following:
-
- ------------------------------------------------------------------------------
- /* Here is a bug in the original SVR4 C compiler (aka C Issue 5) which
- effectively prevents you from making good use of the `const' and
- `volatile' qualifiers defined by ANSI C in conjunction with pointer
- types and typedef statements. Compile this code and you will get:
-
- "qualifiers.c", line 23: left operand must be modifiable lvalue: op "="
-
- ...if your copy of the svr4 C compiler still has the bug. Note that
- given these declarations, the ANSI C standard say that the thing pointed
- to by the variable `pci' should be considered to be constant... not the
- variable `pci' itself. (The GCC compiler, either version 1.x or version
- 2.x, correctly compiles this example without complaint.)
- */
-
- typedef const int *ptr_to_const_int;
-
- ptr_to_const_int pci;
-
- int i;
-
- void main ()
- {
- pci = &i;
- }
- ------------------------------------------------------------------------------
- /* Here is a subtle bug in the original SVR4 C compiler (aka C Issue 5)
- which prevents you from first declaring a tagged type (i.e. a struct
- type or a union type) in a parameter list, and then defining that tagged
- type later on within the same scope. (Note that according to the ANSI C
- standard, the scope in which parameters get declared and the outermost
- block of a function body are one and the same scope. Thus, this really
- is legal ANSI C code!)
-
- Try compiling this with your C compiler on SVR4. If your compiler still
- has the bug, you will get:
-
- "tagged_type.c", line 24: warning: dubious tag declaration: struct S
- "tagged_type.c", line 28: warning: improper member use: i
- "tagged_type.c", line 28: warning: improper member use: i
- "tagged_type.c", line 31: warning: dubious tag declaration: struct S
- "tagged_type.c", line 35: warning: improper member use: i
- "tagged_type.c", line 35: warning: improper member use: i
-
- (The GCC compiler also had this bug in version 1.x, but it has been fixed
- in version 2.x.)
- */
-
- void foobar1 (arg) /* use old-style without prototypes */
- struct S *arg;
- {
- struct S { int i; }; /* define the type `struct S' */
-
- arg->i = arg->i; /* legal according to ANSI C rules! */
- }
-
- void foobar2 (struct S *arg) /* use new-style with prototypes */
- {
- struct S { int i; }; /* define the type `struct S' */
-
- arg->i = arg->i; /* legal according to ANSI C rules! */
- }
- ------------------------------------------------------------------------------
- /* Here is a serious bug in the original SVR4 `dump' program which dumps
- out parts of object files in either plain hex form or symbolically.
-
- To see the `dump' program get a segfault and die, save this code under
- the name `dump-bug.c' and then do:
-
- cc -g -c dump-bug.c
- dump -v -D dump-bug.o
-
- The bug arises whenever `dump' tries to read Dwarf debugging information
- for an array of pointers to any "user defined" type (e.g. `struct S' in
- this example). Past that point, `dump' is totally confused, so further
- Dwarf debugging information finally causes it to go belly-up.
- */
-
- struct S { int i; };
- struct S *array[10];
- int j;
- ------------------------------------------------------------------------------
- It appears that the svr4 C compiler (for x86 machines) doesn't conform real
- well to either the letter or the spirit of the IEEE 754 floating-point
- standard. In particular, "unordered comparisons" and other operations on
- NaNs don't always produce the result that that the IEEE 754 standard calls
- for.
-
- An AT&T source comments: "This is documented in the SVID as a future direction.
- We do not support NaNs in -Xa and -Xt modes, only in -Xc. Try
- isnan(sqrt(-1.0)) to determine which modes support it."
- ------------------------------------------------------------------------------
-
- Both 4.0.3 and 4.0.4 USL versions are missing the documented dial.h
- file from their /usr/include directory. Dell 2.1 has it.
-
- V. The FUBYTE bug
-
- (Thanks to Christoph Badura <bad@flatlin.ka.sub.org> for this info)
-
- The kernel function fubyte() is documented to return a positive value when
- given a valid user space address and -1 otherwise. In the latter case u.u_error
- is set to EFAULT. USL SysV R4.0.3 has a sign extension bug in the
- implementation of fubyte() for local file descriptors (i.e. not opened via
- RFS), which causes fubyte() to return negative values if the byte fetched has
- its high bit set. This bug doesn't affect STREAMS drivers, as they don't call
- (and in fact are normally unable to call) fubyte(). Thus writing a byte with
- the high bit set to certain character device drivers returns with -1 and errno
- set to EFAULT.
-
- The bug may affect any character device driver that calls fubyte(). It's not
- limited to serial card drivers. The bug is noticed most often with serial card
- drivers, since uucp uses byte values > 127 very early during g-protocol setup
- and drivers for serial cards tend to use fubyte() quite often.
-
- Note also that the bug's effect is different if the driver checks for a -1
- return value of fubyte() or just a negative one. In the former case it is
- possible to pass bytes with the 8 bit set through fubyte(), except for 0xff
- which is -1 in two's complement. That makes the bug more obscure.
-
- The fix is easy. First, make a backup copy of the kernel object file
- /etc/conf/pack.d/kernel/vm.o! A disassembly of vm.o(lfubyte) should reveal
- *exactly* one mov[s]bl (move byte to long w/sign extend). That one needs to be
- patched into a movzbl (zero extend). The difference is one bit in the second
- byte of the opcode.
-
- The movsbl has the bit pattern 00001111 1011111w mod/rm-byte.
- The movzbl has the bit pattern 00001111 1011011w mod/rm-byte.
-
- The 'w' bit is 0 for the instruction in question. So the opcodes are 0f be and
- 0f b6. Here is the diff -c from dis -F lfubyte showing the patch applied to
- the Dell 2.1 kernel:
-
- *** vm.o Mon Mar 9 00:31:38 1992
- --- vm.o.org Mon Mar 9 00:32:40 1992
- ***************
- *** 22,28 ****
- 11c90: 85 c0 testl %eax,%eax
- 11c92: 75 09 jne 0x9 <11c9d>
- 11c94: 8b 45 08 movl 8(%ebp),%eax
- ! 11c97: 0f b6 00 movzbl (%eax),%eax
- 11c9a: 89 45 fc movl %eax,-4(%ebp)
- 11c9d: c7 05 d8 13 00 00 00 00 00 00 movl $0x0,0x13d8
- 11ca7: 83 3d dc 13 00 00 00 cmpl $0x0,0x13dc
- --- 22,28 ----
- 11c90: 85 c0 testl %eax,%eax
- 11c92: 75 09 jne 0x9 <11c9d>
- 11c94: 8b 45 08 movl 8(%ebp),%eax
- ! 11c97: 0f be 00 movsbl (%eax),%eax
- 11c9a: 89 45 fc movl %eax,-4(%ebp)
- 11c9d: c7 05 d8 13 00 00 00 00 00 00 movl $0x0,0x13d8
- 11ca7: 83 3d dc 13 00 00 00 cmpl $0x0,0x13dc
-
- Of course there is a workaround at the driver level. Canonically, one would do
- this by checking for fubyte() returning -1 *and* u.u_error being set to EFAULT
- (u.u_error is cleared upon entering a system call). However, in R4.0.3
- fubyte() does NOT set u.u_error. It *does* set u.u_fault_catch.fc_errno.
-
- Cristoph reports that Dell V.4 can be object-patched successfully to fix this.
- I do not know the status of the other ports.
-
- Another poster (Marc Boucher <marc@cam.org>) adds:
-
- On ESIX SVR4.0.3 Rev. A, the instruction movsbl in question can be changed to
- movzbl (as described above) with a binary-editor on file
- /etc/conf/pack.d/kernel/vm.o. At offset 0x11eb0, change 0xbe to 0xb6.
-
- Before patching, verify that your /etc/conf/pack.d/kernel/vm.o is the same as
- mine! On my system, the /bin/sum generated checksum of vm.o was "4440 222".
-
- The problem results from a sign-extension bug. The function lfubyte(), which
- is called by fubyte(), is declared as
-
- int lfubyte(char *addr); /* actually caddr_t */
-
- The byte is fetched with
-
- val = *addr;
-
- which triggers sign extension. Casting addr to a unsigned char * or declaring
- it as such solves the problem.
-
- This bug is still present in stock USL 4.0.4.
-
- DESTINY AND DELL
-
- A source at at UNIX System Labs Europe claims that `Destiny' (the new Release
- 4.2) incorporates all of Dell UNIX's fixes to 4.0.3; thus, any bug for which a
- Dell fix is indicated above should be gone in Destiny.
- --
- Send your feedback to: Eric Raymond = esr@snark.thyrsus.com
-