home *** CD-ROM | disk | FTP | other *** search
- I seem to have run into a a real hardware bug in the Dell System 325
- Chips & Technologies 8259 clone interrupt controller.
-
- Summary:
-
- Sending this interrupt controller a Non Specific End of Interrupt
- (EOI) command causes it to reset all In Service Register (ISR) bits
- instead of only the most recent one with the highest priority.
-
- Long Winded Explanation:
-
- I had a serious bug with my high-performance Western Digital wd80x3
- packet driver. Transmitting and receiving on it at high rates caused
- it to go west in many different ways. After tearing my hair out for a
- while, I added logging code which logged all procedure entries and
- exits along with detailed chip status in a large ring. I finally
- discovered that the impossible was happening. During my packet copy
- routine (which can take > 1.5ms to copy a 1500 byte packet), my
- interrupt handler was being reentered and was trashing the stack.
- This "shouldn't happen" since I was not giving an EOI command to the
- interrupt controller until the very end of the interrupt handler. The
- interrupt did however reenable processor and ethernet chip interrupts
- fairly early.
-
- After tearing my hair out some more and checking my code thouroughly,
- I decided that I must be getting some other interrupt in the middle of
- my code somewhere. I added some more logging code to record interrupt
- controller status, and changed my packet copy routine to enable
- processor interrupts AFTER the copy instead of before it. Sure
- enough, at the point the bug hit, my log indicated that timer had
- fired and a timer interrupt was now pending. Also, I had received a
- new packet, and the ethernet chip also had a new interrupt pending
- although it was still blocked because it already had an interrupt in
- service. However, immediately after reenabling processor interrupts,
- my log indicated that my interrupt handler was reentered. This
- indicated to me that the timer interrupt handler was somehow resetting
- not only its ISR bit but mine also.
-
- After disassembling the timer interrupt handler, I determined that the
- only thing it was doing was sending a Non Specific EOI to the primary
- interrupt controller (using mov al,20h; out 20h,al). To make my case
- for a hardware bug even stronger, I next coded my own timer interrupt
- handler. Just before and just after the mov/out instructions, I made
- log entries. Sure enough, while my log showed the ISR register
- reading 21h (IR5 and IR0 in service) just before the EOI was sent, it
- read 0 just after. I then changed the code to use a specific EOI
- instruction to reset the timer interrupt instead of a non specific
- EOI. The problem went away!
-
- Finally, I tested the code with a non specific EOI on a stock IBM
- PC/AT with a real Intel 8259. It didn't exhibit the problem.
-
- Since I can't change the real timer interrupt handler (its in BIOS), I
- had to use a different workaround. Just before reenabling processor
- interrupts, I now disable further ethernet device interrupts by
- setting its Interrupt Mask Register (IMR) bit. At the end of
- interrupt handler, I reset the bit. This insures that I can't get
- further device interrupts even if the timer interrupt clears my ISR bit.
-
- Synopsis:
-
- If you write high performance drivers where:
- 1. The interrupt handler runs with other interrupts enabled at the
- processor and at the interrupt controller,
- 2. The interrupt handler reenables interrupts at the device while 1.
- is true and further device interrupts are possible before the
- interrupt handler again disables interrupts and returns,
- 3. You want to the driver to work in clones with C&T chips
-
- Then, you better use a technique like the one I use to guarantee that
- you can't get reentrant interrupts.
-
- Drew
-
-
-