home *** CD-ROM | disk | FTP | other *** search
- <!-- Forthmacs Formatter generated HTML output -->
- <html>
- <head>
- <title>System Debugging Techniques</title>
- </head>
- <body>
- <h1>System Debugging Techniques</h1>
- <hr>
- <p>
- This chapter relates some of my techniques for debugging computer system
- problems, both hardware and software. In many cases, the hardest problem is to
- decide whether hardware or software is at fault.
- <p>
- The discussion focuses on debugging "generic" problems that happen on more than
- one system. Fixing individual bad boards is a different problem.
- <p>
- There are no rules or procedures, because every problem is different. Instead,
- this a "bag of tricks".
- <p>
- <p>
- <h2>Find the WORST system and hang on to it</h2>
- <p>
- The system which fails the MOST frequently is your best clue.
- <p>
- <p>
- <h2>Eliminate irrelevant factors</h2>
- <p>
- Simply the setup; try to find the simplest setup that will still fail; remove
- boards that don't seem to be related to the problem. Turn off test processes
- that don't affect the problem.
- <p>
- <p>
- <h2>Make it fail stand-alone</h2>
- <p>
- It is very difficult to debug hardware problems under Unix because Unix takes a
- long time to boot, exercises a whole lot of the system all the time, and often
- crashes when the hardware is broken.
- <p>
- Running RiscOS makes life much easier.
- <p>
- Try to cause the failure running stand-alone Forth. Write a Forth program to
- exercise the interesting parts of the system (or better yet, adapt an existing
- program). Even when running stand-alone, turn off tests that seem to be
- irrelevant.
- <p>
- Forth can still run when the hardware is crippled, and only touches a small
- amount of the hardware unless you tell it to.
- <p>
- <p>
- <h2>Try to increase the failure rate</h2>
- <p>
- The faster it fails, the easier it is to see on a scope and the easier it is to
- trigger a logic analyzer
- <p>
- If it only fails once an hour, you aren't likely to solve it, so try to make it
- fail faster.
- <p>
- Try out lots of ideas for making it fail faster. Forth is good for this because
- you can try out stuff directly from the keyboard.
- <p>
- The things you have to do to make it fail faster give you important clues to the
- problem.
- <p>
- After you have figured out the cause of the problem, the test program that makes
- it fail quickly is a good testbed for proposed fixes.
- <p>
- <p>
- <h2>Attack with both software and hardware weapons</h2>
- <p>
- Excessive reliance on a logic analyzer will slow you down. The logic analyzer
- is most useful after you have already narrowed down the problem pretty far.
- <p>
- Try software first. Increasing the failure frequency is the number one
- priority. After you can get it to fail quickly and repeatable, then you can get
- out the logic analyzer, or, if it fails quickly enough, a scope may be
- sufficient.
- <p>
- Try simple tests first. Often a problem can be triggered by simply repeating a
- command over and over.
- <p>
- Sometimes it is very difficult to get the logic analyzer to trigger on the right
- event. Software can be very useful here. Think of ways to make a very specific
- event happen concurrently with the problem. Special data patterns may be
- helpful.
- <p>
- Don't be content with "quick answers"
- <p>
- If you haven't discovered EXACTLY what is going wrong, at the lowest level, the
- problem will come back to haunt you.
- <p>
- "Well, if we do this and this, the problem seems to go away". This is not a
- problem solution, it is hand-waving.
- <p>
- <p>
- <h2>Assumptions are your enemy. Collect data, not assumptions</h2>
- <p>
- "We think the problem is thus-and-so" is not data.
- <p>
- "I performed this test under these conditions and the result was x" is data.
- <p>
- <p>
- <h2> Be wary when giving status reports!!!</h2>
- <p>
- Before the problem is really solved, if you tell people what you think it MIGHT
- be, the rumor mill will sometimes twist your words. You can end up spending
- more time combating twisted rumors than on solving the problem.
- <p>
- Every few hours, your boss will ask you what the solution is. Say "I don't know
- yet". He will ask again, using different wording. Say "I don't know yet, but I
- will tell you as soon as I know".
- <p>
- <p>
- <h2>Last Resort</h2>
- <p>
- If all else fails, try to push the problem off onto someone else.
- <p>
- </body>
- </html>
-