@DATABASE AmigaMail @NODE MAIN "I-47: Debugging with Enforcer and Mungwall" @TOC "Table_of_Contents/I" by Carolyn Scheppner It is almost impossible for a human being to develop software without including some bugs. Some bugs make themselves known rather quickly. Other bugs are not so easy to notice. For example, two hard to find bugs that sometimes slip past developers are using uninitialized pointers and using memory which has already been freed. These bugs often reference semi-random memory, which contains semi-random data. Typically the behavior of the software depends on the data passed to it. If that data is erratic, the behavior of the software will also be erratic. Because the behavior is erratic and unpredictable, the problem is very hard to spot. Quite often, bugs like these go unnoticed until software is running in a real user's multitasking environment. Fortunately, debugging tools like Enforcer and Mungwall (and for assembler programmers, Scratch by Bill Hawes) help uncover hidden bugs. Enforcer is a debugging tool written by Bryce Nesbitt. Its job is to report any attempts to access regions of the Amiga address space that are off-limits to applications. These off-limits accesses include reading and writing to the lowest 1K range of memory (except for reading ExecBase from $4), writing to the Kickstart ROM, and reading and writing to non-existent memory ranges. Enforcer requires that the CPU has access to an MMU (Memory Management Unit) to catch reads and writes to memory. Most 68020 based accelerator boards have MMU chips on them. An MMU is built-in to each 68030 and 68040 CPU (currently Enforcer does not support 68040). When an application tries to read a memory location that has been read-protected by Enforcer, Enforcer intercepts that memory read, reports the illegal memory access (known as an ``Enforcer hit''), and shows the application a zero instead of the contents of the memory address. The application has no idea that Enforcer did anything. When an application tries to write to a memory location that has been write-protected by Enforcer, Enforcer prevents the illegal memory write and issues an Enforcer hit. Enforcer is even more powerful when used in combination with Mungwall. Mungwall was written by Ewout Walraven and is based on Memmung by Bryce Nesbitt and Memwall by Randell Jesup. The ``mung'' part of Mungwall fills all of free memory (and all subsequently freed memory) with a large, odd, 32-bit value. An odd value is likely to cause serious problems for any program that uses wild or uninitialized pointers, or uses memory after it has been freed. Unlike Enforcer, Mungwall does not require any special hardware. Mungwall can run without Enforcer and on non-MMU machines. Mungwall uses several special 32-bit values to ``mung'' memory which helps diagnose problems: Except when Enforcer is running, Mungwall sets location zero to $C0DEDBAD. Normally, location zero is $00000000. By putting an odd, non-zero value in location zero, any erroneous references to location zero are much more likely to show themselves. For example, a program that references location zero as character array will see a string that starts with the ASCII values $C0 $DE $DB $AD, rather than seeing a NULL string. When Mungwall starts up, it sets all free memory to $ABADCAFE. If this number shows up while an application is running, it is likely that someone is referencing memory in the free list. When a program allocates memory, Mungwall sets that memory to $DEADF00D (Except when allocating memory with MEMF_CLEAR). When an application accidentally accesses its memory before initializing it, the application will find the well-known value $DEADF00D rather than some random value that happened to be left in memory. Mungwall fills deallocated memory with $DEADBEEF, which makes using freed memory bugs much more obvious. The ``wall'' part of Mungwall allocates extra memory before and after every memory allocation and fills this memory ``wall'' with a fill pattern (normally $BB) and some information Mungwall uses to perform certain tests on an application's memory blocks: When an application deallocates memory, Mungwall reports when the size of the deallocated memory block does not match its size when it was allocated. Mungwall reports when a memory block's ``walls'' have been overwritten. Mungwall reports allocations and deallocations of memory blocks that are zero bytes long and deallocations of memory blocks that start at location zero. Mungwall has an option to ``snoop'' and report on all memory allocations and deallocations for all tasks or specific tasks. This feature can be useful when tracking down memory losses. Because the snoop option can generate so many reports, the output can be run through the snoopstrip program which will throw away all matching allocate/deallocate pairs. @{" The Debugging Setup " link I-47-1} @{" Sample Enforcer Output " link I-47-2} @{" Sample Mungwall Output " link I-47-3} @{" Using Enforcer and Mungwall Together " link I-47-4} @{" A Sample Debugging Session " link I-47-5} @{" More Remote Debugging Tips " link I-47-6} @{" Who Should Use Enforcer and Mungwall " link I-47-7} @ENDNODE @NODE I-47-1 "The Debugging Setup" Enforcer and Mungwall both output their debugging information to the serial port at the baud rate to which the Amiga's serial hardware is currently set. After powerup, the serial hardware is set to 9600 baud. You can use a terminal package to alter the current baud rate setting. To set up an Amiga for serial debugging, connect your Amiga's serial port via a NULL-modem serial cable to a terminal. The best debugging setup is to connect your Amiga to another computer running a terminal package. Ideally the terminal package should have an ASCII capture mode so it can capture all the serial debugging output and save it to a file for examination. The terminal package should also beep when it receives a CTRL-G, as both Enforcer and Mungwall send a CTRL-G beep with each Enforcer/Mungwall hit. There are several less effective alternatives to using Enforcer and Mungwall with a remote terminal. Both can send their output to a serial printer. There are special versions of both Enforcer and Mungwall that send their output to the parallel port (Enforcer.par and Mungwall.par) for output to a parallel printer. In a pinch, attach a modem to the serial port and run a terminal package set to the modem's baud rate. As long as the modem has not made any telephone connection, the modem will bounce back any Enforcer or Mungwall hits that come through the serial port, which the terminal package can capture. Because the debugging information has to move at modem's baud rate, the modem method tends to lose data, especially when there is a lot Enforcer/Mungwall hits. @ENDNODE @NODE I-47-2 "Sample Enforcer Output" Here is a sample Enforcer hit which was caused by a program called Lawbreaker which tried to read from location $14. Program Counter (approximate)= 343F4A Fault address = 14 User stack pointer = 348734 DOS process address = 339590 Data: DDDD0000 DDDD1100 DDDD2200 DDDD3300 DDDD4400 DDDD5500 DDDD6600 DDDD7700 Addr: AAAA0000 AAAA1100 AAAA2200 AAAA3300 AAAA4400 AAAA5500 AAAA6600 00002E28 Stck: 00210D70 00000FA0 00339F84 BBBBBBBB BBBBBBBB BBBBBBBB BBBBBBBB BBBBBBBB READ-WORD (---)(-)(-) SR=0008 SSW=0161 Background CLI, "lawbreaker", Hunk #0, Offset $5A Program Counter - Normally, this is the address of the instruction that was executing when the Enforcer hit occurred. For some types of Enforcer hits, this is the address of the instruction that executed after the hit. Note that if a program passes a bad pointer or an improperly initialized structure to a system ROM routine, it can cause the ROM code to read or write to an illegal address. Fault Address - This is the address where the illegal access occurred. In this example, the illegal access occurred at address $14, and as specified later in the debugging output, this access was a READ-WORD access. So the illegal memory access was an attempt to read a word (2 bytes) at address $14. Low memory accesses are often caused by NULL pointers to structures. If, for example, a ROM routine references a WORD-sized structure member at offset $20, and an application passes the ROM routine a NULL pointer as a pointer to that structure, Enforcer will report that the ROM routine tried to read a word at address $20. User Stack Pointer - This is the value that was in register A7 when the Enforcer hit occurred. It points to the top of the stack for the task that was running when the Enforcer hit occurred. DOS Process Address - This points to the Task structure of the task that was running when the Enforcer hit occurred. Data/Addr (Register Dump) - This is the contents of registers D0-D7 and A0-A7. This information can help assembly programmers and programmers who like to debug at a low level. Notice that register A7, the stack pointer, no longer contains the stack pointer for the task that caused the Enforcer hit (A7 does not match the user stack pointer above). Ignore the value in A7. Stck (Stack Dump) - This is the eight long words at the top of the offending program's stack. For those who need to see more of the stack, there is a special version of Enforcer called Enforcer.megastack. that shows the last 32 long words on the stack. Access Type - This tells what kind of memory access the Enforcer hit is. In this example the access is a READ-WORD, which most likely is a result of Lawbreaker accessing a word-sized structure member. There are two other read access types: READ-BYTE, which is generally a result of a bad string pointer, and READ-LONG, which is normally a result of a bad pointer or a bad pointer within a structure. When an errant program directly or indirectly writes over Enforcer-protected memory, Enforcer reports a WRITE-BYTE, WRITE-WORD, or WRITE-LONG type access. When Enforcer issues an INSTRUCTION type memory access, the CPU tried to load an instruction from an invalid address. Some common causes for this type of Enforcer hit are: An errant program has overwritten a program's instructions, An errant program has overwritten a subroutine return address that was on the stack, The CPU tried to execute a library function using an invalid library base. Notice that the errant program is not necessarily the program that caused the Enforcer hit. Interrupt/Forbid/Disable - The series of parentheses after the access type indicate if the Enforcer hit occurred in an interrupt (and if so, the interrupt level), Forbid, or Disable state: (I-n)(-)(-) The hit occurred in an n level interrupt. (---)(F)(-) The hit occurred while the Amiga was in a Forbid state. (---)(-)(D) The hit occurred while the Amiga was in a Disable state. SR (Status Register) - The CPU's status register (see 680x0 manual for more information). SSW (Special Status Word) - The special status word for 68010 though 68040 CPUs (see 680x0 manual for more information). Program Name - The program name is the name of the task or command that was executing when the Enforcer hit occurred. Hunk Number and Offset - If possible, Enforcer also provides a hunk offset to where the program counter was if the hit occurred within the program's own code instructions. This will only work if the errant program was started from a shell. @ENDNODE @NODE I-47-3 "Sample Mungwall Output" When Mungwall reports a hit, it lists: The function that triggered the hit (either AllocMem() or FreeMem()). It includes the functions arguments. The name of the offending program (``attempted by `' ''). The address of the offending program's Task structure (``at 0x''). The address from which AllocMem()/FreeMem() was called. Mungwall reports two addresses: one labelled ``A:'' and one labelled ``C:''. The ``A:'' address is the address if AllocMem()/FreeMem() was called directly (i.e., using assembler or #pragmas) by the offending program. The ``C:'' address is the address if AllocMem()/FreeMem() was called from an amiga.lib C stub by the offending program. Since Mungwall patches the memory allocation functions, it can only guess the caller's address based on the return address it finds on the stack. The stack pointer at the time of the Mungwall hit. It is labelled ``SP:''. Note that Mungwall ignores the layers.library's partial deallocations. If any other debugging tools patch AllocMem() or FreeMem(), Mungwall's ``A:'' and ``C:'' addresses may be thrown off by additional information pushed on the stack, and Mungwall will also be unable to screen out the layers.library's partial deallocations (which will often show up as Mungwall hits on your task, CON:, or input.device). Here are some sample Mungwall hits that were produced by a program called mungwalltest. As a reminder, the arguments for memory functions are AllocMem(size,type) and FreeMem(address,size). AllocMem(0x0,10000) attempted by `mungwalltest' (at 0x339590) from A:0x35C03A C:0x35677E SP:0x35CFC0 Tried to allocate 0 bytes of memory. FreeMem(0x0,16) attempted by `mungwalltest' (at 0x339590) from A:0x35C068 C:0x3567C4 SP:0x35CFB8 Tried to free memory with a NULL pointer. FreeMem(0x33BD10,0) attempted by `mungwalltest' (at 0x339590) from A:0x35C068 C:0x3567D4 SP:0x35CFB0 Tried to free 0 bytes of memory. Mis-aligned FreeMem(0x33BD14,16) attempted by `mungwalltest' (at 0x339590) from A:0x35C068 C:0x3567E2 SP:0x35CFA8 Deallocation address is incorrect because it is not aligned according to AllocMem()'s lowest memory chunk size. Mismatched FreeMem size 14! Original allocation: 16 bytes from A:0x35C03A C:0x3567A0 Task 0x339590 Testing with original size. Deallocation size does not match the allocation size. 19 byte(s) before allocation at 0x33BD10, size 16 were hit! >$: BBBBBBBB BBBBBBBB BB536572 6765616E 74277320 50657070 65722000 Something wrote to the bytes which precede the allocation. 8 byte(s) after allocation at 0x33BD10, size 16 were hit! >$: 75622042 616E6400 BBBBBBBB BBBBBBBB BBBBBBBB BBBBBBBB BBBBBBBBB Something wrote to the bytes which follow the allocation. @ENDNODE @NODE I-47-4 "Using Enforcer and Mungwall Together" Alone, Mungwall can catch a large variety of memory-related software problems. If Mungwall and Enforcer are used together, they can catch many more well-hidden bugs. One bug that is hard to catch is when a program either mistakenly reads memory that does not belong to the program or reads its own memory without initializing the memory first. These bugs are normally hard to catch because the behavior of the errant program usually depends on how it reacts to the data that happens to be stored in that memory. This makes the behavior of the bug erratic. Sometimes the errant program may crash. Sometimes the errant program may write over another program's data, causing the other program to crash. At other times, no noticeable abnormality takes place. Mungwall and Enforcer together can help find these bugs. Because Mungwall fills freed memory with the same odd 32-bit values, when an errant program mistakenly accesses memory, the behavior of the bug will be consistent while Mungwall is running. Also, the values Mungwall puts in memory are more likely to cause the errant program to access Enforcer-protected memory, triggering an Enforcer hit. @ENDNODE @NODE I-47-5 "A Sample Debugging Session" The following is an example of debugging an Enforcer hit that occurred using a test program called ownertest. This hit was generated on an A2500 with a 2.04 ROM image loaded using ZKick: Program Counter (approximate)= 201946 Fault address = 0 User stack pointer = 566110 DOS process address = 38E888 Data: 00282D90 00000000 000003ED 0038FD8C 00000001 00000001 000E203B 00000001 Addr: 00225469 00000001 00282DE0 00448A3A 004487C0 004487CC 00001420 00002E28 Stck: 00448A3A 00223BA2 00280810 000003ED 0038FD8C 00000001 00000001 000E203B READ-BYTE (---)(F)(-) SR=0010 SSW=0751 Background CLI, "ownertest" The Program Counter is at $201946. On a machine with the Kickstart in ROM, the ROM addresses range from $F80000 to $FFFFFF under 2.0 or greater and from $FC0000 to $FFFFFF under 1.3. On a softkicked A3000, the addresses are in the same range as real ROM addresses. On a softkicked A2500, $201946 is a ROM address (in this case, the ROM ranges from $200000 to $27FFFF). The first thing to do is figure out in which ROM module the Enforcer hit occurred. The debugging tool owner , by Michael Sinz, will figure that out: 1.Ram Disk:> owner 0x201946 Address - Owner -------- ----- 00201946 - in resident module: exec 37.132 (23.5.91) Note that Owner looks at the ROM addresses of the Amiga on which it is executing, so you must run owner on the machine that generated the Enforcer hit. Next, use the debugging tool lvo to figure out what function entry in the Exec ROM module is closest to that ROM address. Like owner, lvo also looks at local machine's ROM addresses, so you have to run owner on the machine that generated the Enforcer hit. Note that lvo requires the FD files to be in an ``FD:'' assign directory. Pass the address and the module name (from owner) to lvo. 1.Ram Disk:> lvo exec romaddress=0x201946 Closest to $201946 without going over: exec.library LVO $feec -276 FindName() jumps to $20192a on this system Hmmm. A lot of functions use FindName() on Exec lists, so, in this case, the Program Counter does not pinpoint the problem. However, it does hint that FindName() probably received a bad string pointer. The ``READ-BYTE'' attribute of the Enforcer hit is an extra clue that the Ownertest program has a problem with a string pointer, since the most common READ-BYTE actions are on strings. Let's check the Enforcer stack dump and see if there are any other ROM addresses there which might have called FindName(): Stck: 00448A3A 00223BA2 00280810 000003ED 0038FD8C 00000001 00000001 000E203B The address closest to the top of the stack that looks like a ROM address on this particular system is 0x00223BA2. Let's see what ROM module contains this address: 1.Ram Disk:> owner 0x223BA2 Address - Owner -------- ----- 00223BA2 - in resident module: graphics 37.35 (23.5.91) Let's see what function in the Graphics library is closest to this ROM address: 1.Ram Disk:> lvo graphics 0x223BA2 Closest to $223ba2 without going over: graphics.library LVO $ffb8 -72 OpenFont() jumps to $223b84 on this system It looks like the ownertest program has a problem with a font name that was not properly initialized. @ENDNODE @NODE I-47-6 "More Remote Debugging Tips" Debugging with Enforcer and Mungwall is even more effective when an application sends other debugging information to the serial or parallel port. The linker library debug.lib contains a printf()-like function, kprintf(), to print information to the serial port. The linker library ddebug.lib contains a similar function called dprintf() that prints debugging information to the parallel port. The output from these functions intermix with the output from Enforcer and Mungwall, making it easy to pinpoint which part of the code is causing Enforcer or Mungwall hits. Functions like kprintf() and dprintf() are useful, but adding and removing them from programs can be tedious. One easy way to deal with this problem is to include them only when a special label is defined: /********** debug macros ***********/ #define MYDEBUG 1 void kprintf(UBYTE *fmt,...); void dprintf(UBYTE *fmt,...); #define DEBTIME 0 #define bug printf #if MYDEBUG #define D(x) (x); if(DEBTIME>0) Delay(DEBTIME); #else #define D(x) ; #endif /* MYDEBUG */ /********** end of debug macros **********/ Set MYDEBUG to 1 to turn on debugging. Set ``bug'' to: ``printf'' to send debugging information to the default console, ``kprintf'' to send debugging information to the serial port (link with debug.lib), and ``dprintf'' to send debugging information to the parallel port (link with ddebug.lib). When using this macro, make sure there two close parentheses before the semicolon at the end of each D(bug()) statement. Example macro usage: win = OpenWindow(&mynewwin); D(bug("Opened window at $%lx\n", win)); A different low-level method of figuring out which instructions caused an Enforcer hit is to disassemble program memory where the hit occurred. First, match the disassembly with your own code. Assembly programmers could just compare the disassembly to their source. Others could take the hex values of a sequence of position-independent 68000 instructions near the hit (i.e. no addresses except for offsets and branches) and do a search for this pattern in your object modules. If you find the pattern, do a mixed source and object disassembly of that object module and then look in the output for instructions matching those where the hit occurred. For example, with SAS's OMD you could compile your code with the flag -d1, then do the following: 1.Ram Disk:> OMD >ram:dump mymodule.o mymodule.c @ENDNODE @NODE I-47-7 "Who Should Use Enforcer and Mungwall" If you are developing Amiga software, it is extremely important that you invest in a MMU, or at the very least make sure that your software is tested on machines with Enforcer and Mungwall (also test with Enforcer alone as Mungwall can hide certain types of bugs). If you are programming in assembly, you should also test with Scratch by Bill Hawes to catch improper usage of CPU registers. Enforcer and Mungwall are not just for developers and QA departments. Anyone who uses software can help find bugs in it with Enforcer. During normal usage, they can catch hidden software problems. Many people at Commodore run Enforcer all of the time. As more and more people begin running these tools, they will become less tolerant of software that causes Enforcer and Mungwall hits. At a small developer meeting at a recent Amiga trade show, CATS was disappointed to discover that, although the majority of the audience believed that they needed Enforcer, a relatively small percentage of them owned the equipment necessary to run it (i.e., an MMU). If you don't have an MMU, get one. The investment in an A3000, 68030 card, or 68020+MMU card will quickly pay for itself. It significantly cuts down development time because it quickly catches bugs that are otherwise hard to track down. @ENDNODE