Amiga Developer CD 1.2

home *** CD-ROM | disk | FTP | other *** search

/ Amiga Developer CD 1.2 / amidev_cd_12.iso / reference / amiga_mail_vol1 / program / troubleshooting_software < prev next >

Wrap

Text File | 1990-01-25 | 20.2 KB | 506 lines

(c) Copyright 1989 Commodore-Amiga, Inc. All rights reserved. The information contained herein is subject to change without notice, and is provided "as is" without warranty of any kind, either express or implied. The entire risk as to the use of this information is assumed by the user. Troubleshooting Your Software by Carolyn Scheppner Many Amiga programming errors have classic symptoms. This guide gives some tips to help you find and eliminate these problems in your software. CLI Error Messages This is caused by calling exit(n) with an invalid or missing return value. Assembler programmers using startup code should jump to the startup code's _exit with a valid return value on the stack. Programs without startup code should return with a valid value in D0. Valid return values are defined in libraries/dos.h and i. Other values (-1 for instance) can cause CLI error messages such as "not an object module". CLI Won't Close on RUN A CLI can't close if a program has a lock on the CLI input or output stream ("*"). If your program is RUN >NIL: from a CLI, that CLI should be able to close unless your code or your compiler's startup code explicitly opens "*". Crashes and Memory Corruption at Run Time Memory corruption, address errors, and illegal instruction errors are generally caused by wild pointers - pointers which are uninitialized, incorrectly initialized, or point to memory oresources which have already been freed or closed. You may be accidently modifying or incrementing a pointer later used to free memory or close a resource. The pointer may be one that you use directly, or indirectly, such as in a structure which is passed to a system routine. Amiga functions which open system resources or allocate memory typically return a pointer; if the open or allocate fails, zero is usually returned. So you must test the return value of a system call for success before using it as a pointer. Test utilities such as MemWatch and MemMung can help catch the use of uninitialized pointers or freed memory. MemMung is a torture test which sets freed memory areas and location $0 to odd values (if your program is written correctly, this will have no effect; if not, your program is more likely to fail). MemWatch is a watchdog utility which reports modification of low memory (wild pointers often point to low memory, especially $0). Memory corruption and crashes can also be caused by calling functions with the wrong arguments or missing arguments (for example SetAPen(3) or SetAPen(win,3), instead of SetAPen(rp,3) ). Another possibility is that you might be overflowing your stack. The compiler's stack checking option may be able to catch this (with Lattice, -v disables stack checking). Cut stack usage by dynamically allocating large structures, buffers, and arrays if they are currently defined inside main() or your other functions. If you are using short integers be sure to explicitly type any long constants (e.g. 42L). For example, with short integers, the expression 1 << 17 may become zero. If corruption is occurring during exit, use printf (or kprintf, etc.) with Delay(n) to slow down your cleanup and broadcast each step. A bad pointer which causes a system crash will often be reported as a guru meditation 00000003 or 4. Numbers in the range 00000006 - B may also indicate a problem with pointers. These numbers correspond to the hardware defined CPU exceptions of Motorola's 68000-family of processors. Generally these occur when the CPU tries to access a non-existent memory location or execute an illegal instruction. Other guru meditation numbers are Amiga-specific, but may be caused by wild pointers; for the meaning of these codes, refer to the include file exec/alerts.h. Crashes - After Exit If your program crashes after exiting only when your program is started from the Workbench, then you are probably UnLock()ing one of the WBStartup message wa_Locks, or UnLock()ing the Lock returned from an initial CurrentDir() call. If you call CurrentDir() in your application, you should save the first lock it returns and then call CurrentDir() on that lock before you exit. If you are crashing from both Workbench and CLI and you are only crashing after exit, then you may be freeing or closing something twice. Also, you may be freeing or closing something that you did not allocate or open. A crash after your program exits can also be caused by leaving an outstanding device IO request or other wakeup request. If you send an IO request and then exit, Exec, upon completion of that IO request, will send a reply message to a port that no longer exists. You must abort and then WaitIO() on any pending IO requests before you free things and exit. See the autodocs for your device and for Exec AbortIO() and WaitIO(). Similar problems can be caused by deleting a subtask that is in a wait loop such as WaitTOF(). Only delete subtasks when you are sure they are in a safe state such as Wait(0L). Crashes - Subtasks, Interrupts If part of your code runs on a different stack or on the system stack, you must turn off compiler stack-checking options (Lattice uses the -v flag to disable stack checking). If part of your code is called directly by the system or by other tasks, you must use the large code / large data model, or use compiler functions or options to assure that the correct base registers are set up for your subtask or interrupt code. Crashes - Window Related Be careful not to call CloseWindow() during a while(msg=GetMsg(...)) loop on that window's port because the next GetMsg() will be on a freed pointer. Also, use ModifyIDCMP(NULL) with care, especially if you are using one port with multiple windows. Be sure to ClearMenuStrip() any menus before closing a window, and do not free items such as dynamically allocated gadgets and menus while they are attached to a window. Crashes - Workbench Only If you are crashing near the first DOS call, either your stack is too small or your startup code does not GetMsg() the WBStartup message from the process message port. If your program crashes only when started from Workbench and your startup code opens no stdio window or NIL: file handles for Workbench programs, then make sure you are not writing anything to stdout (e.g. printf() ) when started from Workbench (argc==0). See also ``Crashes - After Exit'' above. Disk Icon Won't Go Away This occurs when a program leaves a lock on one or more of a disk's files or directories. Fails Only On the 68020/30 In general this occurs whenever an application inadvertently contains a CPU dependency. The following programming practices will lead to programs which fail on the Motorola 68020/030 (but run OK on the 68000): o Using the upper byte of addresses for flags. o Doing signed math on addresses. o Writing self-modifying code. o Using the MOVE SR assembler instruction (use Exec GetCC() instead). o Using software delay or timing loops. o Making assumptions about the order in which asynchronous tasks will finish. Special features of the 68020/30 processors can also cause problems for programs written on the 68000. For example, an invalid cache entry due to DMA or other non-processor modification of data which has already been cached; a different exception stack frame; interrupt auto-vectors moved by VBR; the 68020/30 CLR instruction which does a single write access unlike the 68000 CLR instruction which does a separate read and write access (this might effect a read-triggered register in IO space - use MOVE instead). Fails Only On the 68000 Again, a program which fails only on certain processors contains a CPU dependency. The following programming practices can cause this problem: o Software delay loops. o Word or longword access of an odd address (illegal on the 68000). o Assumptions about the order in which asynchronous tasks will finish. o Using compiler flags which have generated inline 68881/68882 math coprocessor instructions or 68020/30 specific code. o Using the CLR instruction on a hardware register (it's behavior on the 68000 differs from the 68020/030 (use MOVE instead) ). Fails Only on Older ROMs or Older Workbench This can be caused by calling functions or using structures which do not exist in the older versions of the operating system. Or you may be asking for a library version higher than you need. Ask for the lowest version which provides the functions your application requires (usually 33). You should not use the #define LIBRARY_VERSION from the include files when you open a library. Also make sure you check OpenLibrary() calls for success (a non-zero return value). If the library you request is not available, exit gracefully and informatively. Fails only on Newer ROMs or Newer Workbench This should not happen with proper programming. Possible causes are: o Running too close to your stack limits or the memory limits of a base machine (newer versions of the operating system may use slightly more stack in system calls, and usually use more free memory). o Using system functions improperly. o Not testing function return values. o Using improperly initialized pointers o Assuming that a system variable (such as a Flags field) is B if it is not A. o Failing to initialize formerly reserved structure fields to zero. o Violating Amiga programming guidelines (for example: depending on or poking private system structures, jumping into ROM, depending on undocumented or unsupported behaviors). o Failing to read the function autodocs. Fails On CHIP-RAM-Only Machines This is caused by specifically asking for or requiring MEMF_FAST memory. If you don't need chip memory, ask for memory type 0L, or MEMF_CLEAR, or MEMF_PUBLIC|MEMF_CLEAR as applicable. If there is fast memory available, you will be given fast memory. If not, you will get chip memory. Fails Only on Machines with FAST RAM Data and buffers which will be accessed directly by the custom chips must be in chip memory. This includes bitplanes (use OpenScreen() or AllocRaster() ), audio samples, trackdisk buffers, and the graphic image data for sprites, pointers, bobs, images, gadgets, etc. Use compiler or linker flags to force chip memory loading of any initialized data that needs to be in chip memory. You could also dynamically allocate chip memory and copy the initialized data there. Fails Only with Enhanced Chips This is usually caused by writing or reading addresses past the end of register space on older custom chips, or writing a non-zero value to bits which are undefined in older chip registers, or failing to mask out undefined bits when interpreting the value read from a chip register. Fireworks A dazzling pyrotechnic video display is caused by trashing or freeing a copper list which is in use, or trashing the pointers to the copper list. If you aren't messing with copper lists, see ``Crashes and Memory Corruption''. Graphics - Corrupted Images The bit data for graphic images such as sprites, pointers, bobs, and gadgets must be in chip memory. Check your compiler manual for directives or flags which will place your graphic image data in chip memory. Alternately you could allocate chip memory and copy the graphic image there. Hang - Single Program Only Program hangs are generally caused by Wait()ing on the wrong signal bits, on the wrong port, on the wrong message, or on some other event that will never occur. They can also be caused by verify deadlocks. Be sure to turn off all Intuition VERIFY messages (such as MENUVERIFY) before calling AutoRequest() or doing disk access. Hang - Whole System This is generally caused by a Disable() without a corresponding Enable(). It can also be caused by memory corruption, especially corruption of low memory. See ``Crashes and Memory Corruption'' above. Memory Loss First, make sure that your program is actually causing the memory loss. Boot with a normal Workbench disk whose s:startup-sequence LoadWB command line has been changed to LoadWB -debug. It is important to boot with a standard Workbench because some third party applications such as background utilities, shells, and network handlers dynamically allocate and free memory. Arrange all windows so that part of the Workbench backdrop window is accessible and so that no window rearrangement will be needed to run your program. Select flushlibs from the rightmost Workbench menu. Any disk-loaded fonts, libraries, devices, etc. that are not currently open will be flushed from memory. Wait a few seconds, then click on the Workbench backdrop. Write down the amount of free memory displayed in the Workbench title bar. Now without rearranging any windows, run your program and use all of the program features. Exit your program, wait a few seconds, then click on the Workbench backdrop. Now select flushlibs, wait a few seconds and write down this final free amount. If this matches the first value you wrote down, then your program is fine and is not causing a memory loss. If memory was actually lost and your program can be run from CLI or Workbench, then try the above procedure with both methods of starting your program. See ``Memory Loss - CLI Only'' and ``Memory Loss - Workbench Only'' as appropriate. If you lose memory from both Workbench and CLI, then make sure all calls to functions which open/allocate/create/lock have a matching call to the corresponding close/free/delete/unlock function (there are a few system calls that do not require a corresponding free - check the autodocs). Generally, the close/free/delete/unlock calls should be in the opposite order of the allocations. If you are losing a small, fixed amount of memory, look for a structure of that size in the Structure Offsets listing in the Includes and Autodocs manual. For example, a loss of exactly 24 bytes is probably a Lock which has not been UnLock()ed. If you are using ScrollRaster(), be aware that ScrollRaster() left or right in a SUPERBITMAP window with no TmpRas will currently lose memory (workaround - attach a TmpRas). If you lose much more memory when started from Workbench than from the CLI, make sure your program is not using Exit(n). This would bypass startup code cleanups and prevent a Workbench-loaded program from being unloaded. Use exit(n) instead. Memory Loss - CLI Only Some third-party shells dynamically allocate history buffers, or cause other memory fluctuations. Also, if your program executes different code when started from CLI, check that code and its cleanup. And check your startup.asm if you wrote your own. Memory Loss - Ctrl-C Exit Only This occurs when you have Amiga-specific resources allocated and you have not disabled your compiler's automatic Ctrl-C handling (causing all of your program clean-ups to be skipped). Disable the compiler's Ctrl-C handling and handle Ctrl-C yourself. Memory Loss - During Execution A continuing memory loss during execution can be caused by failure to keep up with all of the IDCMP messages (such as MOUSEMOVE) that you request from Intuition. Intuition can not reuse IDCMP message blocks until you call ReplyMsg() on them. If your window's allotted message blocks are all in use, new ones will be allocated and not freed until the window is closed. Continuing memory losses can also be caused by a program loop containing an allocation/open type call without a corresponding free. Memory Loss - Workbench Only This is often caused by the failure of your code to unload after you exit. Make sure that your code is being linked with a correct, standard startup module and do not use the Exit(n) function to exit your program. The Exit(n) function will bypass your startup code's cleanup, including its ReplyMsg() of the WorkbenchStartup message (this signals Workbench to unload your program from memory). You should exit via exit(n) where n is a valid DOS error code such as RETURN_OK (libraries/dos.h). You may also exit with a final closing brace "}" or with the return statement. Assembler programmers using startup code can JMP to _exit with a long return value on the stack or use the RTS instruction. Menu Problems A flickering menu is caused by leaving a pixel or more space between menu subitems in your menu structures. Crashing after browsing a menu (looking at menu without selecting any items) is caused by not properly handling MENUNULL select messages. Multiple selection not working is caused by improper handling of the NextSelect field properly. See the Menus chapter from the Intuition manual for more details. Out-of-Sync Response to Input This is caused by failing to handle all received signals or messages after waking up from a Wait() or WaitPort() call. More than one event or message may have caused your program to be awakened. Check the signals returned by Wait() and act on every one that is set. At ports which may have more than one message (such as a window's IDCMP port) you must handle the messages in a while(msg=GetMsg(...)) loop. Performance Loss in Other Processes This is often caused by a program doing one of the following: Busy waiting or polling. Running at a higher priority. Doing lengthy Forbid()s, Disable()s, or interrupt handling. Sound Samples Won't Play Correctly The data for audio samples must be in chip memory. Check your compiler manual for directives or flags which will place your audio sample data in chip memory. Also, you can dynamically allocate chip memory and copy or load the audio sample there. Trackdisk Data Not Transferred This may occur if your trackdisk buffers are not in chip memory. Windows - Borders Flicker after Resize Set the NOCAREREFRESH flag. Even SMART_REFRESH windows can generate refresh events if there is a sizing gadget, so if you don't have specific code to handle this, you must set the NOCAREREFRESH flag. If you do have refresh code, be sure to use the Begin/EndRefresh() calls. Failure to do one or the other will leave Intuition in an intermediate state and slow down operation for all windows on the screen. GENERAL DEBUGGING TECHNIQUES Isolate the problem by using printf() to find the section of code in which the problem occurs. If you cannot display messages on the screen, use kprintf() to send messages to the serial port or dprintf() for the parallel port (see Linker Library documentation). Check the initial values, allocation, use, and freeing of all pointers and structures used in the problem area. Also make sure that all of your system and internal function calls pass correct initialized arguments and that all possible error returns are checked for and handled. Use Debugging Tools A variety of debugging tools are available to help locate faulty code. There are source level debuggers (such as Lattice's CodePRobe), crash interceptors (such as GOMF), memory watchdogs like MemWatch and WatchMem, and other helpful tools like MemMung, Avail, WBFrags, etc. Test With Different Configurations Test your program on a wide variety of systems and configurations. Programs with coding errors may appear to work properly on one configuration but may fail or cause fatal problems on another. Make sure that your code is tested on both the 68000 and the 68020/30, on machines with and without fast memory, and on machines with and without enhanced chips. Test all of your program functions on every machine. Test All Error and Abort Code A program with missing error checks or unsafe cleanup might work fine when all of the items it opens or allocates are available, but may fail fatally when an error or problem is encountered. Try your code with missing files, filenames with spaces, incorrect filenames, cancelled requesters, Ctrl-C, missing libraries or devices, low memory, missing hardware, etc. Test all of your text input functions with international ASCII characters (such as the character produced by pressing ALT-F then A). Rawkey codes produce different keyboard characters on the various national keyboards (higher levels of keyboard input are automatically translated to the proper characters). If your program will be distributed internationally, support and take advantage of the additional screen lines available on a PAL system. On A2000s with the enhanced Agnus chip, a PAL display can be selected via motherboard jumper J102. Note that a base PAL machine will have less memory free due to the larger display size.