home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Amiga Developer CD v1.2
/
amidev_cd_12.iso
/
reference
/
amiga_mail_vol1
/
program
/
troubleshooting_software
< prev
next >
Wrap
Text File
|
1990-01-25
|
21KB
|
506 lines
(c) Copyright 1989 Commodore-Amiga, Inc. All rights reserved.
The information contained herein is subject to change without notice, and
is provided "as is" without warranty of any kind, either express or implied.
The entire risk as to the use of this information is assumed by the user.
Troubleshooting Your Software
by Carolyn Scheppner
Many Amiga programming errors have classic symptoms. This guide
gives some tips to help you find and eliminate these problems in
your software.
CLI Error Messages
This is caused by calling exit(n) with an invalid or missing
return value. Assembler programmers using startup code should
jump to the startup code's _exit with a valid return value on
the stack. Programs without startup code should return with a
valid value in D0. Valid return values are defined in
libraries/dos.h and i. Other values (-1 for instance) can cause
CLI error messages such as "not an object module".
CLI Won't Close on RUN
A CLI can't close if a program has a lock on the CLI input or
output stream ("*"). If your program is RUN >NIL: from a CLI,
that CLI should be able to close unless your code or your
compiler's startup code explicitly opens "*".
Crashes and Memory Corruption at Run Time
Memory corruption, address errors, and illegal instruction errors
are generally caused by wild pointers - pointers which are
uninitialized, incorrectly initialized, or point to memory
oresources which have already been freed or closed. You may be
accidently modifying or incrementing a pointer later used to free
memory or close a resource. The pointer may be one that you use
directly, or indirectly, such as in a structure which is passed
to a system routine.
Amiga functions which open system resources or allocate memory
typically return a pointer; if the open or allocate fails, zero
is usually returned. So you must test the return value of a
system call for success before using it as a pointer.
Test utilities such as MemWatch and MemMung can help catch the
use of uninitialized pointers or freed memory. MemMung is a
torture test which sets freed memory areas and location $0 to
odd values (if your program is written correctly, this will have
no effect; if not, your program is more likely to fail).
MemWatch is a watchdog utility which reports modification of low
memory (wild pointers often point to low memory, especially $0).
Memory corruption and crashes can also be caused by calling
functions with the wrong arguments or missing arguments (for
example SetAPen(3) or SetAPen(win,3), instead of SetAPen(rp,3)
). Another possibility is that you might be overflowing your
stack. The compiler's stack checking option may be able to catch
this (with Lattice, -v disables stack checking). Cut stack
usage by dynamically allocating large structures, buffers, and
arrays if they are currently defined inside main() or your other
functions. If you are using short integers be sure to explicitly
type any long constants (e.g. 42L). For example, with short
integers, the expression 1 << 17 may become zero. If corruption
is occurring during exit, use printf (or kprintf, etc.) with
Delay(n) to slow down your cleanup and broadcast each step.
A bad pointer which causes a system crash will often be reported
as a guru meditation 00000003 or 4. Numbers in the range
00000006 - B may also indicate a problem with pointers. These
numbers correspond to the hardware defined CPU exceptions of
Motorola's 68000-family of processors. Generally these occur
when the CPU tries to access a non-existent memory location or
execute an illegal instruction. Other guru meditation numbers
are Amiga-specific, but may be caused by wild pointers; for the
meaning of these codes, refer to the include file exec/alerts.h.
Crashes - After Exit
If your program crashes after exiting only when your program is
started from the Workbench, then you are probably UnLock()ing one
of the WBStartup message wa_Locks, or UnLock()ing the Lock
returned from an initial CurrentDir() call. If you call
CurrentDir() in your application, you should save the first lock
it returns and then call CurrentDir() on that lock before you
exit.
If you are crashing from both Workbench and CLI and you are only
crashing after exit, then you may be freeing or closing something
twice. Also, you may be freeing or closing something that you
did not allocate or open.
A crash after your program exits can also be caused by leaving an
outstanding device IO request or other wakeup request. If you
send an IO request and then exit, Exec, upon completion of that
IO request, will send a reply message to a port that no longer
exists. You must abort and then WaitIO() on any pending IO
requests before you free things and exit. See the autodocs for
your device and for Exec AbortIO() and WaitIO(). Similar
problems can be caused by deleting a subtask that is in a wait
loop such as WaitTOF(). Only delete subtasks when you are sure
they are in a safe state such as Wait(0L).
Crashes - Subtasks, Interrupts
If part of your code runs on a different stack or on the system
stack, you must turn off compiler stack-checking options
(Lattice uses the -v flag to disable stack checking). If part
of your code is called directly by the system or by other tasks,
you must use the large code / large data model, or use compiler
functions or options to assure that the correct base registers
are set up for your subtask or interrupt code.
Crashes - Window Related
Be careful not to call CloseWindow() during a
while(msg=GetMsg(...)) loop on that window's port because the
next GetMsg() will be on a freed pointer. Also, use
ModifyIDCMP(NULL) with care, especially if you are using one port
with multiple windows. Be sure to ClearMenuStrip() any menus
before closing a window, and do not free items such as
dynamically allocated gadgets and menus while they are attached
to a window.
Crashes - Workbench Only
If you are crashing near the first DOS call, either your stack is
too small or your startup code does not GetMsg() the WBStartup
message from the process message port. If your program crashes
only when started from Workbench and your startup code opens no
stdio window or NIL: file handles for Workbench programs, then
make sure you are not writing anything to stdout (e.g. printf()
) when started from Workbench (argc==0). See also ``Crashes -
After Exit'' above.
Disk Icon Won't Go Away
This occurs when a program leaves a lock on one or more of a
disk's files or directories.
Fails Only On the 68020/30
In general this occurs whenever an application inadvertently
contains a CPU dependency. The following programming practices
will lead to programs which fail on the Motorola 68020/030 (but
run OK on the 68000):
o Using the upper byte of addresses for flags.
o Doing signed math on addresses.
o Writing self-modifying code.
o Using the MOVE SR assembler instruction (use Exec GetCC()
instead).
o Using software delay or timing loops.
o Making assumptions about the order in which asynchronous
tasks will finish.
Special features of the 68020/30 processors can also cause
problems for programs written on the 68000. For example, an
invalid cache entry due to DMA or other non-processor
modification of data which has already been cached; a different
exception stack frame; interrupt auto-vectors moved by VBR; the
68020/30 CLR instruction which does a single write access unlike
the 68000 CLR instruction which does a separate read and write
access (this might effect a read-triggered register in IO space -
use MOVE instead).
Fails Only On the 68000
Again, a program which fails only on certain processors contains
a CPU dependency. The following programming practices can cause
this problem:
o Software delay loops.
o Word or longword access of an odd address (illegal on the
68000).
o Assumptions about the order in which asynchronous tasks will
finish.
o Using compiler flags which have generated inline 68881/68882
math coprocessor instructions or 68020/30 specific code.
o Using the CLR instruction on a hardware register (it's
behavior on the 68000 differs from the 68020/030 (use
MOVE instead) ).
Fails Only on Older ROMs or Older Workbench
This can be caused by calling functions or using structures which
do not exist in the older versions of the operating system. Or
you may be asking for a library version higher than you need.
Ask for the lowest version which provides the functions your
application
requires (usually 33). You should not use the #define
LIBRARY_VERSION from the include files when you open a library.
Also make sure you check OpenLibrary() calls for success (a
non-zero return value). If the library you request is not
available, exit gracefully and informatively.
Fails only on Newer ROMs or Newer Workbench
This should not happen with proper programming. Possible causes
are:
o Running too close to your stack limits or the memory limits
of a base machine (newer versions of the operating system
may use slightly more stack in system calls, and usually use
more free memory).
o Using system functions improperly.
o Not testing function return values.
o Using improperly initialized pointers
o Assuming that a system variable (such as a Flags field) is B
if it is not A.
o Failing to initialize formerly reserved structure fields to
zero.
o Violating Amiga programming guidelines (for example:
depending on or poking private system structures, jumping
into ROM, depending on undocumented or unsupported
behaviors).
o Failing to read the function autodocs.
Fails On CHIP-RAM-Only Machines
This is caused by specifically asking for or requiring MEMF_FAST
memory. If you don't need chip memory, ask for memory type 0L,
or MEMF_CLEAR, or MEMF_PUBLIC|MEMF_CLEAR as applicable. If
there is fast memory available, you will be given fast memory.
If not, you will get chip memory.
Fails Only on Machines with FAST RAM
Data and buffers which will be accessed directly by the custom
chips must be in chip memory. This includes bitplanes (use
OpenScreen() or AllocRaster() ), audio samples, trackdisk
buffers, and the graphic image data for sprites, pointers, bobs,
images, gadgets, etc. Use compiler or linker flags to force chip
memory loading of any initialized data that needs to be in chip
memory. You could also dynamically allocate chip memory and copy
the initialized data there.
Fails Only with Enhanced Chips
This is usually caused by writing or reading addresses past the
end of register space on older custom chips, or writing a
non-zero value to bits which are undefined in older chip
registers, or failing to mask out undefined bits when
interpreting the value read from a chip register.
Fireworks
A dazzling pyrotechnic video display is caused by trashing or
freeing a copper list which is in use, or trashing the pointers
to the copper list. If you aren't messing with copper lists, see
``Crashes and Memory Corruption''.
Graphics - Corrupted Images
The bit data for graphic images such as sprites, pointers, bobs,
and gadgets must be in chip memory. Check your compiler manual
for directives or flags which will place your graphic image data
in chip memory. Alternately you could allocate chip memory and
copy the graphic image there.
Hang - Single Program Only
Program hangs are generally caused by Wait()ing on the wrong
signal bits, on the wrong port, on the wrong message, or on some
other event that will never occur. They can also be caused by
verify deadlocks. Be sure to turn off all Intuition VERIFY
messages (such as MENUVERIFY) before calling AutoRequest() or
doing disk access.
Hang - Whole System
This is generally caused by a Disable() without a corresponding
Enable(). It can also be caused by memory corruption, especially
corruption of low memory. See ``Crashes and Memory Corruption''
above.
Memory Loss
First, make sure that your program is actually causing the memory
loss. Boot with a normal Workbench disk whose
s:startup-sequence LoadWB command line has been changed to
LoadWB -debug. It is important to boot with a standard
Workbench because some third party applications such as
background utilities, shells, and network handlers dynamically
allocate and free memory. Arrange all windows so that part of
the Workbench backdrop window is accessible and so that no
window rearrangement will be needed to run your program. Select
flushlibs from the rightmost Workbench menu. Any disk-loaded
fonts, libraries, devices, etc. that are not currently open will
be flushed from memory. Wait a few seconds, then click on the
Workbench backdrop. Write down the amount of free memory
displayed in the Workbench title bar. Now without rearranging
any windows, run your program and use all of the program
features. Exit your program, wait a few seconds, then click on
the Workbench backdrop. Now select flushlibs, wait a few
seconds and write down this final free amount. If this matches
the first value you wrote down, then your program is fine and is
not causing a memory loss.
If memory was actually lost and your program can be run from CLI
or Workbench, then try the above procedure with both methods of
starting your program. See ``Memory Loss - CLI Only'' and
``Memory Loss - Workbench Only'' as appropriate.
If you lose memory from both Workbench and CLI, then make sure
all calls to functions which open/allocate/create/lock have a
matching call to the corresponding close/free/delete/unlock
function (there are a few system calls that do not require a
corresponding free - check the autodocs). Generally, the
close/free/delete/unlock calls should be in the opposite order of
the allocations.
If you are losing a small, fixed amount of memory, look for a
structure of that size in the Structure Offsets listing in the
Includes and Autodocs manual. For example, a loss of exactly 24
bytes is probably a Lock which has not been UnLock()ed. If you
are using ScrollRaster(), be aware that ScrollRaster() left or
right in a SUPERBITMAP window with no TmpRas will currently lose
memory (workaround - attach a TmpRas). If you lose much more
memory when started from Workbench than from the CLI, make sure
your program is not using Exit(n). This would bypass startup
code cleanups and prevent a Workbench-loaded program from being
unloaded. Use exit(n) instead.
Memory Loss - CLI Only
Some third-party shells dynamically allocate history buffers, or
cause other memory fluctuations. Also, if your program executes
different code when started from CLI, check that code and its
cleanup. And check your startup.asm if you wrote your own.
Memory Loss - Ctrl-C Exit Only
This occurs when you have Amiga-specific resources allocated and
you have not disabled your compiler's automatic Ctrl-C handling
(causing all of your program clean-ups to be skipped). Disable
the compiler's Ctrl-C handling and handle Ctrl-C yourself.
Memory Loss - During Execution
A continuing memory loss during execution can be caused by
failure to keep up with all of the IDCMP messages (such as
MOUSEMOVE) that you request from Intuition. Intuition can not
reuse IDCMP message blocks until you call ReplyMsg() on them.
If your window's allotted message blocks are all in use, new ones
will be allocated and not freed until the window is closed.
Continuing memory losses can also be caused by a program loop
containing an allocation/open type call without a corresponding
free.
Memory Loss - Workbench Only
This is often caused by the failure of your code to unload after
you exit. Make sure that your code is being linked with a
correct, standard startup module and do not use the Exit(n)
function to exit your program. The Exit(n) function will bypass
your startup code's cleanup, including its ReplyMsg() of the
WorkbenchStartup message (this signals Workbench to unload your
program from memory). You should exit via exit(n) where n is a
valid DOS error code such as RETURN_OK (libraries/dos.h). You
may also exit with a final closing brace "}" or with the return
statement. Assembler programmers using startup code can JMP to
_exit with a long return value on the stack or use the RTS
instruction.
Menu Problems
A flickering menu is caused by leaving a pixel or more space
between menu subitems in your menu structures. Crashing after
browsing a menu (looking at menu without selecting any items) is
caused by not properly handling MENUNULL select messages.
Multiple selection not working is caused by improper handling of
the NextSelect field properly. See the Menus chapter from the
Intuition manual for more details.
Out-of-Sync Response to Input
This is caused by failing to handle all received signals or
messages after waking up from a Wait() or WaitPort() call. More
than one event or message may have caused your program to be
awakened. Check the signals returned by Wait() and act on every
one that is set. At ports which may have more than one message
(such as a window's IDCMP port) you must handle the messages in a
while(msg=GetMsg(...)) loop.
Performance Loss in Other Processes
This is often caused by a program doing one of the following:
Busy waiting or polling.
Running at a higher priority.
Doing lengthy Forbid()s, Disable()s, or interrupt handling.
Sound Samples Won't Play Correctly
The data for audio samples must be in chip memory. Check your
compiler manual for directives or flags which will place your
audio sample data in chip memory. Also, you can dynamically
allocate chip memory and copy or load the audio sample there.
Trackdisk Data Not Transferred
This may occur if your trackdisk buffers are not in chip memory.
Windows - Borders Flicker after Resize
Set the NOCAREREFRESH flag. Even SMART_REFRESH windows can
generate refresh events if there is a sizing gadget, so if you
don't have specific code to handle this, you must set the
NOCAREREFRESH flag. If you do have refresh code, be sure to use
the Begin/EndRefresh() calls. Failure to do one or the other
will leave Intuition in an intermediate state and slow down
operation for all windows on the screen.
GENERAL DEBUGGING TECHNIQUES
Isolate the problem by using printf() to find the section of code
in which the problem occurs. If you cannot display messages on
the screen, use kprintf() to send messages to the serial port or
dprintf() for the parallel port (see Linker Library
documentation). Check the initial values, allocation, use, and
freeing of all pointers and structures used in the problem area.
Also make sure that all of your system and internal function
calls pass correct initialized arguments and that all possible
error returns are checked for and handled.
Use Debugging Tools
A variety of debugging tools are available to help locate faulty
code. There are source level debuggers (such as Lattice's
CodePRobe), crash interceptors (such as GOMF), memory watchdogs
like MemWatch and WatchMem, and other helpful tools like MemMung,
Avail, WBFrags, etc.
Test With Different Configurations
Test your program on a wide variety of systems and
configurations. Programs with coding errors may appear to work
properly on one configuration but may fail or cause fatal
problems on another. Make sure that your code is tested on both
the 68000 and the 68020/30, on machines with and without fast
memory, and on machines with and without enhanced chips. Test
all of your program functions on every machine.
Test All Error and Abort Code
A program with missing error checks or unsafe cleanup might work
fine when all of the items it opens or allocates are available,
but may fail fatally when an error or problem is encountered.
Try your code with missing files, filenames with spaces,
incorrect filenames, cancelled requesters, Ctrl-C, missing
libraries or devices, low memory, missing hardware, etc.
Test all of your text input functions with international ASCII
characters (such as the character produced by pressing ALT-F then
A). Rawkey codes produce different keyboard characters on the
various national keyboards (higher levels of keyboard input are
automatically translated to the proper characters). If your
program will be distributed internationally, support and take
advantage of the additional screen lines available on a PAL
system. On A2000s with the enhanced Agnus chip, a PAL display
can be selected via motherboard jumper J102. Note that a base
PAL machine will have less memory free due to the larger display
size.