Day 20: Debugging

No matter how well you design your program and how carefully you write the code, sooner or later you will find a bug. Something doesn't work as intended, the program is slower than it should be, data is corrupted, or the program crashes. The difference between a poor programmer and a good one is not whether there are bugs in the code, but how quickly the bugs are found and squashed, and whether any survive into the shipping product.

Today you will learn

Examining Bugs, Errors, and Design Problems

There are many reasons why a program may not behave as intended. Not all these reasons are bugs; some are problems in the design, architecture, or logic of the program. Other reasons are misimplemented features. Some problems are, in fact, simply bugs: errors in syntax or the semantics of what you were trying to do.

Looking At the Cost of Bugs

Industry estimates are that the overwhelming cost of software development is in the testing, debugging, and maintenance stages--not in the design and development! More important, repeated studies have shown beyond any contention that the later in the development process a bug is detected, the more expensive it is to fix.

Bugs caught by Quality Assurance engineers are less expensive to fix than bugs caught by the customer. Bugs caught by the developer are less expensive to fix than those found by QA, and bugs found in the design phase are less expensive than those found during the coding phase.

Writing a Good Specification

Many bugs are really a disagreement or misunderstanding between the developer and the person providing the specification for the program. Informal specifications particularly are subject to this type of problem. The program does exactly what the developer intended; unfortunately, this is not what the client wants.

The best time to fix this bug is before you start your design. To the extent possible, get a complete specification. Most clients, and a surprising number of developers, have no idea how to create a good written specification.

The first and most important thing about a specification is that it is written. If you are designing a program for a graphical user interface (GUI), be sure to include screen shots.

Your specification should detail the entire user experience, including what the user will see and what the user can enter, as well as all the potential events and their effects. An event is any action to which the program must react, such as user input, mouse movements, modem detection, and so on.

The specification should detail any filtering or validation you must do on user input, and how the program is to react to normal and abnormal user actions. The specification should explain the exact hardware requirements (the user will need an 80386 or more powerful computer with at least 16MB of RAM, and so on) and how the program should respond when the required hardware is not present.

The specification should go on to detail how the program responds to errors, exceptions, and other predictable (and not so predictable) problems. The specification should, in short, provide details on how the program works from the user's perspective.

Notice that this specification says nothing about how you will architect or implement the program. I work with a graphic artist named David Rollert, and he once told me that his clients will often say, "Use more red here..." He asks them to tell him what they are trying to accomplish. "Do you want it to look prettier, more soothing, more noticeable..." he asks them. The idea is that his client should specify what effect they want and leave the implementation to him. After all, that is the skill for which they hired him.

Similarly, ask your clients to tell you what behavior and characteristics they want (high performance, quick response, robustness, and so on) and let them leave the implementation to you; you are the C++ expert! Don't misunderstand a user who says "I want you to keep an array of employee IDs" to mean that he really wants you to use an array; he just wants you to be able to save and access the IDs when they are needed. How you manage the magic is your business.

Writing the Design Document

After you have a reasonably frozen specification, it is time to write the design document. This document should specify how you intend to organize your program, what the principal classes are, and how the data will be preserved. The size of the design document (and all the other documentation mentioned here) should correspond to the overall size of the program. It makes no sense to create a 30-page design for a 100-line program; but it also makes no sense to launch a 50-programmer effort based on notes sketched on a scrap of paper.

Considering Reality and Schedules

The reality of real-world programming is that the specification almost never is completed before the code (let alone the design document) is written. Further, in our rush to meet deadlines, the design document often is skimped on, and even when fleshed out reasonably well, it quickly becomes obsolete. That said, to the extent that you can get everyone working from a shared set of assumptions, you will be able to avoid many of the pitfalls that cause significant delays in so much of the industry.

Using Debuggers

My dad used to tell me the story of why the leaky roof was never fixed. When it was raining, you couldn't fix the roof because it was wet, and the shingles wouldn't set correctly. And on sunny days, he would explain, no one was motivated to go up on the roof when they could sit in the backyard and enjoy the weather!

Most of us treat debuggers like a leaky roof. When our code is working properly, we have little motivation to sit down and learn how to use the debugger. When the code is broken, we have no time to learn new skills because we have to fix that bug! The truth, however, is that even rudimentary skill with a real debugger can save you hours and hours of effort.

Before launching into an explanation of using a world-class debugger, let me take a brief tangent into the alternative techniques used by many programmers.

Use the Source, Luke

The simplest and often the most effective way to find bugs is--surprise!--to read the code. You would be surprised how often a bug can be found on inspection, just by carefully re-reading what you have written. One technique that works very well for me, especially when I'm completely stuck, is to explain to someone else why what I'm seeing makes no sense and why there cannot possibly be a bug in this particular section of code.

Invariably, one of two things happens: I prove the bug really cannot be here, and I go figure out where it is, or I smack myself on the head and say, "Oh! Now that I said it out loud, of course it is right here!"

Use TRACE Macros

The most popular method for finding bugs among old C hackers probably was to put printf statements into the code, which printed out the current value of variables so that the programmer could examine the values stored at various memory locations. Of course, C++ programmers never would be so crude as to use printf. We use cout instead, but the advantages and disadvantages remain: You see what is going on, but your code becomes bloated and unreadable.

The solution to this code bloat is to create a TRACE macro. Typically, a TRACE macro would take a string as a parameter. In debug mode, trace passes the string to cout; in the release version, however, it does nothing.

Using trace throughout your code is a crude way to manage debugging, and your debugger provides a better alternative, but trace does have an important role to play nonetheless. Many programmers use TRACE macros to report on run-time failures, and to provide a running commentary on which areas of their code have been entered and exited.

Use Logs

An effective adjunct to TRACE macros is to create a log file, and to write progress messages and other status messages to the log. You can create macros that write to the log only in debug mode, or only when certain flags are set. Often, reduced but vital information will be logged even in the release version of a product, enabling tech support to use the logs to diagnose problems encountered by customers.

If you are writing log-in macros, you may want to consider creating a variety of numbered logs. This method will enable you to turn on and off logging by section or by level, as your needs change.

Another important consideration in creating logs is buffering. Most operating systems allow for buffered and unbuffered output, and it is important to consider this when creating your logs. If all your logging uses a write-through, unbuffered mechanism, your program will slow as it writes to the disk. If, on the other hand, you make all your logging buffered, you run the risk that you will lose vital information when your program crashes.

Using the Debugger

Nearly all modern development environments include one or more high-powered debuggers. The essential idea behind these tools is this: You run the debugger, which loads your source code, and then you run your program from within the debugger. This method enables you to see each instruction in your program as it executes, and to examine each variable as it changes during the life of your program.

All compilers will enable you to compile with or without symbols. Compiling with symbols tells the compiler to create the necessary mapping between your source code and the generated program; the debugger uses this to point to the line of source code that corresponds to the next action in the program.

This point is critical and often misunderstood by programmers: Your source code has been translated into object code and then into machine language by the time you run your program. Each machine-language instruction corresponds to one assembler instruction, and early debuggers showed your source code only in assembler.

Modern compilers and debuggers work together to enable you to step through each line of your C++ code as if each line were one machine instruction. The reality, however, is that each line represents many assembler instructions, and most compilers will reveal this to you if you asked for mixed symbols and assembler.

Using Symbolic Debuggers

Full-screen symbolic debuggers make walking your code a delight. After you load your debugger, it reads through all your source code and shows the code in a window. You can step over function calls or direct the debugger to step into each function, line by line.

With most debuggers, you can switch between the source code and the output to see the results of each executed statement. You can examine the current state of each variable, look at complex data structures, examine the value of member data within classes, and look at the actual values in memory of various pointers and other memory locations.

You can switch to mixed source and step through each assembler instruction, and you usually can jump over instructions or manipulate the value of memory locations. You also can instruct your debugger to race through the code, stopping only when it reaches a particular location, when a memory variable has a certain value, or when it has iterated over a section of code a specified number of times.

Using Break Points

Break points are instructions to the debugger that when a particular line of code is ready to be executed, the program should stop. Break points enable you to run your program unimpeded until the line in question is reached. They also help you analyze the current condition of variables just before and after a critical line of code.

Using Watch Points

It is possible to tell the debugger to show you the value of a particular variable, or to break when a particular variable is written to or read. Watch points enable you to set these conditions, and at times to even modify the value of a variable while the program is running.

Examining Memory

At times, it is important to see the actual values held in memory. Modern debuggers can show values in the form of the actual variable strings can be shown as characters, longs can be shown as numbers rather than as 4 bytes, and so on. Sophisticated C++ debuggers even can show complete classes, providing the current value of all the member variables, including the this pointer.

Using a Call Stack

At any break point in your program, you can examine the call stack. The call stack will display the series of function calls leading up to the current function. This can answer that age-honored question of so many programmers, "How the devil did we get here?"

It is important, however, to note that the call stack gives you less information than you might otherwise suspect. Remember that when a function is called and then returns, it falls off the call stack. It therefore is possible for main() to call Func1(), and for Func1() to call Func2(). Func2() can return, and then Func1() can call Func3(). The call stack at that point will not include Func2(), even though it is true that Func2() is called before Func3() is called.

If you are trying to prove that some field is initialized in Func2() before Func3() is called, the call stack may not be able to help you. The right way to track this down is to put a break point in Func2().

Turning Off Optimization

Modern compilers do a wonderful job of optimizing your code to be smaller and faster. They accomplish this by reorganizing your code at compile time. Although this is very desirable in your shipping code, it will undermine completely your efforts to step through and debug your code. When you create your debug versions, be sure to turn off all optimizations.

Zen Mind, Debugging Mind

It has been said in dozens of books and is no less true for being commonplace: Debugging is a state of mind. Moreover, the mind-set needed for debugging is quite different from that needed for programming.

Although leaps of intuition always are welcome in all creative fields, and no less so in programming, debugging is best accomplished as a methodical process of hypothesis, experimentation, and evaluation. The great dangers in debugging come from unexamined assumptions and sloppy evaluation. Often, a programmer will whiz right by the bug as he pursues a half-baked theory before all the evidence is evaluated.

Defining the Problem

The first step in successful debugging is to define the problem: What is the bug you are after? What are its manifestations? What might cause such a problem?

There are, I must confess, some shortcuts along the way. If your code crashes, for example, it is likely that you are writing into memory you don't own. This is a good indication that you may have your pointers wrong. Look for things like writing past the end of an array, deleting an already deleted pointer, using a deleted pointer, and so on.

If your bug appears suddenly when you change a different area of code, you might want to consider whether you have a pointer problem. Problems that come and go when memory changes often are pointer problems that only show up when different areas of memory are being corrupted. That is, the problem never really goes away, but it is masked when unimportant memory is corrupted, and it is manifest when critical memory is stepped on.

Locating, Isolating, and Destroying the Problem

The crucial trick in finding a bug is to isolate the problem, eliminate the extraneous and confounding variables, and zero in on the problem. If you are seeing a crash in a particular section of code, try commenting out much of the code until you find the responsible code.

Often, you can save time by conducting an impromptu binary search of your code. If you have narrowed the problem to a particular function, for example, comment out the bottom half of the function. If the problem goes away, uncomment the bottom and comment out the top. If you prove that the problem is in the bottom, leave the top commented and comment out half of the bottom half, and so on.

To the extent that you can eliminate areas of code, variables, and other factors, you can simplify the problem. A problem sufficiently simplified may offer up its own solution on inspection. Even if this doesn't happen, the fewer variables you are contending with, the likelier it is you will be able to quickly solve the problem.

Knowing What to Look For

After you narrow your problem to a specific area of code, what do you do next? The first thing to do is to open a watch window and examine the value of all your variables. Do the values make sense? Are your pointers pointing to reasonable areas of memory?

Does your bug occur each time or only after so many iterations? Examine your call stack; are you getting here along the path you expect? Open the assembler window; are you executing the code you think you are? Examine the registers; are you in the section of code and memory you think you are?

If data is getting unexpectedly trashed, examine the stack pointer. Are you overrunning the stack? Are you running with stack trace on? Most compilers will enable you to trace the stack and will issue a warning in your debugger if you run past the top of your stack.

Where are you storing your variables? In segmented memory (that is, in DOS-based applications), there are near and far memory, and you have much less near memory than far memory. Is it possible that you are running out of near memory?

Questioning Your Assumptions

One of the great benefits of interactive debugging is that you can question and reexamine your assumptions. "That variable must be greater than zero." Oh? A debugger can tell you whether you are right. "I initialized that pointer already." A debugger can show you the exact address stored in that pointer.

Looking At Logic Flaws

Often, a bug is not a mistake in syntax or an overwritten array, but instead a flaw in your algorithm. These flaws can be the hardest to find. You may be searching for a "mistake," for example, and thus not realize that the error is not a misplaced semicolon but instead a lapse in understanding.

Logic flaws can be found in complex sections of your code, and in very simple sections where unexpected starting or ending conditions can throw off your reasoning. These flaws often can be found by having someone else check your reasoning. Explain to another programmer what you are doing; even before she points out the error of your ways, you may realize it yourself.

Finding Bugs That Only Show Up in Release Code

Sometimes, your code will work perfectly in debug mode but will break in the release version. This can drive you crazy--just when you thought it was safe to go back into the water...

The first thing to look for when this happens is debug-mode macro side effects. As explained on day 19, for example, your ASSERT macro usually evaluates to nothing at run time. If you write

    BigClassPtr * pBigClass = getBigClass();
    ASSERT(pBigClass);

the assertion will disappear at run time, leaving behind absolutely no assembler instructions. This is just what you want and will have no bad effects on your program.

However, if you write

    BigClassPtr * pBigClass = 0;
    ASSERT ((pBigClass = getBigClass()) != NULL);

in debug mode, this code will have the same effect as the preceding code. In non-debug mode, however, the ASSERT macro will evaluate to nothing, and the assignment will not be made; this will leave the pointer, pBigClass, pointing to nothing and initialized to zero.

Watching for Some Common Bugs

Some bugs are so common, even in the code of experienced programmers, that it is worthwhile to keep them in mind every time you are debugging. The next few sections present a partial list. Over time, you will create your own dirty dozen of repeated nasty errors.

Making Fence-Post Errors

No matter how long we code, we still make this bush-league error. Declare an array of 100 items, and then try to access myArray[100]. Or, as an example of a somewhat more subtle error, declare a buffer to hold a C-style, null-terminated string of 100 bytes, and then copy 100 bytes into it, leaving no room for the terminating null.

Switching back and forth between C++ string objects and C-style character arrays can generate fence-post errors, as can passing pointers and then using pointer arithmetic to access members of a predefined array.

Deleting Memory Twice, or Not at All

Memory management can be confusing and enervating. It is imperative that you keep careful track of "who" is responsible for owning a block of memory, and for restoring that memory when the last pointer is destroyed (and not before).

Remember to zero initialize all pointers that you aren't using, and to set pointers to zero after they are deleted. Of course, all of that does you no good if you don't test the pointer before using it.

Wrapping around an Integer

Unsigned integers can hold any value from 0 through 65,535. When you declare an int variable, however, you actually declare a signed integer, and any value past 32,767 will "wrap around" to a very large negative number. This effect can cause surprisingly subtle bugs.

Returning a Reference to a Local Variable

References can simplify greatly the syntax of your function, but beware of returning a reference to an item on the stack. A variant on this is to return a reference to a deleted item. Remember, a reference can never refer to a null object.

Memory Checking

Although the debugger is certainly your first line of defense against programming bugs, there now are new and powerful methods that you can use. A whole class of special boundary- and memory-checking utilities exist, exemplified by NuMega Technology's Bounds Checker utility.

Like a debugger, Bounds Checker is run first, and Bounds Checker then runs your program. As your code executes, Bounds Checker is looking for fence-post errors, memory overwrites, memory leaks, and so on. This can be a very effective way to find problems in your code and can save you many hours of painful debugging.

I have nothing to do with Nu Mega Technologies, except that I'll never write another DOS or Windows programs without first checking it with Bounds Checker. It has saved me hundreds of hours of effort with its well-crafted tools.

A Word about Rest

If I had not experienced this more than once, and were it not for the fact that dozens of other programmers have reported the same thing, I wouldn't bother saying it here. You cannot debug a complex program when you are exhausted.

It happens to all of us: The deadline is imminent, it is 2 a.m., you have been at it for 16 hours straight, and the program will not work. You simply know that another hour's work will find the problem. Unfortunately, more often than not, what really happens is that the program begins to unravel as you make stupid and irreversible mistakes.

There is a tried and true technique for solving this problem, and it works surprisingly well. Save your file, get up, walk away, and get six hours of sleep. More often than not, you will come back and fix the problem in less than an hour.

This technique works for two reasons. First, you need to be rested. Second, you well may have been processing the problem for much of the time you were away. Learn the lesson from the experience of others: Sometimes a break and a good night's sleep is what is needed.

Looking At Some Debugging Examples

Finding example code to teach debugging techniques is particularly difficult. Trivial programs lend themselves to solutions on inspection, and complex problems mask the debugging techniques behind the obscurity of the program itself. Further, true debugging requires a full and visceral understanding of the program. Trying to debug someone else's code (or your own code after you have been away from it for a long time) particularly is difficult.

Nonetheless, the following listings provide a starting point for exploration. Listing 20.1 is the interface to a String class not unlike that used throughout this book. Listing 20.2 is the implementation, and listing 20.3 is a simple driver program.

Listing 20.1 Interface to String Class

    1:     #include <iostream.h>
    2:     #include <string.h>
    3:     #define UINT unsigned int
    4:     enum BOOL { FALSE, TRUE };
    5:
    6:     class xOutOfBounds {};
    7:
    8:     class String
    9:     {
    10:    public:
    11:
    12:             // constructors
    13:             String();
    14:             String(const char *);
    15:             String (const char *, UINT length);
    16:             String (const String&);
    17:             ~String();
    18:
    19:             // helpers and manipulators
    20:             UINT   GetLength() const { return itsLen; }
    21:             BOOL IsEmpty() const { return (BOOL) (itsLen == 0); }
    22:             void Clear();                // set string to 0 length
    23:
    24:             // accessors
    25:             char operator[](UINT offset) const;
    26:             char& operator[](UINT offset);
    27:             const char * GetString()const  { return itsCString; }
    28:
    29:             // casting operators
    30:              operator const char* () const { return itsCString; }
    31:              operator char* () { return itsCString;}
    32:
    33:             void operator+=(const String&);
    34:             void operator+=(char);
    35:             void operator+=(const char*);
    36:
    37:             BOOL operator<(const String& rhs)const;
    38:             BOOL operator>(const String& rhs)const;
    39:             BOOL operator<=(const String& rhs)const;
    40:             BOOL operator>=(const String& rhs)const;
    41:             BOOL operator==(const String& rhs)const;
    42:             BOOL operator!=(const String& rhs)const;
    43:
    44:
    45:             // friend functions
    46:             String operator+(const String&);
    47:             String operator+(const char*);
    48:             String operator+(char);
    49:
    50:             friend ostream& operator<< (ostream&, const String&);
    51:
    52:    private:
    53:             // returns 0 if same, -1 if this is less than argument,
    54:             // 1 if this is greater than argument
    55:             int StringCompare(const String&) const;  // used by Boolean
                    operators
    56:
    57:
    58:             char * itsCString;
    59:             UINT itsLen;
    60:    };

Listing 20.2 Implementation of String

    1:     #include "string.hpp"
    2:      // default constructor creates string of 0 bytes
    3:     String::String()
    4:     {
    5:        itsCString = new char[1];
    6:        itsCString[0] = '\0';
    7:        itsLen=0;
    8:     }
    9:
    10:    String::String(const char *rhs)
    11:    {
    12:       itsLen = strlen(rhs);
    13:       itsCString = new char[itsLen+1];
    14:       strcpy(itsCString,rhs);
    15:    }
    16:
    17:    String::String (const char *rhs, UINT length)
    18:    {
    19:       itsLen = strlen(rhs);
    20:       if (length < itsLen)
    21:          itsLen = length;  // max size = length
    22:       itsCString = new char[itsLen+1];
    23:       memcpy(itsCString,rhs,itsLen);
    24:       itsCString[itsLen] = '\0';
    25:    }
    26:
    27:    // copy constructor
    28:    String::String (const String & rhs)
    29:    {
    30:       itsLen=rhs.GetLength();
    31:       itsCString = new char[itsLen+1];
    32:       memcpy(itsCString,rhs.GetString(),itsLen);
    33:       itsCString[rhs.itsLen]='\0';
    34:    }
    35:
    36:    String::~String ()
    37:    {
    38:       Clear();
    39:    }
    40:
    41:    void String::Clear()
    42:    {
    43:       delete [] itsCString;
    44:       itsLen = 0;
    45:    }
    46:
    47:    //non constant offset operator
    48:    char & String::operator[](UINT offset)
    49:    {
    50:       if (offset > itsLen)
    51:       {
    52:          //throw xOutOfBounds();
    53:           return itsCString[itsLen-1];
    54:       }
    55:       else
    56:          return itsCString[offset];
    57:    }
    58:
    59:    // constant offset operator
    60:    char String::operator[](UINT offset) const
    61:    {
    62:       if (offset > itsLen)
    63:       {
    64:          //throw xOutOfBounds();
    65:          return itsCString[itsLen-1];
    66:       }
    67:       else
    68:          return itsCString[offset];
    69:    }
    70:
    71:
    72:
    73:    // changes current string, returns nothing
    74:    void String::operator+=(const String& rhs)
    75:    {
    76:       unsigned short rhsLen = rhs.GetLength();
    77:       unsigned short totalLen = itsLen + rhsLen;
    78:       char *temp = new char[totalLen+1];
    79:       for (UINT i = 0; i<itsLen; i++)
    80:          temp[i] = itsCString[i];
    81:       for (UINT j = 0; j<rhsLen; j++, i++)
    82:          temp[(UINT)i] = rhs[(UINT)j];
    83:       temp[totalLen]='\0';
    84:       *this = temp;
    85:    }
    86:
    87:    int String::StringCompare(const String& rhs) const
    88:    {
    89:          return strcmp(itsCString, rhs.GetString());
    90:    }
    91:
    92:    String String::operator+(const String& rhs)
    93:    {
    94:
    95:       char * newCString = new char[GetLength() + rhs.GetLength() + 1];
    96:       strcpy(newCString,GetString());
    97:       strcat(newCString,rhs.GetString());
    98:       String newString(newCString);
    99:       return newString;
    100:   }
    101:
    102:
    103:   String String::operator+(const char* rhs)
    104:   {
    105:
    106:      char * newCString = new char[GetLength() + strlen(rhs)+ 1];
    107:      strcpy(newCString,GetString());
    108:      strcat(newCString,rhs);
    109:      String newString(newCString);
    110:      return newString;
    111:   }
    112:
    113:
    114:   String String::operator+(char rhs)
    115:   {
    116:      int oldLen = GetLength();
    117:      char * newCString = new char[oldLen + 2];
    118:      strcpy(newCString,GetString());
    119:      newCString[oldLen] = rhs;
    120:      newCString[oldLen+1] = '\0';
    121:      String newString(newCString);
    122:      return newString;
    123:   }
    124:
    125:
    126:
    127:   BOOL String::operator==(const String& rhs) const
    128:   { return (BOOL) (StringCompare(rhs) == 0); }
    129:   BOOL String::operator!=(const String& rhs)const
    130:      { return (BOOL) (StringCompare(rhs) != 0); }
    131:   BOOL String::operator<(const String& rhs)const
    132:      { return (BOOL) (StringCompare(rhs) < 0); }
    133:   BOOL String::operator>(const String& rhs)const
    134:      { return (BOOL) (StringCompare(rhs) > 0); }
    135:   BOOL String::operator<=(const String& rhs)const
    136:      { return (BOOL) (StringCompare(rhs) <= 0); }
    137:   BOOL String::operator>=(const String& rhs)const
    138:      { return (BOOL) (StringCompare(rhs) >= 0); }
    139:
    140:   ostream& operator<< (ostream& ostr, const String& str)
    141:   {
    142:      ostr << str.itsCString;
    143:      return ostr;
    144:   }

Listing 20.3 Driver Program

    1:     #include <iostream.h>
    2:     #include "string.hpp"
    3:
    4:     int main()
    5:     {
    6:
    7:        String *CatName = new String("Fritz");
    8:        char buffer[100];
    9:        cout << "The cat name is " << *CatName;
    10:       cout << "\nEnter a new name for the cat: ";
    11:       cin.getline(buffer,100);
    12:       String *NewName = new String(buffer);
    13:       *CatName = *NewName;
    14:       delete NewName;
    15:
    16:       // String * MyName = new String("Jesse Liberty");
    17:
    18:       cout << "The cat name is " << *CatName;
    19:       return 0;
    20:    }
Output:
    The cat name is Fritz
    Enter a new name for the cat: Fido
    The cat name is Fido
    The cat name is Fritz
    Enter a new name for the cat: Fido
    The cat name is Jesse Liberty
Analysis:

The interface to String appears to be okay. Inspection of the implementation reveals no obvious problems, and the driver program is straightforward. When I ran the program the first time, it worked perfectly. I then uncommented line 16 in listing 20.3, and suddenly the program stopped working.

As you learned earlier in this chapter, the first approach might be to review the program and to walk through how it works. If this reveals no obvious solution, you might fire up the debugger, or you might pause and think about the bug for a while, speculating on what might cause this behavior.

A first hypothesis might be that somehow CatName is being replaced by MyName, or that somehow MyName is trashing the pointer to NewName.

The debugger clarifies the problem virtually instantly. I opened a watch window and put in

    *CatName
    *NewName
    *MyName

This put watch statements on all three strings. I also put a break point in line 9. After this line, CatName pointed to Fritz as expected. After line 12, CatName continued to point to Fritz and NewName pointed to Fido (also as expected). After line 13, both pointed to Fido.

After line 14 however, both pointers were trashed! And, even more surprising, after line 16, all three pointers pointed to Jesse Liberty! What was going on?

The answer, of course, is that the String class shown does not include an operator equals. Thus, the compiler-supplied shallow-copy operator is used. When line 13 executes, CatName and NewName are set to point to the same buffer in memory. When NewName then is deleted (properly), unfortunately the memory pointed to by CatName is deleted as well.

When the new string, in line 16, is allocated, the new data is assigned to the memory abandoned by NewName. This causes CatName to point to that area, and thus when CatName is printed, the contents of MyName are printed.

In the first run, when line 16 was commented out, nothing overwrote that area of memory, and so it was printed, but the pointer was a time bomb waiting to go off.

Note that even in the second run, the program didn't crash. If that area of memory was later used for something other than a string, however, the inadvertently deleted pointer could well bring the system to its knees.

The simple capability to set a watch on these three variables, and to step through the program seeing where and when the data was corrupted, was enough to make the cause of the bug and its solution immediately evident.

Summary

Today you learned the importance of developing professional debugging techniques. You learned the fundamentals of using a source-level debugger, and explored some of the powerful capabilities that such a system brings to your efforts to produce error-free code.

Q&A

Q: What is the difference between the debugger in my integrated development environment and the stand-alone debugger that came with my compiler?

A: Typically, development environments come with a somewhat simplified debugger that works with the editor and compiler in an integrated fashion, and a stand-alone debugger that provides extra features and capabilities.

Q: What is the difference between hard-mode and soft-mode debugging?

A: This answer may depend in detail on the system on which you are developing. Typically, however, this distinction refers to the memory model in which the debugging is accomplished. In hard-mode debugging, no other programs can run when the debugger runsit takes over the machine.

Q: What makes a debugger symbolic?

A: In the early days of programming in "high level" languages such as C, the debugger could not show the symbolic names of variables (myAge, for example) but only the actual memory address. It was up to the programmer to figure out which variable was which, based on map files provided by the linker.

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered, and exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix A, and make sure that you understand the answers before continuing to the next chapter.

Quiz

  1. What is the difference between a watch statement and a break point?

  2. What is the call stack?

  3. What is a TRACE macro?

  4. Why should you turn off optimization when debugging?

[Click here for Answers]

Exercises

  1. Identify the fence-post bugs in the following code. (Assume that there is a String class available that supports the operations asked of it.)
        char * CStrNewCopy(const char * src)
        {
            int size = strlen(src);
            char * retCStr = new char[size];
            strcpy(retCStr, src);
            return retCStr;
        }
    
        const int MaxNameLength = 30;
        const int NameCount = 10;
        String GlobalNameArray[NameCount];
    
        void FillGlobalNameArray()
        {
            cout << "Enter names\n";
            for (int i = 0; i<=NameCount; I++)
            {
                char buffer[MaxNameLength];
                cin.getline(buffer, MaxNameLength);
                GlobalNameArray[i] = buffer;
            }
        }
    

  2. Find the common error, by inspection. (Assume that Test really tests something meaningful, and that there is a point to this whole thing.)
        // Returns TRUE if test passes.
        BOOL Test(int theValue, const String & theString);
    
        void Foo()
        {
            cout << "Enter a value ";
    
            int x;
            cin >> x;
    
            cout << "Enter a String ";
            char buffer[30];
            cin.getline(buffer, 29);
            String * str = new String(buffer);
    
            if (Test(x, str))
            {
                cout << "Passed\n";
            }
            else
            {
                cout << "Failed\n";
                if (x == 0)
                    return;
            }
            delete str;
        }
    

  3. What follows is a piece of code and the log output from running it. Explain why the log shows one more constructor than destructor. (EX2003.CPP)
        #include <iostream.h>
        #include <math.h>
    
        #define LOG(x) cout << x << "\n"
    
        class Complex
        {
        public:
            Complex() : myReal(0.0), myImaginary(0.0)
                 {LOG("Complex c-tor");}
            Complex(double real, double imaginary)
                : myReal(real), myImaginary(imaginary)
                 {LOG("Complex c-tor 2");}
            ~Complex() {LOG("Complex d-tor");}
    
            double Absolute() const
                {return sqrt(myReal * myReal
                            + myImaginary * myImaginary);}
    
            int CompareAbsolute(Complex rhs) const;
        private:
            double myReal, myImaginary;
        };
    
        int Complex::CompareAbsolute(Complex rhs) const
        {
            double myAbs = Absolute();
            double rhsAbs = rhs.Absolute();
    
            if (myAbs < rhsAbs)
                return -1;
            else if (myAbs == rhsAbs)
                return 0;
            else
                return 1;
        }
    
        Complex glComplex(5.0,2.2);
        void main()
        {
            LOG("Starting main");
            Complex c1(2.1,5.5);
    
            int x = c1.CompareAbsolute(glComplex);
    
            LOG("Ending main");
        }
    
    
        // Log output
        Complex c-tor 2
        Starting main
        Complex c-tor 2
        Complex d-tor
        Ending main
        Complex d-tor
        Complex d-tor
    

  4. Find, by inspection, the bug in the following code used for an imaginary compression/decompression tool. The implementation of CompressCStr is elsewhere and is assumed to work as advertised.
        // This will compress src into target. It will write no
        // more than len bytes.
        // Returns the number of bytes actually written, or -1 if
        // there was not enough space.
        int CompressCStr(char * target, unsigned len, char * src);
    
        int CompressCStrArray(char * target, unsigned len, char ** array)
        {
            unsigned LenUsed = 0;
            unsigned TotalLenUsed = 0;
    
            // array is an array of pointers
            // Compress until array has a null pointer
            while(*array != 0)
            {
                LenUsed =
                 CompressCStr(target, len - TotalLenUsed, *array);
    
                if (LenUsed < 0)
                    return -1;
    
                TotalLenUsed+=LenUsed;
                target+=LenUsed;
                array++;
            }
            return TotalLenUsed;
        }
    

  5. The following function crashes under some unusual circumstances. (You don't know what those circumstances are, exactly. You only know that it is rare.) You have set break points in the debugger, but you never have been able to witness it crash. The most common symptom is that you get a protection fault on the last line (not counting the LOG).

    Your first step in debugging this code is to add the LOGs. After filling up a file with logs, you find that in the error case it reaches the first LOG successfully and crashes before the second LOG. What is the bug?

        void PrintMessage(int MessageID)
        {
            char * msg;
            switch (MessageID)
            {
            case 1:
                msg = "The first message."
                break;
            case 2:
                msg = "The second message."
                break;
            case 3:
                msg = "The third message."
                break;
            case 4:
                msg = "The fourth message."
                break;
            case 5:
                msg = "The fifth message."
                break;
            case 6:
                msg = "The sixth message."
                break;
            }
    
            LOG("About to output message");
            cout << msg << "\n";
            LOG("Done outputting message");
        }
    

Go to: Table of Contents | Next Page