Day 19: Writing Solid Code

All non-trivial programs have bugs at some point in the development cycle. The bigger the program, the more bugs, and many of those bugs actually "get out the door" and into final, released software. The job of the professional programmer is to make sure that the bugs are stomped before they are released to an unsuspecting customer.

As you will see on day 20, the earlier in the development process that you can find and eliminate bugs, the less expensive those bugs are. Day 20 focuses on finding and removing bugs. Today focuses on the other aspects of building professional-quality code.

Today you will learn

Writing It Isn't the Hard Part

It is possible--in fact, it often is quite easy--to write a program that behaves well when the customer does exactly what you expect him to do. Solid, professional-quality code, however, can handle even the most bizarre and unexpected customer behavior without crashing and burning.

If you are building a space shuttle or an F-111 fighter, it is reasonable to expect the pilot to get just about everything right. We are willing to spend tens of thousands of dollars in training so that the pilot doesn't eject when he means to lower the landing gear; we are less willing, however, to invest that much in training and practice to use a spreadsheet or a word processor.

This isn't to say that those who work on fighter-jet aircraft and nuclear power stations can be sloppy in their work. Although a nuclear engineer can be expected to understand how his equipment works, avoiding preventable errors certainly is a priority, as the citizens of Pennsylvania can assure you.

The $1,500-a-day consultant does not distinguish herself from the $25-per-hour novice by writing more impressive code, by producing code more quickly, or by shaving a few milliseconds of performance off a routine. Although these are important skills, there always will be someone else who is quicker or craftier.

The true professional brings a superset of these skills, and leaves the customer with a program that can grow and provide headache-free service for longer and at less cost than the cute hack thrown together by a novice. The hallmarks of a good program are robustness, extensibility, and maintainability.

Robustness

Robustness is the capability of your program to run and keep on running in the face of low memory, novice users, new hardware, unexpected conditions, and so on. Robustness does not arise spontaneously; it is cooked into the program from the beginning by the programmer who pays attention to it, and who values and invests in building bulletproof code.

This year, the Quality Assurance team at AT&T Interchange reported a bug in a routine I had written. I tried to reproduce the bug, but no matter what I did, I couldn't make it happen, so I marked the bug as NR (Not Reproducible). Two days later, the bug report was back on my desk, and Celia Fitzgerald, Manager of Quality Assurance, was at my desk.

"Open your dialog," she said. "Now, choose search by Date." I did, and an entry field opened. "Now enter -1."

Before I could stop myself, I said, "But no one would do that." She just smiled. Of course, a user had done exactly that, attempting to search for articles written the day before.

I dutifully entered -1 into the date field, chose Search, and watched my well-crafted and much-loved dialog box crash and burn. There were no survivors.

Defensive programming is the art of asking yourself, "What happens if the user..." and filling in every blank you can think of. If you ask for a string of 10 characters, consider what happens if users enter 11. What happens if they enter 100? What if they don't enter anything at all? What if they enter numbers?

The truth is that there is a cadre of sick, twisted, small-minded users who will spend hours trying to crash your code by putting in bizarre and unexpected data. In our company, these disturbed people are called Quality Assurance Engineers, and they are well paid and much respected, if not much loved, by the developers. We know, in our heart of hearts, that it is better for them to find our bugs than for our customers to find the bugs, but that doesn't mean we're necessarily happy to see them standing at our desks with triumphant smiles on their faces. When they are happy, you can bet we are miserable.

Extensibility

A program is extensible if you can add new features and behaviors without breaking the program or rearchitecting fundamental components. I'm not absolutely sure that you will find extensible in any dictionary, but every programmer I've ever met uses the word to mean that the program was written with growth and evolution in mind.

As an aside, the architecture of a program is the underlying structure, the set of metaphors and ideas that lend the program coherence. The architecture tells you which classes exist and how they interact with one another. When you rearchitect a program, you redesign its fundamental structure.

Extensibility is not an accident. A program must be designed from the start with extensibility in mind if you are not going to find yourself fighting the code every time you want to add a feature.

Extensibility also often brings a cost. A highly extensible program often is typified by a great deal of indirection and generality. Instead of hard coding behavior and values, a highly extensible program often is written so that new values and behaviors can be added to resource files or patched in using common, and thus general, interfaces.

Extensibility often is the litmus test of a well-crafted, object-oriented program. A program with many small and distinct objects that interact along narrow and well-defined interfaces is far more extensible than a jumble of interacting and mutually dependent objects whose interdependencies barely can be understood.

Maintainability

The software industry is changing rapidly, and a great deal of new code is being written every day. The truth is, however, that the vast majority of programmers are engaged in the difficult and often unrewarding task of maintaining legacy code. Legacy code is code written some years ago, usually by someone else.

Companies invest heavily in legacy code, and management is eager to amortize the investment over many years of use. Thus, legacy code often is patched and extended and cobbled together to solve new problems in a rapidly changing environment.

Highly maintainable code is easily read and understood by new programmers, and is marked by good documentation, meaningful comments, and a straightforward, portable coding style.

Often, an expert programmer will forgo a clever and efficient statement in order to make her intentions clearer.

Using Asserts

Although there is much wisdom shared among programmers in the creation of maintainable and extensible code, the first order of business is to ensure that what you do have is rock solid, bulletproof, and stable. That means testing your code at every opportunity.

Testing does not just involve running the program, or having others run it. It means putting tests right into your program, so that your program constantly monitors its own progress and reports on unexpected values or situations.

Because such tests can be expensive in performance degradation, programmers often create macros to run the tests in debug mode; these macros "disappear" when the program is compiled for release.

The simplest of these macros is the ASSERT macro. In its base form, the ASSERT macro evaluates the expression passed in and takes no action if it evaluates TRUE (0). If, on the other hand, the expression evaluates FALSE, the ASSERT macro takes action; the macro puts up an error message or throws an exception.

ASSERT macro definitions typically are surrounded by #ifdef DEBUG guards, which define ASSERT(x) to be nothing if DEBUG isn't defined. For those cases where you want the argument evaluated even in release mode, but you don't want the error action taken unless you are debugging, consider creating a second macro such as VERIFY.

ASSERT macros have one additional, side benefit. They serve as self-documentation to the code. If the programmer is ASSERTing that a value is non-zero, that tells you a great deal about what the programmer thinks is happening at that point in the program. Listing 19.1 shows this self-documenting effect.

Listing 19.1 Using Asserts

    1:  Page::Page(char *buffer):
    2:     myKeys((Index*)myVars.mk)
    3:  {
    4:     assert(sizeof(myBlock) == PageSize);
    5:     assert(sizeof(myVars) == PageSize);
    6:     memcpy(&myBlock,buffer,PageSize);
    7:     SetLocked(FALSE);
    8:     myTime = time(NULL);
    9:  }

Output:

None.

Analysis:

This excerpt from the implementation of the Page class seen in previous chapters has no comments or other documentation. Nonetheless, you can tell from the assert statements on lines 4 and 5 that the programmer believes that at this step of the program the PageSize must be equal to the size of the member variable myBlock and to the size of myVars. Note that line 6 takes advantage of this equality to copy myBlock bytes to buffer from PageSize.

Leave the Assert in There

Novice programmers often surround difficult areas of code with print statements or tests, and then strip them out when it is time to ship the code. The beauty of the assert statement is that there is no reason to remove it: it costs you nothing if debug isn't defined, and it comments your code. Leave in the assert statement, even long after the code is fully working. It's free, and you never know when you will need it.

Beware of Side Effects

Remember, however, that the assert statement will disappear when the release version is created. Don't be trapped by the common mistake of using an assert with a (desirable) side effect that then will disappear in the final version of your code. Listing 19.2 illustrates this problem.

Listing 19.2 Asserts with Side Effects

    1: SomeFunction(char * buff, char * buff2)
    2: {
    3:    int x = strlen(buff);
    4:    int y = strlen (buff2);
    4:    assert (y++ = x);  // oops
    6:    otherFunc(y);
    7: }

Output:

None.

Analysis:

The programmer intended to assert that buff2 is one character shorter than buff. Unfortunately, as part of the assert, y is incremented before it is passed to otherFunc. When the run-time version is created, the assert will become inoperative and the increment will not be performed, potentially breaking otherFunc.

Class Invariants

Most classes have data that must be in a determinable state in order for the class to be valid. An Employee class, for example, might always require a name and a social security number in order for it to be a valid Employee class.

These invariant attributes can and should be tested for each time you manipulate an object of the class. Typically, C++ programmers will create a method of the class called Invariants() that tests each of these criteria and returns a Boolean value TRUE if the class meets its requirements, and FALSE otherwise.

The programmer then can call invariants from within an assert statement: assert(Invariants());. If the class is in an unacceptable state and if debug is defined, this will display a helpful error dialog box, flagging the problem for the developer.

Many C++ programmers are scrupulous in their use of Invariants(), bracketing virtually every member function with calls to this important test.

Test the Return Value from New

Nearly all C++ programmers know that operator new returns NULL if it cannot allocate memory. Nonetheless, virtually no C++ programmers test the return value from operator new. It is rare to run out of memory, and it isn't always obvious what to do if you do get a NULL in return, but letting your program crash or hang probably is not the desired answer.

Make Destructors Virtual

If your class has one or more virtual functions, be sure to make the destructor virtual. If you create virtual functions, sooner or later you will subclass the object. At that point, it is possible to have a pointer to the base class, and to fill it with the address of a derived object. When you use the pointer, the "right thing" happens.

If, however, you then delete the pointer, and you have failed to create a virtual destructor, only a base class object will be deleted, and you will have created a memory leak.

There is little additional overhead to creating a virtual destructor once you already have a v-table, and the 4 bytes it typically costs will be well worth it when you delete the object.

Initialize Pointers to Zero

It is common to test a pointer to make sure that it is assigned a meaningful value before using it. It is imperative, however, that the pointer be initialized to zero at its creation, or the test will not work properly. When the pointer is deleted, it must be reset to zero for the same reason. Listing 19.3 illustrates the proper use of this technique.

Listing 19.3 Setting Pointers to Zero

    1:  CAT::CAT():
    2:       pMyOwner(0) // initialize to zero
    3:  {
    4:      if (!pMyOwner)
    5:       pMyOwner= new Person;
    6:       // other code
    7:      delete pMyOwner;
    8:       pMyOwner= 0;
    9:      // other code
    10:      if (pMyOwner)
    11:         pMyOwner->BuyCatFood();
    12: }
Analysis:

Between lines 7 and 10 in this example, there could be dozens of lines of code in which pMyOwner may or may not be assigned to a new Person object. In line 10, the program tests the pointer to make sure that it is valid before using it. If you did not assign 0 to that pointer in line 8, the test would be meaningless.

Similarly, without the initialization shown in line 2, the test in line 4 would not work properly. It is imperative that pointers be assigned 0 when the pointer is not valid.

Use Const Wherever You Can

The compiler is your friend. It will enforce the contracts and rules that you establish for your data and functions, reminding you when you use them in ways you had not intended originally.

To get the most help from your compiler, however, you must tell it when you don't expect to modify the data. The compiler then can warn you when you write over or modify that data.

Const can be used to signal many different requirements:

For example:

    const * const Person CAT::myFunction(const Person&) const;

This declaration should not look unfamiliar or excessive to you if you are using const a great deal. It says that myFunction is a member of the CAT class, that it takes a reference to a constant person and returns a constant pointer to a constant person, and that the function itself is constant.

A reference to a constant person means that the variable passed into this function cannot be changed by this function even though it is, for the sake of efficiency, being passed by reference.

A constant pointer to a constant person is much like a constant reference. The pointer cannot be reassigned to point to anyone else (like a reference cannot) and the pointer cannot be used to change the person object (like a constant reference cannot).

The fact that the function itself is const means that the this pointer of the CAT object is const, and thus the CAT object itself cannot be modified by this function.

Managing Object Creation

Any time you create a class with one or more pointers, be sure to write your own constructor that initializes the pointer. You also must write a destructor that deletes the pointer.

Be sure to create a copy constructor and an operator equals, both of which must perform "deep" copies of the pointed-to memory, allocating necessary memory for the new object and managing the freed memory in the case of the operator equals.

Initialize in the Order You Declare Your Member Variables

The C++ compiler will ignore the initialization order you set in your constructor and will initialize your objects in the order in which they are declared in the interface to the class. You should make a point, however, of listing the initializations in the order in which you declare your member variables, so that your code accurately documents what really is happening.

This only matters when one member depends on the value of another, but it can make a great difference when you are debugging and trying to determine the current value of a given variable.

Never, Ever, Return a Reference to a Local Object. Ever.

Remember that references can never be NULL. If you declare an object in your function and then return a reference to that object, to what does it refer when the function has popped off the stack?

Similarly, avoid references to objects on the heap that might be deleted. Dereferenced pointers are dangerous candidates for references. If the pointer is deleted and the memory is freed, you will be left with an illegal reference to a nonexistent object.

Create a Consistent Style

It is important to adopt a consistent coding style, although in many ways it doesn't matter which style you adopt. A consistent style makes it easier to guess what you meant by a particular part of the code, and you avoid having to look up whether you spelled the function with an initial capital letter the last time you invoked it.

More important, consistency in your code makes it easier for another programmer to take over maintenance and development of your code. If you always declare your member variables to begin with my, your local variables with a lowercase letter, and your functions with an uppercase letter, then names such as myAge, ageOfBook, and AgeBook are easier to categorize.

The following guidelines are arbitrary; they are based on the guidelines used in projects I have worked on in the past, and they have worked well. You just as easily can make up your own guidelines, but these will get you started.

Braces

How to align braces can be the most controversial topic between C and C++ programmers. Here are the tips I suggest:

Long Lines

Keep lines to the width that can be displayed on a single screen. Code that is off to the right easily is overlooked, and scrolling horizontally is annoying. When a line is broken, indent the following lines. Try to break the line at a reasonable place, and try to leave the intervening operator at the end of the preceding line (as opposed to the beginning of the following line) so that it is clear that the line does not stand alone and that there is more coming.

In C++, functions tend to be far shorter than they were in C, but the old, sound advice still applies. Try to keep your functions short enough to print the entire function on one page.

Switch Statements

Indent switches to conserve horizontal space, as shown in the following code:

    switch(variable)
    {
    case ValueOne:
         ActionOne();
         break;
    case ValueTwo:
         ActionTwo();
         break;
    default:
         assert("bad Action");
         break;
    }

Program Text

You can use several tips to create code that is easy to read. Code that is easy to read is easy to maintain. Keep these tips in mind:

Identifier Names

Here are some guidelines for working with identifiers:

Spelling and Capitalization of Names

Spelling and capitalization should not be overlooked when creating your own style. Some tips for these areas follow:

Comments

Comments can make it much easier to understand a program. Often, you will not work on a program for several days or even months, so you might forget what certain code does or why it has been included. Problems in understanding code also can occur when someone else reads your code. Comments that are applied in a consistent, well thought-out style can be well worth the effort. There are several tips to remember concerning comments:

Access

The way you access portions of your program also should be consistent. Some tips for accessing parts of your program follow:

Class Definitions

Try to keep the definitions of methods in the same order as the declarations. It makes things easier to find. Alternatively, consider alphabetizing the methods, both in the declaration and in the cpp file.

When defining a function, place the return type and all other modifiers on a previous line so that the class name and function name begin on the left margin. This makes it much easier to find functions.

Include Files

Try as hard as you can to keep from including files in header files. The ideal minimum is the header file for the class this one derives from. Other mandatory includes will be those for objects that are members of the class being declared. Classes that are merely pointed to or referenced only need forward references of the form.

Don't leave out an include file in a header just because you assume that whatever cpp file includes this one also will have the needed include.

Evolve Your Own Rules and Write Them Down

If you are working with other programmers, try to establish a set of style guidelines with which you can all work. This method will simplify greatly the job of exchanging code listings.

Reviewing Your Code

An essential component of writing good code is having your code, design, and documentation reviewed by others. A classic mistake in such reviews, however, is to mix design reviews and code reviews.

Design Reviews

A design review can be a very helpful experience, but it must be handled correctly. The first guideline is to review at the right time, late enough in the process to have thought through the major issues, but early enough that you can incorporate what you learn into the review.

A second essential for a successful review is that the design be summarized in a preliminary document circulated a few days before the review. The reviewers are obliged to read the document and to make notes; obviously, the document must be sufficiently complete that the reviewers are left with a correct impression of the overall design approach.

A danger in such review sessions is that the developer whose design is being reviewed can become very defensive. If your peers are reviewing your code, it is difficult to maintain your objectivity; if your boss is participating, it can be even more threatening.

One approach that has worked well is this: Don't try to solve any of the problems that arise during the review; just make a note and move on. The developer might want to talk about why she made one decision over another, but if a reviewer suggests an alternative approach, write it down and continue.

A delicate balance must be maintained between clarifying ideas on the one hand, and advocating and defending ideas on the other. The point of the review should be to clarify and articulate ideas, but evaluation of their merit should be left until after the meeting. If the developer is relatively senior, he might evaluate all the suggestions on his own, incorporating some and rejecting others based on his own judgment and experience. A more junior engineer might want to review the ideas with his manager, but the review meeting probably is not the right place.

Code Reviews

A code review is completely different from a design review. This time, code is circulated, and the reviewers assemble to discuss what they have found. The upfront investment by the reviewers is substantially greater; they must walk the code with a fine-tooth comb looking for syntax errors, memory leaks, logic flaws, and so on.

There is less room for judgment in a code review. The idea is not to question the overall approach, but to locate bugs and errors in the execution. The moderator, typically the developer whose code is being reviewed, must be careful to rule out of order comments on the design; the purpose must be a single-minded search for errors.

A good code review acts as a core sample, digging deep but with little breadth into the developer's product. The idea is not to find all the bugs in the program, but to uncover one or a few of each type of problem that might be represented in the rest of the program.

This is a good time to put emphasis on and get feedback about readability. Is the code well documented? Do the comments help? Are sections laid out in a clear and understandable fashion?

Knuth writes that good code can be read like a novel. Although C++ does not look like English to the uninitiated, the truth is that good code is easily understood.

Comments, of course, should be used to lay out the overall plot of the section of code and to clarify what might otherwise be a confusing section. The developer, however, should prefer that the code speak for itself whenever possible. As a simple example:

    int x = myObject.GetValue();   // get the age and assign it to the minimum age
                                      variable

is far less clear or maintainable than

    int MinAge = Employee.GetAge();

The second statement stands on its own, and because it needs no comment, there is no chance of the comment becoming obsolete and thus misleading.

Documentation Reviews

Documentation is like the weather, everyone complains when it is bad, but no one does anything about it. The truth is that investing the time and cost of writing good documentation is so difficult that few organizations get it right. Consider having the documentation for a set of code reviewed by someone other than the developer, preferably by the person who will be responsible for maintaining the code if the developer is not available.

Planning for Change

Developers often write their code like they always will be around to take care of it, to nurture it, and to clarify any confusion. The reality is that developers move on to other projects (or other jobs!) and someone else often is stuck trying to disentangle a particularly clever, read-obtuse, bit of code. The only antidote to this disease is to ensure that the code is maintainable while it is being developed, and especially at delivery.

Summary

Today you learned about writing robust, bulletproof, highly tested, extensible, maintainable code. You saw how to use ASSERT macros and Class Invariant() methods to add reliability to your code and how to test your assumptions.

Q&A

Q: What is the good of an assert if it will not be in the release code?

A: An assert is put into the code to alert you to problems during debugging. The point of an assert is to flag a bug. In theory, your release code will not have bugs, and thus will not need the assert. Then again, in theory, theory and practice are the same, but in practice they aren't.

Q: What is the difference between an assert and an exception?

A: An assert flags a bug in your code. An exception flags a predictable problem that must be handled properly. It is not a bug to run out of memory, but it is a bug to crash when you do run out of memory.

Q: What is the difference between maintainable and extensible code?

A: Maintainable code can be fixed. Extensible code can be extendedthat is, new functionality easily can be added.

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered, and exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix A, and make sure that you understand the answers before continuing to the next chapter.

Quiz

  1. How do you ensure that the assert will not be in the release code?

  2. What are side effects, and how do you prevent them?

  3. What do you do if you want to assert if something isn't TRUE, but you need the evaluation done in the release code?

  4. Where do you put the invariant() assertion?

  5. Why should destructors be virtual?

[Click here for Answers]

Exercises

  1. Write an ASSERT macro definition.

  2. Write a VERIFY macro definition.

  3. Write the class invariants for the String class shown in listings 3.1 and 3.2 in day 3.

  4. Rewrite the methods shown in exercise 3.2 to use the ASSERT macro you wrote in exercise 1 with the invariant methods you wrote in exercise 3.

Go to: Table of Contents | Next Page