Day 19: Writing Solid Code

All non-trivial programs have bugs at some point in the development cycle. The bigger the program, the more bugs, and many of those bugs actually "get out the door" and into final, released software. The job of the professional programmer is to make sure that the bugs are stomped before they are released to an unsuspecting customer.

As you will see on day 20, the earlier in the development process that you can find and eliminate bugs, the less expensive those bugs are. Day 20 focuses on finding and removing bugs. Today focuses on the other aspects of building professional-quality code.

Today you will learn

How to build robust code
How to build extensible code
How to build maintainable code

Writing It Isn't the Hard Part

It is possible--in fact, it often is quite easy--to write a program that behaves well when the customer does exactly what you expect him to do. Solid, professional-quality code, however, can handle even the most bizarre and unexpected customer behavior without crashing and burning.

If you are building a space shuttle or an F-111 fighter, it is reasonable to expect the pilot to get just about everything right. We are willing to spend tens of thousands of dollars in training so that the pilot doesn't eject when he means to lower the landing gear; we are less willing, however, to invest that much in training and practice to use a spreadsheet or a word processor.

This isn't to say that those who work on fighter-jet aircraft and nuclear power stations can be sloppy in their work. Although a nuclear engineer can be expected to understand how his equipment works, avoiding preventable errors certainly is a priority, as the citizens of Pennsylvania can assure you.

The $1,500-a-day consultant does not distinguish herself from the $25-per-hour novice by writing more impressive code, by producing code more quickly, or by shaving a few milliseconds of performance off a routine. Although these are important skills, there always will be someone else who is quicker or craftier.

The true professional brings a superset of these skills, and leaves the customer with a program that can grow and provide headache-free service for longer and at less cost than the cute hack thrown together by a novice. The hallmarks of a good program are robustness, extensibility, and maintainability.

Robustness

Robustness is the capability of your program to run and keep on running in the face of low memory, novice users, new hardware, unexpected conditions, and so on. Robustness does not arise spontaneously; it is cooked into the program from the beginning by the programmer who pays attention to it, and who values and invests in building bulletproof code.

This year, the Quality Assurance team at AT&T Interchange reported a bug in a routine I had written. I tried to reproduce the bug, but no matter what I did, I couldn't make it happen, so I marked the bug as NR (Not Reproducible). Two days later, the bug report was back on my desk, and Celia Fitzgerald, Manager of Quality Assurance, was at my desk.

"Open your dialog," she said. "Now, choose search by Date." I did, and an entry field opened. "Now enter -1."

Before I could stop myself, I said, "But no one would do that." She just smiled. Of course, a user had done exactly that, attempting to search for articles written the day before.

I dutifully entered -1 into the date field, chose Search, and watched my well-crafted and much-loved dialog box crash and burn. There were no survivors.

Defensive programming is the art of asking yourself, "What happens if the user..." and filling in every blank you can think of. If you ask for a string of 10 characters, consider what happens if users enter 11. What happens if they enter 100? What if they don't enter anything at all? What if they enter numbers?

The truth is that there is a cadre of sick, twisted, small-minded users who will spend hours trying to crash your code by putting in bizarre and unexpected data. In our company, these disturbed people are called Quality Assurance Engineers, and they are well paid and much respected, if not much loved, by the developers. We know, in our heart of hearts, that it is better for them to find our bugs than for our customers to find the bugs, but that doesn't mean we're necessarily happy to see them standing at our desks with triumphant smiles on their faces. When they are happy, you can bet we are miserable.

Extensibility

A program is extensible if you can add new features and behaviors without breaking the program or rearchitecting fundamental components. I'm not absolutely sure that you will find extensible in any dictionary, but every programmer I've ever met uses the word to mean that the program was written with growth and evolution in mind.

As an aside, the architecture of a program is the underlying structure, the set of metaphors and ideas that lend the program coherence. The architecture tells you which classes exist and how they interact with one another. When you rearchitect a program, you redesign its fundamental structure.

Extensibility is not an accident. A program must be designed from the start with extensibility in mind if you are not going to find yourself fighting the code every time you want to add a feature.

Extensibility also often brings a cost. A highly extensible program often is typified by a great deal of indirection and generality. Instead of hard coding behavior and values, a highly extensible program often is written so that new values and behaviors can be added to resource files or patched in using common, and thus general, interfaces.

Extensibility often is the litmus test of a well-crafted, object-oriented program. A program with many small and distinct objects that interact along narrow and well-defined interfaces is far more extensible than a jumble of interacting and mutually dependent objects whose interdependencies barely can be understood.

Maintainability

The software industry is changing rapidly, and a great deal of new code is being written every day. The truth is, however, that the vast majority of programmers are engaged in the difficult and often unrewarding task of maintaining legacy code. Legacy code is code written some years ago, usually by someone else.

Companies invest heavily in legacy code, and management is eager to amortize the investment over many years of use. Thus, legacy code often is patched and extended and cobbled together to solve new problems in a rapidly changing environment.

Highly maintainable code is easily read and understood by new programmers, and is marked by good documentation, meaningful comments, and a straightforward, portable coding style.

Often, an expert programmer will forgo a clever and efficient statement in order to make her intentions clearer.

Using Asserts

Although there is much wisdom shared among programmers in the creation of maintainable and extensible code, the first order of business is to ensure that what you do have is rock solid, bulletproof, and stable. That means testing your code at every opportunity.

Testing does not just involve running the program, or having others run it. It means putting tests right into your program, so that your program constantly monitors its own progress and reports on unexpected values or situations.

Because such tests can be expensive in performance degradation, programmers often create macros to run the tests in debug mode; these macros "disappear" when the program is compiled for release.

The simplest of these macros is the ASSERT macro. In its base form, the ASSERT macro evaluates the expression passed in and takes no action if it evaluates TRUE (0). If, on the other hand, the expression evaluates FALSE, the ASSERT macro takes action; the macro puts up an error message or throws an exception.

ASSERT macro definitions typically are surrounded by #ifdef DEBUG guards, which define ASSERT(x) to be nothing if DEBUG isn't defined. For those cases where you want the argument evaluated even in release mode, but you don't want the error action taken unless you are debugging, consider creating a second macro such as VERIFY.

ASSERT macros have one additional, side benefit. They serve as self-documentation to the code. If the programmer is ASSERTing that a value is non-zero, that tells you a great deal about what the programmer thinks is happening at that point in the program. Listing 19.1 shows this self-documenting effect.

Listing 19.1 Using Asserts

    1:  Page::Page(char *buffer):
    2:     myKeys((Index*)myVars.mk)
    3:  {
    4:     assert(sizeof(myBlock) == PageSize);
    5:     assert(sizeof(myVars) == PageSize);
    6:     memcpy(&myBlock,buffer,PageSize);
    7:     SetLocked(FALSE);
    8:     myTime = time(NULL);
    9:  }

Output:

None.

Analysis:

This excerpt from the implementation of the Page class seen in previous chapters has no comments or other documentation. Nonetheless, you can tell from the assert statements on lines 4 and 5 that the programmer believes that at this step of the program the PageSize must be equal to the size of the member variable myBlock and to the size of myVars. Note that line 6 takes advantage of this equality to copy myBlock bytes to buffer from PageSize.

Leave the Assert in There

Novice programmers often surround difficult areas of code with print statements or tests, and then strip them out when it is time to ship the code. The beauty of the assert statement is that there is no reason to remove it: it costs you nothing if debug isn't defined, and it comments your code. Leave in the assert statement, even long after the code is fully working. It's free, and you never know when you will need it.

Beware of Side Effects

Remember, however, that the assert statement will disappear when the release version is created. Don't be trapped by the common mistake of using an assert with a (desirable) side effect that then will disappear in the final version of your code. Listing 19.2 illustrates this problem.

Listing 19.2 Asserts with Side Effects

    1: SomeFunction(char * buff, char * buff2)
    2: {
    3:    int x = strlen(buff);
    4:    int y = strlen (buff2);
    4:    assert (y++ = x);  // oops
    6:    otherFunc(y);
    7: }

Output:

None.

Analysis:

The programmer intended to assert that buff2 is one character shorter than buff. Unfortunately, as part of the assert, y is incremented before it is passed to otherFunc. When the run-time version is created, the assert will become inoperative and the increment will not be performed, potentially breaking otherFunc.

Class Invariants

Most classes have data that must be in a determinable state in order for the class to be valid. An Employee class, for example, might always require a name and a social security number in order for it to be a valid Employee class.

These invariant attributes can and should be tested for each time you manipulate an object of the class. Typically, C++ programmers will create a method of the class called Invariants() that tests each of these criteria and returns a Boolean value TRUE if the class meets its requirements, and FALSE otherwise.

The programmer then can call invariants from within an assert statement: assert(Invariants());. If the class is in an unacceptable state and if debug is defined, this will display a helpful error dialog box, flagging the problem for the developer.

Many C++ programmers are scrupulous in their use of Invariants(), bracketing virtually every member function with calls to this important test.

Test the Return Value from New

Nearly all C++ programmers know that operator new returns NULL if it cannot allocate memory. Nonetheless, virtually no C++ programmers test the return value from operator new. It is rare to run out of memory, and it isn't always obvious what to do if you do get a NULL in return, but letting your program crash or hang probably is not the desired answer.

Make Destructors Virtual

If your class has one or more virtual functions, be sure to make the destructor virtual. If you create virtual functions, sooner or later you will subclass the object. At that point, it is possible to have a pointer to the base class, and to fill it with the address of a derived object. When you use the pointer, the "right thing" happens.

If, however, you then delete the pointer, and you have failed to create a virtual destructor, only a base class object will be deleted, and you will have created a memory leak.

There is little additional overhead to creating a virtual destructor once you already have a v-table, and the 4 bytes it typically costs will be well worth it when you delete the object.

Initialize Pointers to Zero

It is common to test a pointer to make sure that it is assigned a meaningful value before using it. It is imperative, however, that the pointer be initialized to zero at its creation, or the test will not work properly. When the pointer is deleted, it must be reset to zero for the same reason. Listing 19.3 illustrates the proper use of this technique.

Listing 19.3 Setting Pointers to Zero

    1:  CAT::CAT():
    2:       pMyOwner(0) // initialize to zero
    3:  {
    4:      if (!pMyOwner)
    5:       pMyOwner= new Person;
    6:       // other code
    7:      delete pMyOwner;
    8:       pMyOwner= 0;
    9:      // other code
    10:      if (pMyOwner)
    11:         pMyOwner->BuyCatFood();
    12: }

Analysis:

Between lines 7 and 10 in this example, there could be dozens of lines of code in which pMyOwner may or may not be assigned to a new Person object. In line 10, the program tests the pointer to make sure that it is valid before using it. If you did not assign 0 to that pointer in line 8, the test would be meaningless.

Similarly, without the initialization shown in line 2, the test in line 4 would not work properly. It is imperative that pointers be assigned 0 when the pointer is not valid.

Use Const Wherever You Can

The compiler is your friend. It will enforce the contracts and rules that you establish for your data and functions, reminding you when you use them in ways you had not intended originally.

To get the most help from your compiler, however, you must tell it when you don't expect to modify the data. The compiler then can warn you when you write over or modify that data.

Const can be used to signal many different requirements:

A pointer will not point to other data.
A pointer will not change the data to which it points.
A member function's this pointer is constant.
A member function will not change its object.

For example:

    const * const Person CAT::myFunction(const Person&) const;

This declaration should not look unfamiliar or excessive to you if you are using const a great deal. It says that myFunction is a member of the CAT class, that it takes a reference to a constant person and returns a constant pointer to a constant person, and that the function itself is constant.

A reference to a constant person means that the variable passed into this function cannot be changed by this function even though it is, for the sake of efficiency, being passed by reference.

A constant pointer to a constant person is much like a constant reference. The pointer cannot be reassigned to point to anyone else (like a reference cannot) and the pointer cannot be used to change the person object (like a constant reference cannot).

The fact that the function itself is const means that the this pointer of the CAT object is const, and thus the CAT object itself cannot be modified by this function.

Managing Object Creation

Any time you create a class with one or more pointers, be sure to write your own constructor that initializes the pointer. You also must write a destructor that deletes the pointer.

Be sure to create a copy constructor and an operator equals, both of which must perform "deep" copies of the pointed-to memory, allocating necessary memory for the new object and managing the freed memory in the case of the operator equals.

Initialize in the Order You Declare Your Member Variables

The C++ compiler will ignore the initialization order you set in your constructor and will initialize your objects in the order in which they are declared in the interface to the class. You should make a point, however, of listing the initializations in the order in which you declare your member variables, so that your code accurately documents what really is happening.

This only matters when one member depends on the value of another, but it can make a great difference when you are debugging and trying to determine the current value of a given variable.

Never, Ever, Return a Reference to a Local Object. Ever.

Remember that references can never be NULL. If you declare an object in your function and then return a reference to that object, to what does it refer when the function has popped off the stack?

Similarly, avoid references to objects on the heap that might be deleted. Dereferenced pointers are dangerous candidates for references. If the pointer is deleted and the memory is freed, you will be left with an illegal reference to a nonexistent object.

Create a Consistent Style

It is important to adopt a consistent coding style, although in many ways it doesn't matter which style you adopt. A consistent style makes it easier to guess what you meant by a particular part of the code, and you avoid having to look up whether you spelled the function with an initial capital letter the last time you invoked it.

More important, consistency in your code makes it easier for another programmer to take over maintenance and development of your code. If you always declare your member variables to begin with my, your local variables with a lowercase letter, and your functions with an uppercase letter, then names such as myAge, ageOfBook, and AgeBook are easier to categorize.

The following guidelines are arbitrary; they are based on the guidelines used in projects I have worked on in the past, and they have worked well. You just as easily can make up your own guidelines, but these will get you started.

Braces

How to align braces can be the most controversial topic between C and C++ programmers. Here are the tips I suggest:

Matching braces should be aligned vertically.
The outermost set of braces in a definition or declaration should be at the left margin. Statements within should be indented. All other sets of braces should be in line with their leading statements.

No code should appear on the same line as a brace, as shown in the following example:

    if (condition==true)
    {
        j = k;
        SomeFunction();
    }
    m++;

Long Lines

Keep lines to the width that can be displayed on a single screen. Code that is off to the right easily is overlooked, and scrolling horizontally is annoying. When a line is broken, indent the following lines. Try to break the line at a reasonable place, and try to leave the intervening operator at the end of the preceding line (as opposed to the beginning of the following line) so that it is clear that the line does not stand alone and that there is more coming.

In C++, functions tend to be far shorter than they were in C, but the old, sound advice still applies. Try to keep your functions short enough to print the entire function on one page.

Switch Statements

Indent switches to conserve horizontal space, as shown in the following code:

    switch(variable)
    {
    case ValueOne:
         ActionOne();
         break;
    case ValueTwo:
         ActionTwo();
         break;
    default:
         assert("bad Action");
         break;
    }

Program Text

You can use several tips to create code that is easy to read. Code that is easy to read is easy to maintain. Keep these tips in mind:

Use white space to help readability.
Objects and arrays really are referring to one thing. Don't use spaces within object references (., ->, []).
Unary operators are associated with their operand, so don't put a space between them. Do put a space on the side away from the operand. Unary operators include !, ~, ++, --, -, * (for pointers), & (casts), and sizeof.
Binary operators should have spaces on both sides: +, =, *, /, %, >>, <<, <, >, ==, !=, &, |, &&, ||, ?:, =, +=, and so on.
Don't use lack of spaces to indicate precedence (4+ 3*2).
Put a space after commas and semicolons not before.
A parenthesis should not have spaces on either side.
Keywords, such as if, should be set off by a space. For example, if (a == b).
The body of a comment should be set off from the // with a space.
Place the pointer or reference indicator next to the type name--not the variable name:
```
    char* foo;
    int& theInt;
```
rather than
```
    char *foo;
    int &theInt;
```
Do not declare more than one variable on the same line.

Identifier Names

Here are some guidelines for working with identifiers:

Identifier names should be long enough to be descriptive.
Avoid cryptic abbreviations.
Take the time and energy to spell things out.
Short names (i, p, x, and so on) should be used only where their brevity makes the code more readable and where the usage is so obvious that a descriptive name is not needed.
The length of a variable's name should be proportional to its scope.
Make sure that identifiers look and sound different from one another in order to minimize confusion.
Function (or method) names usually are verbs or verb-noun phrases, such as Search(), Reset(), FindParagraph(), or ShowCursor(). Variable names usually are abstract nouns, possibly with an additional noun--for example, count, state, windSpeed, or windowHeight. Boolean variables should be named appropriately (windowIconized or fileIsOpen, for example).

Spelling and Capitalization of Names

Spelling and capitalization should not be overlooked when creating your own style. Some tips for these areas follow:

Use all uppercase letters and underscores to separate the logical words of names, such as SOURCE_FILE_TEMPLATE. Note, however, that these are rare in C++. Consider using constants and templates in most cases.
All other identifiers should use mixed case and no underscores. Function names, methods, classes, typedef, and struct names should begin with a capital letter. Elements like data members or locals should begin with a lowercase letter.
Enumerated constants should begin with a few lowercase letters as an abbreviation for the enum, as shown in the following code:
```
    enum TextStyle
    {
        tsPlain,
        tsBold,
        tsItalic,
        tsUnderscore,
    };
```

Comments

Comments can make it much easier to understand a program. Often, you will not work on a program for several days or even months, so you might forget what certain code does or why it has been included. Problems in understanding code also can occur when someone else reads your code. Comments that are applied in a consistent, well thought-out style can be well worth the effort. There are several tips to remember concerning comments:

Wherever possible, use C++ // comments rather than the /* */ style. Reserve C-style comments for blocking out sections of code.
Higher level comments are infinitely more important than process details. Add value; do not merely restate the code. For example:
```
    n++;   // n is incremented by one
```
This comment isn't worth the time it takes to type it in. Concentrate on the semantics of functions and blocks of code. Say what a function does. Indicate side effects, types of parameters, and return values. Describe all assumptions that are made (or not made), such as assumes n is non-negative or will return -1 if x is invalid. Within complex logic, use comments to indicate the conditions that exist at that point in the code.
Use complete English sentences with appropriate punctuation and capitalization. The extra typing is worth it. Don't be overly cryptic and don't abbreviate. What seems exceedingly clear to you as you write code will be amazingly obtuse in a few months.
Use blank lines freely to help the reader understand what is going on. Separate statements into logical groups.

Access

The way you access portions of your program also should be consistent. Some tips for accessing parts of your program follow:

Always use public:, private:, and protected: labels. Don't rely on the defaults.
List the public members first, followed by the protected members, followed by the private members. List the data members in a group after the methods.
Put the constructor(s) first in the appropriate section, followed by the destructor. List overloaded methods with the same name adjacent to each other. Group accessor functions together when possible.
Consider alphabetizing the method names within each group and alphabetizing the member variables. Be sure to alphabetize the file names in include statements.
Even though the use of the virtual keyword is optional when overriding, use it anyway. It helps to remind you that it is virtual, and also keeps the declaration consistent.
Reserve the keyword struct for those classes that contain only data and all of whose members should be public. An alternative is to avoid this term altogether.

Class Definitions

Try to keep the definitions of methods in the same order as the declarations. It makes things easier to find. Alternatively, consider alphabetizing the methods, both in the declaration and in the cpp file.

When defining a function, place the return type and all other modifiers on a previous line so that the class name and function name begin on the left margin. This makes it much easier to find functions.

Include Files

Try as hard as you can to keep from including files in header files. The ideal minimum is the header file for the class this one derives from. Other mandatory includes will be those for objects that are members of the class being declared. Classes that are merely pointed to or referenced only need forward references of the form.

Don't leave out an include file in a header just because you assume that whatever cpp file includes this one also will have the needed include.

Evolve Your Own Rules and Write Them Down

If you are working with other programmers, try to establish a set of style guidelines with which you can all work. This method will simplify greatly the job of exchanging code listings.

Reviewing Your Code

An essential component of writing good code is having your code, design, and documentation reviewed by others. A classic mistake in such reviews, however, is to mix design reviews and code reviews.

Design Reviews

A design review can be a very helpful experience, but it must be handled correctly. The first guideline is to review at the right time, late enough in the process to have thought through the major issues, but early enough that you can incorporate what you learn into the review.

A second essential for a successful review is that the design be summarized in a preliminary document circulated a few days before the review. The reviewers are obliged to read the document and to make notes; obviously, the document must be sufficiently complete that the reviewers are left with a correct impression of the overall design approach.

A danger in such review sessions is that the developer whose design is being reviewed can become very defensive. If your peers are reviewing your code, it is difficult to maintain your objectivity; if your boss is participating, it can be even more threatening.

One approach that has worked well is this: Don't try to solve any of the problems that arise during the review; just make a note and move on. The developer might want to talk about why she made one decision over another, but if a reviewer suggests an alternative approach, write it down and continue.

A delicate balance must be maintained between clarifying ideas on the one hand, and advocating and defending ideas on the other. The point of the review should be to clarify and articulate ideas, but evaluation of their merit should be left until after the meeting. If the developer is relatively senior, he might evaluate all the suggestions on his own, incorporating some and rejecting others based on his own judgment and experience. A more junior engineer might want to review the ideas with his manager, but the review meeting probably is not the right place.

Code Reviews

A code review is completely different from a design review. This time, code is circulated, and the reviewers assemble to discuss what they have found. The upfront investment by the reviewers is substantially greater; they must walk the code with a fine-tooth comb looking for syntax errors, memory leaks, logic flaws, and so on.

There is less room for judgment in a code review. The idea is not to question the overall approach, but to locate bugs and errors in the execution. The moderator, typically the developer whose code is being reviewed, must be careful to rule out of order comments on the design; the purpose must be a single-minded search for errors.

A good code review acts as a core sample, digging deep but with little breadth into the developer's product. The idea is not to find all the bugs in the program, but to uncover one or a few of each type of problem that might be represented in the rest of the program.

This is a good time to put emphasis on and get feedback about readability. Is the code well documented? Do the comments help? Are sections laid out in a clear and understandable fashion?

Knuth writes that good code can be read like a novel. Although C++ does not look like English to the uninitiated, the truth is that good code is easily understood.

Comments, of course, should be used to lay out the overall plot of the section of code and to clarify what might otherwise be a confusing section. The developer, however, should prefer that the code speak for itself whenever possible. As a simple example:

    int x = myObject.GetValue();   // get the age and assign it to the minimum age
                                      variable

is far less clear or maintainable than

    int MinAge = Employee.GetAge();

The second statement stands on its own, and because it needs no comment, there is no chance of the comment becoming obsolete and thus misleading.

Documentation Reviews

Documentation is like the weather, everyone complains when it is bad, but no one does anything about it. The truth is that investing the time and cost of writing good documentation is so difficult that few organizations get it right. Consider having the documentation for a set of code reviewed by someone other than the developer, preferably by the person who will be responsible for maintaining the code if the developer is not available.

Planning for Change

Developers often write their code like they always will be around to take care of it, to nurture it, and to clarify any confusion. The reality is that developers move on to other projects (or other jobs!) and someone else often is stuck trying to disentangle a particularly clever, read-obtuse, bit of code. The only antidote to this disease is to ensure that the code is maintainable while it is being developed, and especially at delivery.

Summary

Today you learned about writing robust, bulletproof, highly tested, extensible, maintainable code. You saw how to use ASSERT macros and Class Invariant() methods to add reliability to your code and how to test your assumptions.

Q&A

Q: What is the good of an assert if it will not be in the release code?

A: An assert is put into the code to alert you to problems during debugging. The point of an assert is to flag a bug. In theory, your release code will not have bugs, and thus will not need the assert. Then again, in theory, theory and practice are the same, but in practice they aren't.

Q: What is the difference between an assert and an exception?

A: An assert flags a bug in your code. An exception flags a predictable problem that must be handled properly. It is not a bug to run out of memory, but it is a bug to crash when you do run out of memory.

Q: What is the difference between maintainable and extensible code?

A: Maintainable code can be fixed. Extensible code can be extendedthat is, new functionality easily can be added.

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered, and exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix A, and make sure that you understand the answers before continuing to the next chapter.

Quiz

How do you ensure that the assert will not be in the release code?
What are side effects, and how do you prevent them?
What do you do if you want to assert if something isn't TRUE, but you need the evaluation done in the release code?
Where do you put the invariant() assertion?
Why should destructors be virtual?

[Click here for Answers]

Exercises

Write an ASSERT macro definition.
Write a VERIFY macro definition.
Write the class invariants for the String class shown in listings 3.1 and 3.2 in day 3.
Rewrite the methods shown in exercise 3.2 to use the ASSERT macro you wrote in exercise 1 with the invariant methods you wrote in exercise 3.

Go to: Table of Contents | Next Page