Ada 95 Quality and Style Guide Chapter 3
- Put the name of the program unit in the header.
- Record portability issues in the header.
- Summarize complex algorithms in the header.
- Record reasons for significant or controversial implementation decisions.
- Record discarded implementation alternatives, along with the reason for discarding them.
- Record anticipated changes in the header, especially if some work has already been
done to the code to make the changes easy to accomplish.
example
------------------------------------------------------------------------ -- Autolayout -- Implementation Notes: -- - This package uses a heuristic algorithm to minimize the number -- of arc crossings. It does not always achieve the true minimum -- number which could theoretically be reached. However it does a -- nearly perfect job in relatively little time. For details about -- the algorithm, see ... -- Portability Issues: -- - The native math package Math_Lib is used for computations of -- coordinate positions. -- - 32-bit integers are required. -- - No operating system specific routines are called. -- Anticipated Changes: -- - Coordinate_Type below could be changed from integer to float -- with little effort. Care has been taken to not depend on the -- specific characteristics of integer arithmetic. ------------------------------------------------------------------------ package body Autolayout is ... --------------------------------------------------------------------- -- Define -- Implementation Notes: -- - This routine stores a node in the general purpose Graph data -- structure, not the Fast_Graph structure because ... --------------------------------------------------------------------- procedure Define (New_Node : in Node) is begin ... end Define; --------------------------------------------------------------------- -- Layout -- Implementation Notes: -- - This routine copies the Graph data structure (optimized for -- fast random access) into the Fast_Graph data structure -- (optimized for fast sequential iteration), then performs the -- layout, and copies the data back to the Graph structure. This -- technique was introduced as an optimization when the algorithm -- was found to be too slow, and it produced an order of -- magnitude improvement. --------------------------------------------------------------------- procedure Layout is begin ... end Layout; --------------------------------------------------------------------- -- Position_Of --------------------------------------------------------------------- function Position_Of (Current : in Node) return Position is begin ... end Position_Of; ... end Autolayout;
rationale
The purpose of a header comment on the body of a program unit is to help the maintainer of the program unit to understand the implementation of the unit, including tradeoffs among different techniques. Be sure to document all decisions made during implementation to prevent the maintainer from making the same mistakes you made. One of the most valuable comments to a maintainer is a clear description of why a change being considered will not work.
The header is also a good place to record portability concerns. The maintainer may have to port the software to a different environment and will benefit from a list of nonportable features. Furthermore, the act of collecting and recording portability issues focuses attention on these issues and may result in more portable code from the start.
Summarize complex algorithms in the header if the code is difficult to read or understand without such a summary, but do not merely paraphrase the code. Such duplication is unnecessary and hard to maintain. Similarly, do not repeat the information from the header of the program unit specification.
notes
It is often the case that a program unit is self-explanatory so that it does not require a body header to explain how it is implemented or why. In such a case, omit the header entirely, as in the case with Position_Of above. Be sure, however, that the header you omit truly contains no information. For example, consider the difference between the two header sections:
-- Implementation Notes: None.
and:
-- NonPortable Features: None.
The first is a message from the author to the maintainer saying "I can't think of anything else to tell you," while the second may mean "I guarantee that this unit is entirely portable."
Objects can be grouped by purpose and commented as:
... --------------------------------------------------------------------- -- Current position of the cursor in the currently selected text -- buffer, and the most recent position explicitly marked by the -- user. -- Note: It is necessary to maintain both current and desired -- column positions because the cursor cannot always be -- displayed in the desired position when moving between -- lines of different lengths. --------------------------------------------------------------------- Desired_Column : Column_Counter; Current_Column : Column_Counter; Current_Row : Row_Counter; Marked_Column : Column_Counter; Marked_Row : Row_Counter;
The conditions under which an exception is raised should be commented:
--------------------------------------------------------------------- -- Exceptions --------------------------------------------------------------------- Node_Already_Defined : exception; -- Raised when an attempt is made --| to define a node with an --| identifier which already --| defines a node. Node_Not_Defined : exception; -- Raised when a reference is --| made to a node which has --| not been defined.
Here is a more complex example, involving multiple record and access types that are used to form a complex data structure:
--------------------------------------------------------------------- -- These data structures are used to store the graph during the -- layout process. The overall organization is a sorted list of -- "ranks," each containing a sorted list of nodes, each containing -- a list of incoming arcs and a list of outgoing arcs. -- The lists are doubly linked to support forward and backward -- passes for sorting. Arc lists do not need to be doubly linked -- because order of arcs is irrelevant. -- The nodes and arcs are doubly linked to each other to support -- efficient lookup of all arcs to/from a node, as well as efficient -- lookup of the source/target node of an arc. --------------------------------------------------------------------- type Arc; type Arc_Pointer is access Arc; type Node; type Node_Pointer is access Node; type Node is record Id : Node_Pointer;-- Unique node ID supplied by the user. Arc_In : Arc_Pointer; Arc_Out : Arc_Pointer; Next : Node_Pointer; Previous : Node_Pointer; end record; type Arc is record ID : Arc_ID; -- Unique arc ID supplied by the user. Source : Node_Pointer; Target : Node_Pointer; Next : Arc_Pointer; end record; type Rank; type Rank_Pointer is access Rank; type Rank is record Number : Level_ID; -- Computed ordinal number of the rank. First_Node : Node_Pointer; Last_Node : Node_Pointer; Next : Rank_Pointer; Previous : Rank_Pointer; end record; First_Rank : Rank_Pointer; Last_Rank : Rank_Pointer;
rationale
It is very useful to add comments explaining the purpose, structure, and semantics of the data structures. Many maintainers look at the data structures first when trying to understand the implementation of a unit. Understanding the data that can be stored, along with the relationships between the different data items and the flow of data through the unit, is an important first step in understanding the details of the unit.
In the first example above, the names Current_Column and Current_Row are relatively self-explanatory. The name Desired_Column is also well chosen, but it leaves the reader wondering what the relationship is between the current column and the desired column. The comment explains the reason for having both.
Another advantage of commenting on the data declarations is that the single set of comments on a declaration can replace multiple sets of comments that might otherwise be needed at various places in the code where the data is manipulated. In the first example above, the comment briefly expands on the meaning of "current" and "marked." It states that the "current" position is the location of the cursor, the "current" position is in the current buffer, and the "marked" position was marked by the user. This comment, along with the mnemonic names of the variables, greatly reduces the need for comments at individual statements throughout the code.
It is important to document the full meaning of exceptions and under what conditions they can be raised, as shown in the second example above, especially when the exceptions are declared in a package specification. The reader has no other way to find out the exact meaning of the exception (without reading the code in the package body).
Grouping all the exceptions together, as shown in the second example, can provide the reader with the effect of a "glossary" of special conditions. This is useful when many different subprograms in the package can raise the same exceptions. For a package in which each exception can be raised by only one subprogram, it may be better to group related subprograms and exceptions together.
When commenting exceptions, it is better to describe the exception's meaning in general terms than to list all the subprograms that can cause the exception to be raised; such a list is harder to maintain. When a new routine is added, it is likely that these lists will not be updated. Also, this information is already present in the comments describing the subprograms, where all exceptions that can be raised by the subprogram should be listed. Lists of exceptions by subprogram are more useful and easier to maintain than lists of subprograms by exception.
In the third example, the names of the record fields are short and mnemonic, but they are not completely self-explanatory. This is often the case with complex data structures involving access types. There is no way to choose the record and field names so that they completely explain the overall organization of the records and pointers into a nested set of sorted lists. The comments shown are useful in this case. Without them, the reader would not know which lists are sorted, which lists are doubly linked, or why. The comments express the intent of the author with respect to this complex data structure. The maintainer still has to read the code if he wants to be sure that the double links are all properly maintained. Keeping this in mind when reading the code makes it much easier for the maintainer to find a bug where one pointer is updated and the opposite one is not.
See Guideline 9.3.1 for the rationale for documenting the use of redispatching operations. (Redispatching means converting an argument of one primitive operation to a class-wide type and making a dispatching call to another primitive operation.) The rationale in Guideline 9.3.1 discusses whether such documentation should be in the specification or the body.
The following is an example of very poorly commented code:
... -- Loop through all the strings in the array Strings, converting -- them to integers by calling Convert_To_Integer on each one, -- accumulating the sum of all the values in Sum, and counting them -- in Count. Then divide Sum by Count to get the average and store -- it in Average. Also, record the maximum number in the global -- variable Max_Number. for I in Strings'Range loop -- Convert each string to an integer value by looping through -- the characters which are digits, until a nondigit is found, -- taking the ordinal value of each, subtracting the ordinal value -- of '0', and multiplying by 10 if another digit follows. Store -- the result in Number. Number := Convert_To_Integer(Strings(I)); -- Accumulate the sum of the numbers in Total. Sum := Sum + Number; -- Count the numbers. Count := Count + 1; -- Decide whether this number is more than the current maximum. if Number > Max_Number then -- Update the global variable Max_Number. Max_Number := Number; end if; end loop; -- Compute the average. Average := Sum / Count;
The following is improved by not repeating things in the comments that are obvious from the code, not describing the details of what goes in inside of Convert_To_Integer, deleting an erroneous comment (the one on the statement that accumulates the sum), and making the few remaining comments more visually distinct from the code.
Sum_Integers_Converted_From_Strings: for I in Strings'Range loop Number := Convert_To_Integer(Strings(I)); Sum := Sum + Number; Count := Count + 1; -- The global Max_Number is computed here for efficiency. if Number > Max_Number then Max_Number := Number; end if; end loop Sum_Integers_Converted_From_Strings; Average := Sum / Count;
rationale
The improvements shown in the example are not improvements merely by reducing the total number of comments; they are improvements by reducing the number of useless comments.
Comments that paraphrase or explain obvious aspects of the code have no value. They are a waste of effort for the author to write and the maintainer to update. Therefore, they often end up becoming incorrect. Such comments also clutter the code, hiding the few important comments.
Comments describing what goes on inside another unit violate the principle of information hiding. The details about Convert_To_Integer (deleted above) are irrelevant to the calling unit, and they are better left hidden in case the algorithm ever changes. Examples explaining what goes on elsewhere in the code are very difficult to maintain and almost always become incorrect at the first code modification.
The advantage of making comments visually distinct from the code is that it makes the code easier to scan, and the few important comments stand out better. Highlighting unusual or special code features indicates that they are intentional. This assists maintainers by focusing attention on code sections that are likely to cause problems during maintenance or when porting the program to another implementation.
Comments should be used to document code that is nonportable, implementation-dependent, environment-dependent, or tricky in any way. They notify the reader that something unusual was put there for a reason. A beneficial comment would be one explaining a work around for a compiler bug. If you use a lower level (not "ideal" in the software engineering sense) solution, comment on it. Information included in the comments should state why you used that particular construct. Also include documentation on the failed attempts, for example, using a higher level structure. This kind of comment is useful to maintainers for historical purposes. You show the reader that a significant amount of thought went into the choice of a construct.
Finally, comments should be used to explain what is not present in the code as well as what is present. If you make a conscious decision to not perform some action, like deallocating a data structure with which you appear to be finished, be sure to add a comment explaining why not. Otherwise, a maintainer may notice the apparent omission and "correct" it later, thus introducing an error.
See also Guideline 9.3.1 for a discussion of what kind of documentation you should provide regarding tagged types and redispatching.
notes
Further improvements can be made on the above example by declaring the variables Count and Sum in a local block so that their scope is limited and their initializations occur near their usage, e.g., by naming the block Compute_Average or by moving the code into a function called Average_Of. The computation of Max_Number can also be separated from the computation of Average. However, those changes are the subject of other guidelines; this example is only intended to illustrate the proper use of comments.
if A_Found then ... elsif B_Found then ... else -- A and B were both not found ... if Count = Max then ... end if; ... end if; -- A_Found ------------------------------------------------------------------------ package body Abstract_Strings is ... --------------------------------------------------------------------- procedure Concatenate (...) is begin ... end Concatenate; --------------------------------------------------------------------- ... begin -- Abstract_Strings ... end Abstract_Strings; ------------------------------------------------------------------------
rationale
Marker comments emphasize the structure of code and make it easier to scan. They can be lines that separate sections of code or descriptive tags for a construct. They help the reader resolve questions about the current position in the code. This is more important for large units than for small ones. A short marker comment fits on the same line as the reserved word with which it is associated. Thus, it adds information without clutter.
The if, elsif, else, and end if of an if statement are often separated by long sequences of statements, sometimes involving other if statements. As shown in the first example, marker comments emphasize the association of the keywords of the same statement over a great visual distance. Marker comments are not necessary with the block statement and loop statement because the syntax of these statements allows them to be named with the name repeated at the end. Using these names is better than using marker comments because the compiler verifies that the names at the beginning and end match.
The sequence of statements of a package body is often very far from the first line of the package. Many subprogram bodies, each containing many begin lines, may occur first. As shown in the second example, the marker comment emphasizes the association of the begin with the package.
notes
Repeating names and noting conditional expressions clutters the code if overdone. It is visual distance, especially page breaks, that makes marker comments beneficial.
subtype Card_Image is String (1 .. 80); Input_Line : Card_Image := (others => ' '); -- restricted integer type: type Day_Of_Leap_Year is range 1 .. 366; subtype Day_Of_Non_Leap_Year is Day_Of_Leap_Year range 1 .. 365;
By the following declaration, the programmer means, "I haven't the foggiest idea how many," but the actual base range will show up buried in the code or as a system parameter:
Employee_Count : Integer;
rationale
Eliminating meaningless values from the legal range improves the compiler's ability to detect errors when an object is set to an invalid value. This also improves program readability. In addition, it forces you to carefully think about each use of objects declared to be of the subtype.
Different implementations provide different sets of values for most of the predefined types. A reader cannot determine the intended range from the predefined names. This situation is aggravated when the predefined names are overloaded.
The names of an object and its subtype can clarify their intended use and document low-level design decisions. The example above documents a design decision to restrict the software to devices whose physical parameters are derived from the characteristics of punch cards. This information is easy to find for any later changes, thus enhancing program maintainability.
You can rename a type by declaring a subtype without a constraint (Ada Reference Manual 1995, ยง8.5). You cannot overload a subtype name; overloading only applies to callable entities. Enumeration literals are treated as parameterless functions and so are included in this rule.
Types can have highly constrained sets of values without eliminating useful values. Usage as described in Guideline 5.3.1 eliminates many flag variables and type conversions within executable statements. This renders the program more readable while allowing the compiler to enforce strong typing constraints.
notes
Subtype declarations do not define new types, only constraints for existing types.
Any deviation from this guideline detracts from the advantages of the strong typing facilities of the Ada language.
exceptions
There are cases where you do not have a particular dependence on any range of numeric values. Such situations occur, for example, with array indices (e.g., a list whose size is not fixed by any particular semantics). See Guideline 7.2.1 for a discussion of appropriate uses of predefined types.
Use:
type Color is (Blue, Red, Green, Yellow);
rather than:
Blue : constant := 1; Red : constant := 2; Green : constant := 3; Yellow : constant := 4;
and add the following if necessary:
for Color use (Blue => 1, Red => 2, Green => 3, Yellow => 4);
rationale
Enumerations are more robust than numeric codes; they leave less potential for errors resulting from incorrect interpretation and from additions to and deletions from the set of values during maintenance. Numeric codes are holdovers from languages that have no user-defined types.
In addition, Ada provides a number of attributes ('Pos, 'Val, 'Succ, 'Pred, 'Image, and 'Value) for enumeration types that, when used, are more reliable than user-written operations on encodings.
A numeric code may at first seem appropriate to match external values. Instead, these situations call for a representation clause on the enumeration type. The representation clause documents the "encoding." If the program is properly structured to isolate and encapsulate hardware dependencies (see Guideline 7.1.5), the numeric code ends up in an interface package where it can be easily found and replaced if the requirements change.
In general, avoid using representation clauses for enumeration types. When there is no obvious ordering of the enumeration literals, an enumeration representation can create portability problems if the enumeration type must be reordered to accommodate a change in representation order on the new platform.