home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-10-18 | 38.2 KB | 1,214 lines |
- .RP
- .TL
- Indian Hill C Style and Coding Standards
- .br
- as amended for U of T Zoology
- .UX
- .AU
- L.W. Cannon
- R.A. Elliott
- L.W. Kirchhoff
- J.H. Miller
- J.M. Milner
- R.W. Mitze
- E.P. Schan
- N.O. Whittington
- .AI
- Bell Labs
- .AU
- Henry Spencer
- .AI
- Zoology Computer Systems
- University of Toronto
- .AB
- This document is an annotated
- (by the last author) version of the original paper of the same title.
- It describes a set of coding standards and recommendations
- which are local standards for officially-supported
- .UX
- programs.
- The scope is coding style, not functional organization.
- .AE
- .\"--------------------
- .\" Footnote numbering
- .ds f \\u\\n+f\\d
- .nr f 0 1
- .ds F \\n+F.
- .nr F 0 1
- .\"--------------------
- .NH
- Introduction
- .PP
- This document is a result of a committee formed at Indian Hill to establish
- a common set of coding standards and recommendations for the Indian Hill
- community.
- The scope of this work is the coding style,
- not the functional organization of programs.
- The standards in this document are not specific to ESS programming
- only\*f.
- .FS
- .IP \*F
- In fact, they're pretty good general standards.
- ``To be clear is professional; not to be clear is unprofessional.'' \(em Sir
- Ernest Gowers.
- This document is presented unadulterated; U of T variations, comments,
- exceptions, etc. are presented in footnotes.
- .FE
- We have tried to combine previous work [1,6] on C style into a uniform
- set of standards that should be appropriate for any project using C\*f.
- .FS
- .IP \*F
- Of necessity, these standards cannot cover all situations.
- Experience and informed judgement count for much.
- Inexperienced programmers who encounter unusual situations should
- consult 1) code written by experienced C programmers following
- these rules, or 2) experienced C programmers.
- .FE
- .NH
- File Organization
- .PP
- A file consists of various sections that should be separated by
- several blank lines.
- Although there is no maximum length requirement for source files,
- files with more than about 1500 lines are cumbersome to deal with.
- The editor may not have enough temp space to edit the file,
- compilations will go slower,
- etc.
- Since most of us use 300 baud terminals,
- entire rows of asterisks, for example, should be discouraged\*f.
- .FS
- .IP \*F
- This is not a problem at U of T, or most other sensible places,
- but rows of asterisks are still annoying.
- .FE
- Also lines longer than 80 columns are not handled well by all terminals
- and should be avoided if possible\*f.
- .FS
- .IP \*F
- Excessively long lines which result from deep indenting are often
- a symptom of poorly-organized code.
- .FE
- .PP
- The suggested order of sections for a file is as follows:
- .IP 1.
- Any header file includes should be the first thing in the file.
- .IP 2.
- Immediately after the includes\*f should be a prologue that tells what
- .FS
- .IP \*F
- A common variation, in both Bell code and ours, is to reverse the
- order of sections 1 and 2.
- This is an acceptable practice.
- .FE
- is in that file.
- A description of the purpose of the objects in the files (whether
- they be functions, external data declarations or definitions, or
- something else) is more useful than a list of the object names.
- .IP 3.
- Any typedefs and defines that apply to the file as a whole are next.
- .IP 4.
- Next come the global (external) data declarations.
- If a set of defines applies to a particular piece of global data
- (such as a flags word), the defines should be immediately after
- the data declaration\*f.
- .FS
- .IP \*F
- Such defines should be indented to put the \fIdefine\fRs one level
- deeper than the first keyword of the declaration to which they apply.
- .FE
- .IP 5.
- The functions come last\*f.
- .FS
- .IP \*F
- They should be in some sort of meaningful order.
- Top-down is generally better than bottom-up, and a ``breadth-first''
- approach (functions on a similar level of abstraction together) is
- preferred over depth-first (functions defined as soon as possible
- after their calls).
- Considerable judgement is called for here.
- If defining large numbers of essentially-independent utility
- functions, consider alphabetical order.
- .FE
- .NH 2
- File Naming Conventions
- .PP
- .UX
- requires certain suffix conventions for names of files to be processed
- by the \fIcc\fR command [5]\*f.
- The following suffixes are required:
- .IP \(bu
- C source file names must end in \fI.c\fR
- .IP \(bu
- Assembler source file names must end in \fI.s\fR
- .PP
- In addition the following conventions are universally followed:
- .FS
- .IP \*F
- In addition to the suffix conventions given here,
- it is conventional to use `Makefile' (not `makefile') for the
- control file for \fImake\fR
- and
- `README' for a summary of the contents of a directory or directory
- tree.
- .FE
- .IP \(bu
- Relocatable object file names end in \fI.o\fR
- .IP \(bu
- Include header file names end in \fI.h\fR\ \*f or \fI.d\fR
- .FS
- .IP \*F
- Preferred.
- An alternate convention that may
- be preferable in multi-language environments
- is to use the same suffix as an ordinary source file
- but with two periods instead of one (e.g. ``foo..c'').
- .FE
- .IP \(bu
- Ldp\*f specification file names end in \fI.b\fR
- .FS
- .IP \*F
- No idea what this is.
- .FE
- .IP \(bu
- Yacc source file names end in \fI.y\fR
- .IP \(bu
- Lex source file names end in \fI.l\fR
- .NH
- Header Files
- .PP
- Header files are files that are included in other files prior to compilation
- by the C preprocessor.
- Some are defined at the system level like \fIstdio.h\fR which must be included
- by any program using the standard I/O library.
- Header files are also used to contain data declarations and defines
- that are needed by more than one program\*f.
- .FS
- .IP \*F
- Don't use absolute pathnames for header files.
- Use the \fI<name>\fR construction for getting them from a standard
- place, or define them relative to the current directory.
- The \fB\-I\fR option of the C compiler is the best way to handle
- extensive private libraries of header files; it permits reorganizing
- the directory structure without having to alter source files.
- .FE
- Header files should be functionally organized,
- i.e., declarations for separate subsystems should be in separate header files.
- Also, if a set of declarations is likely to change when code is
- ported from one machine to another, those declarations should be
- in a separate header file.
- .PP
- Header files should not be nested.
- Some objects like typedefs and initialized data definitions cannot be
- seen twice by the compiler in one compilation.
- On non-\c
- .UX
- systems this is also true of uninitialized declarations without the
- \fIextern\fR keyword\*f.
- .FS
- .IP \*F
- It should be noted that declaring variables in a header file
- is often a poor idea.
- Frequently it is a symptom of poor partitioning of code between files.
- .FE
- This can happen if include files are nested and will cause the
- compilation to fail.
- .NH
- External Declarations
- .PP
- External declarations should begin in column 1.
- Each declaration should be on a separate line.
- A comment describing the role of the object being declared should be
- included, with the exception that a list of defined constants do not
- need comments if the constant names are sufficient documentation.
- The comments should be tabbed so that they line up underneath each
- other\*f.
- .FS
- .IP \*F
- So should the constant names and their defined values.
- .FE
- Use the tab character (CTRL I if your terminal doesn't have a separate key)
- rather than blanks.
- For structure and union template declarations, each element should be alone
- on a line with a comment describing it.
- The opening brace (\ {\ ) should be on the same line as the structure
- tag, and the closing brace should be alone on a line in column 1, i.e.
- .DS L
- struct boat {
- int wllength; /* water line length in feet */
- int type; /* see below */
- long sarea; /* sail area in square feet */
- };
- /*
- * defines for boat.type\*f
- */
- #define KETCH 1
- #define YAWL 2
- #define SLOOP 3
- #define SQRIG 4
- #define MOTOR 5
- .DE
- .FS
- .IP \*F
- These defines are better put right after the declaration of \fItype\fR,
- within the \fIstruct\fR declaration, with enough tabs
- after # to indent \fIdefine\fR one level more than
- the structure member declarations.
- .FE
- If an external variable is initialized\*f the equal sign should not be omitted\*f.
- .FS
- .IP \*F
- Any variable whose initial value is important should be
- \fIexplicitly\fR initialized, or at the very least should be commented
- to indicate that C's default initialization to 0 is being relied on.
- .FE
- .FS
- .IP \*F
- The empty initializer, ``{}'', should never be used.
- Structure
- initializations should be fully parenthesized with braces.
- Constants used to initialize longs should be explicitly long.
- .FE
- .DS
- int x = 1;
- char *msg = "message";
- struct boat winner = {
- 40, /* water line length */
- YAWL,
- 600 /* sail area */
- };
- .DE
- \*f
- .FS
- .IP \*F
- In any file which is part of a larger whole rather than a self-contained
- program, maximum use should be made of the \fIstatic\fR keyword to make
- functions and variables local to single files.
- Variables in particular should be accessible from other files
- only when there is a clear
- need that cannot be filled in another way.
- Such usages should be commented to make it clear that another file's
- variables are being used; the comment should name the other file.
- .FE
- .NH
- Comments
- .PP
- Comments that describe data structures, algorithms, etc., should be
- in block comment form with the opening /* in column one, a * in column
- 2 before each line of comment text\*f, and the closing */ in columns 2-3.
- .FS
- .IP \*F
- Some automated
- program-analysis
- packages use a different character in this position as
- a marker for lines with specific items of information.
- In particular, a line with a `-' here in a comment preceding a function
- is sometimes assumed to be a one-line summary of the function's purpose.
- .FE
- .DS L
- /*
- * Here is a block comment.
- * The comment text should be tabbed over\*f
- * and the opening /* and closing star-slash
- * should be alone on a line.
- */
- .DE
- .FS
- .IP \*F
- A common practice in both Bell and local code is to
- use a space rather than a tab after the *.
- This is acceptable.
- .FE
- .PP
- Note that \fIgrep ^.\e*\fR will catch all block comments in the file.
- In some cases, block comments inside a function are appropriate, and
- they should be tabbed over to the same tab setting as the code that
- they describe.
- Short comments may appear on a single line indented over to the tab
- setting of the code that follows.
- .DS
- if (argc > 1) {
- /* Get input file from command line. */
- if (freopen(argv[1], "r", stdin) == NULL)
- error("can't open %s\en", argv[1]);
- }
- .DE
- .PP
- Very short comments may appear on the same line as the code they describe,
- but should be tabbed over far enough to separate them from the statements.
- If more than one short comment appears in a block of code they should all
- be tabbed to the same tab setting.
- .DS
- if (a == 2)
- return(TRUE); /* special case */
- else
- return(isprime(a)); /* works only for odd a */
- .DE
- .NH
- Function Declarations
- .PP
- Each function should be preceded by a block comment prologue that gives
- the name and a short description of what the function does\*f.
- .FS
- .IP \*F
- Discussion of non-trivial design decisions is also appropriate,
- but avoid duplicating information that is present in (and clear from)
- the code.
- It's too easy for such redundant information to get out of date.
- .FE
- If the function returns a value, the type of the value returned should
- be alone on a line in column 1 (do not default to \fIint\fR).
- If the function does not return a value then it should not be given
- a return type.
- If the value returned requires a long explanation, it should be given
- in the prologue; otherwise it can be on the same line as the return
- type, tabbed over.
- The function name and formal parameters should be alone on a line
- beginning in column 1.
- Each parameter should be declared (do not default to \fIint\fR),
- with a comment on a single line.
- The opening brace of the function body should also be alone on a line
- beginning in column 1.
- The function name, argument declaration list, and opening brace
- should be separated by a blank line\*f.
- .FS
- .IP \*F
- Neither Bell nor local code has ever included these separating blank
- lines, and it is not clear that they add anything useful.
- Leave them out.
- .FE
- All local declarations and code within the function body should be
- tabbed over at least one tab.
- .PP
- If the function uses any external variables, these should have their
- own declarations in the function body using the \fIextern\fR keyword.
- If the external variable is an array the array bounds must be repeated
- in the \fIextern\fR declaration.
- There should also be \fIextern\fR declarations for all functions called
- by a given function.
- This is particularly beneficial to someone picking up code written
- by another.
- If a function returns a value of type other than \fIint\fR,
- it is required by the compiler that such functions be declared before
- they are used.
- Having the \fIextern\fR delcaration in the calling function's
- declarations section avoids all such problems\*f.
- .FS
- .IP \*F
- These rules tend to produce a lot of clutter.
- Both Bell and local practice frequently omits \fIextern\fR declarations
- for \fIstatic\fR variables and functions.
- This is permitted.
- Omission of declarations for standard library routines
- is also permissible,
- although if they \fIare\fR declared it is better to declare them within
- the functions that use them rather than globally.
- .FE
- .PP
- In general each variable declaration should be on a separate line with
- a comment describing the role played by the variable in the function.
- If the variable is external or a parameter of type pointer which is
- changed by the function, that should be noted in the comment.
- All such comments for parameters and local variables should be
- tabbed so that they line up underneath each other.
- The declarations should be separated from the function's statements
- by a blank line.
- .PP
- A local variable should not be redeclared in nested blocks\*f.
- .FS
- .IP \*F
- In fact, avoid any local declarations that override declarations
- at higher levels.
- .FE
- Even though this is valid C, the potential confusion is enough
- that
- \fIlint\fR will complain about it when given the \fB\-h\fR option.
- .NH 2
- Examples
- .PP
- .DS L
- /*
- * skyblue()
- *
- * Determine if the sky is blue.
- */
-
- int /* TRUE or FALSE */
- skyblue()
-
- {
- extern int hour;
-
- if (hour < MORNING || hour > EVENING)
- return(FALSE); /* black */
- else
- return(TRUE); /* blue */
- }
- .DE
- .DS L
- /*
- * tail(nodep)
- *
- * Find the last element in the linked list
- * pointed to by nodep and return a pointer to it.
- */
-
- NODE * /* pointer to tail of list */
- tail(nodep)
-
- NODE *nodep; /* pointer to head of list */
-
- {
- register NODE *np; /* current pointer advances to NULL */
- register NODE *lp; /* last pointer follows np */
-
- np = lp = nodep;
- while ((np = np->next) != NULL)
- lp = np;
- return(lp);
- }
- .DE
- .NH
- Compound Statements
- .PP
- Compound statements are statements that contain lists of statements
- enclosed in braces.
- The enclosed list should be tabbed over one more than the tab position
- of the compound statement itself.
- The opening left brace should be at the end of the line beginning the
- compound statement and the closing right brace should be alone on a
- line, tabbed under the beginning of the compound statement.
- Note that the left brace beginning a function body is the only occurrence
- of a left brace which is alone on a line.
- .NH 2
- Examples
- .PP
- .DS
- if (expr) {
- statement;
- statement;
- }
-
- if (expr) {
- statement;
- statement;
- } else {
- statement;
- statement;
- }
- .DE
- Note that the right brace before the \fIelse\fR and the right brace
- before the \fIwhile\fR of a \fIdo-while\fR statement (below) are the
- only places where a right braces appears that is not alone on a line.
- .DS
- for (i = 0; i < MAX; i++) {
- statement;
- statement;
- }
-
- while (expr) {
- statement;
- statement;
- }
-
- do {
- statement;
- statement;
- } while (expr);
-
- switch (expr) {
- case ABC:
- case DEF:
- statement;
- break;
- case XYZ:
- statement;
- break;
- default:
- statement;
- break\*f;
- }
- .DE
- .FS
- .IP \*F
- This \fIbreak\fR is, strictly speaking, unnecessary, but it is required
- nonetheless because it prevents a fall-through error if another \fIcase\fR
- is added later after the last one.
- .FE
- Note that when multiple \fIcase\fR labels are used, they are placed
- on separate lines.
- The fall through feature of the C \fIswitch\fR statement should
- rarely if ever be used when code is executed before falling through
- to the next one.
- If this is done it must be commented for future maintenance.
- .DS
- if (strcmp(reply, "yes") == EQUAL) {
- statements for yes
- ...
- } else if (strcmp(reply, "no") == EQUAL) {
- statements for no
- ...
- } else if (strcmp(reply, "maybe") == EQUAL) {
- statements for maybe
- ...
- } else {
- statements for none of the above
- ...
- }
- .DE
- The last example is a generalized \fIswitch\fR statement and the
- tabbing reflects the switch between exactly one of several
- alternatives rather than a nesting of statements.
- .NH
- Expressions
- .NH 2
- Operators
- .PP
- The old versions of equal-ops =+, =\-, =*, etc. should not be used.
- The preferred use is +=, \-=, *=, etc.
- All binary operators except . and \-> should be separated from their
- operands by blanks\*f.
- .FS
- .IP \*F
- Some judgement is called for in the case of complex expressions,
- which may be clearer if the ``inner'' operators are not surrounded
- by spaces and the ``outer'' ones are.
- .FE
- In addition, keywords that are followed by expressions in parentheses
- should be separated from the left parenthesis by a blank\*f.
- .FS
- .IP \*F
- \fISizeof\fR is an exception, see the discussion of function calls.
- Less logically, so is \fIreturn\fR.
- .FE
- Blanks should also appear after commas in argument lists to help
- separate the arguments visually.
- On the other hand, macros with arguments and function calls should
- not have a blank between the name and the left parenthesis.
- In particular, the C preprocessor requires the left parenthesis
- to be immediately after the macro name or else the argument list
- will not be recognized.
- Unary operators should not be separated from their single operand.
- Since C has some unexpected precedence rules,
- all expressions involving mixed operators should be fully parenthesized.
- .PP
- \fIExamples\fR
- .DS
- a += c + d;
- a = (a + b) / (c * d);
- strp\->field = str.fl - ((x & MASK) >> DISP);
- while (*d++ = *s++)
- ; /* EMPTY BODY */
- .DE
- .NH 2
- Naming Conventions
- .PP
- Individual projects will no doubt have their own naming conventions.
- There are some general rules however.
- .IP \(bu
- An initial underscore should not be used for any user-created names\*f.
- .FS
- .IP \*F
- Trailing underscores should be avoided too.
- .FE
- .UX
- uses it for names that the user should not have to know (like
- the standard I/O library)\*f.
- .FS
- .IP \*F
- This convention is reserved for system purposes.
- If you must have your own private identifiers,
- begin them with a capital letter identifying the
- package to which they belong.
- .FE
- .IP \(bu
- Macro names, \fItypedef\fR names, and \fIdefine\fR names should be
- all in CAPS.
- .IP \(bu
- Variable names, structure tag names, and function names should be in
- lower case\*f.
- .FS
- .IP \*F
- It is best to avoid names that differ only in case, like \fIfoo\fR
- and \fIFOO\fR.
- The potential for confusion is considerable.
- .FE
- Some macros (such as \fIgetchar\fR and \fIputchar\fR) are in lower case
- since they may also exist as functions.
- Care is needed when interchanging macros and functions since functions
- pass their parameters by value whereas macros pass their arguments by
- name substitution\*f.
- .FS
- .IP \*F
- This difference also means that carefree use of macros requires care
- when they are defined.
- Remember that complex expressions can be used as parameters,
- and operator-precedence problems can arise unless all occurrences of
- parameters in the definition have parentheses around them.
- There is little that can be done about the problems caused by side
- effects in parameters
- except to avoid side effects in expressions (a good idea anyway).
- .FE
- .NH 2
- Constants
- .PP
- Numerical constants should not be coded directly\*f.
- .FS
- .IP \*F
- At the very least, any directly-coded numerical constant must have a
- comment explaining the derivation of the value.
- .FE
- The \fIdefine\fR feature of the C preprocessor should be used to
- assign a meaningful name.
- This will also make it easier to administer large programs since the
- constant value can be changed uniformly by changing only the \fIdefine\fR.
- The enumeration data type is the preferred way to handle situations where
- a variable takes on only a discrete set of values, since additional type
- checking is available through \fIlint\fR.
- .PP
- There are some cases where the constants 0 and 1 may appear as themselves
- instead of as defines.
- For example if a \fIfor\fR loop indexes through an array, then
- .DS
- for (i = 0; i < ARYBOUND; i++)
- .DE
- is reasonable while the code
- .DS
- fptr = fopen(filename, "r");
- if (fptr == 0)
- error("can't open %s\en", filename);
- .DE
- is not.
- In the last example the defined constant \fINULL\fR is available as
- part of the standard I/O library's header file \fIstdio.h\fR and
- must be used in place of the 0.
- .NH
- Portability
- .PP
- The advantages of portable code are well known.
- This section gives some guidelines for writing portable code,
- where the definition of portable is taken to mean that a source file
- contains portable code if it can be compiled and executed on different
- machines with the only source change being the inclusion of possibly
- different header files.
- The header files will contain defines and typedefs that may vary from
- machine to machine.
- Reference [1] contains useful information on both style and portability.
- Many of the recommendations in this document originated in [1].
- The following is a list of pitfalls to be avoided and recommendations
- to be considered when designing portable code:
- .IP \(bu
- First, one must recognize that some things are inherently non-portable.
- Examples are code to deal with particular hardware registers such as
- the program status word,
- and code that is designed to support a particular piece of hardware
- such as an assembler or I/O driver.
- Even in these cases there are many routines and data organizations
- that can be made machine independent.
- It is suggested that source file be organized so that the machine-independent
- code and the machine-dependent code are in separate files.
- Then if the program is to be moved to a new machine,
- it is a much easier task to determine what needs to be changed\*f.
- .FS
- .IP \*F
- If you \fI#ifdef\fR dependencies,
- make sure that if no machine is specified,
- the result is a syntax error, \fInot\fR a default machine!
- .FE
- It is also possible that code in the machine-independent files
- may have uses in other programs as well.
- .IP \(bu
- Pay attention to word sizes.
- The following sizes apply to basic types in C for the machines
- that will be used most at IH\*f:
- .FS
- .IP \*F
- The 3B is a Bell Labs machine.
- The VAX, not shown in the table, is similar to the 3B in these respects.
- The 68000 resembles either the pdp11 or the 3B, depending on the
- particular compiler.
- .FE
- .br
- .ne 2i
- .TS
- center;
- l c c c
- l r r r.
- type pdp11 3B IBM
- _
- char 8 8 8
- short 16 16 16
- int 16 32 32
- long 32 32 32
- .TE
- In general if the word size is important, \fIshort\fR or \fIlong\fR
- should be used to get 16 or 32 bit items on any of the above machines\*f.
- .FS
- .IP \*F
- Any unsigned type other than plain \fIunsigned int\fR should be
- \fItypedef\fRed, as such types are highly compiler-dependent.
- This is also true of long and short types other than \fIlong int\fR
- and \fIshort int\fR.
- Large programs should have a central header file which supplies
- \fItypedef\fRs for commonly-used width-sensitive types, to make
- it easier to change them and to aid in finding width-sensitive code.
- .FE
- If a simple loop counter is being used where either 16 or 32 bits will
- do, then use \fIint\fR, since it will get the most efficient (natural)
- unit for the current machine\*f.
- .FS
- .IP \*F
- Beware of making assumptions about the size of pointers.
- They are not always the same size as \fIint\fR.
- Nor are all pointers always the same size, or freely interconvertible.
- Pointer-to-character is a particular trouble spot on machines which
- do not address to the byte.
- .FE
- .IP \(bu
- Word size also affects shifts and masks.
- The code
- .DS
- x &= 0177770
- .DE
- will clear only the three rightmost bits of an \fIint\fR on a PDP11.
- On a 3B it will also clear the entire upper halfword.
- Use
- .DS
- x &= ~07
- .DE
- instead which works properly on all machines\*f.
- .FS
- .IP \*F
- The or operator (\ |\ ) does not have these problems,
- nor do bitfields (which, unfortunately, are not very portable due to
- defective compilers).
- .FE
- .IP \(bu
- Code that takes advantage of the two's complement representation of
- numbers on most machines should not be used.
- Optimizations that replace arithmetic operations with equivalent
- shifting operations are particularly suspect.
- You should weigh the time savings with the potential for obscure
- and difficult bugs when your code is moved, say, from a 3B to a 1A.
- .IP \(bu
- Watch out for signed characters.
- On the PDP-11, characters are sign extended when used in expressions,
- which is not the case on any other machine.
- In particular, \fIgetchar\fR is an integer-valued function (or macro)
- since the value of \fIEOF\fR for the standard I/O library is \-1,
- which is not possible for a character on the 3B or IBM\*f.
- .FS
- .IP \*F
- Actually, this is not quite the real reason why \fIgetchar\fR returns
- \fIint\fR, but the comment is valid: code which assumes either
- that characters are signed or that they are unsigned is unportable.
- It is best to completely avoid using \fIchar\fR to hold numbers.
- Manipulation of characters as if they were numbers is also
- often unportable.
- .FE
- .IP \(bu
- The PDP-11 is unique among processors on which C exists in that the
- bytes are numbered from right to left within a word.
- All other machines (3B, IBM, Interdata 8/32, Honeywell) number the
- bytes from left to right\*f.
- .FS
- .IP \*F
- Actually, there are some more right-to-left machines now, but
- the comments still apply.
- .FE
- Hence any code that depends on the left-right orientation of bits
- in a word deserves special scrutiny.
- Bit fields within structure members will only be portable so long as
- two separate fields are never concatenated and treated as a unit\*f.
- .FS
- .IP \*F
- The same applies to variables in general.
- Alignment considerations and loader peculiarities make it very rash
- to assume that two consecutively-declared variables are together
- in memory, or that a variable of one type is aligned appropriately
- to be used as another type.
- .FE
- [1,3]
- .IP \(bu
- Do not default the boolean test for non-zero, i.e.
- .DS
- if (f() != FAIL)
- .DE
- is better than
- .DS
- if (f())
- .DE
- even though \fIFAIL\fR may have the value 0 which is considered to mean
- false by C\*f.
- This will help you out later when somebody decides that a failure return
- should be \-1 instead of 0\ \*f.
- .FS
- .IP \*F
- A particularly notorious case is using \fIstrcmp\fR
- to test for string equality, where the result should \fInever\fR
- \fIever\fR be defaulted.
- The preferred approach is to define a macro \fISTREQ\fR:
- .DS
- #define STREQ(a, b) (strcmp((a), (b)) == 0)
- .DE
- .FE
- .FS
- .IP \*F
- An exception is commonly made for predicates, which are functions
- which meet the following restrictions:
- .IP \(bu
- Has no other purpose than to return true or false.
- .IP \(bu
- Returns 0 for false, 1 for true, nothing else.
- .IP \(bu
- Is named so that the meaning of (say) a `true' return
- is absolutely obvious.
- Call a predicate \fIisvalid\fR or \fIvalid\fR, not \fIcheckvalid\fR.
- .FE
- .IP \(bu
- Be suspicious of numeric values appearing in the code.
- Even simple values like 0 or 1 could be better expressed using
- defines like \fIFALSE\fR and \fITRUE\fR (see previous item)\*f.
- .FS
- .IP \*F
- Actually, \fIYES\fR and \fINO\fR often read better.
- .FE
- Any other constants appearing in a program would be better expressed
- as a defined constant.
- This makes it easier to change and also easier to read.
- .IP \(bu
- Become familiar with existing library functions and defines\*f.
- .FS
- .IP \*F
- But not \fItoo\fR familiar.
- The internal details of library facilities, as opposed to their
- external interfaces, are subject to change without warning.
- They are also often quite unportable.
- .FE
- You should not be writing your own string compare routine, or making
- your own defines for system structures\*f.
- .FS
- .IP \*F
- Or, especially, writing your own code to control terminals.
- Use the \fItermcap\fR package.
- .FE
- Not only does this waste your time, but it prevents your program
- from taking advantage of any microcode assists or other
- means of improving performance of system routines\*f.
- .FS
- .IP \*F
- It also makes your code less readable, because the reader has to
- figure out whether you're doing something special in that reimplemented
- stuff to justify its existence.
- Furthermore, it's a fruitful source of bugs.
- .FE
- .IP \(bu
- Use \fIlint\fR.
- It is a valuable tool for finding machine-dependent constructs as well
- as other inconsistencies or program bugs that pass the compiler\*f.
- .FS
- .IP \*F
- The use of \fIlint\fR on all programs is strongly recommended.
- It is difficult to eliminate complaints about functions whose return
- value is not used (in the current version of C, at least), but most
- other messages from \fIlint\fR really do indicate something wrong.
- The \-h, \-p, \-a, \-x, and \-c options are worth learning.
- All of them will complain about some legitimate things, but they will
- also pick up many botches.
- Note that \-p checks function-call type-consistency for only a subset of
- Unix library routines, so programs should be linted both with and without
- this option for best ``coverage''.
- .FE
- .NH
- Lint
- .PP
- \fILint\fR is a C program checker [2] that examines C source files to
- detect and report type incompatibilities, inconsistencies between
- function definitions and calls,
- potential program bugs, etc.
- It is expected that projects will require programs to use \fIlint\fR
- as part of the official acceptance procedure\*f.
- .FS
- .IP \*F
- Yes.
- .FE
- In addition, work is going on in department 5521 to modify \fIlint\fR
- so that it will check for adherence to the standards in this document.
- .PP
- It is still too early to say exactly which of the standards given here
- will be checked by \fIlint\fR.
- In some cases such as whether a comment is misleading or incorrect there
- is little hope of mechanical checking.
- In other cases such as checking that the opening brace of a function
- body is alone on a line in column 1, the test has already been added\*f.
- .FS
- .IP \*F
- Little of this is relevant at U of T.
- The version of \fIlint\fR that we have lacks these mods.
- .FE
- Future bulletins will be used to announce new additions to \fIlint\fR
- as they occur.
- .PP
- It should be noted that the best way to use \fIlint\fR is not as a barrier
- that must be overcome before official acceptance of a program, but
- rather as a tool to use whenever major changes or additions to the
- code have been made.
- \fILint\fR
- can find obscure bugs and insure portability before problems occur.
- .NH
- Special Considerations
- .PP
- This section contains some miscellaneous do's and don'ts.
- .IP \(bu
- Don't change syntax via macro substitution.
- It makes the program unintelligible to all but the perpetrator.
- .IP \(bu
- There is a time and a place for embedded assignment statements\*f.
- .FS
- .IP \*F
- The \fB++\fR and \fB\-\-\fR operators count as assignment statements.
- So, for many purposes, do functions with side effects.
- .FE
- In some constructs there is no better way to accomplish the results
- without making the code bulkier and less readable.
- The \fIwhile\fR loop in section 8.1 is one example of an appropriate
- place.
- Another is the common code segment:
- .DS
- while ((c = getchar()) != EOF) {
- process the character
- }
- .DE
- Using embedded assignment statements to improve run-time performance
- is also possible.
- However, one should consider the tradeoff between increased speed and
- decreased maintainability that results when embedded assignments are
- used in artificial places.
- For example, the code:
- .DS
- a = b + c;
- d = a + r;
- .DE
- should not be replaced by
- .DS
- d = (a = b + c) + r;
- .DE
- even though the latter may save one cycle.
- Note that in the long run the time difference between the two will
- decrease as the optimizer gains maturity, while the difference in
- ease of maintenance will increase as the human memory of what's
- going on in the latter piece of code begins to fade\*f.
- .FS
- .IP \*F
- Note also that side effects within expressions can result in code
- whose semantics are compiler-dependent, since C's order of evaluation
- is explicitly undefined in most places.
- Compilers do differ.
- .FE
- .IP \(bu
- There is also a time and place for the ternary \fI?\ :\fR operator
- and the binary comma operator.
- The logical expression operand before the \fI?\ :\fR should be
- parenthesized:
- .DS
- (x >= 0) ? x : \-x
- .DE
- Nested \fI?\ :\fR operators can be confusing and should be avoided
- if possible.
- There are some macros like \fIgetchar\fR where they can be useful.
- The comma operator can also be useful in \fIfor\fR statements to
- provide multiple initializations or incrementations.
- .IP \(bu
- Goto statements should be used sparingly as in any well-structured
- code\*f.
- .FS
- .IP \*F
- The \fIcontinue\fR statement is almost as bad.
- \fIBreak\fR is less troublesome.
- .FE
- The main place where they can be usefully employed is to break out
- of several levels of \fIswitch\fR, \fIfor\fR, and \fIwhile\fR
- nesting\*f, e.g.
- .FS
- .IP \*F
- The need to do such a thing may indicate
- that the inner constructs should be broken out into
- a separate function, with a success/failure return code.
- .FE
- .DS L
- for (...)
- for (...) {
- ...
- if (disaster)
- goto error;
- }
- \&...
- error:
- clean up the mess
- .DE
- When a \fIgoto\fR is necessary the accompanying label should be alone
- on a line and tabbed one tab position to the left of the associated
- code that follows.
- .IP \(bu
- This committee recommends that programmers not rely on automatic
- beautifiers for the following reasons.
- First, the main person who benefits from good program style is the
- programmer himself.
- This is especially true in the early design of handwritten algorithms
- or pseudo-code.
- Automatic beautifiers can only be applied to complete, syntactically
- correct programs and hence are not available when the need for
- attention to white space and indentation is greatest.
- It is also felt that programmers can do a better job of making clear
- the complete visual layout of a function or file, with the normal
- attention to detail of a careful programmer\*f.
- .FS
- .IP \*F
- In other words, some of the visual layout is dictated by intent
- rather than syntax.
- Beautifiers cannot read minds.
- .FE
- Sloppy programmers should learn to be careful programmers instead of
- relying on a beautifier to make their code readable.
- Finally, it is felt that since beautifiers are non-trivial programs
- that must parse the source,
- the burden of maintaining them in the face of the continuing evolution
- of C is not worth the benefits gained by such a program.
- .NH
- Project Dependent Standards
- .PP
- Individual projects may wish to establish additional standards beyond
- those given here.
- The following issues are some of those that should be adddressed by
- each project program administration group.
- .IP \(bu
- What additional naming conventions should be followed?
- In particular, systematic prefix conventions for functional grouping
- of global data and also for structure or union member names can be useful.
- .IP \(bu
- What kind of include file organization is appropriate for the
- project's particular data hierarchy?
- .IP \(bu
- What procedures should be established for reviewing \fIlint\fR
- complaints?
- A tolerance level needs to be established in concert with the \fIlint\fR
- options to prevent unimportant complaints from hiding complaints about
- real bugs or inconsistencies.
- .IP \(bu
- If a project establishes its own archive libraries, it should plan on
- supplying a lint library file [2] to the system administrators.
- This will allow \fIlint\fR to check for compatible use of library
- functions.
- .NH
- Conclusion
- .PP
- A set of standards has been presented for C programming style.
- One of the most important points is the proper use of white space
- and comments so that the structure of the program is evident from
- the layout of the code.
- Another good idea to keep in mind when writing code is that it is
- likely that you or someone else will be asked to modify it or make
- it run on a different machine sometime in the future.
- .PP
- As with any standard, it must be followed if it is to be useful.
- The Indian Hill version of \fIlint\fR will enforce those standards
- that are amenable to automatic checking.
- If you have trouble following any of these standards don't just ignore them.
- Programmers at Indian Hill should bring their problems to the
- Software Development System Group (Lee Kirchhoff, contact) in department
- 5522.
- Programmers outside Indian Hill should contact the Processor
- Application Group (Layne Cannon, contact) in department 5512\ \*f.
- .FS
- .IP \*F
- At U of T Zoology, it's Henry Spencer in 336B.
- .FE
- .bp
- .ce 1
- \fBReferences\fR
- .sp 2
- .IP [1]
- B.A. Tague, "C Language Portability", Sept 22, 1977.
- This document issued by department 8234 contains three memos by
- R.C. Haight, A.L. Glasser, and T.L. Lyon dealing with style and
- portability.
- .IP [2]
- S.C. Johnson, "Lint, a C Program Checker", Technical Memorandum,
- 77-1273-14, September 16, 1977.
- .IP [3]
- R.W. Mitze, "The 3B/PDP-11 Swabbing Problem", Memorandum for File,
- 1273-770907.01MF,
- September 14, 1977.
- .IP [4]
- R.A. Elliott and D.C. Pfeffer, "3B Processor Common Diagnostic
- Standards- Version 1",
- Memorandum for File, 5514-780330.01MF, March 30, 1978.
- .IP [5]
- R.W. Mitze,
- "An Overview of C Compilation of UNIX User Processes on the 3B",
- Memorandum for File, 5521-780329.02MF, March 29, 1978.
- .IP [6]
- B.W. Kernighan and D.M. Ritchie,
- \fIThe C Programming Language\fR,
- Prentice-Hall 1978.
- .bp
- .PP
- .DS L
- .ta 8 16 24 32 40 48 56 64 72 80
- /*
- * \fBThe C Style Summary Sheet\fR Block comment,
- * by Henry Spencer, U of T Zoology describes file.
- */
-
- #include <errno.h> Headers; don't nest.
-
- typedef int SEQNO; /* ... */ Global definitions.
- #define STREQ(a, b) (strcmp((a), (b)) == 0)
-
- static char *foo = NULL; /* ... */ Global declarations.
- struct bar { Static whenever poss.
- SEQNO alpha; /* ... */
- # define NOSEQNO 0
- int beta; /* ... */ Don't assume 16 bits.
- };
-
- /*
- * Many unnecessary braces, to show where. Functions.
- */
- static int /* what is returned */ Don't default int.
- bletch(a)
- int a; /* ... */ Don't default int.
- {
- int bar; /* ... */
- extern int errno; /* ..., changed here */
- extern char *index();
-
- if (foobar() != FAIL) { if (!isvalid()) {
- return(OK); errno = ERANGE;
- } } else {
- x = &y + z\->field;
- while (x == (y & MASK)) { }
- f += (x >= 0) ? x : \-x;
- } for (i = 0; i < BOUND; i++) {
- /* lint \-h[p]cax. */
- do { }
- /* Avoid nesting ?: */
- } while (index(a, b) != NULL); if (STREQ(x, "foo")) {
- x |= 07; /* 07 is... */
- switch (...) { } else if (STREQ(x, "bar")) {
- case ABC: x &= ~077; /* 077 is... */
- case DEF: } else if (STREQ(x, "ugh")) {
- printf("...", a, b); /* Avoid gotos */
- break; } else {
- case XYZ: /* and continues. */
- x = y; }
- /* FALLTHROUGH */
- default: while ((c = getc()) != EOF)
- /* Limit imbedded =s. */ ; /* NULLBODY */
- break;
- }
- }
- .DE
- ---------------
-
-
-