minnie.tuhs.org

home *** CD-ROM | disk | FTP | other *** search

/ minnie.tuhs.org / unixen.tar / unixen / PDP-11 / Trees / V6 / usr / doc / c / c1 < prev next >

Wrap

Text File | 1975-06-26 | 10.0 KB | 359 lines

.ta .5i 1i 1.5i 2i 2.5i 3i 3.5i 4i .ul 1. Introduction .et C is a computer language based on the earlier language B [1]. The languages and their compilers differ in two major ways: C introduces the notion of types, and defines appropriate extra syntax and semantics; also, C on the \s8PDP\s10-11 is a true compiler, producing machine code where B produced interpretive code. .pg Most of the software for the \s8UNIX\s10 time-sharing system [2] is written in C, as is the operating system itself. C is also available on the \s8HIS\s10 6070 computer at Murray Hill and and on the \s8IBM\s10 System/370 at Holmdel [3]. This paper is a manual only for the C language itself as implemented on the \s8PDP\s10-11. However, hints are given occasionally in the text of implementation-dependent features. .pg The \s8UNIX\s10 Programmer's Manual [4] describes the library routines available to C programs under \s8UNIX\s10, and also the procedures for compiling programs under that system. ``The \s8GCOS\s10 C Library'' by Lesk and Barres [5] describes routines available under that system as well as compilation procedures. Many of these routines, particularly the ones having to do with I/O, are also provided under \s8UNIX\s10. Finally, ``Programming in C\(mi A Tutorial,'' by B. W. Kernighan [6], is as useful as promised by its title and the author's previous introductions to allegedly impenetrable subjects. .ul 2. Lexical conventions .et There are six kinds of tokens: identifiers, keywords, constants, strings, expression operators, and other separators. In general blanks, tabs, newlines, and comments as described below are ignored except as they serve to separate tokens. At least one of these characters is required to separate otherwise adjacent identifiers, constants, and certain operator-pairs. .pg If the input stream has been parsed into tokens up to a given character, the next token is taken to include the longest string of characters which could possibly constitute a token. .ms 2.1 Comments .et The characters ^^\fG/\**\fR^^ introduce a comment, which terminates with the characters^^ \fG\**/\fR. .ms 2.2 Identifiers (Names) .et An identifier is a sequence of letters and digits; the first character must be alphabetic. The underscore ``\(ru'' counts as alphabetic. Upper and lower case letters are considered different. No more than the first eight characters are significant, and only the first seven for external identifiers. .ms 2.3 Keywords .et The following identifiers are reserved for use as keywords, and may not be used otherwise: .sp .7 .in .5i .ta 2i .nf .ne 10 .ft G int break char continue float if double else struct for auto do extern while register switch static case goto default return entry sizeof .sp .7 .ft R .fi .in 0 The .bd entry keyword is not currently implemented by any compiler but is reserved for future use. .ms 2.3 Constants .et There are several kinds of constants, as follows: .ms 2.3.1 Integer constants .et An integer constant is a sequence of digits. An integer is taken to be octal if it begins with \fG0\fR, decimal otherwise. The digits \fG8\fR and \fG9\fR have octal value 10 and 11 respectively. .ms 2.3.2 Character constants .et A character constant is 1 or 2 characters enclosed in single quotes ``\fG^\(aa^\fR''. Within a character constant a single quote must be preceded by a back-slash ``\\''. Certain non-graphic characters, and ``\\'' itself, may be escaped according to the following table: .sp .7 .ta .5i 1.25i .nf \s8BS\s10 \\b \s8NL\s10 \\n \s8CR\s10 \\r \s8HT\s10 \\t \fIddd\fR \\\fIddd\fR \\ \\\\ .fi .sp .7 The escape ``\\\fIddd\fR'' consists of the backslash followed by 1, 2, or 3 octal digits which are taken to specify the value of the desired character. A special case of this construction is ``\\0'' (not followed by a digit) which indicates a null character. .pg Character constants behave exactly like integers (not, in particular, like objects of character type). In conformity with the addressing structure of the \s8PDP\s10-11, a character constant of length 1 has the code for the given character in the low-order byte and 0 in the high-order byte; a character constant of length 2 has the code for the first character in the low byte and that for the second character in the high-order byte. Character constants with more than one character are inherently machine-dependent and should be avoided. .ms 2.3.3 Floating constants .et A floating constant consists of an integer part, a decimal point, a fraction part, an \fGe\fR, and an optionally signed integer exponent. The integer and fraction parts both consist of a sequence of digits. Either the integer part or the fraction part (not both) may be missing; either the decimal point or the \fGe\fR and the exponent (not both) may be missing. Every floating constant is taken to be double-precision. .ms 2.4 Strings .et A string is a sequence of characters surrounded by double quotes ``^\fG"\fR^''. A string has the type array-of-characters (see below) and refers to an area of storage initialized with the given characters. The compiler places a null byte (^\\0^) at the end of each string so that programs which scan the string can find its end. In a string, the character ``^\fG"\fR^'' must be preceded by a ``\\''^; in addition, the same escapes as described for character constants may be used. .ul 3. Syntax notation .et In the syntax notation used in this manual, syntactic categories are indicated by \fIitalic\fR type, and literal words and characters in \fGgothic.\fR Alternatives are listed on separate lines. An optional terminal or non-terminal symbol is indicated by the subscript ``opt,'' so that .dp { expression\*(op } .ed would indicate an optional expression in braces. .ul 4. What's in a Name? .et C bases the interpretation of an identifier upon two attributes of the identifier: its .ft I storage class .ft R and its .ft I type. .ft R The storage class determines the location and lifetime of the storage associated with an identifier; the type determines the meaning of the values found in the identifier's storage. .pg There are four declarable storage classes: automatic, static, external, and register. Automatic variables are local to each invocation of a function, and are discarded on return; static variables are local to a function, but retain their values independently of invocations of the function; external variables are independent of any function. Register variables are stored in the fast registers of the machine; like automatic variables they are local to each function and disappear on return. .pg C supports four fundamental types of objects: characters, integers, single-, and double-precision floating-point numbers. .sp .7 .in .5i Characters (declared, and hereinafter called, \fGchar\fR) are chosen from the \s8ASCII\s10 set; they occupy the right-most seven bits of an 8-bit byte. It is also possible to interpret \fGchar\fRs as signed, 2's complement 8-bit numbers. .sp .4 Integers (\fGint\fR) are represented in 16-bit 2's complement notation. .sp .4 Single precision floating point (\fGfloat\fR) quantities have magnitude in the range approximately 10\u\s7\(+-38\s10\d or 0; their precision is 24 bits or about seven decimal digits. .sp .4 Double-precision floating-point (\fGdouble\fR) quantities have the same range as \fGfloat\fRs and a precision of 56 bits or about 17 decimal digits. .sp .7 .in 0 .pg Besides the four fundamental types there is a conceptually infinite class of derived types constructed from the fundamental types in the following ways: .sp .7 .in .5i .ft I arrays .ft R of objects of most types; .sp .4 .ft I functions .ft R which return objects of a given type; .sp .4 .ft I pointers .ft R to objects of a given type; .sp .4 .ft I structures .ft R containing objects of various types. .sp .7 .in 0 In general these methods of constructing objects can be applied recursively. .ul 5. Objects and lvalues .et An object is a manipulatable region of storage; an lvalue is an expression referring to an object. An obvious example of an lvalue expression is an identifier. There are operators which yield lvalues: for example, if E is an expression of pointer type, then \**E is an lvalue expression referring to the object to which E points. The name ``lvalue'' comes from the assignment expression ``E1@=@E2'' in which the left operand E1 must be an lvalue expression. The discussion of each operator below indicates whether it expects lvalue operands and whether it yields an lvalue. .ul 6. Conversions .et A number of operators may, depending on their operands, cause conversion of the value of an operand from one type to another. This section explains the result to be expected from such conversions. .ms 6.1 Characters and integers .et A \fGchar\fR object may be used anywhere an \fGint\fR may be. In all cases the \fGchar\fR is converted to an \fGint\fR by propagating its sign through the upper 8 bits of the resultant integer. This is consistent with the two's complement representation used for both characters and integers. (However, the sign-propagation feature disappears in other implementations.) .ms 6.2 Float and double .et All floating arithmetic in C is carried out in double-precision; whenever a \fGfloat\fR appears in an expression it is lengthened to \fGdouble\fR by zero-padding its fraction. When a \fGdouble\fR must be converted to \fGfloat\fR, for example by an assignment, the \fGdouble\fR is rounded before truncation to \fGfloat\fR length. .ms 6.3 Float and double; integer and character .et All \fGint\fRs and \fGchar\fRs may be converted without loss of significance to \fGfloat\fR or \fGdouble\fR. Conversion of \fGfloat\fR or \fGdouble\fR to \fGint\fR or \fGchar\fR takes place with truncation towards 0. Erroneous results can be expected if the magnitude of the result exceeds 32,767 (for \fGint\fR) or 127 (for \fGchar\fR). .ms 6.4 Pointers and integers .et Integers and pointers may be added and compared; in such a case the \fGint\fR is converted as specified in the discussion of the addition operator. .pg Two pointers to objects of the same type may be subtracted; in this case the result is converted to an integer as specified in the discussion of the subtraction operator.