home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!mcsun!uknet!acorn!ixi!clive
- From: clive@x.co.uk (Clive Feather)
- Newsgroups: comp.std.c
- Subject: Re: struct hack, and other out-of-array references
- Message-ID: <1992Sep15.201137.805@x.co.uk>
- Date: 15 Sep 92 20:11:37 GMT
- References: <1992Sep07.104932.20060@x.co.uk <1992Sep8.124655.1498@Urmel.Informatik.RWTH-Aachen.DE> <1992Sep10.014137.16209@sq.sq.com>
- Organization: IXI Limited, Cambridge, UK
- Lines: 140
-
-
- [<> = Norman Diamond; o' = me (Clive Feather) previously;
- ^^ = Stephen R. van den Berg; ~~ = Mark Brader;
- || = Interpretation Ruling; ## = the standard; no indent = me now.]
-
- [Sorry for so much quoting, but I can't summarise the previous
- discussion any more than I have.]
-
- The question:
- =============
- In the code:
-
- o' struct fred { int i; char s [1]; } *f;
- o' char *s;
- o'
- o' /* ... */
- o' f = malloc (sizeof f + strlen (ss));
- o' if (f != NULL)
- o' strcpy (f->s, ss);
-
- If strlen (ss) is greater than 0, is the access to f->s [1] allowed in a
- strictly conforming program, or does it invoke undefined behaviour
- (access outside the bounds of an array) ?
-
- The story so far:
- =================
- <> You can't go past the end of an array object. But if malloc() or some
- <> other variable has defined the end of the actual array object, then the
- <> + operator can get you that far, regardless of the declared type that
- <> some other array variable had before getting flattened to a pointer.
-
- o' But there is an interpretation that says that, given
- o' int a [5][5];
- o' the access "a [1][6]" is illegal, because it goes past the bounds of the
- o' array "a [1]". In other words, the declared type of the array does
- o' restrict what can happen to a pointer derived from it.
-
- [RFI 17 item 16]
-
- || For an array of arrays, the permitted pointer arithmetic in Standard
- || ##3.3.6 Semantics (page 48, lines 12-40) is to be understood by
- || interpreting the use of the word "object" as denoting the specific
- || object determined directly by the pointer's type and value, *not* other
- || objects related to that one by contiguity. For example, the following
- || code has undefined behaviour:
- || int a [4][5];
- || a [1][7] = 0; /* undefined */
- || Some conforming implementations may choose to diagnose an "array bounds
- || violation", while others may choose to interpret such attempted accesses
- || successfully with the "obvious" extended semantics.
-
- [Standard section 6.3.6 (ANSI 3.3.6)]
-
- ## If both the pointer operand and the result point to elements of the
- ## same array object, or one past the last element of the same array
- ## object, the evaluation shall not produce an overflow; otherwise, the
- ## behavior is undefined.
-
- In "a [1][7]", the pointer operand is "a [1]".
-
- ~~ Now, the important thing is that nowhere in the standard is there any
- ~~ license to treat the object a as an array of 20 ints. It is an array
- ~~ of 4 arrays of 5 ints each. If you write a[1][7], you are calling for
- ~~ the computation of a[1]+7. a[1] here decays to a pointer to int, which
- ~~ points to the first element of the array a[1]. a[1]+7 does not point
- ~~ to that array.
-
- o' This RFT applies to slices of arrays, but, in *my* opinion, it is
- o' extendable to this case:
- [The code at the top of this article]
-
- o' The pointer f->s points to an object with type "char [1]", and so, if
- o' strlen (ss) > 0, the access to (f->s)[1] required by strlen is undefined
- o' according to this RFI, even though the array has already decayed to a
- o' pointer.
-
- ~~ Clive's opinion here is wrong. The difference
- ~~ between this and the other case is that the standard provides (in ANSI
- ~~ section 4.10.3, ISO 7.10.3) specific dispensation for the value returned by
- ~~ malloc() to:
-
- ## ... be assigned to a pointer to any type of object and then used to
- ## access such an object or an array of such objects in the space
- ## allocated ...
-
- ~~ The pointer f->s does point to an object with type char[1], but it is
- ~~ also a pointer into the space returned by malloc(), which may be treated
- ~~ as an array of chars. That is, when strcpy computes the equivalent of
- ~~ (f->s)[1], and therefore (f->s)+1, both the pointer operand of +, i.e.
- ~~ f->s, and the computed result *are* within the same array, namely the one
- ~~ returned by malloc(), and all is well.
-
- Now read on:
- ============
- I don't think Mark is correct here (obviously, or I wouldn't be
- writing :-). The intent of the section about malloc quoted is clearly to
- indicate that the space returned can be used for any purpose, and is not
- restricted to (say) char arrays only. If the pointer returned from
- malloc is cast to a pointer to a character type, then of course any part
- of the space can be accessed. However, once the pointer has been cast to
- a "struct fred *", then it can only be used to access the contents of a
- "struct fred".
-
- I have to say that the Standard seems very confused when talking about
- pointers to objects within objects. For example, the above ruling
- implies that a single pointer cannot walk through the array "a", because
- there is an "iron curtain" between a[0] and a[1], and so on. On the
- other hand, it is easy to show that a[0][4] and a[1][0] must be adjacent
- in memory, which implies that the pointer can walk through, in code
- like:
-
- for (i = 0, p = &(a [0][0]); i < sizeof a / sizeof a [0][0]; i++)
- printf ("%d\n", *p++);
-
- I think that the only way we are going to decide this question is by
- putting together a set of RFIs which cover all the relevant issues
- and submitting them.
-
- ~~ I would also argue that the Interpretation Ruling cited above would not
- ~~ apply if the type of the example array "a" had been char instead of int.
- ~~ The definitions of "object" and "byte" (in ANSI section 1.6, ISO 3.14 and
- ~~ 3.4) in effect require that it be possible to treat any object as an
- ~~ array of any character type.
-
- But even then "a" is an array of type "char [5]", not of type "char", so
- that doesn't apply. However, to avoid doubt, change my code to:
-
- struct fred { double d; int arr [1]; } *f;
- int i, n;
-
- /* ... */
- f = malloc (sizeof f + (n - 1) * sizeof (int));
- if (f != NULL)
- for (i = 0; i < n; i++)
- f->arr [i] = 0;
- --
- Clive D.W. Feather | IXI Limited | If you lie to the compiler,
- clive@x.co.uk | 62-74 Burleigh St. | it will get its revenge.
- Phone: +44 223 462 131 | Cambridge CB1 1OJ | - Henry Spencer
- Fax: +44 223 462 132 | United Kingdom |
-