home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.lang.c:11825 comp.std.c:2402
- Newsgroups: comp.lang.c,comp.std.c
- Path: sparky!uunet!usc!sdd.hp.com!think.com!snorkelwacker.mit.edu!bloom-picayune.mit.edu!news
- From: scs@adam.mit.edu (Steve Summit)
- Subject: Re: Pointers to freed memory
- Message-ID: <1992Jul31.193143.26743@athena.mit.edu>
- Summary: yes, Virginia, they could be invalid
- Sender: news@athena.mit.edu (News system)
- Nntp-Posting-Host: adam.mit.edu
- Organization: none, at the moment
- References: <1992Jul30.164035.7349@taumet.com> <1992Jul31.035905.20683@oracle.us.oracle.com>
- Date: Fri, 31 Jul 1992 19:31:43 GMT
- Lines: 200
-
- In article <1992Jul31.035905.20683@oracle.us.oracle.com>, wkaufman@us.oracle.com (William Kaufman) writes:
- > Sorry to post asking for random machine specifics, but this one
- > intrigued me. If what Steve says is true, there's a whole lot of busted
- > code out there,...some of it with my name on it,...
-
- I was going to stay out of this, because Steve Clamage had
- already posted the correct answer, but in this case it may help
- to have several people saying the same thing.
-
- This question, even more than the related one about using funny
- pointers to simulate non-zero-based arrays, unfortunately demands
- a certain amount of faith. (I say "unfortunately" because ours
- ought to be an extremely rational, deterministic, unambiguous
- "science," disdainful of emotional belief systems.)
-
- The code
-
- char *p, *anotherp;
- p = malloc(10);
- free(p);
- anotherp = p;
-
- really, truly could fail on the last line, due to the
- manipulation (via assignment) of an invalid pointer value.
- This has been hashed out a number of times on comp.lang.c and
- comp.std.c; the conclusion has always been that the Standard
- does, strictly speaking, permit the above code to fail. (The
- question is almost worthy of the FAQ list, except that it is not
- quite frequent enough, and is way too obscure.) The clearest
- statement of this (admittedly surprising) result is in the
- Rationale, section 3.2.2.3, pp. 37-8:
-
- Implicit in the Standard is the notion of invalid
- pointers. In discussing pointers, the Standard typically
- refers to "a pointer to an object" or "a pointer to a
- function" or "a null pointer." A special case in address
- arithmetic allows for a pointer to just past the end of
- the array. Any other pointer is invalid.
-
- An invalid pointer might be created in several ways. An
- arbitrary value can be assigned (via a cast) to a pointer
- variable. (This could even create a valid pointer,
- depending on the value.) A pointer to an object becomes
- invalid if the memory containing the object is
- deallocated. Pointer arithmetic can produce pointers
- outside the range of an array.
-
- Regardless of how an invalid pointer is created, any use
- of it yields undefined behavior. Even assignment,
- comparison with a null pointer constant, or comparison
- with itself, might on some systems result in an exception.
-
- Now, as I had occasion to remark yesterday, "[The] Rationale is
- not part of American National Standard X3.159-1989, but is
- included for information only." I confess that I can find
- precious little explicit language in the Standard proper to
- support the claim under consideration; indeed, the Rationale
- characterizes the notion of invalid pointers as "implicit."
- (In fact, other parts of the Standard might lead us to believe
- that invalid pointers cause problems only when dereferenced; for
- instance, section 3.3.6, p. 48, lines 24-27 state, with respect
- to additive pointer arithmetic, that
-
- Unless both the pointer operand and the result point to
- elements of the same array object, or the pointer operand
- points one past the last element of an array object and
- the result points to an element of the same array object,
- the behavior is undefined if the result is used as an
- operand of the unary * operator.
-
- , suggesting that we'd be okay as long as we didn't dereference.)
-
- If anyone can point to any explicit language elsewhere in the
- Standard which pertains to the subject at hand, I'm sure we'd all
- we glad to hear it. (Sections 3.1.2.5, 3.2.2.3, and 3.3.16.1 are
- all curiously silent on the allowable values of the pointers
- being discussed.) I do note (as did Steve Clamage) that section
- 4.10.3 states (with respect to malloc/free) that "The value of a
- pointer that refers to freed space is indeterminate," which says
- something.
-
- Now, let me hasten to add that it is generally agreed that very
- few implementations are likely to make use of the license the
- Standard gives them to mistreat programmers who mistreat invalid
- pointers. As Bill points out, "there's a whole lot of busted
- code out there." Academic considerations aside, compiler writers
- typically let programmers get away with various unquestionably
- bogus and illegal practices, so they're certainly likely to let
- them get away with something this obscure.
-
- Continuing with Bill's article,
- > In article <1992Jul30.164035.7349@taumet.com> steve@taumet.com (Steve Clamage) writes:
- >] The C Standard states that the value of the pointer after a call to free
- >] is 'indeterminate'. This means that just reading the value might
- >] produce a runtime fault.
- >
- > Well, OK, the pointer will be useless, but will it really be
- > dangerous? What he's got is essentially:
- >
- > struct something *p = (struct something *)rand();
- >
- > Could this assignment really cause a fault? Even if p is never
- > dereferenced? I know this is an improper question, but what machines is
- > this true for?
-
- The situation is certainly discomforting, especially for those of
- us who are conditioned to think about pointers as "addresses," on
- a flat-address machine, to boot. If, however, we confine our
- model of pointers to be "magic" objects which, well, point to
- things, we will be thinking more along the lines which the
- Standard does, and we will be more appreciative of the machine
- architectures which do not use flat addressing and which are more
- likely to make use of these peculiar freedoms with respect to
- pointer manipulation which the Standard gives them. (For this
- reason, I have lately been striving to avoid resorting to words
- like "address" when explaining the concept of C pointers.)
-
- Let me illustrate with a slightly different example. Suppose we
- have
-
- union {
- char *p;
- int ia[sizeof(char *) / sizeof(int)];
- } u;
-
- for(i = 0; i < sizeof(char *) / sizeof(int); i++)
- u.ia[i] = rand();
-
- char *anotherp = u.p;
-
- This code uses a union to fill in the bits of a pointer with
- random values. (The array and loop allow for the possibility
- that sizeof(char *) is greater than sizeof(int), although it
- wouldn't work if pointers were somehow smaller than ints.)
- This code is quite similar to Bill's
-
- struct something *p = (struct something *)rand();
-
- , and it could fail, for the same reason. We are all fairly
- surprised that it could fail (or we were, the first time we heard
- about it).
-
- However, let's change it just a little bit:
-
- union {
- double d;
- int ia[sizeof(double) / sizeof(int)];
- } u;
-
- for(i = 0; i < sizeof(double) / sizeof(int); i++)
- u.ia[i] = rand();
-
- double anotherd = u.d;
-
- Here we stuff random values into the bits of a double. I've
- worked on machines where simple assignment of garbage floating-
- point values resulted in a floating point exception (you may
- have, too -- it's a likely occurrence if the machine sets
- condition codes after moves as well as after tests and arithmetic
- operations), so it wouldn't surprise me at all if the above code
- were to fail. Floating-point values are special -- the bits have
- meaning, and some bit patterns are meaningless, so the machine is
- allowed to hiccup, burp, and/or vomit when a meaningless pattern
- is encountered. The situation is no different for pointers.
- (It's no coincidence that it's floating point values and pointers
- which calloc() isn't guaranteed to initialize an array of to 0.)
-
- Finally, getting back to the question of which specific machines
- might actually disallow the code fragments being discussed, the
- Intel 80[234]86 family is often suggested as a "modern" segmented
- architecture which treats pointers specially (i.e. not as
- arbitrary bit values) and which could, in its protected modes,
- conceivably generate faults for things like invalid segment or
- selector values (i.e. when the invalid values are loaded at all,
- not just when they're dereferenced). The fact that teeming
- millions of programmers are using these processors and writing
- code which accidentally or deliberately manipulates invalid
- pointers without getting any addressing faults can be used to
- argue that the concerns are only in the minds of ivory-tower
- academicians and crotchety old Multics users. In actually (so
- I've been told; I've never used one) the 80386 definitely could
- generate faults for these cases if it wanted to. However, as is
- usually the case with hardware fault detection and protection
- issues, the processor's fault detection mechanisms operate at the
- operating system's request and with its assistance. Because of
- the widespread prevalence of code which makes questionable or
- invalid assumptions, OS'es for Intel chips tend to disable
- several of the protection features.
-
- I have crossposted this to comp.std.c in case readers there have
- additional insight into language in the Standard which supports
- the conclusions that freed pointers are invalid and that invalid
- pointers may not be manipulated, but *not* to throw open any
- question about the validity of these conclusions themselves.
- (If you're encountering this issue for the first time, and you're
- tempted to post asking "Golly, `free(p); q = p;' is illegal? I
- can't believe it", here's your answer: believe it.)
-
- Steve Summit
- scs@adam.mit.edu
-