ftp.pasteur.org/FAQ/

home *** CD-ROM | disk | FTP | other *** search

/ ftp.pasteur.org/FAQ/ / ftp-pasteur-org-FAQ.zip / FAQ / C++-faq / part6 < prev next >

Wrap

Text File | 2000-03-01 | 44.4 KB | 1,233 lines

Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!news-out.cwix.com!newsfeed.cwix.com!newsfeed2.skycache.com!newsfeed.skycache.com!news.maxwell.syr.edu!newsfeed.novia.net.MISMATCH!novia!nntp3.cerf.net!nntp2.cerf.net!news.cerf.net!not-for-mail From: mpcline@nic.cerf.net (Marshall Cline) Newsgroups: comp.lang.c++,comp.answers,news.answers,alt.comp.lang.learn.c-c++ Subject: C++ FAQ (part 6 of 10) Followup-To: comp.lang.c++ Date: 29 Feb 2000 20:06:59 GMT Organization: ATT Cerfnet Lines: 1212 Approved: news-answers-request@mit.edu Distribution: world Expires: +1 month Message-ID: <89h8t3$bu0$1@news.cerf.net> Reply-To: cline@parashift.com (Marshall Cline) NNTP-Posting-Host: nic1.san.cerf.net X-Trace: news.cerf.net 951854819 12224 192.215.81.88 (29 Feb 2000 20:06:59 GMT) X-Complaints-To: abuse@cerf.net NNTP-Posting-Date: 29 Feb 2000 20:06:59 GMT Summary: Please read this before posting to comp.lang.c++ Xref: senator-bedfellow.mit.edu comp.lang.c++:453813 comp.answers:39853 news.answers:178212 alt.comp.lang.learn.c-c++:40791 Archive-name: C++-faq/part6 Posting-Frequency: monthly Last-modified: Feb 29, 2000 URL: http://marshall-cline.home.att.net/cpp-faq-lite/ AUTHOR: Marshall Cline / cline@parashift.com / 972-931-9470 COPYRIGHT: This posting is part of "C++ FAQ Lite." The entire "C++ FAQ Lite" document is Copyright(C)1991-2000 Marshall Cline, Ph.D., cline@parashift.com. All rights reserved. Copying is permitted only under designated situations. For details, see section [1]. NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS. THE AUTHOR PROVIDES NO WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. C++-FAQ-Lite != C++-FAQ-Book: This document, C++ FAQ Lite, is not the same as the C++ FAQ Book. The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is 500% larger than this document, and is available in bookstores. For details, see section [3]. ============================================================================== SECTION [16]: Freestore management [16.1] Does delete p delete the pointer p, or the pointed-to-data *p? The pointed-to-data. The keyword should really be delete_the_thing_pointed_to_by. The same abuse of English occurs when freeing the memory pointed to by a pointer in C: free(p) really means free_the_stuff_pointed_to_by(p). ============================================================================== [16.2] Can I free() pointers allocated with new? Can I delete pointers allocated with malloc()? No! It is perfectly legal, moral, and wholesome to use malloc() and delete in the same program, or to use new and free() in the same program. But it is illegal, immoral, and despicable to call free() with a pointer allocated via new, or to call delete on a pointer allocated via malloc(). Beware! I occasionally get e-mail from people telling me that it works OK for them on machine X and compiler Y. That does not make it right! Sometimes people say, "But I'm just working with an array of char." Nonetheless do not mix malloc() and delete on the same pointer, or new and free() on the same pointer! If you allocated via p = new char[n], you must use delete[] p; you must not use free(p). Or if you allocated via p = malloc(n), you must use free(p); you must not use delete[] p or delete p! Mixing these up could cause a catastrophic failure at runtime if the code was ported to a new machine, a new compiler, or even a new version of the same compiler. You have been warned. ============================================================================== [16.3] Why should I use new instead of trustworthy old malloc()? Constructors/destructors, type safety, overridability. * Constructors/destructors: unlike malloc(sizeof(Fred)), new Fred() calls Fred's constructor. Similarly, delete p calls *p's destructor. * Type safety: malloc() returns a void* which isn't type safe. new Fred() returns a pointer of the right type (a Fred*). * Overridability: new is an operator that can be overridden by a class, while malloc() is not overridable on a per-class basis. ============================================================================== [16.4] Can I use realloc() on pointers allocated via new? No! When realloc() has to copy the allocation, it uses a bitwise copy operation, which will tear many C++ objects to shreds. C++ objects should be allowed to copy themselves. They use their own copy constructor or assignment operator. Besides all that, the heap that new uses may not be the same as the heap that malloc() and realloc() use! ============================================================================== [16.5] Do I need to check for NULL after p = new Fred()? No! (But if you have an old compiler, you may have to force the compiler to have this behavior[16.6]). It turns out to be a real pain to always write explicit NULL tests after every new allocation. Code like the following is very tedious: Fred* p = new Fred(); if (p == NULL) throw bad_alloc(); If your compiler doesn't support (or if you refuse to use) exceptions[17], your code might be even more tedious: Fred* p = new Fred(); if (p == NULL) { cerr << "Couldn't allocate memory for a Fred" << endl; abort(); } Take heart. In C++, if the runtime system cannot allocate sizeof(Fred) bytes of memory during p = new Fred(), a bad_alloc exception will be thrown. Unlike malloc(), new never returns NULL! Therefore you should simply write: Fred* p = new Fred(); // No need to check if p is NULL However, if your compiler is old, it may not yet support this. Find out by checking your compiler's documentation under "new". If you have an old compiler, you may have to force the compiler to have this behavior[16.6]. ============================================================================== [16.6] How can I convince my (older) compiler to automatically check new to see if it returns NULL? Eventually your compiler will. If you have an old compiler that doesn't automagically perform the NULL test[16.5], you can force the runtime system to do the test by installing a "new handler" function. Your "new handler" function can do anything you want, such as print a message and abort() the program, delete some objects and return (in which case operator new will retry the allocation), throw an exception, etc. Here's a sample "new handler" that prints a message and calls abort(). The handler is installed using set_new_handler(): #include <new.h> // To get set_new_handler #include <stdlib.h> // To get abort() #include <iostream.h> // To get cerr void myNewHandler() { // This is your own handler. It can do anything you want. cerr << "Attempt to allocate memory failed!" << endl; abort(); } int main() { set_new_handler(myNewHandler); // Install your "new handler" // ... } After the set_new_handler() line is executed, operator new will call your myNewHandler() if/when it runs out of memory. This means that new will never return NULL: Fred* p = new Fred(); // No need to check if p is NULL Note: Please use this abort() approach as a last resort. If your compiler supports exception handling[17], please consider throwing an exception instead of calling abort(). Note: If some global/static object's constructor uses new, it won't use the myNewHandler() function since that constructor will get called before main() begins. Unfortunately there's no convenient way to guarantee that the set_new_handler() will be called before the first use of new. For example, even if you put the set_new_handler() call in the constructor of a global object, you still don't know if the module ("compilation unit") that contains that global object will be elaborated first or last or somewhere inbetween. Therefore you still don't have any guarantee that your call of set_new_handler() will happen before any other global's constructor gets invoked. ============================================================================== [16.7] Do I need to check for NULL before delete p? No! The C++ language guarantees that delete p will do nothing if p is equal to NULL. Since you might get the test backwards, and since most testing methodologies force you to explicitly test every branch point, you should not put in the redundant if test. Wrong: if (p != NULL) delete p; Right: delete p; ============================================================================== [16.8] What are the two steps that happen when I say delete p? delete p is a two-step process: it calls the destructor, then releases the memory. The code generated for delete p looks something like this (assuming p is of type Fred*): // Original code: delete p; if (p != NULL) { p->~Fred(); operator delete(p); } The statement p->~Fred() calls the destructor for the Fred object pointed to by p. The statement operator delete(p) calls the memory deallocation primitive, void operator delete(void* p). This primitive is similar in spirit to free(void* p). (Note, however, that these two are not interchangeable; e.g., there is no guarantee that the two memory deallocation primitives even use the same heap!). ============================================================================== [16.9] In p = new Fred(), does the Fred memory "leak" if the Fred constructor throws an exception? No. If an exception occurs during the Fred constructor of p = new Fred(), the C++ language guarantees that the memory sizeof(Fred) bytes that were allocated will automagically be released back to the heap. Here are the details: new Fred() is a two-step process: 1. sizeof(Fred) bytes of memory are allocated using the primitive void* operator new(size_t nbytes). This primitive is similar in spirit to malloc(size_t nbytes). (Note, however, that these two are not interchangeable; e.g., there is no guarantee that the two memory allocation primitives even use the same heap!). 2. It constructs an object in that memory by calling the Fred constructor. The pointer returned from the first step is passed as the this parameter to the constructor. This step is wrapped in a try ... catch block to handle the case when an exception is thrown during this step. Thus the actual generated code looks something like: // Original code: Fred* p = new Fred(); Fred* p = (Fred*) operator new(sizeof(Fred)); try { new(p) Fred(); // Placement new[11.10] } catch (...) { operator delete(p); // Deallocate the memory throw; // Re-throw the exception } The statement marked "Placement new[11.10]" calls the Fred constructor. The pointer p becomes the this pointer inside the constructor, Fred::Fred(). ============================================================================== [16.10] How do I allocate / unallocate an array of things? Use p = new T[n] and delete[] p: Fred* p = new Fred[100]; // ... delete[] p; Any time you allocate an array of objects via new (usually with the [n] in the new expression), you must use [] in the delete statement. This syntax is necessary because there is no syntactic difference between a pointer to a thing and a pointer to an array of things (something we inherited from C). ============================================================================== [16.11] What if I forget the [] when deleteing array allocated via new T[n]? All life comes to a catastrophic end. It is the programmer's --not the compiler's-- responsibility to get the connection between new T[n] and delete[] p correct. If you get it wrong, neither a compile-time nor a run-time error message will be generated by the compiler. Heap corruption is a likely result. Or worse. Your program will probably die. ============================================================================== [16.12] Can I drop the [] when deleteing array of some built-in type (char, int, etc)? No! Sometimes programmers think that the [] in the delete[] p only exists so the compiler will call the appropriate destructors for all elements in the array. Because of this reasoning, they assume that an array of some built-in type such as char or int can be deleted without the []. E.g., they assume the following is valid code: void userCode(int n) { char* p = new char[n]; // ... delete p; // <-- ERROR! Should be delete[] p ! } But the above code is wrong, and it can cause a disaster at runtime. In particular, the code that's called for delete p is operator delete(void*), but the code that's called for delete[] p is operator delete[](void*). The default behavior for the latter is to call the former, but users are allowed to replace the latter with a different behavior (in which case they would normally also replace the corresponding new code in operator new[](size_t)). If they replaced the delete[] code so it wasn't compatible with the delete code, and you called the wrong one (i.e., if you said delete p rather than delete[] p), you could end up with a disaster at runtime. ============================================================================== [16.13] After p = new Fred[n], how does the compiler know there are n objects to be destructed during delete[] p? Short answer: Magic. Long answer: The run-time system stores the number of objects, n, somewhere where it can be retrieved if you only know the pointer, p. There are two popluar techniques that do this. Both these techniques are in use by commercial grade compilers, both have tradeoffs, and neither is perfect. These techniques are: * Over-allocate the array and put n just to the left of the first Fred object[33.6]. * Use an associative array with p as the key and n as the value[33.7]. ============================================================================== [16.14] Is it legal (and moral) for a member function to say delete this? As long as you're careful, it's OK for an object to commit suicide (delete this). Here's how I define "careful": 1. You must be absolutely 100% positive sure that this object was allocated via new (not by new[], nor by placement new[11.10], nor a local object on the stack, nor a global, nor a member of another object; but by plain ordinary new). 2. You must be absolutely 100% positive sure that your member function will be the last member function invoked on this object. 3. You must be absolutely 100% positive sure that the rest of your member function (after the delete this line) doesn't touch any piece of this object (including calling any other member functions or touching any data members). 4. You must be absolutely 100% positive sure that no one even touches the this pointer itself after the delete this line. In other words, you must not examine it, compare it with another pointer, compare it with NULL, print it, cast it, do anything with it. Naturally the usual caveats apply in cases where your this pointer is a pointer to a base class when you don't have a virtual destructor[20.4]. ============================================================================== [16.15] How do I allocate multidimensional arrays using new? There are many ways to do this, depending on how flexible you want the array sizing to be. On one extreme, if you know all the dimensions at compile-time, you can allocate multidimensional arrays statically (as in C): class Fred { /*...*/ }; void someFunction(Fred& fred); void manipulateArray() { const unsigned nrows = 10; // Num rows is a compile-time constant const unsigned ncols = 20; // Num columns is a compile-time constant Fred matrix[nrows][ncols]; for (unsigned i = 0; i < nrows; ++i) { for (unsigned j = 0; j < ncols; ++j) { // Here's the way you access the (i,j) element: someFunction( matrix[i][j] ); // You can safely "return" without any special delete code: if (today == "Tuesday" && moon.isFull()) return; // Quit early on Tuesdays when the moon is full } } // No explicit delete code at the end of the function either } More commonly, the size of the matrix isn't known until run-time but you know that it will be rectangular. In this case you need to use the heap ("freestore"), but at least you are able to allocate all the elements in one freestore chunk. void manipulateArray(unsigned nrows, unsigned ncols) { Fred* matrix = new Fred[nrows * ncols]; // Since we used a simple pointer above, we need to be VERY // careful to avoid skipping over the delete code. // That's why we catch all exceptions: try { // Here's how to access the (i,j) element: for (unsigned i = 0; i < nrows; ++i) { for (unsigned j = 0; j < ncols; ++j) { someFunction( matrix[i*ncols + j] ); } } // If you want to quit early on Tuesdays when the moon is full, // make sure to do the delete along ALL return paths: if (today == "Tuesday" && moon.isFull()) { delete[] matrix; return; } // ... } catch (...) { // Make sure to do the delete when an exception is thrown: delete[] matrix; throw; // Re-throw the current exception } // Make sure to do the delete at the end of the function too: delete[] matrix; } Finally at the other extreme, you may not even be guaranteed that the matrix is rectangular. For example, if each row could have a different length, you'll need to allocate each row individually. In the following function, ncols[i] is the number of columns in row number i, where i varies between 0 and nrows-1 inclusive. void manipulateArray(unsigned nrows, unsigned ncols[]) { Fred** matrix = new Fred*[nrows]; for (unsigned i = 0; i < nrows; ++i) matrix[i] = new Fred[ ncols[i] ]; // Since we used a simple pointer above, we need to be VERY // careful to avoid skipping over the delete code. // That's why we catch all exceptions: try { // Here's how to access the (i,j) element: for (unsigned i = 0; i < nrows; ++i) { for (unsigned j = 0; j < ncols[i]; ++j) { someFunction( matrix[i][j] ); } } // If you want to quit early on Tuesdays when the moon is full, // make sure to do the delete along ALL return paths: if (today == "Tuesday" && moon.isFull()) { for (unsigned i = nrows; i > 0; --i) delete[] matrix[i-1]; delete[] matrix; return; } // ... } catch (...) { // Make sure to do the delete when an exception is thrown: for (unsigned i = nrows; i > 0; --i) delete[] matrix[i-1]; delete[] matrix; throw; // Re-throw the current exception } // Make sure to do the delete at the end of the function too. // Note that deletion is the opposite order of allocation: for (i = nrows; i > 0; --i) delete[] matrix[i-1]; delete[] matrix; } Note the funny use of matrix[i-1] in the deletion process. This prevents wrap-around of the unsigned value when i goes one step below zero. Finally, note that pointers and arrays are evil[21.5]. It is normally much better to encapsulate your pointers in a class that has a safe and simple interface. The following FAQ[16.16] shows how to do this. ============================================================================== [16.16] But the previous FAQ's code is SOOOO tricky and error prone! Isn't there a simpler way? Yep. The reason the code in the previous FAQ[16.15] was so tricky and error prone was that it used pointers, and we know that pointers and arrays are evil[21.5]. The solution is to encapsulate your pointers in a class that has a safe and simple interface. For example, we can define a Matrix class that handles a rectangular matrix so our user code will be vastly simplified when compared to the the rectangular matrix code from the previous FAQ[16.15]: // The code for class Matrix is shown below... void someFunction(Fred& fred); void manipulateArray(unsigned nrows, unsigned ncols) { Matrix matrix(nrows, ncols); // Construct a Matrix called matrix for (unsigned i = 0; i < nrows; ++i) { for (unsigned j = 0; j < ncols; ++j) { // Here's the way you access the (i,j) element: someFunction( matrix(i,j) ); // You can safely "return" without any special delete code: if (today == "Tuesday" && moon.isFull()) return; // Quit early on Tuesdays when the moon is full } } // No explicit delete code at the end of the function either } The main thing to notice is the lack of clean-up code. For example, there aren't any delete statements in the above code, yet there will be no memory leaks, assuming only that the Matrix destructor does its job correctly. Here's the Matrix code that makes the above possible: class Matrix { public: Matrix(unsigned nrows, unsigned ncols); // Throws a BadSize object if either size is zero class BadSize { }; // Based on the Law Of The Big Three[25.9]: ~Matrix(); Matrix(const Matrix& m); Matrix& operator= (const Matrix& m); // Access methods to get the (i,j) element: Fred& operator() (unsigned i, unsigned j); const Fred& operator() (unsigned i, unsigned j) const; // These throw a BoundsViolation object if i or j is too big class BoundsViolation { }; private: Fred* data_; unsigned nrows_, ncols_; }; inline Fred& Matrix::operator() (unsigned row, unsigned col) { if (row >= nrows_ || col >= ncols_) throw BoundsViolation(); return data_[row*ncols_ + col]; } inline const Fred& Matrix::operator() (unsigned row, unsigned col) const { if (row >= nrows_ || col >= ncols_) throw BoundsViolation(); return data_[row*ncols_ + col]; } Matrix::Matrix(unsigned nrows, unsigned ncols) : data_ (new Fred[nrows * ncols]), nrows_ (nrows), ncols_ (ncols) { if (nrows == 0 || ncols == 0) throw BadSize(); } Matrix::~Matrix() { delete[] data_; } Note that the above Matrix class accomplishes two things: it moves some tricky memory management code from the user code (e.g., main()) to the class, and it reduces the overall bulk of program (e.g., assuming Matrix is even mildly reusable, moving complexity from the users of Matrix into Matrix itself is equivalent to moving complexity from the many to the few). And anyone who's seen Star Trek 3 knows that the good of the many outweighs the good of the few... or the one. ============================================================================== [16.17] But the above Matrix class is specific to Fred! Isn't there a way to make it generic? Yep; just use templates[31]: template<class T> // See section on templates[31] for more class Matrix { public: Matrix(unsigned nrows, unsigned ncols); // Throws a BadSize object if either size is zero class BadSize { }; // Based on the Law Of The Big Three[25.9]: ~Matrix(); Matrix(const Matrix<T>& m); Matrix<T>& operator= (const Matrix<T>& m); // Access methods to get the (i,j) element: T& operator() (unsigned i, unsigned j); const T& operator() (unsigned i, unsigned j) const; // These throw a BoundsViolation object if i or j is too big class BoundsViolation { }; private: T* data_; unsigned nrows_, ncols_; }; template<class T> inline T& Matrix<T>::operator() (unsigned row, unsigned col) { if (row >= nrows_ || col >= ncols_) throw BoundsViolation(); return data_[row*ncols_ + col]; } template<class T> inline const T& Matrix<T>::operator() (unsigned row, unsigned col) const { if (row >= nrows_ || col >= ncols_) throw BoundsViolation(); return data_[row*ncols_ + col]; } template<class T> inline Matrix<T>::Matrix(unsigned nrows, unsigned ncols) : data_ (new T[nrows * ncols]), nrows_ (nrows), ncols_ (ncols) { if (nrows == 0 || ncols == 0) throw BadSize(); } template<class T> inline Matrix<T>::~Matrix() { delete[] data_; } Here's one way to use this template[31]: #include "Fred.hpp" // To get the definition for class Fred void doSomethingWith(Fred& fred); void sample(unsigned nrows, unsigned ncols) { Matrix<Fred> matrix(nrows, ncols); // Construct a Matrix<Fred> called matrix for (unsigned i = 0; i < nrows; ++i) { for (unsigned j = 0; j < ncols; ++j) { doSomethingWith( matrix(i,j) ); } } } ============================================================================== [16.18] Does C++ have arrays whose length can be specified at run-time? Yes, in the sense that STL[32.1] has a vector template that provides this behavior. No, in the sense that built-in array types need to have their length specified at compile time. Yes, in the sense that even built-in array types can specify the first index bounds at run-time. E.g., comparing with the previous FAQ, if you only need the first array dimension to vary then you can just ask new for an array of arrays, rather than an array of pointers to arrays: const unsigned ncols = 100; // ncols = number of columns in the array class Fred { /*...*/ }; void manipulateArray(unsigned nrows) // nrows = number of rows in the array { Fred (*matrix)[ncols] = new Fred[nrows][ncols]; // ... delete[] matrix; } You can't do this if you need anything other than the first dimension of the array to change at run-time. But please, don't use arrays unless you have to. Arrays are evil[21.5]. Use some object of some class if you can. Use arrays only when you have to. ============================================================================== [16.19] How can I force objects of my class to always be created via new rather than as locals or global/static objects? Use the Named Constructor Idiom[10.8]. As usual with the Named Constructor Idiom, the constructors are all private: or protected:, and there are one or more public static create() methods (the so-called "named constructors"), one per constructor. In this case the create() methods allocate the objects via new. Since the constructors themselves are not public, there is no other way to create objects of the class. class Fred { public: // The create() methods are the "named constructors": static Fred* create() { return new Fred(); } static Fred* create(int i) { return new Fred(i); } static Fred* create(const Fred& fred) { return new Fred(fred); } // ... private: // The constructors themselves are private or protected: Fred(); Fred(int i); Fred(const Fred& fred); // ... }; Now the only way to create Fred objects is via Fred::create(): int main() { Fred* p = Fred::create(5); // ... delete p; } Make sure your constructors are in the protected: section if you expect Fred to have derived classes. Note also that you can make another class Wilma a friend[14] of Fred if you want to allow a Wilma to have a member object of class Fred, but of course this is a softening of the original goal, namely to force Fred objects to be allocated via new. ============================================================================== [16.20] How do I do simple reference counting? If all you want is the ability to pass around a bunch of pointers to the same object, with the feature that the object will automagically get deleted when the last pointer to it disappears, you can use something like the following "smart pointer" class: // Fred.h class FredPtr; class Fred { public: Fred() : count_(0) /*...*/ { } // All ctors set count_ to 0 ! // ... private: friend FredPtr; // A friend class[14] unsigned count_; // count_ must be initialized to 0 by all constructors // count_ is the number of FredPtr objects that point at this }; class FredPtr { public: Fred* operator-> () { return p_; } Fred& operator* () { return *p_; } FredPtr(Fred* p) : p_(p) { ++p_->count_; } // p must not be NULL ~FredPtr() { if (--p_->count_ == 0) delete p_; } FredPtr(const FredPtr& p) : p_(p.p_) { ++p_->count_; } FredPtr& operator= (const FredPtr& p) { // DO NOT CHANGE THE ORDER OF THESE STATEMENTS! // (This order properly handles self-assignment[12.1]) ++p.p_->count_; if (--p_->count_ == 0) delete p_; p_ = p.p_; return *this; } private: Fred* p_; // p_ is never NULL }; Naturally you can use nested classes to rename FredPtr to Fred::Ptr. Note that you can soften the "never NULL" rule above with a little more checking in the constructor, copy constructor, assignment operator, and destructor. If you do that, you might as well put a p_ != NULL check into the "*" and "->" operators (at least as an assert()). I would recommend against an operator Fred*() method, since that would let people accidentally get at the Fred*. One of the implicit constraints on FredPtr is that it must only point to Fred objects which have been allocated via new. If you want to be really safe, you can enforce this constraint by making all of Fred's constructors private, and for each constructor have a public (static) create() method which allocates the Fred object via new and returns a FredPtr (not a Fred*). That way the only way anyone could create a Fred object would be to get a FredPtr ("Fred* p = new Fred()" would be replaced by "FredPtr p = Fred::create()"). Thus no one could accidentally subvert the reference counted mechanism. For example, if Fred had a Fred::Fred() and a Fred::Fred(int i, int j), the changes to class Fred would be: class Fred { public: static FredPtr create() { return new Fred(); } static FredPtr create(int i, int j) { return new Fred(i,j); } // ... private: Fred(); Fred(int i, int j); // ... }; The end result is that you now have a way to use simple reference counting to provide "pointer semantics" for a given object. Users of your Fred class explicitly use FredPtr objects, which act more or less like Fred* pointers. The benefit is that users can make as many copies of their FredPtr "smart pointer" objects, and the pointed-to Fred object will automagically get deleted when the last such FredPtr object vanishes. If you'd rather give your users "reference semantics" rather than "pointer semantics," you can use reference counting to provide "copy on write"[16.21]. ============================================================================== [16.21] How do I provide reference counting with copy-on-write semantics? The previous FAQ[16.20] a simple reference counting scheme that provided users with pointer semantics. This FAQ describes an approach that provides users with reference semantics. The basic idea is to allow users to think they're copying your Fred objects, but in reality the underlying implementation doesn't actually do any copying unless and until some user actually tries to modify the underlying Fred object. Class Fred::Data houses all the data that would normally go into the Fred class. Fred::Data also has an extra data member, count_, to manage the reference counting. Class Fred ends up being a "smart reference" that (internally) points to a Fred::Data. class Fred { public: Fred(); // A default constructor[10.4] Fred(int i, int j); // A normal constructor Fred(const Fred& f); Fred& operator= (const Fred& f); ~Fred(); void sampleInspectorMethod() const; // No changes to this object void sampleMutatorMethod(); // Change this object // ... private: class Data { public: Data(); Data(int i, int j); Data(const Data& d); // Since only Fred can access a Fred::Data object, // you can make Fred::Data's data public if you want. // But if that makes you uncomfortable, make the data private // and make Fred a friend class[14] via friend Fred; // ... unsigned count_; // count_ is the number of Fred objects that point at this // count_ must be initialized to 1 by all constructors // (it starts as 1 since it is pointed to by the Fred object that created it) }; Data* data_; }; Fred::Data::Data() : count_(1) /*init other data*/ { } Fred::Data::Data(int i, int j) : count_(1) /*init other data*/ { } Fred::Data::Data(const Data& d) : count_(1) /*init other data*/ { } Fred::Fred() : data_(new Data()) { } Fred::Fred(int i, int j) : data_(new Data(i, j)) { } Fred::Fred(const Fred& f) : data_(f.data_) { ++ data_->count_; } Fred& Fred::operator= (const Fred& f) { // DO NOT CHANGE THE ORDER OF THESE STATEMENTS! // (This order properly handles self-assignment[12.1]) ++ f.data_->count_; if (--data_->count_ == 0) delete data_; data_ = f.data_; return *this; } Fred::~Fred() { if (--data_->count_ == 0) delete data_; } void Fred::sampleInspectorMethod() const { // This method promises ("const") not to change anything in *data_ // Other than that, any data access would simply use "data_->..." } void Fred::sampleMutatorMethod() { // This method might need to change things in *data_ // Thus it first checks if this is the only pointer to *data_ if (data_->count_ > 1) { Data* d = new Data(*data_); // Invoke Fred::Data's copy ctor -- data_->count_; data_ = d; } assert(data_->count_ == 1); // Now the method proceeds to access "data_->..." as normal } If it is fairly common to call Fred's default constructor[10.4], you can avoid all those new calls by sharing a common Fred::Data object for all Freds that are constructed via Fred::Fred(). To avoid static initialization order problems, this shared Fred::Data object is created "on first use" inside a function. Here are the changes that would be made to the above code (note that the shared Fred::Data object's destructor is never invoked; if that is a problem, either hope you don't have any static initialization order problems, or drop back to the approach described above): class Fred { public: // ... private: // ... static Data* defaultData(); }; Fred::Fred() : data_(defaultData()) { ++ data_->count_; } Fred::Data* Fred::defaultData() { static Data* p = NULL; if (p == NULL) { p = new Data(); ++ p->count_; // Make sure it never goes to zero } return p; } Note: You can also provide reference counting for a hierarchy of classes[16.22] if your Fred class would normally have been a base class. ============================================================================== [16.22] How do I provide reference counting with copy-on-write semantics for a hierarchy of classes? The previous FAQ[16.21] presented a reference counting scheme that provided users with reference semantics, but did so for a single class rather than for a hierarchy of classes. This FAQ extends the previous technique to allow for a hierarchy of classes. The basic difference is that Fred::Data is now the root of a hierarchy of classes, which probably cause it to have some virtual[20] functions. Note that class Fred itself will still not have any virtual functions. The Virtual Constructor Idiom[20.5] is used to make copies of the Fred::Data objects. To select which derived class to create, the sample code below uses the Named Constructor Idiom[10.8], but other techniques are possible (a switch statement in the constructor, etc). The sample code assumes two derived classes: Der1 and Der2. Methods in the derived classes are unaware of the reference counting. class Fred { public: static Fred create1(String s, int i); static Fred create2(float x, float y); Fred(const Fred& f); Fred& operator= (const Fred& f); ~Fred(); void sampleInspectorMethod() const; // No changes to this object void sampleMutatorMethod(); // Change this object // ... private: class Data { public: Data() : count_(1) { } Data(const Data& d) : count_(1) { } // Do NOT copy the 'count_' member! Data& operator= (const Data&) { return *this; } // Do NOT copy the 'count_' member! virtual ~Data() { assert(count_ == 0); } // A virtual destructor[20.4] virtual Data* clone() const = 0; // A virtual constructor[20.5] virtual void sampleInspectorMethod() const = 0; // A pure virtual function[22.4] virtual void sampleMutatorMethod() = 0; private: unsigned count_; // count_ doesn't need to be protected friend Fred; // Allow Fred to access count_ }; class Der1 : public Data { public: Der1(String s, int i); virtual void sampleInspectorMethod() const; virtual void sampleMutatorMethod(); virtual Data* clone() const; // ... }; class Der2 : public Data { public: Der2(float x, float y); virtual void sampleInspectorMethod() const; virtual void sampleMutatorMethod(); virtual Data* clone() const; // ... }; Fred(Data* data); // Creates a Fred smart-reference that owns *data // It is private to force users to use a createXXX() method // Requirement: data must not be NULL Data* data_; // Invariant: data_ is never NULL }; Fred::Fred(Data* data) : data_(data) { assert(data != NULL); } Fred Fred::create1(String s, int i) { return Fred(new Der1(s, i)); } Fred Fred::create2(float x, float y) { return Fred(new Der2(x, y)); } Fred::Data* Fred::Der1::clone() const { return new Der1(*this); } Fred::Data* Fred::Der2::clone() const { return new Der2(*this); } Fred::Fred(const Fred& f) : data_(f.data_) { ++ data_->count_; } Fred& Fred::operator= (const Fred& f) { // DO NOT CHANGE THE ORDER OF THESE STATEMENTS! // (This order properly handles self-assignment[12.1]) ++ f.data_->count_; if (--data_->count_ == 0) delete data_; data_ = f.data_; return *this; } Fred::~Fred() { if (--data_->count_ == 0) delete data_; } void Fred::sampleInspectorMethod() const { // This method promises ("const") not to change anything in *data_ // Therefore we simply "pass the method through" to *data_: data_->sampleInspectorMethod(); } void Fred::sampleMutatorMethod() { // This method might need to change things in *data_ // Thus it first checks if this is the only pointer to *data_ if (data_->count_ > 1) { Data* d = data_->clone(); // The Virtual Constructor Idiom[20.5] -- data_->count_; data_ = d; } assert(data_->count_ == 1); // Now we "pass the method through" to *data_: data_->sampleInspectorMethod(); } Naturally the constructors and sampleXXX methods for Fred::Der1 and Fred::Der2 will need to be implemented in whatever way is appropriate. ============================================================================== SECTION [17]: Exceptions and error handling [17.1] How can I handle a constructor that fails? Throw an exception. Constructors don't have a return type, so it's not possible to use error codes. The best way to signal constructor failure is therefore to throw an exception. If you don't have or won't use exceptions, here's a work-around. If a constructor fails, the constructor can put the object into a "zombie" state. Do this by setting an internal status bit so the object acts sort of like its dead even though it is technically still alive. Then add a query ("inspector") member function to check this "zombie" bit so users of your class can find out if their object is truly alive, or if it's a zombie (i.e., a "living dead" object). Also you'll probably want to have your other member functions check this zombie bit, and, if the object isn't really alive, do a no-op (or perhaps something more obnoxious such as abort()). This is really ugly, but it's the best you can do if you can't (or don't want to) use exceptions. ============================================================================== [17.2] How should I handle resources if my constructors may throw exceptions? Every data member inside your object should clean up its own mess. If a constructor throws an exception, the object's destructor is not run. If your object has already done something that needs to be undone (such as allocating some memory, opening a file, or locking a semaphore), this "stuff that needs to be undone" must be remembered by a data member inside the object. For example, rather than allocating memory into a raw Fred* data member, put the allocated memory into a "smart pointer" member object, and the destructor of this smart pointer will delete the Fred object when the smart pointer dies. The standard class auto_ptr is an example of such as "smart pointer" class. You can also write your own reference counting smart pointer[16.20]. You can also use smart pointers to "point" to disk records or objects on other machines[13.3]. ============================================================================== [17.3] How do I change the string-length of an array of char to prevent memory leaks even if/when someone throws an exception? If what you really want to do is work with strings, don't use an array of char in the first place, since arrays are evil[21.5]. Instead use an object of some string-like class. For example, suppose you want to get a copy of a string, fiddle with the copy, then append another string to the end of the fiddled copy. The array-of-char approach would look something like this: void userCode(const char* s1, const char* s2) { // Get a copy of s1 into a new string called copy: char* copy = new char[strlen(s1) + 1]; strcpy(copy, s1); // Now that we have a local pointer to freestore-allocated memory, // we need to use a try block to prevent memory leaks: try { // Now we fiddle with copy for a while... // ... // Later we want to append s2 onto the fiddled-with copy: // ... [Here's where people want to reallocate copy] ... char* copy2 = new char[strlen(copy) + strlen(s2) + 1]; strcpy(copy2, copy); strcpy(copy2 + strlen(copy), s2); delete[] copy; copy = copy2; // Finally we fiddle with copy again... // ... } catch (...) { delete[] copy; // Prevent memory leaks if we got an exception throw; // Re-throw the current exception } delete[] copy; // Prevent memory leaks if we did NOT get an exception } Using char*s like this is tedious and error prone. Why not just use an object of some string class? Your compiler probably supplies a string-like class, and it's probably just as fast and certainly it's a lot simpler and safer than the char* code that you would have to write yourself. For example, if you're using the string class from the standardization committee[6.12], your code might look something like this: #include <string> // Let the compiler see class string using namespace std; void userCode(const string& s1, const string& s2) { // Get a copy of s1 into a new string called copy: string copy = s1; // NOTE: we do NOT need a try block! // Now we fiddle with copy for a while... // ... // Later we want to append s2 onto the fiddled-with copy: copy += s2; // NOTE: we do NOT need to reallocate memory! // Finally we fiddle with copy again... // ... } // NOTE: we do NOT need to delete[] anything! ============================================================================== -- Marshall Cline / 972-931-9470 / mailto:cline@parashift.com