The reference count is important because today's computers have a finite (and often severly limited) memory size; it counts how many different places there are that have a reference to an object. Such a place could be another object, or a global (or static) C variable, or a local variable in some C function. When an object's reference count becomes zero, the object is deallocated. If it contains references to other objects, their reference count is decremented. Those other objects may be deallocated in turn, if this decrement makes their reference count become zero, and so on. (There's an obvious problem with objects that reference each other here; for now, the solution is ``don't do that''.)
Reference counts are always manipulated explicitly. The normal way is
to use the macro Py_INCREF(a)
to increment an object's
reference count by one, and Py_DECREF(a)
to decrement it by
one. The decref macro is considerably more complex than the incref one,
since it must check whether the reference count becomes zero and then
cause the object's deallocator, which is a function pointer contained
in the object's type structure. The type-specific deallocator takes
care of decrementing the reference counts for other objects contained
in the object, and so on, if this is a compound object type such as a
list. There's no chance that the reference count can overflow; at
least as many bits are used to hold the reference count as there are
distinct memory locations in virtual memory (assuming
sizeof(long) >= sizeof(char *)
). Thus, the reference count
increment is a simple operation.
It is not necessary to increment an object's reference count for every local variable that contains a pointer to an object. In theory, the oject's reference count goes up by one when the variable is made to point to it and it goes down by one when the variable goes out of scope. However, these two cancel each other out, so at the end the reference count hasn't changed. The only real reason to use the reference count is to prevent the object from being deallocated as long as our variable is pointing to it. If we know that there is at least one other reference to the object that lives at least as long as our variable, there is no need to increment the reference count temporarily. An important situation where this arises is in objects that are passed as arguments to C functions in an extension module that are called from Python; the call mechanism guarantees to hold a reference to every argument for the duration of the call.
However, a common pitfall is to extract an object from a list and
holding on to it for a while without incrementing its reference count.
Some other operation might conceivably remove the object from the
list, decrementing its reference count and possible deallocating it.
The real danger is that innocent-looking operations may invoke
arbitrary Python code which could do this; there is a code path which
allows control to flow back to the user from a Py_DECREF()
, so
almost any operation is potentially dangerous.
A safe approach is to always use the generic operations (functions
whose name begins with PyObject_
, PyNumber_
,
PySequence_
or PyMapping_
). These operations always
increment the reference count of the object they return. This leaves
the caller with the responsibility to call Py_DECREF()
when
they are done with the result; this soon becomes second nature.
guido@CNRI.Reston.Va.US