Reference Count Details

The reference count behavior of functions in the Python/C API is best expelained in terms of ownership of references. Note that we talk of owning references, never of owning objects; objects are always shared! When a function owns a reference, it has to dispose of it properly - either by passing ownership on (usually to its caller) or by calling Py_DECREF() or Py_XDECREF(). When a function passes ownership of a reference on to its caller, the caller is said to receive a new reference. When to ownership is transferred, the caller is said to borrow the reference. Nothing needs to be done for a borrowed reference.

Conversely, when calling a function passes it a reference to an object, there are two possibilities: the function steals a reference to the object, or it does not. Few functions steal references; the two notable exceptions are PyList_SetItem() and PyTuple_SetItem(), which steal a reference to the item (but not to the tuple or list into which the item it put!). These functions were designed to steal a reference because of a common idiom for populating a tuple or list with newly created objects; for example, the code to create the tuple (1, 2, "three") could look like this (forgetting about error handling for the moment; a better way to code this is shown below anyway):

PyObject *t;
t = PyTuple_New(3);
PyTuple_SetItem(t, 0, PyInt_FromLong(1L));
PyTuple_SetItem(t, 1, PyInt_FromLong(2L));
PyTuple_SetItem(t, 2, PyString_FromString("three"));

Incidentally, PyTuple_SetItem() is the only way to set tuple items; PyObject_SetItem() refuses to do this since tuples are an immutable data type. You should only use PyTuple_SetItem() for tuples that you are creating yourself.

Equivalent code for populating a list can be written using PyList_New() and PyList_SetItem(). Such code can also use PySequence_SetItem(); this illustrates the difference between the two:

PyObject *l, *x;
l = PyList_New(3);
x = PyInt_FromLong(1L);
PyObject_SetItem(l, 0, x); Py_DECREF(x);
x = PyInt_FromLong(2L);
PyObject_SetItem(l, 1, x); Py_DECREF(x);
x = PyString_FromString("three");
PyObject_SetItem(l, 2, x); Py_DECREF(x);

You might find it strange that the ``recommended'' approach takes more code. However, in practice, you will rarely use these ways of creating and populating a tuple or list. There's a generic function, Py_BuildValue(), that can create most common objects from C values, directed by a ``format string''. For example, the above two blocks of code could be replaced by the following (which also takes care of the error checking!):

PyObject *t, *l;
t = Py_BuildValue("(iis)", 1, 2, "three");
l = Py_BuildValue("[iis]", 1, 2, "three");

It is much more common to use PyObject_SetItem() and friends with items whose references you are only borrowing, like arguments that were passed in to the function you are writing. In that case, their behaviour regarding reference counts is much saner, since you don't have to increment a reference count so you can give a reference away (``have it be stolen''). For example, this function sets all items of a list (actually, any mutable sequence) to a given item:

int set_all(PyObject *target, PyObject *item)
{
    int i, n;
    n = PyObject_Length(target);
    if (n < 0)
        return -1;
    for (i = 0; i < n; i++) {
        if (PyObject_SetItem(target, i, item) < 0)
            return -1;
    }
    return 0;
}

The situation is slightly different for function return values. While passing a reference to most functions does not change your ownership responsibilities for that reference, many functions that return a referece to an object give you ownership of the reference. The reason is simple: in many cases, the returned object is created on the fly, and the reference you get is the only reference to the object! Therefore, the generic functions that return object references, like PyObject_GetItem() and PySequence_GetItem(), always return a new reference (i.e., the caller becomes the owner of the reference).

It is important to realize that whether you own a reference returned by a function depends on which function you call only - the plumage (i.e., the type of the type of the object passed as an argument to the function) don't enter into it! Thus, if you extract an item from a list using PyList_GetItem(), yo don't own the reference - but if you obtain the same item from the same list using PySequence_GetItem() (which happens to take exactly the same arguments), you do own a reference to the returned object.

Here is an example of how you could write a function that computes the sum of the items in a list of integers; once using PyList_GetItem(), once using PySequence_GetItem().

long sum_list(PyObject *list)
{
    int i, n;
    long total = 0;
    PyObject *item;
    n = PyList_Size(list);
    if (n < 0)
        return -1; /* Not a list */
    for (i = 0; i < n; i++) {
        item = PyList_GetItem(list, i); /* Can't fail */
        if (!PyInt_Check(item)) continue; /* Skip non-integers */
        total += PyInt_AsLong(item);
    }
    return total;
}

long sum_sequence(PyObject *sequence)
{
    int i, n;
    long total = 0;
    PyObject *item;
    n = PyObject_Size(list);
    if (n < 0)
        return -1; /* Has no length */
    for (i = 0; i < n; i++) {
        item = PySequence_GetItem(list, i);
        if (item == NULL)
            return -1; /* Not a sequence, or other failure */
        if (PyInt_Check(item))
            total += PyInt_AsLong(item);
        Py_DECREF(item); /* Discared reference ownership */
    }
    return total;
}

guido@CNRI.Reston.Va.US