PEP: 3123 Title: Making PyObject_HEAD conform to standard C Version:
$Revision$ Last-Modified: $Date$ Author: Martin von Löwis
<martin@v.loewis.de> Status: Final Type: Standards Track Content-Type:
text/x-rst Created: 27-Apr-2007 Python-Version: 3.0 Post-History:

Abstract

Python currently relies on undefined C behavior, with its usage of
PyObject_HEAD. This PEP proposes to change that into standard C.

Rationale

Standard C defines that an object must be accessed only through a
pointer of its type, and that all other accesses are undefined behavior,
with a few exceptions. In particular, the following code has undefined
behavior:

    struct FooObject{
      PyObject_HEAD
      int data;
    };

    PyObject *foo(struct FooObject*f){
     return (PyObject*)f;
    }

    int bar(){
     struct FooObject *f = malloc(sizeof(struct FooObject));
     struct PyObject *o = foo(f);
     f->ob_refcnt = 0;
     o->ob_refcnt = 1;
     return f->ob_refcnt;
    }

The problem here is that the storage is both accessed as if it where
struct PyObject, and as struct FooObject.

Historically, compilers did not have any problems with this code.
However, modern compilers use that clause as an optimization
opportunity, finding that f->ob_refcnt and o->ob_refcnt cannot possibly
refer to the same memory, and that therefore the function should return
0, without having to fetch the value of ob_refcnt at all in the return
statement. For GCC, Python now uses -fno-strict-aliasing to work around
that problem; with other compilers, it may just see undefined behavior.
Even with GCC, using -fno-strict-aliasing may pessimize the generated
code unnecessarily.

Specification

Standard C has one specific exception to its aliasing rules precisely
designed to support the case of Python: a value of a struct type may
also be accessed through a pointer to the first field. E.g. if a struct
starts with an int, the struct * may also be cast to an int *, allowing
to write int values into the first field.

For Python, PyObject_HEAD and PyObject_VAR_HEAD will be changed to not
list all fields anymore, but list a single field of type
PyObject/PyVarObject:

    typedef struct _object {
      _PyObject_HEAD_EXTRA
      Py_ssize_t ob_refcnt;
      struct _typeobject *ob_type;
    } PyObject;

    typedef struct {
      PyObject ob_base;
      Py_ssize_t ob_size;
    } PyVarObject;

    #define PyObject_HEAD        PyObject ob_base;
    #define PyObject_VAR_HEAD    PyVarObject ob_base;

Types defined as fixed-size structure will then include PyObject as its
first field, PyVarObject for variable-sized objects. E.g.:

    typedef struct {
      PyObject ob_base;
      PyObject *start, *stop, *step;
    } PySliceObject;

    typedef struct {
      PyVarObject ob_base;
      PyObject **ob_item;
      Py_ssize_t allocated;
    } PyListObject;

The above definitions of PyObject_HEAD are normative, so extension
authors MAY either use the macro, or put the ob_base field explicitly
into their structs.

As a convention, the base field SHOULD be called ob_base. However, all
accesses to ob_refcnt and ob_type MUST cast the object pointer to
PyObject* (unless the pointer is already known to have that type), and
SHOULD use the respective accessor macros. To simplify access to
ob_type, ob_refcnt, and ob_size, macros:

    #define Py_TYPE(o)    (((PyObject*)(o))->ob_type)
    #define Py_REFCNT(o)  (((PyObject*)(o))->ob_refcnt)
    #define Py_SIZE(o)    (((PyVarObject*)(o))->ob_size)

are added. E.g. the code blocks :

    #define PyList_CheckExact(op) ((op)->ob_type == &PyList_Type)

    return func->ob_type->tp_name;

needs to be changed to:

    #define PyList_CheckExact(op) (Py_TYPE(op) == &PyList_Type)

    return Py_TYPE(func)->tp_name;

For initialization of type objects, the current sequence :

    PyObject_HEAD_INIT(NULL)
    0, /* ob_size */

becomes incorrect, and must be replaced with :

    PyVarObject_HEAD_INIT(NULL, 0)

Compatibility with Python 2.6

To support modules that compile with both Python 2.6 and Python 3.0, the
Py_* macros are added to Python 2.6. The macros Py_INCREF and Py_DECREF
will be changed to cast their argument to PyObject *, so that module
authors can also explicitly declare the ob_base field in modules
designed for Python 2.6.

Copyright

This document has been placed in the public domain.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 coding: utf-8 End: