PEP: 298 Title: The Locked Buffer Interface Version: $Revision$
Last-Modified: $Date$ Author: Thomas Heller <theller@python.net> Status:
Withdrawn Type: Standards Track Content-Type: text/x-rst Created:
26-Jul-2002 Python-Version: 2.3 Post-History: 30-Jul-2002, 01-Aug-2002

Abstract

This PEP proposes an extension to the buffer interface called the
'locked buffer interface'.

The locked buffer interface avoids the flaws of the 'old' buffer
interface[1] as defined in Python versions up to and including 2.2, and
has the following semantics:

-   The lifetime of the retrieved pointer is clearly defined and
    controlled by the client.
-   The buffer size is returned as a 'size_t' data type, which allows
    access to large buffers on platforms where
    sizeof(int) != sizeof(void *).

(Guido comments: This second sounds like a change we could also make to
the "old" buffer interface, if we introduce another flag bit that's not
part of the default flags.)

Specification

The locked buffer interface exposes new functions which return the size
and the pointer to the internal memory block of any python object which
chooses to implement this interface.

Retrieving a buffer from an object puts this object in a locked state
during which the buffer may not be freed, resized, or reallocated.

The object must be unlocked again by releasing the buffer if it's no
longer used by calling another function in the locked buffer interface.
If the object never resizes or reallocates the buffer during its
lifetime, this function may be NULL. Failure to call this function (if
it is != NULL) is a programming error and may have unexpected results.

The locked buffer interface omits the memory segment model which is
present in the old buffer interface - only a single memory block can be
exposed.

The memory blocks can be accessed without holding the global interpreter
lock.

Implementation

Define a new flag in Include/object.h:

    /* PyBufferProcs contains bf_acquirelockedreadbuffer,
       bf_acquirelockedwritebuffer, and bf_releaselockedbuffer */
    #define Py_TPFLAGS_HAVE_LOCKEDBUFFER (1L<<15)

This flag would be included in Py_TPFLAGS_DEFAULT:

    #define Py_TPFLAGS_DEFAULT  ( \
                        ....
                        Py_TPFLAGS_HAVE_LOCKEDBUFFER | \
                        ....
                        0)

Extend the PyBufferProcs structure by new fields in Include/object.h:

    typedef size_t (*acquirelockedreadbufferproc)(PyObject *,
                                                  const void **);
    typedef size_t (*acquirelockedwritebufferproc)(PyObject *,
                                                   void **);
    typedef void (*releaselockedbufferproc)(PyObject *);

    typedef struct {
        getreadbufferproc bf_getreadbuffer;
        getwritebufferproc bf_getwritebuffer;
        getsegcountproc bf_getsegcount;
        getcharbufferproc bf_getcharbuffer;
        /* locked buffer interface functions */
        acquirelockedreadbufferproc bf_acquirelockedreadbuffer;
        acquirelockedwritebufferproc bf_acquirelockedwritebuffer;
        releaselockedbufferproc bf_releaselockedbuffer;
    } PyBufferProcs;

The new fields are present if the Py_TPFLAGS_HAVE_LOCKEDBUFFER flag is
set in the object's type.

The Py_TPFLAGS_HAVE_LOCKEDBUFFER flag implies the
Py_TPFLAGS_HAVE_GETCHARBUFFER flag.

The acquirelockedreadbufferproc and acquirelockedwritebufferproc
functions return the size in bytes of the memory block on success, and
fill in the passed void * pointer on success. If these functions fail -
either because an error occurs or no memory block is exposed - they must
set the void * pointer to NULL and raise an exception. The return value
is undefined in these cases and should not be used.

If calls to these functions succeed, eventually the buffer must be
released by a call to the releaselockedbufferproc, supplying the
original object as argument. The releaselockedbufferproc cannot fail.
For objects that actually maintain an internal lock count it would be a
fatal error if the releaselockedbufferproc function would be called too
often, leading to a negative lock count.

Similar to the 'old' buffer interface, any of these functions may be set
to NULL, but it is strongly recommended to implement the
releaselockedbufferproc function (even if it does nothing) if any of the
acquireread/writelockedbufferproc functions are implemented, to
discourage extension writers from checking for a NULL value and not
calling it.

These functions aren't supposed to be called directly, they are called
through convenience functions declared in Include/abstract.h:

    int PyObject_AcquireLockedReadBuffer(PyObject *obj,
                                        const void **buffer,
                                        size_t *buffer_len);

    int PyObject_AcquireLockedWriteBuffer(PyObject *obj,
                                          void **buffer,
                                          size_t *buffer_len);

    void PyObject_ReleaseLockedBuffer(PyObject *obj);

The former two functions return 0 on success, set buffer to the memory
location and buffer_len to the length of the memory block in bytes. On
failure, or if the locked buffer interface is not implemented by obj,
they return -1 and set an exception.

The latter function doesn't return anything, and cannot fail.

Backward Compatibility

The size of the PyBufferProcs structure changes if this proposal is
implemented, but the type's tp_flags slot can be used to determine if
the additional fields are present.

Reference Implementation

An implementation has been uploaded to the SourceForge patch manager as
https://bugs.python.org/issue652857.

Additional Notes/Comments

Python strings, unicode strings, mmap objects, and array objects would
expose the locked buffer interface.

mmap and array objects would actually enter a locked state while the
buffer is active, this is not needed for strings and unicode objects.
Resizing locked array objects is not allowed and will raise an
exception. Whether closing a locked mmap object is an error or will only
be deferred until the lock count reaches zero is an implementation
detail.

Guido recommends

  But I'm still very concerned that if most built-in types (e.g.
  strings, bytes) don't implement the release functionality, it's too
  easy for an extension to seem to work while forgetting to release the
  buffer.

  I recommend that at least some built-in types implement the
  acquire/release functionality with a counter, and assert that the
  counter is zero when the object is deleted -- if the assert fails,
  someone DECREF'ed their reference to the object without releasing it.
  (The rule should be that you must own a reference to the object while
  you've acquired the object.)

  For strings that might be impractical because the string object would
  have to grow 4 bytes to hold the counter; but the new bytes object
  (PEP 296) could easily implement the counter, and the array object too
  -- that way there will be plenty of opportunity to test proper use of
  the protocol.

Community Feedback

Greg Ewing doubts the locked buffer interface is needed at all, he
thinks the normal buffer interface could be used if the pointer is
(re)fetched each time it's used. This seems to be dangerous, because
even innocent looking calls to the Python API like Py_DECREF() may
trigger execution of arbitrary Python code.

The first version of this proposal didn't have the release function, but
it turned out that this would have been too restrictive: mmap and array
objects wouldn't have been able to implement it, because mmap objects
can be closed anytime if not locked, and array objects could resize or
reallocate the buffer.

This PEP will probably be rejected because nobody except the author
needs it.

References

Copyright

This document has been placed in the public domain.

[1] The buffer interface
https://mail.python.org/pipermail/python-dev/2000-October/009974.html