Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 741 – Python Configuration C API

Author:
Victor Stinner <vstinner at python.org>
Discussions-To:
Discourse thread
Status:
Draft
Type:
Standards Track
Created:
18-Jan-2024
Python-Version:
3.13
Post-History:
19-Jan-2024, 08-Feb-2024

Table of Contents

Abstract

Add a C API to the limited C API to configure the Python initialization, and to get the current configuration. It can be used with the stable ABI.

Add also sys.get_config(name) function to get the current configuration.

Complete PEP 587 API by adding PyInitConfig_AddModule() which can be used to add a built-in extension module; feature previously referred to as the “inittab”.

PEP 587 “Python Initialization Configuration” unified all the ways to configure the Python initialization. This PEP (almost fully) unifies also the configuration of the Python preinitialization and the Python initialization in a single API, even if the preinitialization is still required to decode strings from the locale encoding.

This new API replaces the deprecated and incomplete legacy API which is scheduled for removal between Python 3.13 and Python 3.15.

Rationale

PyConfig is not part of the limited C API

When the first versions of PEP 587 “Python Initialization Configuration” were discussed, there was a private field _config_version (int): the configuration version, used for ABI compatibility. It was decided that if an application embeds Python, it sticks to a Python version anyway, and so there is no need to bother with the ABI compatibility.

The final PyConfig API of PEP 587 is excluded from the limited C API since its main PyConfig structure is not versioned. Python cannot guarantee ABI backward and forward compatibility, it’s incompatible with the stable ABI.

Since PyConfig was added to Python 3.8, the limited C API and the stable ABI are getting more popular. For example, Rust bindings such as the PyO3 project can target the limited C API to embed Python in Rust (but it’s not the default). In practice, PyO3 can use non-limited C API for specific needs, but using them avoids the stable ABI advantages.

Limitations of the legacy API

The legacy API to configure the Python initialization is based on the legacy Py_Initialize() function. It is now mostly deprecated:

  • Set the initialization configuration such as Py_SetPath(): deprecated in Python 3.11 and removed in Python 3.13.
  • Global configuration variables such as Py_VerboseFlag: deprecated in Python 3.12 and scheduled for removal in Python 3.14.
  • Get the current configuration such as Py_GetPath(): deprecated in Python 3.13 and scheduled for removal in Python 3.15.

The legacy API doesn’t support the “Python Configuration” and the “Isolated Configuration” of PEP 587 PyConfig API. It only provides a “legacy configuration” of Py_Initialize() which is in the between, and also uses the legacy global configuration variables (such as Py_VerboseFlag).

Some options can only by set by environment variables, such as home set by the PYTHONHOME environment variable. The problem is that environment variables are inherited by child processes which can be a surprising and unwanted behavior.

Some configuration options, such as configure_locale, simply cannot be set.

Limitations of the limited C API

The limited C API is a subset of the legacy API. For example, global configuration variables, such as Py_VerboseFlag, are not part of the limited C API.

While some functions were removed from the limited C API version 3.13, they are still part of the stable ABI. For example, building a application with the limited C API version 3.12 can still run with Python 3.13 stable ABI.

Get the current configuration

PEP 587 has no API to get the current configuration, only to configure the Python initialization.

For example, the global configuration variable Py_UnbufferedStdioFlag was deprecated in Python 3.12 and using PyConfig.buffered_stdio is recommended instead. It only works to configure Python, there is no public API to get PyConfig.buffered_stdio.

Users of the limited C API are asking for a public API to get the current configuration.

Security fix

To fix CVE-2020-10735, a denial-of-service when converting very a large string to an integer (in base 10), it was discussed to add a new PyConfig member to stable branches which affects the ABI.

Gregory P. Smith proposed a different API using text based configuration file to not be limited by PyConfig members: FR: Allow private runtime config to enable extending without breaking the PyConfig ABI (August 2022).

In the end, it was decided to not add a new PyConfig member to stable branches, but only add a new PyConfig.int_max_str_digits member to the development branch (which became Python 3.12). A dedicated private global variable (unrelated to PyConfig) is used in stable branches.

Redundancy between PyPreConfig and PyConfig

The Python preinitialization uses the PyPreConfig structure and the Python initialization uses the PyConfig structure. Both structures have four duplicated members: dev_mode, parse_argv, isolated and use_environment.

The redundancy is caused by the fact that the two structures are separated, whereas some PyConfig members are needed by the preinitialization.

Applications embedding Python

Examples:

On Linux, FreeBSD and macOS, applications are usually either statically linked to a libpython, or load dynamically a libpython . The libpython shared library is versionned, example: libpython3.12.so for Python 3.13 on Linux.

The vim project can target the stable ABI. Usually, the “system Python” version is used. It’s not currently possible to select which Python version to use. Users would like the ability to select a newer Python on demand.

On Linux, another approach to deploy an application embedding Python, such as GIMP, is to include Python in Flatpack, AppImage or Snap “container”. In this case, the application brings its own copy of Python version with the container.

Libraries embedding Python

Examples:

Utilities creating standalone applications

These utilities create standalone applications, they are not linked to libpython.

Usage of a stable ABI

Ronald Oussoren:

For tools like py2app/py2exe/pyinstaller, it is pretty inconvenient to have to rebuild the launcher executable that’s used to start the packaged application when there’s a bug fix release of Python.

Gregory P. Smith:

You can’t extend a struct and assume embedding people all rebuild. They don’t. Real world embedding uses exist that use an installed Python minor version as a shared library. Update that to use a different sized struct in a public API and someone is going to have a bad time. That’s why I consider the struct frozen at rc1 time, even when only for use in the embedding / writing their own launcher case.

Colton Murphy:

I am trying to embed the Python interpreter using a non C language. I have to stick with the limited API and private structures for configuration in headers files is a no-no. Basically, I need to be able to allocate and configure everything using only exportable functions and the heap… no private structure details.

(…)

I am strictly limited to what’s in the shared library (DLL). I don’t have headers, I can’t statically “recompile” every time a new version of python comes out. That’s unmaintainable for me.

Milian Wolff:

IIUC then there’s still no non-deprecated API in the limited C API to customize the initialization, right? Can you then please reopen this task to indicate that this? Or should I report a separate issue to track this? Thank you

David Hewitt of the PyO3 project:

I think making the configuration structure opaque and using an API to set/get configuration by name is a welcome simplification:
  • It’s a smaller API for language bindings like PyO3 to wrap and re-expose, and
  • It’s easier for people to support multiple Python versions to embed into their application; no need to conditionally compile structure field access, can just use normal error handling if configuration values are not available for a specific version at runtime.

Specification

Add C API functions and structure to configure the Python initialization:

  • PyInitConfig opaque structure.
  • PyInitConfig_CreatePython().
  • PyInitConfig_CreateIsolated().
  • PyInitConfig_Free(config).
  • PyInitConfig_GetInt(config, name, &value).
  • PyInitConfig_GetStr(config, name, &value).
  • PyInitConfig_GetWStr(config, name, &value).
  • PyInitConfig_GetStrList(config, name, &length, &items).
  • PyInitConfig_FreeStrList().
  • PyInitConfig_GetWStrList(config, name, &length, &items).
  • PyInitConfig_FreeWStrList().
  • PyInitConfig_SetInt(config, name, value).
  • PyInitConfig_SetStr(config, name, value).
  • PyInitConfig_SetStrLocale(config, name, value).
  • PyInitConfig_SetWStr(config, name, value).
  • PyInitConfig_SetStrList(config, name, length, items).
  • PyInitConfig_SetStrLocaleList(config, name, length, items).
  • PyInitConfig_SetWStrList(config, name, length, items).
  • PyInitConfig_AddModule(config, name, initfunc)
  • Py_PreInitializeFromInitConfig(config).
  • Py_InitializeFromInitConfig(config).
  • PyInitConfig_GetError(config, &err_msg).
  • Py_ExitWithInitConfig(config).

Add C API and Python functions to get the current configuration:

  • PyConfig_Get(name).
  • PyConfig_GetInt(name, &value).
  • PyConfig_Keys().
  • sys.get_config(name).

The C API uses null-terminated UTF-8 encoded strings to refer to a configuration option.

All C API functions are added to the limited C API version 3.13.

The PyInitConfig structure is implemented by combining the three structures of the PyConfig API and has an inittab member as well:

  • PyPreConfig preconfig
  • PyConfig config
  • PyStatus status
  • struct _inittab *inittab for PyInitConfig_AddModule()

The PyStatus status is no longer separated, but part of the unified PyInitConfig structure, which makes the API easier to use.

Configuration Options

Configuration option names:

  • "allocator" (integer)
  • "argv" (string list).
  • "base_exec_prefix" (string).
  • "base_executable" (string).
  • "base_prefix" (string).
  • "buffered_stdio" (integer).
  • "bytes_warning" (integer).
  • "check_hash_pycs_mode" (string).
  • "code_debug_ranges" (integer).
  • "coerce_c_locale" (integer)
  • "coerce_c_locale_warn" (integer)
  • "configure_c_stdio" (integer).
  • "configure_locale" (integer)
  • "cpu_count" (integer).
  • "dev_mode" (integer).
  • "dump_refs" (integer).
  • "dump_refs_file" (string).
  • "exec_prefix" (string).
  • "executable" (string).
  • "faulthandler" (integer).
  • "filesystem_encoding" (string).
  • "filesystem_errors" (string).
  • "hash_seed" (unsigned long).
  • "home" (string).
  • "import_time" (integer).
  • "inspect" (integer).
  • "install_signal_handlers" (integer).
  • "int_max_str_digits" (integer).
  • "interactive" (integer).
  • "isolated" (integer).
  • "legacy_windows_fs_encoding" (integer)
  • "legacy_windows_stdio" (integer): only on Windows.
  • "malloc_stats" (integer).
  • "module_search_paths" (string list).
  • "module_search_paths_set" (integer).
  • "optimization_level" (integer).
  • "orig_argv" (string list).
  • "parse_argv" (integer).
  • "parser_debug" (integer).
  • "pathconfig_warnings" (integer).
  • "perf_profiling" (integer).
  • "platlibdir" (string).
  • "prefix" (string).
  • "program_name" (string).
  • "pycache_prefix" (string).
  • "pythonpath_env" (string).
  • "quiet" (integer).
  • "run_command" (string).
  • "run_filename" (string).
  • "run_module" (string).
  • "run_presite" (string): only on a Python debug build.
  • "safe_path" (integer).
  • "show_ref_count" (integer).
  • "site_import" (integer).
  • "skip_source_first_line" (integer).
  • "stdio_encoding" (string).
  • "stdio_errors" (string).
  • "stdlib_dir" (string).
  • "sys_path_0" (string).
  • "tracemalloc" (integer).
  • "use_environment" (integer).
  • "use_frozen_modules" (integer).
  • "use_hash_seed" (integer).
  • "user_site_directory" (integer).
  • "utf8_mode" (integer)
  • "verbose" (integer).
  • "warn_default_encoding" (integer).
  • "warnoptions" (string list).
  • "write_bytecode" (integer).
  • "xoptions" (string list).

Configuration options are named after PyPreConfig and PyConfig structure members even if PyConfig_Get() can also get values from the sys module. See the PyPreConfig documentation and the PyConfig documentation.

Deprecating and removing configuration options is out of the scope of the PEP and should be discussed on a case by case basis.

Preconfiguration

Calling Py_PreInitializeFromInitConfig() preinitializes Python. For example, it sets the memory allocation, and can configure the LC_CTYPE locale and configure the standard C streams such as stdin and stdout.

The following option names can only be set during the Python preconfiguration:

  • "allocator",
  • "coerce_c_locale",
  • "coerce_c_locale_warn",
  • "configure_locale",
  • "legacy_windows_fs_encoding",
  • "utf8_mode".

Trying to set these options after Python preinitialization fails with an error.

PyInitConfig_SetStrLocale() and PyInitConfig_SetStrLocaleList() functions cannot be called Python preinitialization.

Create PyInitConfig

PyInitConfig structure:
Opaque structure to configure the Python preinitialization and the Python initialization.
PyInitConfig* PyInitConfig_CreatePython(void):
Create a new initialization configuration using default values of the Python Configuration.

It must be freed with PyInitConfig_Free().

Return NULL on memory allocation failure.

PyInitConfig* PyInitConfig_CreateIsolated(void):
Similar to PyInitConfig_CreatePython(), but use default values of the Isolated Configuration.
void PyInitConfig_Free(PyInitConfig *config):
Free memory of an initialization configuration.

Get PyInitConfig Options

int PyInitConfig_GetInt(PyInitConfig *config, const char *name, int64_t *value):
Get an integer configuration option.
  • Set *value, and return 0 on success.
  • Set an error in config and return -1 on error.
int PyInitConfig_GetStr(PyInitConfig *config, const char *name, char **value):
Get a string configuration option as a null-terminated UTF-8 encoded string.
  • Set *value, and return 0 on success.
  • Set an error in config and return -1 on error.

On success, the string must be released with free(value).

int PyInitConfig_GetWStr(PyInitConfig *config, const char *name, wchar_t **value):
Get a string configuration option as a null-terminated wide string.
  • Set *value and return 0 on success.
  • Set an error in config and return -1 on error.

On success, the string must be released with free(value).

int PyInitConfig_GetStrList(PyInitConfig *config, const char *name, size_t *length, char ***items):
Get a string list configuration option as an array of null-terminated UTF-8 encoded strings.
  • Set *length and *value, and return 0 on success.
  • Set an error in config and return -1 on error.

On success, the string list must be released with PyInitConfig_FreeStrList(length, items).

void PyInitConfig_FreeStrList(size_t length, char **items):
Free memory of a string list created by PyInitConfig_GetStrList().
int PyInitConfig_GetWStrList(PyInitConfig *config, const char *name, size_t *length, wchar_t ***items):
Get a string list configuration option as an error of null-terminated wide strings.
  • Set *length and *value, and return 0 on success.
  • Set an error in config and return -1 on error.

On success, the string list must be released with PyInitConfig_FreeWStrList(length, items).

void PyInitConfig_FreeWStrList(size_t length, wchar_t **items):
Free memory of a string list created by PyInitConfig_GetWStrList().

Set PyInitConfig Options

int PyInitConfig_SetInt(PyInitConfig *config, const char *name, int64_t value):
Set an integer configuration option.
  • Return 0 on success.
  • Set an error in config and return -1 on error.
int PyInitConfig_SetStr(PyInitConfig *config, const char *name, const char *value):
Set a string configuration option from a null-terminated UTF-8 encoded string. The string is copied.
  • Return 0 on success.
  • Set an error in config and return -1 on error.
int PyInitConfig_SetStrLocale(PyInitConfig *config, const char *name, const char *value):
Set a string configuration option from a null-terminated bytes string encoded in the locale encoding. The string is copied.

The bytes string is decoded by Py_DecodeLocale(). Py_PreInitializeFromInitConfig() must be called before calling this function.

  • Return 0 on success.
  • Set an error in config and return -1 on error.
int PyInitConfig_SetWStr(PyInitConfig *config, const char *name, const wchar_t *value):
Set a string configuration option from a null-terminated wide string. The string is copied.
  • Return 0 on success.
  • Set an error in config and return -1 on error.
int PyInitConfig_SetStrList(PyInitConfig *config, const char *name, size_t length, char * const *items):
Set a string list configuration option from an array of null-terminated UTF-8 encoded strings. The string list is copied.
  • Return 0 on success.
  • Set an error in config and return -1 on error.
int PyInitConfig_SetStrLocaleList(PyInitConfig *config, const char *name, size_t length, char * const *items):
Set a string list configuration option from an array of null-terminated bytes strings encoded in the locale encoding. The string list is copied.

The bytes string is decoded by Py_DecodeLocale(). Py_PreInitializeFromInitConfig() must be called before calling this function.

  • Return 0 on success.
  • Set an error in config and return -1 on error.
int PyInitConfig_SetWStrList(PyInitConfig *config, const char *name, size_t length, wchar_t * const *items):
Set a string list configuration option from an error of null-terminated wide strings. The string list is copied.
  • Return 0 on success.
  • Set an error in config and return -1 on error.
int PyInitConfig_AddModule(PyInitConfig *config, const char *name, PyObject* (*initfunc)(void)):
Add a built-in extension module to the table of built-in modules.

The new module can be imported by the name name, and uses the function initfunc as the initialization function called on the first attempted import.

  • Return 0 on success.
  • Set an error in config and return -1 on error.

If Python is initialized multiple times, PyInitConfig_AddModule() must be called at each Python initialization.

Similar to the PyImport_AppendInittab() function.

Initialize Python

int Py_PreInitializeFromInitConfig(PyInitConfig *config):
Preinitialize Python from the initialization configuration.
  • Return 0 on success.
  • Set an error in config and return -1 on error.
int Py_InitializeFromInitConfig(PyInitConfig *config):
Initialize Python from the initialization configuration.
  • Return 0 on success.
  • Set an error in config and return -1 on error.

Error handling

int PyInitConfig_GetError(PyInitConfig* config, const char **err_msg):
Get the config error message.
  • Set *err_msg and return 1 if an error is set.
  • Set *err_msg to NULL and return 0 otherwise.

An error message is an UTF-8 encoded string.

The error message remains valid until another PyInitConfig function is called with config. The caller doesn’t have to free the error message.

void Py_ExitWithInitConfig(PyInitConfig *config):
Exit Python and free memory of an initialization configuration.

If an error message is set, display the error message.

The function does not return.

Get current configuration

PyObject* PyConfig_Get(const char *name):
Get the current value of a configuration option as an object.
  • Return a new reference on success.
  • Set an exception and return NULL on error.

The object type depends on the option.

The following options are read from the sys modules.

  • "argv": sys.argv.
  • "base_exec_prefix": sys.base_exec_prefix.
  • "base_executable": sys._base_executable.
  • "base_prefix": sys.base_prefix.
  • "exec_prefix": sys.exec_prefix.
  • "executable": sys.executable.
  • "module_search_paths": sys.path.
  • "orig_argv": sys.orig_argv.
  • "platlibdir": sys.platlibdir.
  • "prefix": sys.prefix.
  • "pycache_prefix": sys.pycache_prefix.
  • "stdlib_dir": sys._stdlib_dir.
  • "warnoptions": sys.warnoptions.
  • "write_bytecode": not sys.dont_write_bytecode (opposite value).
  • "xoptions": sys._xoptions.

Other options are get from internal PyPreConfig and PyConfig structures.

The function cannot be called before Python initialization nor after Python finalization.

int PyConfig_GetInt(const char *name, int *value):
Similar to PyConfig_Get(), but get the value as an integer.
  • Set *value and return 0 success.
  • Set an exception and return -1 on error.
PyObject* PyConfig_Keys(void):
Get all configuration option names as a tuple.

Set an exception and return NULL on error.

sys.get_config()

Add sys.get_config(name: str) function which calls PyConfig_Get():

  • Return the configuration option value on success.
  • Raise an exception on error.

Examples

Initialize Python

Example setting some configuration options of different types to initialize Python.

void init_python(void)
{
    PyInitConfig *config = PyInitConfig_CreatePython();
    if (config == NULL) {
        printf("Init allocation error\n");
        return;
    }

    // Set an integer (dev mode)
    if (PyInitConfig_SetInt(config, "dev_mode", 1) < 0) {
        goto error;
    }

    // Set a list of wide strings (argv)
    wchar_t *argv[] = {L"my_program"", L"-c", L"pass"};
    if (PyInitConfig_SetWStrList(config, "argv",
                                 Py_ARRAY_LENGTH(argv), argv) < 0) {
        goto error;
    }

    // Set a wide string (program name)
    if (PyInitConfig_SetWStr(config, "program_name", L"my_program") < 0) {
        goto error;
    }

    // Set a list of bytes strings (xoptions).
    // Preinitialize implicitly Python to decode the bytes string.
    char* xoptions[] = {"faulthandler"};
    if (PyInitConfig_SetStrList(config, "xoptions",
                                Py_ARRAY_LENGTH(xoptions), xoptions) < 0) {
        goto error;
    }

    // Initialize Python with the configuration
    if (Py_InitializeFromInitConfig(config) < 0) {
        goto error;
    }
    PyInitConfig_Free(config);
    return;

error:
    // Display the error message an exit the process
    // with a non-zero exit code
    Py_ExitWithInitConfig(config);
}

Get the verbose option

Example getting the configuration option verbose:

int get_verbose(void)
{
    int verbose;
    if (PyConfig_GetInt("verbose", &verbose) < 0) {
        // Silently ignore the error
        PyErr_Clear();
        return -1;
    }
    return verbose;
}

On error, the function silently ignores the error and returns -1.

Implementation

Backwards Compatibility

Changes are fully backward compatible. Only new APIs are added.

Existing API such as the PyConfig C API (PEP 587) are left unchanged.

Rejected Ideas

Configuration as text

It was proposed to provide the configuration as text to make the API compatible with the stable ABI and to allow custom options.

Example:

# integer
bytes_warning = 2

# string
filesystem_encoding = "utf8"   # comment

# list of strings
argv = ['python', '-c', 'code']

The API would take the configuration as a string, not as a file. Example with a hypothetical PyInit_SetConfig() function:

void stable_abi_init_demo(int set_path)
{
    PyInit_SetConfig(
        "isolated = 1\n"
        "argv = ['python', '-c', 'code']\n"
        "filesystem_encoding = 'utf-8'\n"
    );
    if (set_path) {
        PyInit_SetConfig("pythonpath = '/my/path'");
    }
}

The example ignores error handling to make it easier to read.

The problem is that generating such configuration text requires adding quotes to strings and to escape quotes in strings. Formatting an array of strings becomes non-trivial.

Providing an API to format a string or an array of strings is not really worth it, whereas Python can provide directly an API to set a configuration option where the value is passed directly as a string or an array of strings. It avoids giving special meaning to some characters, such as newline characters, which would have to be escaped.

Refer to an option with an integer

Using strings to refer to a configuration option requires comparing strings which can be slower than comparing integers.

Use integers, similar to type “slots” such as Py_tp_doc, to refer to a configuration option. The const char *name parameter is replaced with int option.

Accepting custom options is more likely to cause conflicts when using integers, since it’s harder to maintain “namespaces” (ranges) for integer options. Using strings, a simple prefix with a colon separator can be used.

Integers also requires maintaining a list of integer constants and so make the C API and the Python API larger.

Python 3.13 only has around 62 configuration options, and so performance is not really a blocker issue. If better performance is needed later, a hash table can be used to get an option by its name.

If getting a configuration option is used in hot code, the value can be read once and cached. By the way, most configuration options cannot be changed at runtime.

Discussions


Source: https://github.com/python/peps/blob/main/peps/pep-0741.rst

Last modified: 2024-02-09 00:00:51 GMT