Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 720 – Cross-compiling Python packages

Author:
Filipe Laíns <lains at riseup.net>
PEP-Delegate:

Status:
Draft
Type:
Informational
Created:
01-Jul-2023
Python-Version:
3.12

Table of Contents

Abstract

This PEP attempts to document the status of cross-compilation of downstream projects.

It should give an overview of the approaches currently used by distributors (Linux distros, WASM environment providers, etc.) to cross-compile downstream projects (3rd party extensions, etc.).

Motivation

We write this PEP to express the challenges in cross-compilation and act as a supporting document in future improvement proposals.

Analysis

Introduction

There are a couple different approaches being used to tackle this, with different levels of interaction required from the user, but they all require a significant amount of effort. This is due to the lack of standardized cross-compilation infrastructure on the Python packaging ecosystem, which itself stems from the complexity of cross-builds, making it a huge undertaking.

Upstream support

Some major projects like CPython, setuptools, etc. provide some support to help with cross-compilation, but it’s unofficial and at a best-effort basis. For example, the sysconfig module allows overwriting the data module name via the _PYTHON_SYSCONFIGDATA_NAME environment variable, something that is required for cross-builds, and setuptools accepts patches [1] to tweak/fix its logic to be compatible with popular “environment faking” workflows [2].

The lack of first-party support in upstream projects leads to cross-compilation being fragile and requiring a significant effort from users, but at the same time, the lack of standardization makes it harder for upstreams to improve support as there’s no clarity on how this feature should be provided.

Projects with decent cross-build support

It seems relevant to point out that there are a few modern Python package build-backends with, at least, decent cross-compilation support, those being scikit-build and meson-python. Both these projects integrate external mature build-systems into Python packaging — CMake and Meson, respectively — so cross-build support is inherited from them.

Downstream approaches

Cross-compilation approaches fall in a spectrum that goes from, by design, requiring extensive user interaction to (ideally) almost none. Usually, they’ll be based on one of two main strategies, using a cross-build environment, or faking the target environment.

Cross-build environment

This consists of running the Python interpreter normally and utilizing the cross-build provided by the projects’ build-system. However, as we saw above, upstream support is lacking, so this approach only works for a small-ish set of projects. When this fails, the usual strategy is to patch the build-system code to build use the correct toolchain, system details, etc. [3].

Since this approach often requires package-specific patching, it requires a lot of user interaction.

Examples

python-for-android, kivy-ios, etc.

Faking the target environment

Aiming to drop the requirement for user input, a popular approach is trying to fake the target environment. It generally consists of monkeypatching the Python interpreter to get it to mimic the interpreter on the target system, which constitutes of changing many of the sys module attributes, the sysconfig data, etc. Using this strategy, build-backends do not need to have any cross-build support, and should just work without any code changes.

Unfortunately, though, it isn’t possible to truly fake the target environment. There are many reasons for this, one of the main ones being that it breaks code that actually needs to introspect the running interpreter. As a result, monkeypatching Python to look like target is very tricky — to achieve the less amount of breakage, we can only patch certain aspects of the interpreter. Consequently, build-backends may need some code changes, but these are generally much smaller than the previous approach. This is an inherent limitation of the technique, meaning this strategy still requires some user interaction.

Nonetheless, this strategy still works out-of-the-box with significantly more projects than the approach above, and requires much less effort in these cases. It is successful in decreasing the amount of user interaction needed, even though it doesn’t succeed in being generic.

Examples

crossenv, conda-forge, etc.

Environment introspection

As explained above, most build system code is written with the assumption that the target system is the same as where the build is occurring, so introspection is usually used to guide the build.

In this section, we try to document most of the ways this is accomplished. It should give a decent overview of of environment details that are required by build systems.

Snippet Description Variance
>>> importlib.machinery.EXTENSION_SUFFIXES
[
   '.cpython-311-x86_64-linux-gnu.so',
   '.abi3.so',
   '.so',
]
Extension (native module) suffixes supported by this interpreter. This is implementation-defined, but it usually differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists.
>>> importlib.machinery.SOURCE_SUFFIXES
['.py']
Source (pure-Python) suffixes supported by this interpreter. This is implementation-defined, but it usually doesn’t differ (outside exotic implementations or systems).
>>> importlib.machinery.all_suffixes()
[
   '.py',
   '.pyc',
   '.cpython-311-x86_64-linux-gnu.so',
   '.abi3.so',
   '.so',
]
All module file suffixes supported by this interpreter. It should be the union of all importlib.machinery.*_SUFFIXES attributes. This is implementation-defined, but it usually differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists. See the entries above for more information.
>>> sys.abiflags
''
ABI flags, as specified in PEP 3149. Differs based on the build configuration.
>>> sys.api_version
1013
C API version. Differs based on the Python installation.
>>> sys.base_prefix
/usr
Prefix of the installation-wide directories where platform independent files are installed. Differs based on the platform, and installation.
>>> sys.base_exec_prefix
/usr
Prefix of the installation-wide directories where platform dependent files are installed. Differs based on the platform, and installation.
>>> sys.byteorder
'little'
Native byte order. Differs based on the platform.
>>> sys.builtin_module_names
('_abc', '_ast', '_codecs', ...)
Names of all modules that are compiled into the Python interpreter. Differs based on the platform, system architecture, and build configuration.
>>> sys.exec_prefix
/usr
Prefix of the site-specific directories where platform independent files are installed. Because it concerns the site-specific directories, in standard virtual environment implementation, it will be a virtual-environment-specific path. Differs based on the platform, installation, and environment.
>>> sys.executable
'/usr/bin/python'
Path of the Python interpreter being used. Differs based on the installation.
>>> with open(sys.executable, 'rb') as f:
...   header = f.read(4)
...   if is_elf := (header == b'\x7fELF'):
...     elf_class = int(f.read(1))
...     size = {1: 52, 2: 64}.get(elf_class)
...     elf_header = f.read(size - 5)
Whether the Python interpreter is an ELF file, and the ELF header. This approach is something used to identify the target architecture of the installation (example). Differs based on the installation.
>>> sys.float_info
sys.float_info(
   max=1.7976931348623157e+308,
   max_exp=1024,
   max_10_exp=308,
   min=2.2250738585072014e-308,
   min_exp=-1021,
   min_10_exp=-307,
   dig=15,
   mant_dig=53,
   epsilon=2.220446049250313e-16,
   radix=2,
   rounds=1,
)
Low level information about the float type, as defined by float.h. Differs based on the architecture, and platform.
>>> sys.getandroidapilevel()
21
Integer representing the Android API level. Differs based on the platform.
>>> sys.getwindowsversion()
sys.getwindowsversion(
   major=10,
   minor=0,
   build=19045,
   platform=2,
   service_pack='',
)
Windows version of the system. Differs based on the platform.
>>> sys.hexversion
0x30b03f0
Python version encoded as an integer. Differs based on the Python language version.
>>> sys.implementation
namespace(
   name='cpython',
   cache_tag='cpython-311',
   version=sys.version_info(
      major=3,
      minor=11,
      micro=3,
      releaselevel='final',
      serial=0,
   ),
   hexversion=0x30b03f0,
   _multiarch='x86_64-linux-gnu',
)
Interpreter implementation details. Differs based on the interpreter implementation, Python language version, and implementation version — if one exists. It may also include architecture-dependent information, so it may also differ based on the system architecture.
>>> sys.int_info
sys.int_info(
   bits_per_digit=30,
   sizeof_digit=4,
   default_max_str_digits=4300,
   str_digits_check_threshold=640,
)
Low level information about Python’s internal integer representation. Differs based on the architecture, platform, implementation, build, and runtime flags.
>>> sys.maxsize
0x7fffffffffffffff
Maximum value a variable of type Py_ssize_t can take. Differs based on the architecture, platform, and implementation.
>>> sys.maxunicode
0x10ffff
Value of the largest Unicode code point. Differs based on the implementation, and on Python versions older than 3.3, the build.
>>> sys.platform
linux
Platform identifier. Differs based on the platform.
>>> sys.prefix
/usr
Prefix of the site-specific directories where platform dependent files are installed. Because it concerns the site-specific directories, in standard virtual environment implementation, it will be a virtual-environment-specific path. Differs based on the platform, installation, and environment.
>>> sys.platlibdir
lib
Platform-specific library directory. Differs based on the platform, and vendor.
>>> sys.version_info
sys.version_info(
   major=3,
   minor=11,
   micro=3,
   releaselevel='final',
   serial=0,
)
Python language version implemented by the interpreter. Differs if the target Python version is not the same [4].
>>> sys.thread_info
sys.thread_info(
   name='pthread',
   lock='semaphore',
   version='NPTL 2.37',
)
Information about the thread implementation. Differs based on the platform, and implementation.
>>> sys.winver
3.8-32
Version number used to form Windows registry keys. Differs based on the platform, and implementation.
>>> sysconfig.get_config_vars()
{ ... }
>>> sysconfig.get_config_var(...)
...
Python distribution configuration variables. It includes a set of variables [5] — like prefix, exec_prefix, etc. — based on the running context [6], and may include some extra variables based on the Python implementation and system.

In CPython and most other implementations that use the same build-system, the “extra” variables mention above are: on POSIX, all variables from the Makefile used to build the interpreter, and on Windows, it usually only includes a small subset of the those [7] — like EXT_SUFFIX, BINDIR, etc.

This is implementation-defined, but it usually differs between non-identical builds. Please refer to the sysconfig configuration variables table for a overview of the different configuration variable that are usually present.

CPython (and similar)

sysconfig configuration variables
Name Example Value Description Variance
SOABI cpython-311-x86_64-linux-gnu ABI string — defined by PEP 3149. Differs based on the implementation, system architecture, Python language version, and implementation version — if one exists.
SHLIB_SUFFIX .so Shared library suffix. Differs based on the platform.
EXT_SUFFIX .cpython-311-x86_64-linux-gnu.so Interpreter-specific Python extension (native module) suffix — generally defined as .{SOABI}.{SHLIB_SUFFIX}. Differs based on the implementation, system architecture, Python language version, and implementation version — if one exists.
LDLIBRARY libpython3.11.so Shared libpython library name — if available. If unavailable [8], the variable will be empty, if available, the library should be located in LIBDIR. Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists.
PY3LIBRARY libpython3.so Shared Python 3 only (major version bound only) [9] libpython library name — if available. If unavailable [8], the variable will be empty, if available, the library should be located in LIBDIR. Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists.
LIBRARY libpython3.11.a Static libpython library name — if available. If unavailable [8], the variable will be empty, if available, the library should be located in LIBDIR. Differs based on the implementation, system architecture, build configuration, Python language version, and implementation version — if one exists.
Py_DEBUG 0 Whether this is a debug build. Differs based on the build configuration.
WITH_PYMALLOC 1 Whether this build has pymalloc support. Differs based on the build configuration.
Py_TRACE_REFS 0 Whether reference tracing (debug build only) is enabled. Differs based on the build configuration.
Py_UNICODE_SIZE Size of the Py_UNICODE object, in bytes. This variable is only present in CPython versions older than 3.3, and was commonly used to detect if the build uses UCS2 or UCS4 for unicode objects — before PEP 393. Differs based on the build configuration.
Py_ENABLE_SHARED 1 Whether a shared libpython is available. Differs based on the build configuration.
PY_ENABLE_SHARED 1 Whether a shared libpython is available. Differs based on the build configuration.
CC gcc The C compiler used to build the Python distribution. Differs based on the build configuration.
CXX g++ The C compiler used to build the Python distribution. Differs based on the build configuration.
CFLAGS -DNDEBUG -g -fwrapv ... The C compiler flags used to build the Python distribution. Differs based on the build configuration.
py_version 3.11.3 Full form of the Python version. Differs based on the Python language version.
py_version_short 3.11 Custom form of the Python version, containing only the major and minor numbers. Differs based on the Python language version.
py_version_nodot 311 Custom form of the Python version, containing only the major and minor numbers, and no dots. Differs based on the Python language version.
prefix /usr Same as sys.prefix, please refer to the entry in table above. Differs based on the platform, installation, and environment.
base /usr Same as sys.prefix, please refer to the entry in table above. Differs based on the platform, installation, and environment.
exec_prefix /usr Same as sys.exec_prefix, please refer to the entry in table above. Differs based on the platform, installation, and environment.
platbase /usr Same as sys.exec_prefix, please refer to the entry in table above. Differs based on the platform, installation, and environment.
installed_base /usr Same as sys.base_prefix, please refer to the entry in table above. Differs based on the platform, and installation.
installed_platbase /usr Same as sys.base_exec_prefix, please refer to the entry in table above. Differs based on the platform, and installation.
platlibdir lib Same as sys.platlibdir, please refer to the entry in table above. Differs based on the platform, and vendor.
SIZEOF_* 4 Size of a certain C type (double, float, etc.). Differs based on the system architecture, and build details.

Relevant Information

There are some bits of information required by build systems — eg. platform particularities — scattered across many places, and it often is difficult to identify code with assumptions based on them. In this section, we try to document the most relevant cases.

When should extensions be linked against libpython?

Short answer
Yes, on Windows. No on POSIX platforms, except Android, Cygwin, and other Windows-based POSIX-like platforms.

When building extensions for dynamic loading, depending on the target platform, they may need to be linked against libpython.

On Windows, extensions need to link against libpython, because all symbols must be resolvable at link time. POSIX-like platforms based on Windows — like Cygwin, MinGW, or MSYS — will also require linking against libpython.

On most POSIX platforms, it is not necessary to link against libpython, as the symbols will already be available in the due to the interpreter — or, when embedding, the executable/library in question — already linking to libpython. Not linking an extension module against libpython will allow it to be loaded by static Python builds, so when possible, it is desirable to do so (see GH-65735).

This might not be the case on all POSIX platforms, so make sure you check. One example is Android, where only the main executable and LD_PRELOAD entries are considered to be RTLD_GLOBAL (meaning dependencies are RTLD_LOCAL) [10], which causes the libpython symbols be unavailable when loading the extension.

What are prefix, exec_prefix, base_prefix, and base_exec_prefix?

These are sys attributes set in the Python initialization that describe the running environment. They refer to the prefix of directories where installation/environment files are installed, according to the table below.

Name Target files Environment Scope
prefix platform independent (eg. pure Python) site-specific
exec_prefix platform dependent (eg. native code) site-specific
base_prefix platform independent (eg. pure Python) installation-wide
base_exec_prefix platform dependent (eg. native code) installation-wide

Because the site-specific prefixes will be different inside virtual environments, checking sys.prexix != sys.base_prefix is commonly used to check if we are in a virtual environment.

Case studies

crossenv

Description:
Virtual Environments for Cross-Compiling Python Extension Modules.
URL:
https://github.com/benfogle/crossenv

crossenv is a tool to create a virtual environment with a monkeypatched Python installation that tries to emulate the target machine in certain scenarios. More about this approach can be found in the Faking the target environment section.

conda-forge

Description:
A community-led collection of recipes, build infrastructure and distributions for the conda package manager.
URL:
https://conda-forge.org/

XXX: Jaime will write a quick summary once the PEP draft is public.

XXX Uses a modified crossenv.

Yocto Project

Description:
The Yocto Project is an open source collaboration project that helps developers create custom Linux-based systems regardless of the hardware architecture.
URL:
https://www.yoctoproject.org/

XXX: Sent email to the mailing list.

TODO

Buildroot

Description:
Buildroot is a simple, efficient and easy-to-use tool to generate embedded Linux systems through cross-compilation.
URL:
https://buildroot.org/

TODO

Pyodide

Description:
Pyodide is a Python distribution for the browser and Node.js based on WebAssembly.
URL:
https://pyodide.org/en/stable/

XXX: Hood should review/expand this section.

Pyodide is a provides a Python distribution compiled to WebAssembly using the Emscripten toolchain.

It patches several aspects of the CPython installation and some external components. A custom package manager — micropip — supporting both Pure and wasm32/Emscripten wheels, is also provided as a part of the distribution. On top of this, a repo with a selected set of 3rd party packages is also provided and enabled by default.

Beeware

Description:
BeeWare allows you to write your app in Python and release it on multiple platforms.
URL:
https://beeware.org/

TODO

python-for-android

Description:
Turn your Python application into an Android APK.
URL:
https://github.com/kivy/python-for-android

resource https://github.com/Android-for-Python/Android-for-Python-Users

python-for-android is a tool to package Python apps on Android. It creates a Python distribution with your app and its dependencies.

Pure-Python dependencies are handled automatically and in a generic way, but native dependencies need recipes. A set of recipes for popular dependencies is provided, but users need to provide their own recipes for any other native dependencies.

kivy-ios

Description:
Toolchain for compiling Python / Kivy / other libraries for iOS.
URL:
https://github.com/kivy/kivy-ios

kivy-ios is a tool to package Python apps on iOS. It provides a toolchain to build a Python distribution with your app and its dependencies, as well as a CLI to create and manage Xcode projects that integrate with the toolchain.

It uses the same approach as python-for-android (also maintained by the Kivy project) for app dependencies — pure-Python dependencies are handled automatically, but native dependencies need recipes, and the project provides recipes for popular dependencies.

AidLearning

Description:
AI, Android, Linux, ARM: AI application development platform based on Android+Linux integrated ecology.
URL:
https://github.com/aidlearning/AidLearning-FrameWork

TODO

QPython

Description:
QPython is the Python engine for android.
URL:
https://github.com/qpython-android/qpython

TODO

pyqtdeploy

Description:
pyqtdeploy is a tool for deploying PyQt applications.
URL:
https://www.riverbankcomputing.com/software/pyqtdeploy/

contact https://www.riverbankcomputing.com/pipermail/pyqt/2023-May/thread.html contacted Phil, the maintainer

TODO

Chaquopy

Description:
Chaquopy provides everything you need to include Python components in an Android app.
URL:
https://chaquo.com/chaquopy/

TODO

EDK II

Description:
EDK II is a modern, feature-rich, cross-platform firmware development environment for the UEFI and PI specifications.
URL:
https://github.com/tianocore/edk2-libc/tree/master/AppPkg/Applications/Python

TODO

ActivePython

Description:
Commercial-grade, quality-assured Python distribution focusing on easy installation and cross-platform compatibility on Windows, Linux, Mac OS X, Solaris, HP-UX and AIX.
URL:
https://www.activestate.com/products/python/

TODO

Termux

Description:
Termux is an Android terminal emulator and Linux environment app that works directly with no rooting or setup required.
URL:
https://termux.dev/en/

TODO


Source: https://github.com/python/peps/blob/main/peps/pep-0720.rst

Last modified: 2023-09-09 17:39:29 GMT