Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

Appendix: License Documentation in Python and Other Projects

Abstract

There are multiple ways used or recommended to document licenses. This document contains the results of a comprehensive survey of license documentation in Python and other languages.

The key takeaways from the survey, which have guided the recommendations of PEP 639, are as follows:

  • Most package formats use a single License field.
  • Many modern package systems use some form of license expression to optionally combine more than one license identifier together. SPDX and SPDX-like syntaxes are the most popular in use.
  • SPDX license identifiers are becoming the de facto way to reference common licenses everywhere, whether or not a full license expression syntax is used.
  • Several package formats support documenting both a license expression and the paths of the corresponding files that contain the license text. Most Free and Open Source Software licenses require package authors to include their full text in a Distribution Package.

License Documentation in Python

Core Metadata

There are two overlapping Core Metadata fields to document a license: the license Classifier strings prefixed with License :: and the License field as free text.

The Core Metadata License field documentation is currently:

License
=======

.. versionadded:: 1.0

Text indicating the license covering the distribution where the license
is not a selection from the "License" Trove classifiers. See
:ref:`"Classifier" <metadata-classifier>` below.
This field may also be used to specify a
particular version of a license which is named via the ``Classifier``
field, or to indicate a variation or exception to such a license.

Examples::

    License: This software may only be obtained by sending the
            author a postcard, and then the user promises not
            to redistribute it.

    License: GPL version 3, excluding DRM provisions

Even though there are two fields, it is at times difficult to convey anything but simpler licensing. For instance, some classifiers lack precision (GPL without a version) and when multiple license classifiers are listed, it is not clear if both licenses must apply, or the user may choose between them. Furthermore, the list of available license classifiers is rather limited and out-of-date.

Setuptools and Wheel

Beyond a license code or qualifier, license text files are documented and included in a built package either implicitly or explicitly, and this is another possible source of confusion:

  • In the Setuptools and Wheel projects, license files are automatically added to the distribution (at their source location in a source distribution/sdist, and in the .dist-info directory of a built wheel) if they match one of a number of common license file name patterns (LICEN[CS]E*, COPYING*, NOTICE* and AUTHORS*). Alternatively, a package author can specify a list of license file paths to include in the built wheel under the license_files key in the [metadata] section of the project’s setup.cfg, or as an argument to the setuptools.setup() function. At present, following the Wheel project’s lead, Setuptools flattens the collected license files into the metadata directory, clobbering files with the same name, and dumps license files directly into the top-level .dist-info directory, but there is a desire to resolve both these issues, contingent on PEP 639 being accepted.
  • Both tools also support an older, singular license_file parameter that allows specifying only one license file to add to the distribution, which has been deprecated for some time but still sees some use.
  • Following the publication of an earlier draft of PEP 639, Setuptools added support for License-File in distribution metadata as described in this specification. This allows other tools consuming the resulting metadata to unambiguously locate the license file(s) for a given package.

PyPA Packaging Guide and Sample Project

Both the PyPA beginner packaging tutorial and its more comprehensive packaging guide state that it is important that every package include a license file. They point to the LICENSE.txt in the official PyPA sample project as an example, which is explicitly listed under the license_files key in its setup.cfg, following existing practice formally specified by PEP 639.

Both the beginner packaging tutorial and the sample project only use classifiers to declare a package’s license, and do not include or mention the License field. The full packaging guide does mention this field, but states that authors should use the license classifiers instead, unless the project uses a non-standard license (which the guide discourages).

Python source code files

Note: Documenting licenses in source code is not in the scope of PEP 639.

Beside using comments and/or SPDX-License-Identifier conventions, the license is sometimes documented in Python code files using a “dunder” module-level constant, typically named __license__.

This convention, while perhaps somewhat antiquated, is recognized by the built-in help() function and the standard pydoc module. The dunder variable will show up in the help() DATA section for a module.

License Documentation in Other Projects

Linux distribution packages

Note: in most cases, the texts of the most common licenses are included globally in a shared documentation directory (e.g. /usr/share/doc).

  • Debian documents package licenses with machine readable copyright files. It defines its own license expression syntax and list of identifiers for common licenses, both of which are closely related to those of SPDX.
  • Fedora packages specify how to include License Texts and use a License field that must be filled with appropriate short license identifier(s) from an extensive list of “Good Licenses”. Fedora uses SPDX license expression syntax.
  • OpenSUSE packages use SPDX license expressions with SPDX license IDs and a list of additional license identifiers.
  • Gentoo ebuild uses a LICENSE variable. This field is specified in GLEP-0023 and in the Gentoo development manual. Gentoo also defines a list of allowed licenses and a license expression syntax, which is rather different from SPDX.
  • The FreeBSD package Makefile provides LICENSE and LICENSE_FILE fields with a list of custom license symbols. For non-standard licenses, FreeBSD recommends using LICENSE=UNKNOWN and adding LICENSE_NAME and LICENSE_TEXT fields, as well as sophisticated LICENSE_PERMS to qualify the license permissions and LICENSE_GROUPS to document a license grouping. The LICENSE_COMB allows documenting more than one license and how they apply together, forming a custom license expression syntax. FreeBSD also recommends the use of SPDX-License-Identifier in source code files.
  • Arch Linux PKGBUILD defines its own license identifiers. The value 'unknown' can be used if the license is not defined.
  • OpenWRT ipk packages use the PKG_LICENSE and PKG_LICENSE_FILES variables and recommend the use of SPDX License identifiers.
  • NixOS uses SPDX identifiers and some extra license IDs in its license field.
  • GNU Guix (based on NixOS) has a single License field, uses its own license symbols list and specifies how to use one license or a list of them.
  • Alpine Linux packages recommend using SPDX identifiers in the license field.

Language and application packages

  • In Java, Maven POM defines a licenses XML tag with a list of licenses, each with a name, URL, comments and “distribution” type. This is not mandatory, and the content of each field is not specified.
  • The JavaScript NPM package.json uses a single license field with a SPDX license expression, or the UNLICENSED ID if none is specified. A license file can be referenced as an alternative using SEE LICENSE IN <filename> in the single license field.
  • Rubygems gemspec specifies either a single or list of license strings. The relationship between multiple licenses in a list is not specified. They recommend using SPDX license identifiers.
  • CPAN Perl modules use a single license field, which is either a single or a list of strings. The relationship between the licenses in a list is not specified. There is a list of custom license identifiers plus these generic identifiers: open_source, restricted, unrestricted, unknown.
  • Rust Cargo specifies the use of an SPDX license expression (v2.1) in the license field. It also supports an alternative expression syntax using slash-separated SPDX license identifiers, and there is also a license_file field. The crates.io package registry requires that either license or license_file fields are set when uploading a package.
  • PHP composer.json uses a license field with an SPDX license ID or proprietary. The license field is either a single string with resembling the SPDX license expression syntax with and and or keywords; or is a list of strings if there is a (disjunctive) choice of licenses.
  • NuGet packages previously used only a simple license URL, but now specify using a SPDX license expression and/or the path to a license file within the package. The NuGet.org repository states that they only accept license expressions that are “approved by the Open Source Initiative or the Free Software Foundation.”
  • Go language modules go.mod have no provision for any metadata beyond dependencies. Licensing information is left for code authors and other community package managers to document.
  • The Dart/Flutter spec recommends using a single LICENSE file that should contain all the license texts, each separated by a line with 80 hyphens.
  • The JavaScript Bower license field is either a single string or list of strings using either SPDX license identifiers, or a path/URL to a license file.
  • The Cocoapods podspec license field is either a single string, or a mapping with type, file and text keys. This is mandatory unless there is a LICENSE/LICENCE file provided.
  • Haskell Cabal accepts an SPDX license expression since version 2.2. The version of the SPDX license list used is a function of the Cabal version. The specification also provides a mapping between legacy (pre-SPDX) and SPDX license Identifiers. Cabal also specifies a license-file(s) field that lists license files to be installed with the package.
  • Erlang/Elixir mix/hex package specifies a licenses field as a required list of license strings, and recommends using SPDX license identifiers.
  • D Langanguage dub packages define their own list of license identifiers and license expression syntax, similar to the SPDX standard.
  • The R Package DESCRIPTION defines its own sophisticated license expression syntax and list of licenses identifiers. R has a unique way of supporting specifiers for license versions (such as LGPL (>= 2.0, < 3)) in its license expression syntax.

Other ecosystems

  • The SPDX-License-Identifier header is a simple convention to document the license inside a file.
  • The Free Software Foundation (FSF) promotes the use of SPDX license identifiers for clarity in the GPL and other versioned free software licenses.
  • The Free Software Foundation Europe (FSFE) REUSE project promotes using SPDX-License-Identifier.
  • The Linux kernel uses SPDX-License-Identifier and parts of the FSFE REUSE conventions to document its licenses.
  • U-Boot spearheaded using SPDX-License-Identifier in code and now follows the Linux approach.
  • The Apache Software Foundation projects use RDF DOAP with a single license field pointing to SPDX license identifiers.
  • The Eclipse Foundation promotes using SPDX-license-Identifiers.
  • The ClearlyDefined project promotes using SPDX license identifiers and expressions to improve license clarity.
  • The Android Open Source Project uses MODULE_LICENSE_XXX empty tag files, where XXX is a license code such as BSD, APACHE, GPL, etc. It also uses a NOTICE file that contains license and notice texts.