PEP: 522 Title: Allow BlockingIOError in security sensitive APIs
Version: $Revision$ Last-Modified: $Date$ Author: Alyssa Coghlan
<ncoghlan@gmail.com>, Nathaniel J. Smith <njs@pobox.com> Status:
Rejected Type: Standards Track Content-Type: text/x-rst Requires: 506
Created: 16-Jun-2016 Python-Version: 3.6 Resolution:
https://mail.python.org/pipermail/security-sig/2016-August/000101.html

Abstract

A number of APIs in the standard library that return random values
nominally suitable for use in security sensitive operations currently
have an obscure operating system dependent failure mode that allows them
to return values that are not, in fact, suitable for such operations.

This is due to some operating system kernels (most notably the Linux
kernel) permitting reads from /dev/urandom before the system random
number generator is fully initialized, whereas most other operating
systems will implicitly block on such reads until the random number
generator is ready.

For the lower level os.urandom and random.SystemRandom APIs, this PEP
proposes changing such failures in Python 3.6 from the current silent,
hard to detect, and hard to debug, errors to easily detected and
debugged errors by raising BlockingIOError with a suitable error
message, allowing developers the opportunity to unambiguously specify
their preferred approach for handling the situation.

For the new high level secrets API, it proposes to block implicitly if
needed whenever random number is generated by that module, as well as to
expose a new secrets.wait_for_system_rng() function to allow code
otherwise using the low level APIs to explicitly wait for the system
random number generator to be available.

This change will impact any operating system that offers the getrandom()
system call, regardless of whether the default behaviour of the
/dev/urandom device is to return potentially predictable results when
the system random number generator is not ready (e.g. Linux, NetBSD) or
to block (e.g. FreeBSD, Solaris, Illumos). Operating systems that
prevent execution of userspace code prior to the initialization of the
system random number generator, or do not offer the getrandom() syscall,
will be entirely unaffected by the proposed change (e.g. Windows, Mac OS
X, OpenBSD).

The new exception or the blocking behaviour in the secrets module would
potentially be encountered in the following situations:

-   Python code calling these APIs during Linux system initialization
-   Python code running on improperly initialized Linux systems (e.g.
    embedded hardware without adequate sources of entropy to seed the
    system random number generator, or Linux VMs that aren't configured
    to accept entropy from the VM host)

Relationship with other PEPs

This PEP depends on the Accepted PEP 506, which adds the secrets module.

This PEP competes with Victor Stinner's PEP 524, which proposes to make
os.urandom itself implicitly block when the system RNG is not ready.

PEP Rejection

For the reference implementation, Guido rejected this PEP in favour of
the unconditional implicit blocking proposal in PEP 524 (which brings
CPython's behaviour on Linux into line with its behaviour on other
operating systems).

This means any further discussion of appropriate default behaviour for
os.urandom() in system Python installations in Linux distributions
should take place on the respective distro mailing lists, rather than on
the upstream CPython mailing lists.

Changes independent of this PEP

CPython interpreter initialization and random module initialization have
already been updated to gracefully fall back to alternative seeding
options if the system random number generator is not ready.

This PEP does not compete with the proposal in PEP 524 to add an
os.getrandom() API to expose the getrandom syscall on platforms that
offer it. There is sufficient motive for adding that API in the os
module's role as a thin wrapper around potentially platform dependent
operating system features that it can be added regardless of what
happens to the default behaviour of os.urandom() on these systems.

Proposal

Changing os.urandom() on platforms with the getrandom() system call

This PEP proposes that in Python 3.6+, os.urandom() be updated to call
the getrandom() syscall in non-blocking mode if available and raise
BlockingIOError: system random number generator is not ready; see secrets.token_bytes()
if the kernel reports that the call would block.

This behaviour will then propagate through to the existing
random.SystemRandom, which provides a relatively thin wrapper around
os.urandom() that matches the random.Random() API.

However, the new secrets module introduced by PEP 506 will be updated to
catch the new exception and implicitly wait for the system random number
generator if the exception is ever encountered.

In all cases, as soon as a call to one of these security sensitive APIs
succeeds, all future calls to these APIs in that process will succeed
without blocking (once the operating system random number generator is
ready after system boot, it remains ready).

On Linux and NetBSD, this will replace the previous behaviour of
returning potentially predictable results read from /dev/urandom.

On FreeBSD, Solaris, and Illumos, this will replace the previous
behaviour of implicitly blocking until the system random number
generator is ready. However, it is not clear if these operating systems
actually allow userspace code (and hence Python) to run before the
system random number generator is ready.

Note that in all cases, if calling the underlying getrandom() API
reports ENOSYS rather than returning a successful response or reporting
EAGAIN, CPython will continue to fall back to reading from /dev/urandom
directly.

Adding secrets.wait_for_system_rng()

A new exception shouldn't be added without a straightforward
recommendation for how to resolve that error when encountered (however
rare encountering the new error is expected to be in practice). For
security sensitive code that actually does need to use the lower level
interfaces to the system random number generator (rather than the new
secrets module), and does receive live bug reports indicating this is a
real problem for the userbase of that particular application rather than
a theoretical one, this PEP's recommendation will be to add the
following snippet (directly or indirectly) to the __main__ module:

    import secrets
    secrets.wait_for_system_rng()

Or, if compatibility with versions prior to Python 3.6 is needed:

    try:
        import secrets
    except ImportError:
        pass
    else:
        secrets.wait_for_system_rng()

Within the secrets module itself, this will then be used in
token_bytes() to block implicitly if the new exception is encountered:

    def token_bytes(nbytes=None):
        if nbytes is None:
            nbytes = DEFAULT_ENTROPY
        try:
            result = os.urandom(nbytes)
        except BlockingIOError:
            wait_for_system_rng()
            result = os.urandom(nbytes)
        return result

Other parts of the module will then be updated to use token_bytes() as
their basic random number generation building block, rather than calling
os.urandom() directly.

Application frameworks covering use cases where access to the system
random number generator is almost certain to be needed (e.g. web
frameworks) may choose to incorporate a call to
secrets.wait_for_system_rng() implicitly into the commands that start
the application such that existing calls to os.urandom() will be
guaranteed to never raise the new exception when using those frameworks.

For cases where the error is encountered for an application which cannot
be modified directly, then the following command can be used to wait for
the system random number generator to initialize before starting that
application:

    python3 -c "import secrets; secrets.wait_for_system_rng()"

For example, this snippet could be added to a shell script or a systemd
ExecStartPre hook (and may prove useful in reliably waiting for the
system random number generator to be ready, even if the subsequent
command is not itself an application running under Python 3.6)

Given the changes proposed to os.urandom() above, and the inclusion of
an os.getrandom() API on systems that support it, the suggested
implementation of this function would be:

    if hasattr(os, "getrandom"):
        # os.getrandom() always blocks waiting for the system RNG by default
        def wait_for_system_rng():
            """Block waiting for system random number generator to be ready"""
            os.getrandom(1)
            return
    else:
       # As far as we know, other platforms will never get BlockingIOError
       # below but the implementation makes pessimistic assumptions
        def wait_for_system_rng():
            """Block waiting for system random number generator to be ready"""
            # If the system RNG is already seeded, don't wait at all
            try:
                os.urandom(1)
                return
            except BlockingIOError:
                pass
            # Avoid the below busy loop if possible
            try:
                block_on_system_rng = open("/dev/random", "rb")
            except FileNotFoundError:
                pass
            else:
                with block_on_system_rng:
                    block_on_system_rng.read(1)
            # Busy loop until the system RNG is ready
            while True:
                try:
                    os.urandom(1)
                    break
                except BlockingIOError:
                    # Only check once per millisecond
                    time.sleep(0.001)

On systems where it is possible to wait for the system RNG to be ready,
this function will do so without a busy loop if os.getrandom() is
defined, os.urandom() itself implicitly blocks, or the /dev/random
device is available. If the system random number generator is ready,
this call is guaranteed to never block, even if the system's /dev/random
device uses a design that permits it to block intermittently during
normal system operation.

Limitations on scope

No changes are proposed for Windows or Mac OS X systems, as neither of
those platforms provides any mechanism to run Python code before the
operating system random number generator has been initialized. Mac OS X
goes so far as to kernel panic and abort the boot process if it can't
properly initialize the random number generator (although Apple's
restrictions on the supported hardware platforms make that exceedingly
unlikely in practice).

Similarly, no changes are proposed for other *nix systems that do not
offer the getrandom() syscall. On these systems, os.urandom() will
continue to block waiting for the system random number generator to be
initialized.

While other *nix systems that offer a non-blocking API (other than
getrandom()) for requesting random numbers suitable for use in security
sensitive applications could potentially receive a similar update to the
one proposed for getrandom() in this PEP, such changes are out of scope
for this particular proposal.

Python's behaviour on older versions of affected platforms that do not
offer the new getrandom() syscall will also remain unchanged.

Rationale

Ensuring the secrets module implicitly blocks when needed

This is done to help encourage the meme that arises for folks that want
the simplest possible answer to the right way to generate security
sensitive random numbers to be "Use the secrets module when available or
your application might crash unexpectedly", rather than the more
boilerplate heavy "Always call secrets.wait_for_system_rng() when
available or your application might crash unexpectedly".

It's also done due to the BDFL having a higher tolerance for APIs that
might block unexpectedly than he does for APIs that might throw an
unexpected exception[1].

Raising BlockingIOError in os.urandom() on Linux

For several years now, the security community's guidance has been to use
os.urandom() (or the random.SystemRandom() wrapper) when implementing
security sensitive operations in Python.

To help improve API discoverability and make it clearer that secrecy and
simulation are not the same problem (even though they both involve
random numbers), PEP 506 collected several of the one line recipes based
on the lower level os.urandom() API into a new secrets module.

However, this guidance has also come with a longstanding caveat:
developers writing security sensitive software at least for Linux, and
potentially for some other *BSD systems, may need to wait until the
operating system's random number generator is ready before relying on it
for security sensitive operations. This generally only occurs if
os.urandom() is read very early in the system initialization process, or
on systems with few sources of available entropy (e.g. some kinds of
virtualized or embedded systems), but unfortunately the exact conditions
that trigger this are difficult to predict, and when it occurs then
there is no direct way for userspace to tell it has happened without
querying operating system specific interfaces.

On *BSD systems (if the particular *BSD variant allows the problem to
occur at all) and potentially also Solaris and Illumos, encountering
this situation means os.urandom() will either block waiting for the
system random number generator to be ready (the associated symptom would
be for the affected script to pause unexpectedly on the first call to
os.urandom()) or else will behave the same way as it does on Linux.

On Linux, in Python versions up to and including Python 3.4, and in
Python 3.5 maintenance versions following Python 3.5.2, there's no clear
indicator to developers that their software may not be working as
expected when run early in the Linux boot process, or on hardware
without good sources of entropy to seed the operating system's random
number generator: due to the behaviour of the underlying /dev/urandom
device, os.urandom() on Linux returns a result either way, and it takes
extensive statistical analysis to show that a security vulnerability
exists.

By contrast, if BlockingIOError is raised in those situations, then
developers using Python 3.6+ can easily choose their desired behaviour:

1.  Wait for the system RNG at or before application startup (security
    sensitive)
2.  Switch to using the random module (non-security sensitive)

Making secrets.wait_for_system_rng() public

Earlier versions of this PEP proposed a number of recipes for wrapping
os.urandom() to make it suitable for use in security sensitive use
cases.

Discussion of the proposal on the security-sig mailing list prompted the
realization[2] that the core assumption driving the API design in this
PEP was that choosing between letting the exception cause the
application to fail, blocking waiting for the system RNG to be ready and
switching to using the random module instead of os.urandom is an
application and use-case specific decision that should take into account
application and use-case specific details.

There is no way for the interpreter runtime or support libraries to
determine whether a particular use case is security sensitive or not,
and while it's straightforward for application developer to decide how
to handle an exception thrown by a particular API, they can't readily
workaround an API blocking when they expected it to be non-blocking.

Accordingly, the PEP was updated to add secrets.wait_for_system_rng() as
an API for applications, scripts and frameworks to use to indicate that
they wanted to ensure the system RNG was available before continuing,
while library developers could continue to call os.urandom() without
worrying that it might unexpectedly start blocking waiting for the
system RNG to be available.

Backwards Compatibility Impact Assessment

Similar to PEP 476, this is a proposal to turn a previously silent
security failure into a noisy exception that requires the application
developer to make an explicit decision regarding the behaviour they
desire.

As no changes are proposed for operating systems that don't provide the
getrandom() syscall, os.urandom() retains its existing behaviour as a
nominally blocking API that is non-blocking in practice due to the
difficulty of scheduling Python code to run before the operating system
random number generator is ready. We believe it may be possible to
encounter problems akin to those described in this PEP on at least some
*BSD variants, but nobody has explicitly demonstrated that. On Mac OS X
and Windows, it appears to be straight up impossible to even try to run
a Python interpreter that early in the boot process.

On Linux and other platforms with similar /dev/urandom behaviour,
os.urandom() retains its status as a guaranteed non-blocking API.
However, the means of achieving that status changes in the specific case
of the operating system random number generator not being ready for use
in security sensitive operations: historically it would return
potentially predictable random data, with this PEP it would change to
raise BlockingIOError.

Developers of affected applications would then be required to make one
of the following changes to gain forward compatibility with Python 3.6,
based on the kind of application they're developing.

Unaffected Applications

The following kinds of applications would be entirely unaffected by the
change, regardless of whether or not they perform security sensitive
operations:

-   applications that don't support Linux
-   applications that are only run on desktops or conventional servers
-   applications that are only run after the system RNG is ready
    (including those where an application framework calls
    secrets.wait_for_system_rng() on their behalf)

Applications in this category simply won't encounter the new exception,
so it will be reasonable for developers to wait and see if they receive
Python 3.6 compatibility bugs related to the new runtime behaviour,
rather than attempting to pre-emptively determine whether or not they're
affected.

Affected security sensitive applications

Security sensitive applications would need to either change their system
configuration so the application is only started after the operating
system random number generator is ready for security sensitive
operations, change the application startup code to invoke
secrets.wait_for_system_rng(), or else switch to using the new
secrets.token_bytes() API.

As an example for components started via a systemd unit file, the
following snippet would delay activation until the system RNG was ready:

  ExecStartPre=python3 -c "import secrets;
  secrets.wait_for_system_rng()"

Alternatively, the following snippet will use secrets.token_bytes() if
available, and fall back to os.urandom() otherwise:

  

  try:

      import secrets.token_bytes as _get_random_bytes

  except ImportError:

      import os.urandom as _get_random_bytes

Affected non-security sensitive applications

Non-security sensitive applications should be updated to use the random
module rather than os.urandom:

    def pseudorandom_bytes(num_bytes):
        return random.getrandbits(num_bytes*8).to_bytes(num_bytes, "little")

Depending on the details of the application, the random module may offer
other APIs that can be used directly, rather than needing to emulate the
raw byte sequence produced by the os.urandom() API.

Additional Background

Why propose this now?

The main reason is because the Python 3.5.0 release switched to using
the new Linux getrandom() syscall when available in order to avoid
consuming a file descriptor[3], and this had the side effect of making
the following operations block waiting for the system random number
generator to be ready:

-   os.urandom (and APIs that depend on it)
-   importing the random module
-   initializing the randomized hash algorithm used by some builtin
    types

While the first of those behaviours is arguably desirable (and
consistent with the existing behaviour of os.urandom on other operating
systems), the latter two behaviours are unnecessary and undesirable, and
the last one is now known to cause a system level deadlock when
attempting to run Python scripts during the Linux init process with
Python 3.5.0 or 3.5.1[4], while the second one can cause problems when
using virtual machines without robust entropy sources configured[5].

Since decoupling these behaviours in CPython will involve a number of
implementation changes more appropriate for a feature release than a
maintenance release, the relatively simple resolution applied in Python
3.5.2 was to revert all three of them to a behaviour similar to that of
previous Python versions: if the new Linux syscall indicates it will
block, then Python 3.5.2 will implicitly fall back on reading
/dev/urandom directly[6].

However, this bug report also resulted in a range of proposals to add
new APIs like os.getrandom()[7], os.urandom_block()[8],
os.pseudorandom() and os.cryptorandom()[9], or adding new optional
parameters to os.urandom() itself[10], and then attempting to educate
users on when they should call those APIs instead of just using a plain
os.urandom() call.

These proposals arguably represent overreactions, as the question of
reliably obtaining random numbers suitable for security sensitive work
on Linux is a relatively obscure problem of interest mainly to operating
system developers and embedded systems programmers, that may not justify
expanding the Python standard library's cross-platform APIs with new
Linux-specific concerns. This is especially so with the secrets module
already being added as the "use this and don't worry about the low level
details" option for developers writing security sensitive software that
for some reason can't rely on even higher level domain specific APIs
(like web frameworks) and also don't need to worry about Python versions
prior to Python 3.6.

That said, it's also the case that low cost ARM devices are becoming
increasingly prevalent, with a lot of them running Linux, and a lot of
folks writing Python applications that run on those devices. That
creates an opportunity to take an obscure security problem that
currently requires a lot of knowledge about Linux boot processes and
provably unpredictable random number generation to diagnose and resolve,
and instead turn it into a relatively mundane and
easy-to-find-in-an-internet-search runtime exception.

The cross-platform behaviour of os.urandom()

On operating systems other than Linux and NetBSD, os.urandom() may
already block waiting for the operating system's random number generator
to be ready. This will happen at most once in the lifetime of the
process, and the call is subsequently guaranteed to be non-blocking.

Linux and NetBSD are outliers in that, even when the operating system's
random number generator doesn't consider itself ready for use in
security sensitive operations, reading from the /dev/urandom device will
return random values based on the entropy it has available.

This behaviour is potentially problematic, so Linux 3.17 added a new
getrandom() syscall that (amongst other benefits) allows callers to
either block waiting for the random number generator to be ready, or
else request an error return if the random number generator is not
ready. Notably, the new API does not support the old behaviour of
returning data that is not suitable for security sensitive use cases.

Versions of Python prior up to and including Python 3.4 access the Linux
/dev/urandom device directly.

Python 3.5.0 and 3.5.1 (when build on a system that offered the new
syscall) called getrandom() in blocking mode in order to avoid the use
of a file descriptor to access /dev/urandom. While there were no
specific problems reported due to os.urandom() blocking in user code,
there were problems due to CPython implicitly invoking the blocking
behaviour during interpreter startup and when importing the random
module.

Rather than trying to decouple SipHash initialization from the
os.urandom() implementation, Python 3.5.2 switched to calling
getrandom() in non-blocking mode, and falling back to reading from
/dev/urandom if the syscall indicates it will block.

As a result of the above, os.urandom() in all Python versions up to and
including Python 3.5 propagate the behaviour of the underling
/dev/urandom device to Python code.

Problems with the behaviour of /dev/urandom on Linux

The Python os module has largely co-evolved with Linux APIs, so having
os module functions closely follow the behaviour of their Linux
operating system level counterparts when running on Linux is typically
considered to be a desirable feature.

However, /dev/urandom represents a case where the current behaviour is
acknowledged to be problematic, but fixing it unilaterally at the kernel
level has been shown to prevent some Linux distributions from booting
(at least in part due to components like Python currently using it for
non-security-sensitive purposes early in the system initialization
process).

As an analogy, consider the following two functions:

    def generate_example_password():
        """Generates passwords solely for use in code examples"""
        return generate_unpredictable_password()

    def generate_actual_password():
        """Generates actual passwords for use in real applications"""
        return generate_unpredictable_password()

If you think of an operating system's random number generator as a
method for generating unpredictable, secret passwords, then you can
think of Linux's /dev/urandom as being implemented like:

    # Oversimplified artist's conception of the kernel code
    # implementing /dev/urandom
    def generate_unpredictable_password():
        if system_rng_is_ready:
            return use_system_rng_to_generate_password()
        else:
            # we can't make an unpredictable password; silently return a
            # potentially predictable one instead:
            return "p4ssw0rd"

In this scenario, the author of generate_example_password is fine - even
if "p4ssw0rd" shows up a bit more often than they expect, it's only used
in examples anyway. However, the author of generate_actual_password has
a problem - how do they prove that their calls to
generate_unpredictable_password never follow the path that returns a
predictable answer?

In real life it's slightly more complicated than this, because there
might be some level of system entropy available -- so the fallback might
be more like return random.choice(["p4ssword", "passw0rd", "p4ssw0rd"])
or something even more variable and hence only statistically predictable
with better odds than the author of generate_actual_password was
expecting. This doesn't really make things more provably secure, though;
mostly it just means that if you try to catch the problem in the obvious
way --if returned_password == "p4ssw0rd": raise UhOh -- then it doesn't
work, because returned_password might instead be p4ssword or even
pa55word, or just an arbitrary 64 bit sequence selected from fewer than
2**64 possibilities. So this rough sketch does give the right general
idea of the consequences of the "more predictable than expected"
fallback behaviour, even though it's thoroughly unfair to the Linux
kernel team's efforts to mitigate the practical consequences of this
problem without resorting to breaking backwards compatibility.

This design is generally agreed to be a bad idea. As far as we can tell,
there are no use cases whatsoever in which this is the behavior you
actually want. It has led to the use of insecure ssh keys on real
systems, and many *nix-like systems (including at least Mac OS X,
OpenBSD, and FreeBSD) have modified their /dev/urandom implementations
so that they never return predictable outputs, either by making reads
block in this case, or by simply refusing to run any userspace programs
until the system RNG has been initialized. Unfortunately, Linux has so
far been unable to follow suit, because it's been empirically determined
that enabling the blocking behavior causes some currently extant
distributions to fail to boot.

Instead, the new getrandom() syscall was introduced, making it possible
for userspace applications to access the system random number generator
safely, without introducing hard to debug deadlock problems into the
system initialization processes of existing Linux distros.

Consequences of getrandom() availability for Python

Prior to the introduction of the getrandom() syscall, it simply wasn't
feasible to access the Linux system random number generator in a
provably safe way, so we were forced to settle for reading from
/dev/urandom as the best available option. However, with getrandom()
insisting on raising an error or blocking rather than returning
predictable data, as well as having other advantages, it is now the
recommended method for accessing the kernel RNG on Linux, with reading
/dev/urandom directly relegated to "legacy" status. This moves Linux
into the same category as other operating systems like Windows, which
doesn't provide a /dev/urandom device at all: the best available option
for implementing os.urandom() is no longer simply reading bytes from the
/dev/urandom device.

This means that what used to be somebody else's problem (the Linux
kernel development team's) is now Python's problem -- given a way to
detect that the system RNG is not initialized, we have to choose how to
handle this situation whenever we try to use the system RNG.

It could simply block, as was somewhat inadvertently implemented in
3.5.0, and as is proposed in Victor Stinner's competing PEP:

    # artist's impression of the CPython 3.5.0-3.5.1 behavior
    def generate_unpredictable_bytes_or_block(num_bytes):
        while not system_rng_is_ready:
            wait
        return unpredictable_bytes(num_bytes)

Or it could raise an error, as this PEP proposes (in some cases):

    # artist's impression of the behavior proposed in this PEP
    def generate_unpredictable_bytes_or_raise(num_bytes):
        if system_rng_is_ready:
            return unpredictable_bytes(num_bytes)
        else:
            raise BlockingIOError

Or it could explicitly emulate the /dev/urandom fallback behavior, as
was implemented in 3.5.2rc1 and is expected to remain for the rest of
the 3.5.x cycle:

    # artist's impression of the CPython 3.5.2rc1+ behavior
    def generate_unpredictable_bytes_or_maybe_not(num_bytes):
        if system_rng_is_ready:
            return unpredictable_bytes(num_bytes)
        else:
            return (b"p4ssw0rd" * (num_bytes // 8 + 1))[:num_bytes]

(And the same caveats apply to this sketch as applied to the
generate_unpredictable_password sketch of /dev/urandom above.)

There are five places where CPython and the standard library attempt to
use the operating system's random number generator, and thus five places
where this decision has to be made:

-   initializing the SipHash used to protect str.__hash__ and friends
    against DoS attacks (called unconditionally at startup)
-   initializing the random module (called when random is imported)
-   servicing user calls to the os.urandom public API
-   the higher level random.SystemRandom public API
-   the new secrets module public API added by PEP 506

Previously, these five places all used the same underlying code, and
thus made this decision in the same way.

This whole problem was first noticed because 3.5.0 switched that
underlying code to the generate_unpredictable_bytes_or_block behavior,
and it turns out that there are some rare cases where Linux boot scripts
attempted to run a Python program as part of system initialization, the
Python startup sequence blocked while trying to initialize SipHash, and
then this triggered a deadlock because the system stopped doing anything
-- including gathering new entropy -- until the Python script was
forcibly terminated by an external timer. This is particularly
unfortunate since the scripts in question never processed untrusted
input, so there was no need for SipHash to be initialized with provably
unpredictable random data in the first place. This motivated the change
in 3.5.2rc1 to emulate the old /dev/urandom behavior in all cases (by
calling getrandom() in non-blocking mode, and then falling back to
reading /dev/urandom if the syscall indicates that the /dev/urandom pool
is not yet fully initialized.)

We don't know whether such problems may also exist in the
Fedora/RHEL/CentOS ecosystem, as the build systems for those
distributions use chroots on servers running an older operating system
kernel that doesn't offer the getrandom() syscall, which means CPython's
current build configuration compiles out the runtime check for that
syscall[11].

A similar problem was found due to the random module calling os.urandom
as a side-effect of import in order to seed the default global
random.Random() instance.

We have not received any specific complaints regarding direct calls to
os.urandom() or random.SystemRandom() blocking with 3.5.0 or 3.5.1 -only
problem reports due to the implicit blocking on interpreter startup and
as a side-effect of importing the random module.

Independently of this PEP, the first two cases have already been updated
to never block, regardless of the behaviour of os.urandom().

Where PEP 524 proposes to make all 3 of the latter cases block
implicitly, this PEP proposes that approach only for the last case (the
secrets) module, with os.urandom() and random.SystemRandom() instead
raising an exception when they detect that the underlying operating
system call would block.

References

For additional background details beyond those captured in this PEP and
Victor's competing PEP, also see Victor's prior collection of relevant
information and links at
https://haypo-notes.readthedocs.io/summary_python_random_issue.html

Copyright

This document has been placed into the public domain.

[1] Take a decision for os.urandom() in Python 3.6
(https://mail.python.org/pipermail/security-sig/2016-August/000084.htm)

[2] Application level vs library level design decisions
(https://mail.python.org/pipermail/security-sig/2016-June/000057.html)

[3] os.urandom() should use Linux 3.17 getrandom() syscall
(http://bugs.python.org/issue22181)

[4] Python 3.5 running on Linux kernel 3.17+ can block at startup or on
importing the random module on getrandom()
(http://bugs.python.org/issue26839)

[5] "import random" blocks on entropy collection on Linux with low
entropy (http://bugs.python.org/issue25420)

[6] os.urandom() doesn't block on Linux anymore
(https://hg.python.org/cpython/rev/9de508dc4837)

[7] Proposal to add os.getrandom()
(http://bugs.python.org/issue26839#msg267803)

[8] Add os.urandom_block() (http://bugs.python.org/issue27250)

[9] Add random.cryptorandom() and random.pseudorandom, deprecate
os.urandom() (http://bugs.python.org/issue27279)

[10] Always use getrandom() in os.random() on Linux and add block=False
parameter to os.urandom() (http://bugs.python.org/issue27266)

[11] Does the HAVE_GETRANDOM_SYSCALL config setting make sense?
(https://mail.python.org/pipermail/security-sig/2016-June/000060.html)