PEP: 475 Title: Retry system calls failing with EINTR Version:
$Revision$ Last-Modified: $Date$ Author: Charles-François Natali
<cf.natali@gmail.com>, Victor Stinner <vstinner@python.org>
BDFL-Delegate: Antoine Pitrou <solipsis@pitrou.net> Status: Final Type:
Standards Track Content-Type: text/x-rst Created: 29-Jul-2014
Python-Version: 3.5 Resolution:
https://mail.python.org/pipermail/python-dev/2015-February/138018.html

Abstract

System call wrappers provided in the standard library should be retried
automatically when they fail with EINTR, to relieve application code
from the burden of doing so.

By system calls, we mean the functions exposed by the standard C library
pertaining to I/O or handling of other system resources.

Rationale

Interrupted system calls

On POSIX systems, signals are common. Code calling system calls must be
prepared to handle them. Examples of signals:

-   The most common signal is SIGINT, the signal sent when CTRL+c is
    pressed. By default, Python raises a KeyboardInterrupt exception
    when this signal is received.
-   When running subprocesses, the SIGCHLD signal is sent when a child
    process exits.
-   Resizing the terminal sends the SIGWINCH signal to the applications
    running in the terminal.
-   Putting the application in background (ex: press CTRL-z and then
    type the bg command) sends the SIGCONT signal.

Writing a C signal handler is difficult: only "async-signal-safe"
functions can be called (for example, printf() and malloc() are not
async-signal safe), and there are issues with reentrancy. Therefore,
when a signal is received by a process during the execution of a system
call, the system call can fail with the EINTR error to give the program
an opportunity to handle the signal without the restriction on
signal-safe functions.

This behaviour is system-dependent: on certain systems, using the
SA_RESTART flag, some system calls are retried automatically instead of
failing with EINTR. Regardless, Python's signal.signal() function clears
the SA_RESTART flag when setting the signal handler: all system calls
will probably fail with EINTR in Python.

Since receiving a signal is a non-exceptional occurrence, robust POSIX
code must be prepared to handle EINTR (which, in most cases, means retry
in a loop in the hope that the call eventually succeeds). Without
special support from Python, this can make application code much more
verbose than it needs to be.

Status in Python 3.4

In Python 3.4, handling the InterruptedError exception (EINTR's
dedicated exception class) is duplicated at every call site on a
case-by-case basis. Only a few Python modules actually handle this
exception, and fixes usually took several years to cover a whole module.
Example of code retrying file.read() on InterruptedError:

    while True:
        try:
            data = file.read(size)
            break
        except InterruptedError:
            continue

List of Python modules in the standard library which handle
InterruptedError:

-   asyncio
-   asyncore
-   io, _pyio
-   multiprocessing
-   selectors
-   socket
-   socketserver
-   subprocess

Other programming languages like Perl, Java and Go retry system calls
failing with EINTR at a lower level, so that libraries and applications
needn't bother.

Use Case 1: Don't Bother With Signals

In most cases, you don't want to be interrupted by signals and you don't
expect to get InterruptedError exceptions. For example, do you really
want to write such complex code for a "Hello World" example?

    while True:
        try:
            print("Hello World")
            break
        except InterruptedError:
            continue

InterruptedError can happen in unexpected places. For example,
os.close() and FileIO.close() may raise InterruptedError: see the
article close() and EINTR.

The Python issues related to EINTR section below gives examples of bugs
caused by EINTR.

The expectation in this use case is that Python hides the
InterruptedError and retries system calls automatically.

Use Case 2: Be notified of signals as soon as possible

Sometimes yet, you expect some signals and you want to handle them as
soon as possible. For example, you may want to immediately quit a
program using the CTRL+c keyboard shortcut.

Besides, some signals are not interesting and should not disrupt the
application. There are two options to interrupt an application on only
some signals:

-   Set up a custom signal handler which raises an exception, such as
    KeyboardInterrupt for SIGINT.
-   Use a I/O multiplexing function like select() together with Python's
    signal wakeup file descriptor: see the function
    signal.set_wakeup_fd().

The expectation in this use case is for the Python signal handler to be
executed timely, and the system call to fail if the handler raised an
exception -- otherwise restart.

Proposal

This PEP proposes to handle EINTR and retries at the lowest level, i.e.
in the wrappers provided by the stdlib (as opposed to higher-level
libraries and applications).

Specifically, when a system call fails with EINTR, its Python wrapper
must call the given signal handler (using PyErr_CheckSignals()). If the
signal handler raises an exception, the Python wrapper bails out and
fails with the exception.

If the signal handler returns successfully, the Python wrapper retries
the system call automatically. If the system call involves a timeout
parameter, the timeout is recomputed.

Modified functions

Example of standard library functions that need to be modified to comply
with this PEP:

-   open(), os.open(), io.open()
-   functions of the faulthandler module
-   os functions:
    -   os.fchdir()
    -   os.fchmod()
    -   os.fchown()
    -   os.fdatasync()
    -   os.fstat()
    -   os.fstatvfs()
    -   os.fsync()
    -   os.ftruncate()
    -   os.mkfifo()
    -   os.mknod()
    -   os.posix_fadvise()
    -   os.posix_fallocate()
    -   os.pread()
    -   os.pwrite()
    -   os.read()
    -   os.readv()
    -   os.sendfile()
    -   os.wait3()
    -   os.wait4()
    -   os.wait()
    -   os.waitid()
    -   os.waitpid()
    -   os.write()
    -   os.writev()
    -   special cases: os.close() and os.dup2() now ignore EINTR error,
        the syscall is not retried
-   select.select(), select.poll.poll(), select.epoll.poll(),
    select.kqueue.control(), select.devpoll.poll()
-   socket.socket() methods:
    -   accept()
    -   connect() (except for non-blocking sockets)
    -   recv()
    -   recvfrom()
    -   recvmsg()
    -   send()
    -   sendall()
    -   sendmsg()
    -   sendto()
-   signal.sigtimedwait(), signal.sigwaitinfo()
-   time.sleep()

(Note: the selector module already retries on InterruptedError, but it
doesn't recompute the timeout yet)

os.close, close() methods and os.dup2() are a special case: they will
ignore EINTR instead of retrying. The reason is complex but involves
behaviour under Linux and the fact that the file descriptor may really
be closed even if EINTR is returned. See articles:

-   Returning EINTR from close()
-   (LKML) Re: [patch 7/7] uml: retry host close() on EINTR
-   close() and EINTR

The socket.socket.connect() method does not retry connect() for
non-blocking sockets if it is interrupted by a signal (fails with
EINTR). The connection runs asynchronously in background. The caller is
responsible to wait until the socket becomes writable (ex: using
select.select()) and then call
socket.socket.getsockopt(socket.SOL_SOCKET, socket.SO_ERROR) to check if
the connection succeeded (getsockopt() returns 0) or failed.

InterruptedError handling

Since interrupted system calls are automatically retried, the
InterruptedError exception should not occur anymore when calling those
system calls. Therefore, manual handling of InterruptedError as
described in Status in Python 3.4 can be removed, which will simplify
standard library code.

Backward compatibility

Applications relying on the fact that system calls are interrupted with
InterruptedError will hang. The authors of this PEP don't think that
such applications exist, since they would be exposed to other issues
such as race conditions (there is an opportunity for deadlock if the
signal comes before the system call). Besides, such code would be
non-portable.

In any case, those applications must be fixed to handle signals
differently, to have a reliable behaviour on all platforms and all
Python versions. A possible strategy is to set up a signal handler
raising a well-defined exception, or use a wakeup file descriptor.

For applications using event loops, signal.set_wakeup_fd() is the
recommended option to handle signals. Python's low-level signal handler
will write signal numbers into the file descriptor and the event loop
will be awaken to read them. The event loop can handle those signals
without the restriction of signal handlers (for example, the loop can be
woken up in any thread, not just the main thread).

Appendix

Wakeup file descriptor

Since Python 3.3, signal.set_wakeup_fd() writes the signal number into
the file descriptor, whereas it only wrote a null byte before. It
becomes possible to distinguish between signals using the wakeup file
descriptor.

Linux has a signalfd() system call which provides more information on
each signal. For example, it's possible to know the pid and uid who sent
the signal. This function is not exposed in Python yet (see issue
12304).

On Unix, the asyncio module uses the wakeup file descriptor to wake up
its event loop.

Multithreading

A C signal handler can be called from any thread, but Python signal
handlers will always be called in the main Python thread.

Python's C API provides the PyErr_SetInterrupt() function which calls
the SIGINT signal handler in order to interrupt the main Python thread.

Signals on Windows

Control events

Windows uses "control events":

-   CTRL_BREAK_EVENT: Break (SIGBREAK)
-   CTRL_CLOSE_EVENT: Close event
-   CTRL_C_EVENT: CTRL+C (SIGINT)
-   CTRL_LOGOFF_EVENT: Logoff
-   CTRL_SHUTDOWN_EVENT: Shutdown

The SetConsoleCtrlHandler() function can be used to install a control
handler.

The CTRL_C_EVENT and CTRL_BREAK_EVENT events can be sent to a process
using the GenerateConsoleCtrlEvent() function. This function is exposed
in Python as os.kill().

Signals

The following signals are supported on Windows:

-   SIGABRT
-   SIGBREAK (CTRL_BREAK_EVENT): signal only available on Windows
-   SIGFPE
-   SIGILL
-   SIGINT (CTRL_C_EVENT)
-   SIGSEGV
-   SIGTERM

SIGINT

The default Python signal handler for SIGINT sets a Windows event
object: sigint_event.

time.sleep() is implemented with WaitForSingleObjectEx(), it waits for
the sigint_event object using time.sleep() parameter as the timeout. So
the sleep can be interrupted by SIGINT.

_winapi.WaitForMultipleObjects() automatically adds sigint_event to the
list of watched handles, so it can also be interrupted.

PyOS_StdioReadline() also used sigint_event when fgets() failed to check
if Ctrl-C or Ctrl-Z was pressed.

Links

Misc

-   glibc manual: Primitives Interrupted by Signals
-   Bug #119097 for perl5: print returning EINTR in 5.14.

Python issues related to EINTR

The main issue is: handle EINTR in the stdlib.

Open issues:

-   Add a new signal.set_wakeup_socket() function
-   signal.set_wakeup_fd(fd): set the fd to non-blocking mode
-   Use a monotonic clock to compute timeouts
-   sys.stdout.write on OS X is not EINTR safe
-   platform.uname() not EINTR safe
-   asyncore does not handle EINTR in recv, send, connect, accept,
-   socket.create_connection() doesn't handle EINTR properly

Closed issues:

-   Interrupted system calls are not retried
-   Solaris: EINTR exception in select/socket calls in telnetlib
-   subprocess: Popen.communicate() doesn't handle EINTR in some cases
-   multiprocessing.util._eintr_retry doesn't recalculate timeouts
-   file readline, readlines & readall methods can lose data on EINTR
-   multiprocessing BaseManager serve_client() does not check EINTR on
    recv
-   selectors behaviour on EINTR undocumented
-   asyncio: limit EINTR occurrences with SA_RESTART
-   smtplib.py socket.create_connection() also doesn't handle EINTR
    properly
-   Faulty RESTART/EINTR handling in Parser/myreadline.c
-   test_httpservers intermittent failure, test_post and EINTR
-   os.spawnv(P_WAIT, ...) on Linux doesn't handle EINTR
-   asyncore fails when EINTR happens in pol
-   file.write and file.read don't handle EINTR
-   socket.readline() interface doesn't handle EINTR properly
-   subprocess is not EINTR-safe
-   SocketServer doesn't handle syscall interruption
-   subprocess deadlock when read() is interrupted
-   time.sleep(1): call PyErr_CheckSignals() if the sleep was
    interrupted
-   siginterrupt with flag=False is reset when signal received
-   need siginterrupt() on Linux - impossible to do timeouts
-   [Windows] Can not interrupt time.sleep()

Python issues related to signals

Open issues:

-   signal.default_int_handler should set signal number on the raised
    exception
-   expose signalfd(2) in the signal module
-   missing return in win32_kill?
-   Interrupts are lost during readline PyOS_InputHook processing
-   cannot catch KeyboardInterrupt when using curses getkey()
-   Deferred KeyboardInterrupt in interactive mode

Closed issues:

-   sys.interrupt_main()

Implementation

The implementation is tracked in issue 23285. It was committed on
February 07, 2015.

Copyright

This document has been placed in the public domain.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 coding: utf-8 End: