PEP 786 – Precision and modulo-precision flag format specifiers for integer fields

Author:: Jay Berry <email at jb2170.com>
Sponsor:: Alyssa Coghlan <ncoghlan at gmail.com>
Discussions-To:: Discourse thread
Status:: Draft
Type:: Standards Track
Created:: 04-Apr-2025
Python-Version:: 3.15
Post-History:: 14-Feb-2025, 09-Apr-2026

Table of Contents

Abstract
Rationale
- Precision For Negative Numbers
  - Short Verdict
  - Long Introspection
Rejected Alternatives
Backwards Compatibility
Examples And Teaching
- Precision
- Modulo-Precision
Reference Implementation
Thanks
Copyright

Abstract

This PEP proposes implementing the standard format specifiers . and z of PEP 3101 for integer fields as “precision” and “modulo-precision” respectively. Both are presented together in this PEP as the alternative rejected implementations entail intertwined combinations of both.

. (“precision”) shall format an integer to a specified minimum number of digits, identical to the behavior of old-style % formatting. This shall be implemented for all integer presentation types except 'c'.

z (“modulo-precision”) shall be permitted as an optional “modulo” flag when formatting an integer with precision and one of the binary, octal, or hexadecimal presentation types (bases that are powers of two). This first reduces the integer into range(base ** precision) using the % operator. The result is a predictable two’s complement style formatting with the exact number of digits equal to the precision.

This PEP amends the clause of PEP 3101 which states “The precision is ignored for integer conversions”.

Rationale

When string formatting integers in binary octal and hexadecimal, one often desires the resulting string to contain a guaranteed minimum number of digits. For unsigned integers of known machine-width bounds (for example, 8-bit bytes) this often also ends up the exact resulting number of digits. This has previously been implemented in the old-style % formatting using the . “precision” format specifier, closely related to that of the C programming language.

>>> "0x%.2x" % 15
'0x0f'           # two hex digits, ideal for displaying an unsigned byte
>>> "0o%.3o" % 18
'0o022'          # three octal digits, ideal for displaying a umask or file permissions

When PEP 3101 new-style formatting was first introduced, used in str.format and f-strings, the format specification was simple enough that the behavior of “precision” could be trivially emulated with the width format specifier. Precision therefore was left unimplemented and forbidden for int fields. However, as time has progressed and new format specifiers have been added, whose interactions with width noticeably diverge its behavior away from emulating precision, the readmission of precision as its own format specifier, ., is sufficiently warranted.

The width format specifier guarantees a minimum length of the entire replacement field, not just the number of digits in a formatted integer. For example, the wonderful # specifier that prepends the prefix of the corresponding presentation type consumes from width:

>>> x = 12
>>> f"0x{x:02x}" # manually specifying '0x' prefix
'0x0c'           # two hex digits :)
>>> f"{x:#02x}"  # use '#' format specifier to output '0x' automatically
'0xc'            # only one hex digit :(
>>> f"{x:#08b}"
'0b001100'       # we wanted 8 bits, not 6 :(

One could attempt to argue that since the length of a prefix is known to always be 2, it can be accounted for manually by adding 2 to the desired number of digits. Consider however the following demonstrations of why this is a bad idea:

By correcting the second example to f"{x:#04x}", at a glance this looks like it may produce four hex digits, but it only produces two. This is bad for readability. 4 is thus too much of a ‘magic number’, and trying to counter that by being overly explicit with f"{x:#0{2+2}x}" looks ridiculous.
In the future it is possible that a type specifier may be added with a prefix not of length 2, meaning the programmer has to calculate the prefix length, rather than Python’s internal string formatting code handling that automatically.
Things get more complicated when using the sign format specifier, f"{x: #0{1+2+2}x}" required to produce ' 0x0c'.
Things get even more complicated when introducing a grouping_option, for example formatting an integer into k ‘word’ segments joined by _: x = 3735928559; k = 2; f"{x: #0{1+2+4*k+(k - 1)}_x}" is required to produce ' 0xdead_beef'. Surely this would be easier to write with precision as f"{x: #_.8x}"?

It is clear at this point that the reduction of complexity that would be provided by precision’s implementation for int fields would be beneficial to any user. Nor is this proposal a new special-case behavior being demanded exclusively at the behest of int fields: the precision token . is already implemented as prescribed in PEP 3101 for str data to truncate the field’s length, and for float data to ensure that there are a fixed number of digits after the decimal point, eg f"{0.1+0.2: .4f}" producing ' 0.3000'. Thus no new tokens need adding to the format specification because of this proposal, maintaining its modest size.

For the sake of completion, and lack of any reasonable objection, we propose that precision shall work also in decimal, base 10. Explicitly, the integer presentation types laid out in PEP 3101 that are permitted to implement precision are 'b', 'd', 'o', 'x', 'X', 'n', and '' (None). The only presentation type not permitted is c (‘character’), whose purpose is to format an integer to a single Unicode character, or an appropriate replacement for non-printable characters, for which it does not make sense to implement precision. In the event that new integer presentation types are added in the future, such as 'B' and 'O' which mutatis-mutandis could provide the same behavior as 'X' (that is a capitalized prefix and digits), their addition should appropriately consider whether precision should be implemented or not. In the case of 'B' and 'O' as described here it would be correct to implement precision. A ValueError shall be raised when precision is attempted to be used for invalid integer presentation types.

Precision For Negative Numbers

So far in this PEP we have cautiously avoided talking about the formatting of negative numbers with precision, which we shall now discuss.

Short Verdict

We desire two behaviors, which motivates the implementation of a flag z to toggle on the latter’s behavior:

For precision without the z flag, a negative integer x shall be formatted with a negative sign and the digits of -x’s formatting. This is the same friendly behavior as old-style % formatting.
For example f"{-12:#.2x}" shall produce '-0x0c', equivalent to "%#.2x" % -12.
For precision with the z flag, r = x % base ** n is first taken when formatting f"{x:z.{n}{base_char}}", and r is passed on to precision, the resulting string being equivalent to f"{r:.{n}{base_char}}". Because r is in range(base ** n) the number of digits will always be exactly n, resulting in a predictable two’s complement style formatting, which is useful to the end user in environments that deal with machine-width oriented integers such as struct.
For example in formatting f"{-1:z#.2x}", -1 is reduced modulo 256 via 255 = -1 % 256, the resulting string being equivalent to f"{255:#.2x}", which is '0xff'.

The z flag shall only be implemented for presentation types corresponding to bases that are powers of two, specifically at present binary, octal, and hexadecimal. Whilst reduction of integers modulo by powers of ten is computationally possible, a ‘ten’s complement?’ has no demand and so precision is unimplemented for decimal presentation types. The z flag shall work for all integers, not just negatives.

The syntax choice of z is again out of respect for maintaining the modest size of the format specification. z was introduced to the format specification in PEP 682 as a flag for normalizing negative zero to positive zero for the float and Decimal types. It is currently unimplemented for the int type, and since integers never have a ‘negative zero’ situation it seems uncontroversial to repurpose z, again as a flag. If one squints hard enough, the z looks like a 2 for two’s complement!

Long Introspection

We first present some observations about the binary representations of signed integers in two’s complement. This leads us to a couple of alternative formulations of formatting negative numbers.

Observe that one can always extend a signed number’s binary representation by extending the the leading digit as a prefix:

 45 (8-bit)  00101101
 45 (9-bit) 000101101
-19 (8-bit)  11101101
-19 (9-bit) 111101101

For non-negative numbers this is obvious. For negative numbers this is because the erstwhile leading column of an n-bit representation goes from having a value of -2 ** (n-1), to +2 ** (n-1), with a new n+1th column of value -2 ** n prefixed on, the overall sum unaffected.

This is what C’s printf does, working with powers of two as the numbers of digits:

printf("%#hhb\n", -19); // 0b11101101
printf("%#hho\n", -19); // 0355
printf("%#hhx\n", -19); // 0xed

printf("%#b\n",   -19); // 0b11111111111111111111111111101101
printf("%#o\n",   -19); // 037777777755
printf("%#x\n",   -19); // 0xffffffed

Conversely it should be clear that one can losslessly truncate a signed number’s binary representation to have only one leading 0 if it is non-negative, and one leading 1 if it is negative:

 45 (8-bit)  00101101
 45 (7-bit)   0101101
-19 (8-bit)  11101101
-19 (7-bit)   1101101

If one were to truncate another digit off of these examples, then both would end up as 101101, 45 being indistinguishable from -19 when using only 6 binary digits because they are both the same modulo 2 ** 6 = 64. Therefore to losslessly and unambiguously represent a signed integer x as a binary string which is rendered to the end user, we have a de facto ‘minimal width’ representation convention, using n digits, where n is the smallest integer such that x is in range(-2 ** (n-1), 2 ** (n-1)).

For rendering octal and hexadecimal strings one has to extend the definition of the ‘minimal width’ representation convention to be sufficiently unambiguous. 383’s minimal width binary string is 0101111111, and -129’s is 101111111, a suffix of the former’s. A naive, incorrect, implementation of hexadecimal string formatting would render both as '0x17f' by padding both binary representations to 000101111111. The method was correct to desire a number of binary digits (12) that is divisible by the number of bits in the base (4 bits in base 16) so that the binary representation can be segmented up into (hex) digits, but it was incorrect in padding; the method should have instead extended as we have observed previously, 383 extended to 000101111111, and -129 extended to 111101111111, whence 383 is rendered as '0x17f' and -129 as 0xf7f.

Thus the generalized definition of our ‘minimal width’ representation convention is: for an integer x to rendered in base base, produce n digits, where n is the smallest integer such that x is in range(-base ** n / 2, base ** n / 2).

This leads onto the rejected alternatives.

Rejected Alternatives

Behavior of `z`

The desired implementation of z, the two’s complement style formatting flag, has split into two main camps of opinions, disagreeing over lossless vs lossy presentation. The lossless camp believes that the formatted strings corresponding to integers should all be distinct from each other, uniqueness preserved by the minimal width representation convention; precision with z enabled should still be only a minimum number of digits requested, as it is without z. The lossy camp believes that precision with z enabled should first reduce the integer using modular arithmetic, which then produces exactly the number of digits requested, equivalent to left-truncating the minimal width representation string.

We endeavor to conclude in the following section that the former camp, lossless formatting, has no use cases, and is thus a rejected idea, whence this PEP proposes the latter, lossy, behavior.

Minimal Width Representation Convention

This idea was fiercely entertained only due to its lossless behavior, however it is a obstacle to ergonomics in every candidate use case. These arguments about the aesthetics of string rendering are not irrational or about personal taste, but rather they are crucial in how information is communicated to the end user.

In a program in which signed-ness of integers is critical to communicate, any implementation of z should not be used, as the average user will be expecting to see a negative sign -. The alternative of using minimal width representation convention requires one to be uncomfortably vigilant looking for leading digits of numbers belonging to the upper half of the base’s range whenever a negative number is present (1 for binary, 4-7 for octal, and 8-f for hex). Any end user that is not aware of this de facto convention, and even those who are but are not expecting it to be present in a program, would have a hard time:

The formatting of 128 and -128 using f"{x:z#.2x}" would produce '0x080' and '0x80' respectively. It is the PEP author’s opinion that there is a 0% chance that '0x80' is being read as negative 128 under normal conditions. Furthermore the hideous rendering of positive 128 as '0x080' is useless for a program that should produce a uniformly spaced hexdump of bytes, agnostic of whether they are signed or unsigned; all bytes should be rendered in the form '0xNN'. See the examples section on how modulo-precision handles bytes in the correct sign-agnostic way.

Contrapositively therefore z’s purpose is to be used in environments where signed-ness is not critical, and more likely than not where it is even encouraged to treat the integers with respect to the modular arithmetic that arises in two’s complement hardware of fixed register sizes. In the example above 128 and -128 are the same modulo 256, and the respectable rendering is '0x80'. In general the purpose of z is to treat integers modulo base ** precision as the same. So too 255 and -1 should both be rendered as '0xff', not '0x0ff' and '0xff' respectively; the truncation is not a hindrance, but the desired behavior. Formally we may say that the formatting should be a well defined bijection between the equivalence classes of Z/(base ** precision)Z and strings with precision digits.

The remaining question is “is there no chance to communicate this truncation to the user?” as a concern for the ‘loss of information’ arising from the effectively left-truncated strings. We reject this question’s premise that there ever is such a case of unintentional loss of information, by considering the two cases of hardware-aware integers and otherwise:

With respect to hardware-aware integers we have so far played around with examples of integers in range(-128, 256), the union of the signed and unsigned ranges for bytes. The virtues of formatting x and x - 256 as the same are clearly established. In these contexts that one expects to find z, any erroneous integers corresponding to bytes that lie outside that range are likely a programming error. For example if a library sets a pixel brightness integer to be 257, and prints out '0x01' instead of '0x101' via f"{x:z#.2x}", that’s not our problem or doing; string formatting shouldn’t raise an exception, or even a SyntaxWarning as an invalid escape sequence "\y" would, because ValueError: bytes must be in range(0, 256) will be raised by bytes when trying to serialize that integer via bytes([257]); let the appropriate ‘layer’ of code raise the exception, as that is more indicative of a defect in the library, not our string formatting.

In the case of non-hardware aware integers, one would have to intentionally opt to use z, in which modular arithmetic is the chosen desired effect. It is for this reason also that we shall not raise a SyntaxWarning or ValueError for integers lying outside of range(-base ** precision / 2, base ** precision).

Thus we have defended the lossy behavior of z implemented as modulo-precision, and we have exhausted all reasonable use cases of lossless behavior.

A final compromise to consider and reject is implementing z not as a flag contingent on ., but as a flag that can be combined with .. Specifically: z without . would turn on two’s complement mode to render the minimal width representation of the formatted integer, . without z would implement precision as already explained, a minimum number of digits in the magnitude and a sign if necessary, and z combined with . would turn on the left-truncating modulo-precision. This labyrinth of combinations does not seem useful to anyone, as we have already discredited the ergonomics of minimal width representation convention, whence z would rarely be used on its own, and this behavior of two options that individually render a minimum number of digits combining together to render an exact number of digits seems counterintuitive.

Infinite Length Indication

Another, less popular, rejected alternative was for z to directly acknowledge the infinite prefix of 0s or 1s that precede a non-negative or negative number respectively. For example:

>>> f"{-1:z#.8b}"
'0b[...1]11111111'
>>> f"{300:z#.8b}"
'0b[...0]100101100'

This is effectively the minimal width representation convention with an ‘infinite’ prefix attached to it.

In the C programming language the machine-width dependent two’s complement formatting of int data with precision exhibits excessive lengths of prefixes that arise from negative numbers, even those with small magnitude:

printf("%#.2x\n", -19); // 0xffffffed
printf("%#.2llx\n", (long long unsigned int)-19); // 0xffffffffffffffed

This prefix could continue on indefinitely if it were not limited by a maximum machine-width!

Python’s int type is indeed not limited by a maximum machine-width. Thus to avoid printing infinitely long two’s complement strings we could use a similar approach to that of the builtin list’s string formatting for printing a list that contains itself:

>>> l = []
>>> l.append(l)
>>> l
[[...]]

>>> y = -1
>>> f"{y:z#.8b}"
'0b[...1]11111111'

This may have been useful to educate beginners on how bitwise binary operations work, for example showing how -1 & x is always trivially equal to x, or how the binary representation of the negation of a number can be obtained by adding one to its bitwise complement:

>>> x = 42
>>> f"{x:z#.8b}"
'0b[...0]00101010'
>>> f"{~x:z#.8b}"
'0b[...1]11010101'
>>> f"{x|~x:z#.8b}"
'0b[...1]11111111'
# x | ~x == -1
# x | ~x == x + ~x because of their disjoint bitwise representations
# thus x + ~x == -1
# thus -x == ~x + 1
>>> y = ~x + 1
>>> f"{y:z#.8b}"
'0b[...1]11010110'
>>> y == -x
True

Its use case is just too narrow, and modulo-precision outshines it.

General

What about ones’s complement, or other binary representations?
Two’s complement is so dominant that no one really considers other representations. GCC only supports two’s complement.
Could we do nothing?
Programmers continue to hobble on using the width format specifier with ad-hoc corrections to mimic precision. This is intolerable, and the rationale of this PEP makes conclusive arguments for the addition and implementation choices of precision.

Refusing to implement precision for integer fields using . reserves . for possible future uses. However in the ~20 year timespan since PEP 3101 no alternatives have been accepted, and any alternate use of . takes it further out of sync with both old-style % formatting, and the C programming language.

Syntax

! instead of z. for precision with modulo-precision, mutually exclusive with ..
Pros:
- ! is graphically related to ., an extension if you will. Precision with the modulo-precision flag set is indeed an extension of precision.
- ! in the English language is often used for imperative, commanding sentences. So too modulo-precision commands the exact number of digits to which its input shall be formatted, whereas precision is the minimum number of digits. This is idiomatic.
- ! is only one symbol as opposed to z.. This coupled with ! being mutually exclusive with . leaves the overall length of one’s written code unaffected when switching on modulo-precision.
- Using a new ! symbol reserves z for other future uses, whatever that may be.
Cons:
- z. also conveys a sense of extension from ., a flag attached to ., and lexicographically flows left to right as ‘modulo’ (z) ‘precision’ (.).
- . and ! being mutually exclusive to each other may give a beginner programmer analysis-paralysis over which to choose when looking at the format specification documentation.
- ! would be another addition to the format specification for a single purpose. It would not have any implementation for str, float, or any other type.
- There also already exists a ["!" conversion] “explicit conversion flag” in the format string syntax as laid out in PEP 3101. For example in f"{s!r}" the !r calls repr on s. This would not syntactically clash with a ! format specifier, the format specifiers [":" format_spec] being separated by a well-defined preceding colon, however users unfamiliar with the new modulo-precision mode may glance over format strings containing ! and expect different behavior.
Verdict:
- Whilst graphically attractive, ! would clutter the format specification for a single purpose that can be achieved by overloading the preexisting z flag.

Backwards Compatibility

To quote PEP 682:

The new formatting behavior is opt-in, so numerical formatting of existing programs will not be affected.

unless someone out there is specifically relying upon . raising a ValueError for integers as it currently does, but to quote PEP 475:

The authors of this PEP don’t think that such applications exist

Examples And Teaching

Precision

Documentation and tutorials in the Python sphere of influence should encourage the adoption of ., precision, as the default format specifier for formatting int fields as opposed to width, when it is clear a minimum number of digits is required, not a minimum length of the whole replacement field.

Since the concept of precision is common in other languages such as C, and was already present in Python’s old-style % formatting, we don’t need to go too overboard, but a decent few examples as below may demonstrate its uses.

>>> def hexdump(b: bytes) -> str:
...     return " ".join(f"{c:#.2x}" for c in b)

>>> hexdump(b"GET /\r\n\r\n")
'0x47 0x45 0x54 0x20 0x2f 0x0d 0x0a 0x0d 0x0a'
# observe the CR and LF bytes padded to precision 2
# in this basic HTTP/0.9 request

>>> def unicode_dump(s: str) -> str:
...     return " ".join(f"U+{ord(c):.4X}" for c in s)

>>> unicode_dump("USA 🦅")
'U+0055 U+0053 U+0041 U+0020 U+1F985'
# observe the last character's Unicode codepoint has 5 digits;
# precision is only the minimum number of digits

Modulo-Precision

The clear area for encouraging the use of modulo-precision is when dealing with machine-width oriented integers such as those packed and unpacked by struct. We give an example of the consistent predictable two’s complement formatting of signed and unsigned integers.

>>> import struct

>>> my_struct = b"\xff"
>>> (t,) = struct.unpack('b', my_struct) # signed char
>>> print(t, f"{t:#.2x}", f"{t:z#.2x}")
'-1 -0x01 0xff'
>>> (t,) = struct.unpack('B', my_struct) # unsigned char
>>> print(t, f"{t:#.2x}", f"{t:z#.2x}")
'255 0xff 0xff'

# observe in both the signed and unsigned unpacking the modulo-precision flag 'z'
# produces a predictable two's complement formatting

Reference Implementation

A pull request implementing this PEP is available on GitHub: python/cpython#146437.

Thanks

Thank you to:

Raymond Hettinger, for the initial suggestion of the two’s complement behavior.

Copyright

This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.

Source: https://github.com/python/peps/blob/main/peps/pep-0786.rst

Last modified: 2026-04-09 09:40:11 GMT