Differences between PyPy and CPython

This page documents the few differences and incompatibilities between the PyPy Python interpreter and CPython. Some of these differences are “by design”, since we think that there are cases in which the behaviour of CPython is buggy, and we do not want to copy bugs.

Differences that are not listed here should be considered bugs of PyPy.

Extension modules

List of extension modules that we support:

  • Supported as built-in modules (in pypy/module/):

    __builtin__ __pypy__ _ast _codecs _collections _continuation _ffi _hashlib _io _locale _lsprof _md5 _minimal_curses _multiprocessing _random _rawffi _sha _socket _sre _ssl _warnings _weakref _winreg array binascii bz2 cStringIO cmath cpyext crypt errno exceptions fcntl gc imp itertools marshal math mmap operator parser posix pyexpat select signal struct symbol sys termios thread time token unicodedata zipimport zlib

    When translated on Windows, a few Unix-only modules are skipped, and the following module is built instead:


  • Supported by being rewritten in pure Python (possibly using cffi): see the lib_pypy/ directory. Examples of modules that we support this way: ctypes, cPickle, cmath, dbm, datetime... Note that some modules are both in there and in the list above; by default, the built-in module is used (but can be disabled at translation time).

The extension modules (i.e. modules written in C, in the standard CPython) that are neither mentioned above nor in lib_pypy/ are not available in PyPy. (You may have a chance to use them anyway with cpyext.)

Subclasses of built-in types

Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden __getitem__() in a subclass of dict will not be called by e.g. the built-in get() method.

The above is true both in CPython and in PyPy. Differences can occur about whether a built-in function or method will call an overridden method of another object than self. In PyPy, they are often called in cases where CPython would not. Two examples:

class D(dict):
    def __getitem__(self, key):
        return "%r from D" % (key,)

class A(object):

a = A()
a.__dict__ = D()
a.foo = "a's own foo"
print a.foo
# CPython => a's own foo
# PyPy => 'foo' from D

glob = D(foo="base item")
loc = {}
exec "print foo" in glob, loc
# CPython => base item
# PyPy => 'foo' from D

Mutating classes of objects which are already used as dictionary keys

Consider the following snippet of code:

class X(object):

def __evil_eq__(self, other):
    print 'hello world'
    return False

def evil(y):
    d = {x(): 1}
    X.__eq__ = __evil_eq__
    d[y] # might trigger a call to __eq__?

In CPython, __evil_eq__ might be called, although there is no way to write a test which reliably calls it. It happens if y is not x and hash(y) == hash(x), where hash(x) is computed when x is inserted into the dictionary. If by chance the condition is satisfied, then __evil_eq__ is called.

PyPy uses a special strategy to optimize dictionaries whose keys are instances of user-defined classes which do not override the default __hash__, __eq__ and __cmp__: when using this strategy, __eq__ and __cmp__ are never called, but instead the lookup is done by identity, so in the case above it is guaranteed that __eq__ won’t be called.

Note that in all other cases (e.g., if you have a custom __hash__ and __eq__ in y) the behavior is exactly the same as CPython.

Ignored exceptions

In many corner cases, CPython can silently swallow exceptions. The precise list of when this occurs is rather long, even though most cases are very uncommon. The most well-known places are custom rich comparison methods (like __eq__); dictionary lookup; calls to some built-in functions like isinstance().

Unless this behavior is clearly present by design and documented as such (as e.g. for hasattr()), in most cases PyPy lets the exception propagate instead.

Object Identity of Primitive Values, is and id

Object identity of primitive values works by value equality, not by identity of the wrapper. This means that x + 1 is x + 1 is always true, for arbitrary integers x. The rule applies for the following types:

  • int
  • float
  • long
  • complex

This change requires some changes to id as well. id fulfills the following condition: x is y <=> id(x) == id(y). Therefore id of the above types will return a value that is computed from the argument, and can thus be larger than sys.maxint (i.e. it can be an arbitrary long).

Notably missing from the list above are str and unicode. If your code relies on comparing strings with is, then it might break in PyPy.

Note that for floats there “is” only one object per “bit pattern” of the float. So float('nan') is float('nan') is true on PyPy, but not on CPython because they are two objects; but 0.0 is -0.0 is always False, as the bit patterns are different. As usual, float('nan') == float('nan') is always False. When used in containers (as list items or in sets for example), the exact rule of equality used is “if x is y or x == y” (on both CPython and PyPy); as a consequence, because all nans are identical in PyPy, you cannot have several of them in a set, unlike in CPython. (Issue #1974)


  • Hash randomization (-R) is ignored in PyPy. In CPython before 3.4 it has little point.

  • You can’t store non-string keys in type objects. For example:

    class A(object):
        locals()[42] = 3

    won’t work.

  • sys.setrecursionlimit(n) sets the limit only approximately, by setting the usable stack space to n * 768 bytes. On Linux, depending on the compiler settings, the default of 768KB is enough for about 1400 calls.

  • since the implementation of dictionary is different, the exact number of times that __hash__ and __eq__ are called is different. Since CPython does not give any specific guarantees either, don’t rely on it.

  • assignment to __class__ is limited to the cases where it works on CPython 2.5. On CPython 2.6 and 2.7 it works in a bit more cases, which are not supported by PyPy so far. (If needed, it could be supported, but then it will likely work in many more case on PyPy than on CPython 2.6/2.7.)

  • the __builtins__ name is always referencing the __builtin__ module, never a dictionary as it sometimes is in CPython. Assigning to __builtins__ has no effect.

  • directly calling the internal magic methods of a few built-in types with invalid arguments may have a slightly different result. For example, [].__add__(None) and (2).__add__(None) both return NotImplemented on PyPy; on CPython, only the latter does, and the former raises TypeError. (Of course, []+None and 2+None both raise TypeError everywhere.) This difference is an implementation detail that shows up because of internal C-level slots that PyPy does not have.

  • on CPython, [].__add__ is a method-wrapper, and list.__add__ is a slot wrapper. On PyPy these are normal bound or unbound method objects. This can occasionally confuse some tools that inspect built-in types. For example, the standard library inspect module has a function ismethod() that returns True on unbound method objects but False on method-wrappers or slot wrappers. On PyPy we can’t tell the difference, so ismethod([].__add__) == ismethod(list.__add__) == True.

  • in pure Python, if you write class A(object): def f(self): pass and have a subclass B which doesn’t override f(), then B.f(x) still checks that x is an instance of B. In CPython, types written in C use a different rule. If A is written in C, any instance of A will be accepted by B.f(x) (and actually, B.f is A.f in this case). Some code that could work on CPython but not on PyPy includes: datetime.datetime.strftime(datetime.date.today(), ...) (here, datetime.date is the superclass of datetime.datetime). Anyway, the proper fix is arguably to use a regular method call in the first place: datetime.date.today().strftime(...)

  • the __dict__ attribute of new-style classes returns a normal dict, as opposed to a dict proxy like in CPython. Mutating the dict will change the type and vice versa. For builtin types, a dictionary will be returned that cannot be changed (but still looks and behaves like a normal dictionary).

  • some functions and attributes of the gc module behave in a slightly different way: for example, gc.enable and gc.disable are supported, but instead of enabling and disabling the GC, they just enable and disable the execution of finalizers.

  • PyPy prints a random line from past #pypy IRC topics at startup in interactive mode. In a released version, this behaviour is suppressed, but setting the environment variable PYPY_IRC_TOPIC will bring it back. Note that downstream package providers have been known to totally disable this feature.

  • PyPy’s readline module was rewritten from scratch: it is not GNU’s readline. It should be mostly compatible, and it adds multiline support (see multiline_input()). On the other hand, parse_and_bind() calls are ignored (issue #2072).