PyPy is both:
- a reimplementation of Python in Python, and
- a framework for implementing interpreters and virtual machines for programming languages, especially dynamic languages.
PyPy tries to find new answers about ease of creation, flexibility, maintainability and speed trade-offs for language implementations. For further details see our goal and architecture document .
The mostly likely stumbling block for any given project is support for extension modules. PyPy supports a continually growing number of extension modules, but so far mostly only those found in the standard library.
The language features (including builtin types and functions) are very complete and well tested, so if your project does not use many extension modules there is a good chance that it will work with PyPy.
We list the differences we know about in cpython differences.
We have experimental support for CPython extension modules, so they run with minor changes. This has been a part of PyPy since the 1.4 release, but support is still in beta phase. CPython extension modules in PyPy are often much slower than in CPython due to the need to emulate refcounting. It is often faster to take out your CPython extension and replace it with a pure python version that the JIT can see.
We fully support ctypes-based extensions. But for best performance, we recommend that you use the cffi module to interface with C code.
For information on which third party extensions work (or do not work) with PyPy see the compatibility wiki.
PyPy is regularly and extensively tested on Linux machines and on Mac OS X and mostly works under Windows too (but is tested there less extensively). PyPy needs a CPython running on the target platform to bootstrap, as cross compilation is not really meant to work yet. At the moment you need CPython 2.5 - 2.7 for the translation process. PyPy’s JIT requires an x86 or x86_64 CPU. (There has also been good progress on getting the JIT working for ARMv7.)
PyPy currently aims to be fully compatible with Python 2.7. That means that it contains the standard library of Python 2.7 and that it supports 2.7 features (such as set comprehensions).
Yes, PyPy has a GIL. Removing the GIL is very hard. The first problem is that our garbage collectors are not re-entrant.
This really depends on your code. For pure Python algorithmic code, it is very fast. For more typical Python programs we generally are 3 times the speed of Cpython 2.6 . You might be interested in our benchmarking site and our jit documentation.
Note that the JIT has a very high warm-up cost, meaning that the programs are slow at the beginning. If you want to compare the timings with CPython, even relatively simple programs need to run at least one second, preferrably at least a few seconds. Large, complicated programs need even more time to warm-up the JIT.
No, we found no way of doing that. The JIT generates machine code containing a large number of constant addresses — constant at the time the machine code is written. The vast majority is probably not at all constants that you find in the executable, with a nice link name. E.g. the addresses of Python classes are used all the time, but Python classes don’t come statically from the executable; they are created anew every time you restart your program. This makes saving and reloading machine code completely impossible without some very advanced way of mapping addresses in the old (now-dead) process to addresses in the new process, including checking that all the previous assumptions about the (now-dead) object are still true about the new object.
Yes. The toolsuite that translates the PyPy interpreter is quite general and can be used to create optimized versions of interpreters for any language, not just Python. Of course, these interpreters can make use of the same features that PyPy brings to Python: translation to various languages, stackless features, garbage collection, implementation of various things like arbitrarily long integers, etc.
Certainly you can come to sprints! We always welcome newcomers and try to help them as much as possible to get started with the project. We provide tutorials and pair them with experienced PyPy developers. Newcomers should have some Python experience and read some of the PyPy documentation before coming to a sprint.
Coming to a sprint is usually the best way to get into PyPy development. If you get stuck or need advice, contact us. IRC is the most immediate way to get feedback (at least during some parts of the day; most PyPy developers are in Europe) and the mailing list is better for long discussions.
On Linux, if SELinux is enabled, you may get errors along the lines of “OSError: externmod.so: cannot restore segment prot after reloc: Permission denied.” This is caused by a slight abuse of the C compiler during configuration, and can be disabled by running the following command with root privileges:
# setenforce 0
This will disable SELinux’s protection and allow PyPy to configure correctly. Be sure to enable it again if you need it!
No, RPython is not a Python compiler.
In Python, it is mostly impossible to prove anything about the types that a program will manipulate by doing a static analysis. It should be clear if you are familiar with Python, but if in doubt see [BRETT].
If you want a fast Python program, please use the PyPy JIT instead.
|[BRETT]||Brett Cannon, Localized Type Inference of Atomic Types in Python, http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.90.3231|
RPython is a restricted subset of the Python language. It is used for implementing dynamic language interpreters within the PyPy toolchain. The restrictions ensure that type inference (and so, ultimately, translation to other languages) of RPython programs is possible.
The property of “being RPython” always applies to a full program, not to single functions or modules (the translation toolchain does a full program analysis). The translation toolchain follows all calls recursively and discovers what belongs to the program and what does not.
RPython program restrictions mostly limit the ability to mix types in arbitrary ways. RPython does not allow the binding of two different types in the same variable. In this respect (and in some others) it feels a bit like Java. Other features not allowed in RPython are the use of special methods (__xxx__) except __init__ and __del__, and the use of reflection capabilities (e.g. __dict__).
You cannot use most existing standard library modules from RPython. The exceptions are some functions in os, math and time that have native support.
To read more about the RPython limitations read the RPython description.
No. Zope’s RestrictedPython aims to provide a sandboxed execution environment for CPython. PyPy’s RPython is the implementation language for dynamic language interpreters. However, PyPy also provides a robust sandboxed Python Interpreter.
If you put “NOT_RPYTHON” into the docstring of a function and that function is found while trying to translate an RPython program, the translation process stops and reports this as an error. You can therefore mark functions as “NOT_RPYTHON” to make sure that they are never analyzed.
It’s not necessarily nonsense, but it’s not really The PyPy Way. It’s pretty hard, without some kind of type inference, to translate this Python:
a + b
into anything significantly more efficient than this Common Lisp:
(py:add a b)
And making type inference possible is what RPython is all about.
You could make #'py:add a generic function and see if a given CLOS implementation is fast enough to give a useful speed (but I think the coercion rules would probably drive you insane first). – mwh
No, and you shouldn’t try. First and foremost, RPython is a language designed for writing interpreters. It is a restricted subset of Python. If you program is not an interpreter but tries to do “real things”, like use any part of the standard Python library or any 3rd-party library, then it is not RPython to start with. You should only look at RPython if you try to write your own interpreter.
If your goal is to speed up Python code, then look at the regular PyPy, which is a full and compliant Python 2.7 interpreter (which happens to be written in RPython). Not only is it not necessary for you to rewrite your code in RPython, it might not give you any speed improvements even if you manage to.
Yes, it is possible with enough effort to compile small self-contained pieces of RPython code doing a few performance-sensitive things. But this case is not interesting for us. If you needed to rewrite the code in RPython, you could as well have rewritten it in C for example. The latter is a much more supported, much more documented language :-)
In theory yes. But we tried to use it 5 or 6 times already, as a translation backend or as a JIT backend — and failed each time.
In more details: using LLVM as a (static) translation backend is pointless nowadays because you can generate C code and compile it with clang. (Note that compiling PyPy with clang gives a result that is not faster than compiling it with gcc.) We might in theory get extra benefits from LLVM’s GC integration, but this requires more work on the LLVM side before it would be remotely useful. Anyway, it could be interfaced via a custom primitive in the C code.
On the other hand, using LLVM as our JIT backend looks interesting as well — but again we made an attempt, and it failed: LLVM has no way to patch the generated machine code.
So the position of the core PyPy developers is that if anyone wants to make an N+1’th attempt with LLVM, he is welcome, and he will receive a bit of help on the IRC channel, but he is left with the burden of proof that it works.
No, you have to rebuild the entire interpreter. This means two things:
In this context it is not that important to be able to translate RPython modules independently of translating the complete interpreter. (It could be done given enough efforts, but it’s a really serious undertaking. Consider it as quite unlikely for now.)
Because it’s fun.