The JITter Conundrum - Just in Time for Your Traffic Jam

https://flic.kr/p/cEjEDA

I really hate getting stuck in a traffic jam. And if I encounter one, I'm usually trying to figure out some kind of alternative route which will shorten my trip. That's great, but what if the short cut gets me lost? What if a bridge is out and I can't use the short cut at all?

In interpreted languages, it just takes longer to get stuff done - I earlier gave the example where the Python source code

a = b + c

would result in a BINARY_ADD byte code which takes 78 machine instructions to do the add, but it's a single native ADD instruction if run in compiled language like C or C++. How can we speed this up? Or as the performance expert would say, how do I decrease pathlength while keeping CPI in check?

There is one common short cut around this traffic jam of interpreted languages. If you were going to run this line of code repeatedly, say in a loop, why not translate it into the super-efficient machine code version?

The technique is to use a JIT compiler, which stands for Just in Time. This is because the interpreter watches for a section of code which is being called frequently and compiles it to native code just in time for it to be executed quickly. This technique is used very effectively in interpreted languages like Java and interpreters such as HHVM.

It sounds easy, but it's definitely not. Here's why. if you compile just the hot code, you are effectively combining native code with interpreted code. That interpreter has a lot of internal state which it needs to maintain to run properly, so you need to do work to keep it in synch with native code.

Let's take our ADD example. The most efficient version of the native ADD instruction works with operands in registers, rather than going out to memory. The program flow might allow you to keep your hot variables in registers. But if you switch back to interpreted code, you need to make sure that memory is set up properly. Your interpreted byte code doesn't expect variables to be in registers, it expects them to be on a stack or someplace in memory. This requires the JIT to fix up everything if it switches between interpreted and native code.

Such energetic fixing up of interpreter state is one of the top challenges with creating a JIT for Python. There are some JIT versions of Python, notably Pypy. Pypy is a really fine project, and I have talked with some of the project leaders. According to conversations I have had with the,. their main challenge with wide adoption of Pypy is that add-on modules to Python often expect this interpreter state to be consistent with the way that CPython, the default interpreter, behaves.

This compatibility issue turns out to be really central here. Because Python is so popular, there is an extremely rich collection of add-on modules contributed by the community. It's very common practice for someone writing Python to make use of libraries of existing code, rather than write something new. But in general, these modules do not work with the Python JIT interpreters.

Although this is an issue with Python, it hasn't been as much of a problem in the PHP space. In particular, the HHVM project has gotten some very impressive speedups by implementing PHP in a compatible way but using a JIT. But with Python, it's a challenge.

So one of our hoped-for techniques for getting faster performance is not available, at least if we want a lot of existing Python code to speed up.

For more complete information about compiler optimizations, see our Optimization Notice.