There is no question that PHP is the most popular language in use today to implement server code on web pages. Something like 80% of web sites are implemented using it. The extremely popular site Facebook has implemented their own PHP interpreter called HHVM, and they are doing code development on it as an open source project. Because of its performance and its open source methodology, it is also being adopted by sites other than Facebook.
We began looking at HHVM as an open source contributor and saw some good opportunities to enhance its performance on our server processors.
In particular, our current server generation. The product name is the Intel Xeon Processor E5 v3 family, though it's often referred to by its codename "Haswell."
I remember years ago participating in meetings between key operating systems architects and the chief architect of Haswell. We were discussing the kinds of features which we could add to Haswell which would make their OS rock. One really cool thing about my job is to see the final outcome of those discussions in actual products.
One Haswell feature our engineers took advantage of in HHVM were the second version of Advance Vector Extensions, or AVX2. These are 256 bit wide vector instructions which implement SIMD (single instruction, multiple data) on a vector of data. Just by compiling HHVM with AVX2 enabled, we were able to see a 5% increase in WordPress throughout. This is a actually a really big thing.
I need to pause and explain why this is such a big thing. We usually try to put our effort on what I would call a "real" customer workload. This is some kind of very large PHP web application which represents what actual web sites are doing. Also, it has been scripted to run the same operations every time. If you can run such a workload multiple times and get the same result, it can show you exactly what impact your software changes will have.
This is distinguished from "micro" benchmarks which implement a simple algorithm. In the world of performance engineering, it's sometimes very easy to see like a 2X or 3X or even 10X speedup with a micro benchmark, like a Mandelbrot set. But customers with a large web application like Wikipedia or WordPress won't see the same kind of results.
In this case, we use versions of WordPress, Drupal and MediaWiki, scripted to run multiple streams of simultaneous work. As I write this, I see that last night's run of the latest HHVM running WordPress on Haswell had only a 0.07% performance deviation, which is really quite good. This means if we make a source code change which results in a 1% performance boost, it won't get lost in the noise of benchmark variability.
So by simply enabling AVX2 in the compilation of HHVM resulted in a 5% boost in Wordpress performance on Haswell. We also tuned the assembly language for memset() and memcpy(). All told, we got a 9.8% performance boost on Haswell running WordPress on HHVM.
There are other improvements we are making as well. You can see our contributions in the HHVM project sources. All told I'm hoping that those who choose HHVM for their PHP solution will see a very healthy benefit from using Intel's latest processors.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804