Understanding Intel® Optimizations in the Android* Runtime Compiler

Introduction

The application developer generally writes a section of the application in the Java* language. This Java code is then transformed into an Android-specific bytecode, called the Dex bytecode. In the various flavors of Android’s major releases, there have been several ways to go from the bytecode format to the actual binary format that is run on the processor. For example, in the Jelly Bean* version, there was a just-in-time (JIT) compiler in the runtime, called Dalvik* VM. From KitKat* on, the appearance of a new Android runtime (ART), added the existence of an ahead-of-time (AOT) compiler in ART.

Two optimizations that Intel implemented in the ART compiler. Both optimizations provide a means to simplify compile-time evaluated loops. These elements are only part of Intel’s commitment to provide the best user experience for the Android OS and are presented here to show how synthetic benchmarks are greatly optimized by these optimizations.

Optimizations

Since 2008, there have been three variations of Google’s Android compiler. The Dalvik Jit compiler, available since Froyo, was replaced by the Quick AOT compiler in the Lollipop version of Android.  This succession represented a major change in the paradigm for transforming the Dex bytecode, which is Android’s version of the Java bytecode, into binary code. As a result of this change, most of the middle-end optimization logic from the Dalvik VM was copied over to Quick, and, though AOT compilation might provide the compiler more time for complex optimizations, Quick provided infrastructure and optimizations similar to that of the Dalvik VM. In late 2014, the AOSP added a new compiler named Optimizing. This latest version of the compiler is a full rewrite of the Quick compiler and, over time, seems to be adding more and more infrastructure to enable more advanced optimizations, which were not possible in previous compiler versions.

At the same time, Intel has been working on its own infrastructure and optimizations into the compiler to provide the best user experience for Intel processor-based devices. Before presenting the optimizations that were implemented by Intel in the Android compiler, the next section shows two small optimizations that are classic in production compilers and help provide a better understanding of the two Intel optimizations. 

Constant Folding and Store Sinking Optimizations

Constant Calculation Sinking is an important optimization. When considered individually, it is not immediately clear whether it can be applied to real world cases, but the game example shows how candidate loops are recognized. Integrating the optimization provides a means to optimize loops that get transformed by the inliner method, for example.
 
For the Constant Calculation Sinking optimization, the synthetic benchmark CF-Bench benefited in terms of score for a few of its subtests. In the compiler domain, it is rare to have speedups on the order of magnitude such as the one obtained for the benchmark. It shows that the transformations to the MIPS subtest score shadow any future optimization applicable to other subtests.
 

Trivial Loop Evaluator

The Trivial Loop Evaluator is a powerful optimization when every loop input is available to the compiler. For this optimization to apply realistically, other optimizations such as inlining must occur as well. This is paramount to the two optimizations discussed in this article: developers generally don’t write loops that are applicable per se.

To learn more about those optimizations, read the article: Understanding Intel Optimizations in the Android* Runtime Compiler

For more complete information about compiler optimizations, see our Optimization Notice.