Case Study: Porting Stream to Android*

Objective

This document demonstrates how to port the Stream benchmark app to an x86 platform via creating an Android* application that uses a native shared library. See http://www.streambench.org for the latest app details.

This article serves as a guide for a more advanced use case of porting an app to Android* using the NDK. Specifically, the Stream benchmark will be ported. Stream has been around for quite some time and has set a standard in demonstrating "real world" memory bandwidth (as opposed to "theoretical bandwidth" metrics that serve to be more academic as opposed to what is typically seen in practice).

Stream provides the developer with an option for multithreading. If single-threading the app (ie: "Tuned"), a simple NDK compile will do without failure. However, by default, the app uses OpenMP* as its method for multithreading. As of October, 2011, Android* doesn't support OpenMP* at link time. The developer has to port the application to use POSIX* threads instead. The NDK will then be used to compile the app as a native library providing an infrastructure for using POSIX* threads (pthreads).

Prerequisites

This document assumes that the Android* SDK is properly set up for use with the Eclipse* IDE. The document also assumes the latest NDK is installed and configured as well. The developer uses the NDK to compile native C/C++ code into a native library that the wrapper Android* application can then use. NDK compilation and linking will be demonstrated in this document for Stream.

Refer to other Intel® Developer Zone documents that describe how to procure and install Android* SDK and NDK.

Porting Steps: The Android.mk File

Create a new project folder for Stream. This project will be used with the Android* NDK (NDK r6b was used for this exercise). Create a "jni" folder within it and place a template Android.mk file in there (or create one from scratch). Here are the key changes shown in bold:

...
LOCAL_MODULE:= libstream
...
LOCAL_SRC_FILES:= \
stream.c 


By convention, LOCAL_MODULE has a name starting with "lib", and "stream.c" is assumed to be the main application source file.

It is now time to enable OpenMP* for the app in the NDK build. This entails both compile time and link time enabling. Make the following changes in Android.mk as well:

LOCAL_LDLIBS := -ldl -llog -lgomp
LOCAL_CFLAGS := -fopenmp 


Attempt to build the project with this command:

ndk-build APP_ABI=x86

You will notice that the above operation results in a build error, as the NDK doesn't understand the -lgomp flag. OpenMP* linking isn't enabled for Android* at the time. . Luckily, we can port the app to use POSIX* threads (pthreads) instead.

The build time and link time flags need to be changed as follows:



Figure 3.1: POSIX* Thread Flags

Porting Steps: Rewriting Stream to use POSIX* Threads

Note: It isn't within the scope of this document to fully unravel the details of the pthread – enabled code. A high level overview is simply given, with more focus on the NDK side.

As a robust starting point, it is a good idea to simply add a make flag for pthread support in Stream, rather than doing away with the OpenMP* infrastructure. Here, the make flag is assumed to have the name _PTHREADS. Then, any time a code block for OpenMP* is seen in the form of #pragma omp parallel { … }, the semantic equivalent form in the pthread implementation could appear as follows:

    //<NDK porting>
    //#pragma omp parallel for
    for (j=0; j<THREAD_OFFSET; j++)
    {   
        a[j + (THREAD_OFFSET * thread_ID)] = 1.0; 
        b[j+ (THREAD_OFFSET * thread_ID)] = 2.0;
        c[j+ (THREAD_OFFSET * thread_ID)] = 0.0;
     }

Figure 4.1: Parallel Loop in POSIX*

In this case, THREAD_OFFSET is defined as N / MAX_THREADS, where N is the array problem size in the Stream source code. Thus, THREAD_OFFSET is used to allow multiple threads to work on different regions of an array concurrently. thread_ID is simply the ID of the thread entrant into this code, and the IDs are stored in an array after thread creation.

It may be appropriate in some cases to use mutex locks and unlocks. With mutex locks and unlocks I created my own syncing barrier for the threads, , since semantically, I wanted all threads to sync together after a parallel code section to mimic the parallel section of the OpenMP* implementation. Note: the implementation of a barrier is NOT trivial due to all of the timing nuances of the threads. In fact, the developer is oftentimes left with the implementation exercise as barriers are an option to maintain POSIX* compliance.

Finally, one thread was designated as the "master" thread. This thread was responsible for creating the pthreads, handling any non-parallel computation, and for interpreting / displaying the final benchmark results.

The developer can move on to the next section after the following:
- The POSIX* thread implementation has been completed
- The developer has verified the implementation
- The aforementioned NDK compilation is successful

After Compiling the Native Stream Code

Now, the typical process of calling the Native (compiled) code from the wrapper (Java*-based) Android* package can be used. In this case, the use of JNI was no more difficult than a basic "Hello World" example.

Stream's entry method simply is modified so that it has a typical JNI signature, as follows:



Figure 5.1: Modified Stream Entry Method

This header assumes "NativeCaler.java" was added to an Eclipse* Android* project, where the source file is used to call the Stream app via a System.Load() command. Of course, the developer may choose different nomenclature accordingly. Note also that in this simple example, none of the method parameters are used, but the developer can choose otherwise based on application.

Summary

This article provided a high-level overview on ensuring that the Stream application can properly build and link with multithreading support in the case of it being used as part of an Android* package. This guide discussed the process of porting the code to use POSIX* threads, rather than OpenMP*, in the case of building/linking with the Android* NDK.

For more complete information about compiler optimizations, see our Optimization Notice.