Intel Intrinsics Performance Degrade on Android Framework

Intel Intrinsics Performance Degrade on Android Framework

I am developing Android x86 based frameweork for Intel Atom Processor. I have implemented the entire framework, but I am facing problems with the SIMD implementation for my code. When I run the basic C code, it gives a considerable performance same on the emulator as well as the hardware, however, when I enable the intrinsics option for the code, there is no actual gain but a negligible loss in performance. I have run my code on Intel i7 processor, there is approximately 200% gain. I certainly take into consideration the frequency & number of cores that a PC and a tablet utilizes but still there should be some gain when I enable SIMD code on the Android framework.
Possible problems which I have analyzed so far:
1) Local C flags(can anyone suggest suitable C flags for Intel  Atom Processor).
2) Is it advisable to use .so file instead of  the source code in the framework.
3) Suitable NDK for Intel Atom, I am using 4.8.
4) Optimization level should be set to O2 or O3.
If there are any other reasons that may hinder the performance, please let me know.
Thank you in advance.

3 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.


If you have some snippets it would be easier to take a look what is happening. 

  1. -mfpmath=sse -Ofast -flto -mtune=atom
  2. I don't get the question here. Can you explain what you mean?
  3. GCC 4.8 should be fine. Newer GCC versions usually provide a better performance
  4. Most of the time O3 gives a better performance, but it's not guaranteed and usually the binary file size get's bigger. If you have the time you should test O2 and O3 on your particular application and decide based on this. The proposed options in 1) include -Ofast which includes -O3. 

About other possible reasons for the performance might be that some Atom processor have an in-order execution pipeline and therefor the order of instructions is extremely important to get the best performance. If this is an issue, the -mtune=atom parameter should fix it.

Getting a profiling data would be the best option to understand what is going on. 


Hi Alexander,
                       Thanks  for your reply. I am trying to make changes according to your suggestions. I will check and let you know. My question i.e.point-2 is that I have 25 source files and 30 include files in my solution. Would it better to consider creating a single .so(shared library) or .a(static library) file for any performance gain. Presently, I am using all source and include files in the jni folder. Also, I am currently using Android-ndkr9d, previously I had used Android-ndkr9c, would you think if you using a newer version for ndk will degrade the performance or affect the code by any chance.

Also, I would like to know if I should use Intel C++ compiler, I am using Eclipse to generate the apk for my solution.

Leave a Comment

Please sign in to add a comment. Not a member? Join today