This chapter examines mobile GPUs and how to best optimize for GPUs included with Intel® Atom™ processors.
2. GPU Evolution
It’s important to understand how mobile GPUs are evolving.Originally, mobile devices used software rasterization to generate the final image on the screen. However, the CPU was overtaxed performing rasterization and all the other tasks involved in a modern OS. In Android, the CPU is performing numerous background tasks, handling I/O, and running applications. Combine this with Android’s numerous graphical animations and effects used to convey intuitive UI response, and the need for a GPU is apparent.
The Intel Atom processor is designed to maximize performance and battery life on a mobile device. Just as the Intel Atom processor’s CPU is specially designed for mobile devices, so is its GPU. A lot of the initial GPU design came from OpenGL* ES. The next chapter explores OpenGL ES and how the API allows developers to write graphical applications on Android.
OpenGL was designed to be run on a desktop. It supports numerous features that are not necessary on a mobile device, such as scientifically accurate operations and high-bit precision. To support full OpenGL on a device would require a lot of extra hardware. That extra hardware takes precious space and power. Enter OpenGL ES. OpenGL ES was designed for mobile devices and removes a lot of the OpenGL features not needed for mobile devices.
OpenGL ES 1.1 only supports fixed function pipelines. As such, GPU designers targeting OpenGL ES 1.1 can create very simple special-purpose GPUs. As mobile devices evolved, this limitation was holding back modern graphical techniques. OpenGL ES was extended to support a programmable pipeline (that is, shaders) with OpenGL ES 2.0. This evolution allowed much more complex visuals included in modern 3D games. However, it also required GPUs to become much more complex.
OpenGL ES 2.0 is the predominate graphics API. But the story doesn’t end here. OpenGL ES 3.0 is on the horizon and with it comes the possibility of even more complex and visually impressive techniques.
3. Two Major Mobile GPU Designs
There are currently two major designs being used by mobile GPUs: deferred and immediate. Deferred mode GPUs wait until all commands for an individual frame are submitted before processing the work. An immediate mode GPU starts working on commands as soon as they are ready. Table 1 shows major GPU vendors and their GPU design type.
|Intel® HD Graphics||Immediate|
|Imagination Technologies PowerVR*||Deferred|
Table 1: Major GPU Vendors and Their GPU Design Types. (Please Note: Intel HD Graphics is part of Intel® Core™ processors. Current Intel® Atom™ processors use PowerVR-based GPUs.)
Source: Intel Corporation, 2012
3.1. Advantages of Deferred Mode GPUs
A deferred mode GPU has several major advantages. First, data sent to the GPU can be better organized. This can result in significantly decreased memory bandwidth. Memory usage is a major consumer of power, so limiting memory bandwidth will result in significant power gains.
Since all the data to render the frame is known, the work can be easily divide into smaller chucks. PowerVR is actually referred to as a deferred tile-based render. This is because the GPU collects all the commands/data to generate a frame and then divides the work into small tiles. A tile is just a square collection of pixels. This collection of pixels is designed to fit into very high-speed caches. If you want to learn more about all the advantages of PowerVR’s design, please refer to their documentation: http://www.imgtec.com/powervr/insider/docs/POWERVR%20Series5%20Graphics.SGX%20architecture%20guide%20for%20developers.1.0.8.External.pdf.
Deferred rendering does have several limitations. The internal memory and cache on the GPU can only be so big. If a frame has too much data to be rendered, the work needs to be divided among multiple rendering passes. This redundancy results in a lot of overhead and wasted operations. A lot of the optimization tricks for deferred mode GPUs involve avoiding these “glass jaws.”
3.2. A Note About Defining “Deferred”
The term deferred is very overloaded in computer graphics. There are many terms like deferred mode GPUs, deferred renders, deferred shading, deferred lighting, and deferred rasterization. To make it even more challenging, the definitions are not consistent and must be taken in context. Don’t confuse a deferred mode GPU with deferred rendering techniques. In general, deferred rendering techniques refer to accumulating scene data into a g-buffer and applying lighting/shading in screen space.
3.3. Advantages of Immediate Mode GPUs
Immediate mode GPUs have been the predominant desktop design for decades. A lot of rendering techniques, tricks, and optimizations have been designed around immediate mode rendering. As such, immediate mode GPUs have become very complex and capable.
Since an immediate mode rendering starts processing commands as soon as they are ready, simple tasks can be completed more quickly and efficiently on immediate mode GPUs. Furthermore, they don’t run into as many “glass jaws” based on the amount of data passed to the GPU.
However, the years of design for immediate mode GPUs targeted desktops with massive power supplies. This has resulted in designs that maximized performance at the cost of power. This is why deferred mode renders have dominated the mobile market. But research and development in immediate mode GPUs have been driving down power utilization rapidly; this can be seen in Intel HD Graphics included in Intel Core processors.
4. Optimizing for Intel GPUs
As shown in Table 2, Intel Atom processors designed for Android use PowerVR GPUs.
|Intel Atom Processor Series||GPU Core|
|Z24XX||PowerVR SGX 540|
|Z2580||PowerVR SGX 544MP2|
Table 2: Intel Atom Processors Designed for Android Use and Their GPUs. (More details about individual processors can be found at ark.intel.com.)
Source: Intel Corporation, 2012
It’s important to refer to Imagination Technologies’ documentation on PowerVR. General optimization tips and tricks provided by Imagination Technologies are just as important on Intel platforms as other platforms with PowerVR.
To get a good understanding of the PowerVR hardware, please review Imaginations Technologies’ architecture guide: http://www.imgtec.com/powervr/insider/docs/POWERVR%20Series5%20Graphics.SGX%20architecture%20guide%20for%20developers.1.0.8.External.pdf.
To get a good understanding of how to optimize for PowerVR hardware, please review Imaginations Technologies’ developer recommendations: http://www.imgtec.com/powervr/insider/docs/POWERVR%20SGX.OpenGL%20ES%202.0%20Application%20Development%20Recommendations.1.8f.External.pdf.Pay close attention to the “golden rules.”
It’s important to use good texture compression on mobile devices. Proper texture compression will decrease download sizes, improve visual quality, increase performance, and decrease impact on memory bandwidth. However, this is one of the biggest challenges on Android. Since Android supports a wide range of hardware, there isn’t one texture format that runs well on all devices. OpenGL ES only requires that the hardware supports ETC texture compression. Sadly, this format doesn’t support an alpha channel. So developers are forced to support multiple texture formats. For PowerVR, developers should use PVRTC to compress all textures.
This chapter provided a brief overview of mobiles GPUs and how to best optimize for GPUs included with Intel Atom processors. The next chapter will take a closer look at the API used to drive the GPU, OpenGL ES.