The netbook market is growing steadily and creating new opportunities for game developers on this mobile platform. According to a 2009 report from the NPD Group,1 nearly 40 million netbooks have shipped so far and around 139 million are projected to be shipped by 2013. Another report from ABI Research stated that approximately 7 percent of online users are also netbook owners, and this number is likely to increase over time.
Whether you are developing a new game, or have an existing game that you want to port to the netbook platform, the tips and tools described in this article for optimizing the CPU and GPU on Intel® Atom™ processor-based netbooks will help increase your chances for success.
The best way to show you how to optimize your game for netbooks is to describe what we did to create our Fireflies demo. It’s a great example of the easy optimizations and quick performance gains you can achieve when developing games for this fast-growing market.
Optimizing for the Intel® Atom™ Processor
Let’s first take a look at the Intel Atom processor and see how it can help maximize the performance of the games you’re developing for the netbook platform.
The Intel Atom processor is an in-order processor, unlike the out-of-order functionality built into the larger processors that Intel ships for the high-end desktop and notebook markets. As instruction streams enter an out-of-order processor, the processor can re-order them, if necessary, to cover up latency and extract instruction-level parallelization. If an instruction hits a pause and needs to fetch data from memory, the processor can automatically fill that slot with another instruction stream that is ready, making the stall less severe.
With the Intel Atom processor’s in-order instruction scheduler, re-ordering the instruction stream at the processor level is not possible. However, you can use the Intel® C++ Compiler to minimize the processor’s sensitivity to dependency stalls and help achieve the maximum frames per second (fps) for your application.
The Intel C++ Compiler uses various optimizations that specifically target the Intel Atom processor, such as being able to handle the processor’s trade-off between longer battery life and less out-of-order execution, by going through the code and re-organizing the instructions to remove latencies and stalls. In our testing of the Fireflies demo, we set a few flags on the Intel C++ Compiler, compiled the code, and without much effort were able to achieve a 1.2x to 1.3x speed-up.
You can also improve performance by extracting and taking advantage of the Streaming SIMD Extensions 3 (SSE3), a standardized instruction set that the Intel Atom processor supports. Including support for SSE3 ensures that your game will run on just about any x86 processor. In addition, the Intel Atom processor also supports supplemental SSE3. Supplemental SSE3 support is especially interesting for game developers because of the addition of a dot product operation, which can provide the best throughput in floating-point operations. On our Fireflies demo, we implemented SSE using the XNA* Math Library that come with DirectX*, resulting in about a 1.1x to 1.3x speed-up with only the hand-coded SSE.
Hyper-threading is also very important. As we mentioned, on an in-order processor it is more likely for the processing and execution stream to stall because of latencies, such as having to go to main memory or fetch something from a device. Hyper-threading alleviates these types of bottlenecks by letting two threads run on a single core, allowing the core to actively swap between threads as they’re needed. If one thread gets stalled, another thread can step up and use the 1Source: A Closer Look at Netbooks, NPD Group (June, 2009). extra available execution units.
Optimizing for Netbook Chipsets
Three Intel® Graphics Media Accelerator (Intel® GMA) chipsets—Intel GMA 945, Intel GMA 3150, and Intel GMA 500—play an important part in optimizing performance for games that support Intel Atom processor-based netbooks. With the new N400 series Intel Atom processors, there’s even more integration of the memory controller and the graphics controller. And all this technology is wrapped up in a processor that is about as small as a grain of rice (Figure 1).
The Intel GMA 945 and the Intel GMA 3150 are very similar, with the same characteristics and roughly the same throughput. All three of the chipsets use DirectX 9.0c hardware acceleration, so if you are using the DirectX API, remember that you are limited to DirectX 9.0 at this time. One important distinction to remember is that the CPU performs vertex processing on the Intel GMA 945 and the Intel GMA 3150 chipsets; no hardware support for vertex processing is provided on these two chipsets.
In contrast to the Intel GMA 945 and Intel GMA 3150, the Intel GMA 500 does have hardware support for vertex processing. The Intel GMA 500 is also a DirectX 9.0c-compatible chipset, so considerations for getting the best performance are about the same. A good strategy is to validate and test on the Intel GMA 945 or Intel GMA 3150 part, but make sure you also validate and test on a GMA 500.
You can see a screenshot of the Fireflies demo in Figure 2. During the testing of our demo, we found it very important to run the application at full screen size. Maximizing the application to take up the full screen actually minimizes the amount of context switching in Microsoft Windows* 7, which definitely improves performance. Full screen allows the chipset to focus fully on your work solution.
Balancing the work on the CPU and the GPU is important, and using the Intel® Graphics Performance Analyzers (Intel® GPA) helped us do this. When running a heavy graphics workload, you’ll want to take hyper-threading into account. We wanted to support multiple threads, so we used Intel® Threading Building Blocks to add threading into our demo quickly.
We didn’t see much performance increase in our heavy graphics and heavy artificial intelligence (AI) calculations. That’s because the Intel GMA 945 and Intel GMA 3150 chipsets are doing the vertex processing on the CPU side, while the Intel Atom processor handles the graphics driver and extra work on the front end of the graphics pipeline. A lot of work is occurring beyond the AI calculations or game simulation, requiring a very high utilization of the processor core.
Threading can be advantageous when loading assets or other scenes where the graphics workload may not be as high, but this requires carefully checking the threading and performance. Threading can incur performance penalties if you throw a lot of threads at the calculations while performing a heavy graphics workload. It takes time to swap context from one thread to the next.
Tips and Tools
When we used the Intel GMA 945 and Intel GMA 3150 to optimize and get the best fps out of our Fireflies demo, we found the following tips helpful:
- Compress your texture
- Minimize multiple passes
- Minimize post processing
- Decrease the amount of data pushed to the chips
- Use standard tricks such as index buffers and triangle strips to bring down the vertex count that hits the chipset
When developing for the netbook platform, the following tips can help address the smaller screen size:
- Position elements on the screen relative to where they belong. This is especially important when positioning the heads-up display. Keep on the screen only those items that are necessary for gameplay. It is also a good idea to scale the assets based on resolution or screen size. You are most likely already familiar with some of these concepts, but when targeting netbooks these considerations are more important than when you have higher resolutions and bigger screens.
- Use expressive icons. Instead of using an icon that has both a symbol and text, consider dropping the text. Keep the icon if it’s expressive enough for the user to know what it is used for and maybe add the text to the tool-tip for that button.
It is very important for your game to pay attention to the netbook’s battery life. Because a netbook is mainly a mobile device, most of them have a longer battery life than the average laptop. Nevertheless, games typically drain the battery faster than other types of applications. When your game is running on a netbook, it needs to be aware of whether the network is plugged in, if the lid has been closed, or if the netbook is running out of battery life.
The Intel® Laptop Gaming Technology Development Kit (TDK) can help deliver this type of functionality to your game by monitoring the netbook’s battery life, power source, and network connectivity and responding to these platform state changes. With just a few lines of code, your game can respond as needed.
If you need finer-grained control of power events and connectivity, you can use the Windows* power management functions to learn whether the device is going into suspend mode or hibernating, or if the battery has gone beyond a certain threshold.
The Intel Laptop Gaming TDK also provides functionally for creating a Personal Area Network around your netbook. This adds functionality for ad-hoc gaming, allowing others to join your game without creating a full network infrastructure. The Windows Peer Collaboration API provides functionality for finding other people on the network, although it requires users to be connected to the same subnet or local area network.
Intel designed the Intel GPA tool set specifically for games. It comes with a System Analyzer (Figure 3) that gives you an high-level view of what is going on while the game is running and lets you perform various “what-if” experiments to diagnose performance bottlenecks. Frame Analyzer allows you to dig deeper into a frame to analyze specific rendering problem, such as whether you are spending more time in specific pixel shaders. In the newest release, Intel GPA also added Platform View, so you can instrument the code, figure out the structure of the workload, and determine the execution behavior for your game.
The latest version of Frame Analyzer works with GMA 945 and GMA 3150; GMA 500 is not supported at this time.
Another tool that Intel provides is the Intel® Parallel Studio, used to find performance hotspots and help you further optimize your application.
We used many of these tools as we were working on the Fireflies demo, which made it much easier to optimize performance on netbooks. Don’t re-invent the wheel! When you are on a tight deadline, working hard to make your application as fun as possible, you shouldn’t have to worry about writing multiple lines of code to perform standard operations. Intel has devoted considerable resources to free you up to do what you do best—make great games. You have enough to worry about designing levels, tweaking the AI, and creating challenges for your users. By using the tips and tools mentioned here, you can cut your development time significantly, save yourself a lot of frustration, and spend your time more effectively.
ABOUT THE AUTHORS
Orion R. Granatir is a senior engineer with Intel’s Visual Computing Software Division. Prior to joining Intel in 2007, Orion worked on several PlayStation* 3 titles as a senior programmer with Insomniac Games. His most recent published titles are Resistance: Fall of Man*, and Ratchet and Clank Future: Tools of Destruction*.
Omar A. Rodriguez joined Intel in 2007 as a Software Engineer in the Visual Computing Software Division. Omar works on game demos that show off multi-core and other Intel® technologies. Omar graduated from Arizona State
University with a B.S. in Computer Science. Omar is not the lead guitarist for the Mars Volta.
Dissecting the Typical Netbook
Netbooks are lightweight, portable PCs targeted for casual, on-the-go use. With an average of around 10 hours of battery life between charges, they have more than enough running time for the average gamer. Netbooks have smaller keyboards, and with a typical size of 10 inches at a resolution of 1024x600, their screen size is smaller than what PC game developers are used to. Still, they offer a full Web experience, supporting Flash*and other plug-ins, and they handle high definition video. Netbooks typically do not come with an optical drive, so they’re primed for digital distribution only—users won’t be installing your game from a disc.
Netbooks come with Wi-Fi* and Bluetooth* enabled, and in the future WiMAX* will be available as well. You can assume that a Webcam and a microphone will be mounted on the netbook’s lid, which are great devices for chats, taunts, and other types of communication that goes on inside networked gaming.
Sign up today for Intel® Visual Adrenaline magazine: http://va.softwaredispatch.intel.com/ »