Case Study: How Intel® GPA Measurements Alerted Me to Greatly Improve the FPS of my Windows* 8 Store App: The DispatcherTimer

Last year, I wrote a blog about creating your own simple collision detection code.  I implemented this for a children's math game I created.  You can refer to my blog here:

http://software.intel.com/en-us/blogs/2012/07/13/give-metro-ui-elements-space-writing-your-own-collision-detection-handler-in-c

I started playing around with my game again as I wanted to further improve it.  Just for kicks, I said to myself, “hmm, I wonder how many frames per second are rendered in my game.”  Frames per second measurement, or fps, is a general guideline as to the “smoothness” of graphics; 60 FPS is usually considered “smooth” and this is why often times, we see fps capped at 60 since anything above usually isn't perceived as "better."  I was curious because at times, tile motion in my game seemed to jitter a bit, and thus I suspected that the fps was below 60.

I downloaded Intel GPA* tool as I was aware that it has a nifty FPS HUD overlay.  The tool also tells you if your app is CPU bound, GPU bound, or (hopefully) CPU and GPU balanced.  I grabbed the tool from here:

http://software.intel.com/en-us/vcsource/tools/intel-gpa

 

Feel free to read up on the tool if you aren’t familiar with it.  I installed it then went to its “analyze application” menu to analyze my already installed game.   My game was listed under the set of Windows Store Apps on my system.  I used the default Ctrl-F1 toggle function to bring up four nice graphs in the HUD.  Here’s a screenshot of what I observed:

 

Figure 1: Snapshot of my Game Running with Intel GPA Overlay (Game screenshot taken from Visual Studio 2012 Simulator, FPS Overlay taken from Intel GPA Tool)

When I was observing my game, the FPS always seemed to be in the 30s.  This explained the slight choppiness I was seeing.  Further, while what I show is only one snapshot, when observing my app over time I found that the CPU usage would sometimes peak in the 40s while the GPU usage remained very low.  This told me that my app was CPU bound.  I further confirmed this by using some of Intel GPA’s nifty toggles like turning off certain textures, etc.  These changes made no effect on FPS.

Keep in mind that because my graphics aren’t too complex, I wasn’t using managed DirectX* for graphics, but simply moving children of a Canvas defined in XAML.

I had considered whether my collision handling code was blocking the UI thread and thus causing the low frame rates.  I bypassed my collision detection algorithm and saw no FPS improvement.  Then, to isolate my algorithm I created a simple demo Windows 8 app.  All it does is utilize a black background and moves a white circle back and forth along a line:

Figure 2: A Simple Circle in Motion

I found that once again, my FPS didn’t improve!  I was a bit surprised.  I was hoping that perhaps my algorithm or using pre-made images in a canvas had something to do with the low FPS.  Well, it didn’t.  This sample app did however really help narrow down the issue to the following two cases:

a)      Are the Canvas.SetTop / Canvas.SetLeft calls really slow?

b)      I’m using a DispatcherTimer where on every tick, I draw…is that the problem?

 This led me to use the Stopwatch class.  Its usage is pretty simple and could look like the following:

private static System.Diagnostics.Stopwatch game_clock;

//initialize the game clock and start it

game_clock = new System.Diagnostics.Stopwatch();

game_clock.Start();

//sample the elapsed milliseconds thus far

elapsed_milliseconds = game_clock.Elapsed.TotalMilliseconds;

Figure 3: Sample Stopwatch usage***

 

What I did at this point was simply measure the number of milliseconds it took to call Canvas.SetTop once.  Surely, the call was very fast as it completed within one ms.  It now became apparent that my DispatcherTimer usage was the culprit.

 The DispatcherTimer class has some great uses.  In fact, I use it to refresh the game clock for the math game.  From a power consumption standpoint, if you don't need more than 30 fps rendering, it will save you power when compared to other methods such as threading.  When measuring seconds for the game clock, you have a big timing window to play with where this timer API suffices.  Check out this post that explains this point in more detail:

http://software.intel.com/en-us/articles/writing-energy-efficient-windows-store-applications-for-mobile-devices-impact-of-graphical

 Specifically in my game, I saw sluggishness and thus wanted to increase the fps.  Using the stop watch like mentioned above, I proceeded to perform a lot of measurements as to how fast the dispatcher timer would fire between frames.  I found a problem; no matter what I set the resolution of the timer to, in my case, on average, it would take 28ms for the dispatcher timer between subsequent calls! 

 Do you see the problem?  If you want 60FPS, then a frame should be drawn every 1000/60 ~16.7 ms.  Clearly, this couldn’t be done with the timer as it proved to be the bottleneck in my case. 

 At this point I had to make a design decision.  I didn’t want to jump into DirectX for such simple rendering.  I resorted to plan b: redesign the game loop.  After some reading, I saw a recommendation to use the following:

CompositionTarget.Rendering += <your drawing event handler name>

Figure 4: An Alternative API***

 This fixed my problem!  When applying the change to my math game, I found that the FPS would stay consistently either at 60 or just under it.  This was an incredible improvement for me.

 By the way, to explain the dispatcher timer’s resolution limitation, this thread is also a great reference

http://social.msdn.microsoft.com/Forums/en-US/silverlightcontrols/thread/5c3e6d25-eb76-419f-97c7-77d7f008de0b/

 For my resolution (bad pun, ha) to the FPS problem, read up on the event handler here; this explains the compositiontarget.rendering event:

http://msdn.microsoft.com/en-us/library/system.windows.media.compositiontarget.rendering.aspx

What we are doing is: whenever a frame is to be rendered, call my code for updating the canvas position of the UI elements.  There’s no need to use a dispatcher timer in this case since that can only add delays.  This post from Microsoft explains in more detail how to render per frame:

http://msdn.microsoft.com/en-us/library/ms748838.aspx 

Thanks for reading my blog!  I hope this post was helpful.

***This sample source code is released under the Microsoft Limited Public License (MS-LPL) and is released under the Intel OBL Sample Source Code License (MS-LPL Compatible)

 

 

Per informazioni più dettagliate sulle ottimizzazioni basate su compilatore, vedere il nostro Avviso sull'ottimizzazione.