4,580 Posts served
11,094 Conversations started
- Academic

- Android

- Art, Music, & Animation

- Embedded Computing

- Events

- Game Development

- Graphics & Media

- Intel SW Partner Program

- Intel® AppUp Developer Program

- Manageability & Security

- Mobility

- Open Source

- Parallel Programming

- Performance and Optimization

- Power Efficiency

- Server

- Site News & Announcements

- Software Tools

- Ultrabook

- Association for Computing Machinery TechNews (ACM)
- Go Parallel! (Dr. Dobbs)
- HPCwire (Tabor Communications, Inc.)
- insideHPC (John West)
- Joe Duffy's Weblog (Microsoft)
- Microsoft Parallel Programming Development Center (Microsoft Germany)
- MultiCoreInfo.com
- scalability.org (Scalable Informatics)
- Software Dev Blog (Intel Germany)
- Soft Talk Blog (Intel United Kingdom)
- The Moth (Microsoft)
Getting more out of QA with Code Coverage
By Matthew Jack (1 posts) on October 15, 2010 at 4:39 pm
In my first ISN post I give an introduction to how you can make your QA process much more effective with a novel approach adapted from code coverage analysis. Based on my article in Games Programming Gems 8, I will also be talking to Arti Gupta about the approach on ISN TV's Visualize This! series next week.
Code coverage is a metric commonly used to assess the quality of unit tests. However, the same concept can be reapplied to provide real-time feedback during conventional QA testing, making the process much more rigorous and providing valuable metadata about your code.
I originally developed the approach during my time at Crytek, as a tool to help test a major refactor of our AI system. The change redesigned a fundamental aspect of the system, touching a fair fraction of the code and most of the AI features. It would be made in a branch over a period of quite a few weeks and QA resources would be available, but with games in active development and upcoming milestones, I felt I needed a new measure of confidence if I was ever to be able to merge it with our main branch.
There are some great testing methodologies out there that I would heartily recommend, but applying them at this point in time and for this change was not practical (though, if you'd like to ask why I didn't use approach X that worked for you, I'd love to hear from you!) What I needed was an approach that would make the most of main resource I did have – an experienced QA team – and without turning their workflow on its head.
What can “Code Coverage for QA” do for me?
- Precisely focus your QA team on your code changes
- Make 100% certain that they succeeded in testing those changes
- Tell you more about how and where code is used in your games
- It's easy for QA to use
How does it work?
Conventional code coverage uses instrumentation to systematically insert hooks throughout your code, so that you can track whether it has been run. This is usually used to investigate how good a set of unit tests are, by examining how many functions, or branches, or lines are exercised. That kind of information on your QA testing could be helpful, but it turns out that there are many practical advantages to dropping automatic instrumentation and instead using explicit markers in your code. Amongst those is performance, but it also makes the resulting data much smaller, easier to understand and easier to track as your code changes.
Here's a simple example. I use CCMARKER macros to form the explicit, named markers - for their implementation, check out the open-source mcover code.
void Entity::Attack()
{
CCMARKER(Entity_Attack);
if (HaveGrenades() && m_timeSinceLastGrenade > 10.0f)
{
CCMARKER(Entity_Attack_Grenade);
IGrenade *pGrenade = NULL;
switch(m_grenadeType)
{
case SMOKE_GRENADE:
CCMARKER(Entity_Attack_Smoke);
pGrenade = new SmokeGrenade();
break;
case CONCUSSION_GRENADE:
CCMARKER(Entity_Attack_Concussion);
pGrenade = new ConcussionGrenade();
break;
}
FireGrenade(pGrenade);
}
else
{
CCMARKER(Entity_Attack_Normal);
FireGuns();
}
}
You can place the markers anywhere that is useful to you – and just where they are useful to you.
What do QA see?
Key to the whole approach is a display overlayed on the screen during gameplay, which gives real-time feedback on the code coverage markers they are hitting. It can include an indicator every time a new point is hit and a progress bar to show how many they still have to go. Crucially, when there are a small number left, the names of the markers can be displayed – giving hints or something solid to reference in finding that last 5%.
QA find this easy enough to relate to – its essentially a mini-game. The screenshot is from an integration of code coverage with the open-source FPS Alien Arena - a fast, fun and free shooter based on the Quake II source code.
What results do I get?
In my implementation at Crytek I simply dumped a list of the markers hit to a file. I built up the number of markers as my refactor progressed and after each testing session I did some processing of the results with a script – merging the results from multiple testers and multiple runs - and then checked them into Perforce. I then had a ready reference as to which points had been hit – or not hit - and I was able to split this into results for each level.
This meant I could tell that my latest changes had definitely been tested – which is the crucial point – but with that process came a whole load of other useful results. I was able to track with a simple diff which code had been run before and was not being run now, which could be a bug. I was able to see which features were used in which levels, which can be very useful if you haven't worked with the features yourself. You can also make sure that code really is used before you start refactoring it!
Where can I go with this?
One of the key developments to this is to collect the results over the network to a central database. Putting this in place makes result collection automatic, feedback to the programmers is real-time, and powerful queries are possible with powerful methods of presentation, such as a web interface.
In recent months I have been consulting on AI for Xaviant LLC in Atlanta, who have implemented this network collection mechanism and I'll include some discussion of their implementation in a future post.
Another advantage of this approach occurred to me while finishing a recent project with Microsoft. They made use of a centralised off-site testing team and while communication with the team was good, I could not simply walk over and discuss my features. Records of exactly when a particular feature or codepath had been tested, or real-time feedback that the feature was being tested right now and by who - would have been a great aid to communication.
A few more possible expansions to think about:
- Using the same infrastructure with automatic functional tests
- Categorising and prioritizing markers
- Creating custom levels designed to test all the features in one run
Actually there's a long list where that comes from and I've seen some really great spinoffs that I'm looking forward to people demonstrating.
Matthew Jack works as an AI consultant and freelancer developer at Moon Collider Ltd.
For an in-depth treatment and technical details you can read the original article “Code Coverage: Informing the QA Process” in Games Programming Gems 8 or watch a discussion with Alex Champandard in a masterclass video available at AiGameDev.com. You can also check out the C++ source for an example implementation mcover.
Categories: Game Development
Tags: code coverage, games, QA, testing
For more complete information about compiler optimizations, see our Optimization Notice.
Comments (3)
| October 18, 2010 11:22 AM PDT
Doug Binks (Intel)
| Great blog Matthew, really good to see that the Code Coverage work has continued since the article in GPG8. Having seen the work in action, and played with the code from GPG8 a little, I can genuinely say it's a great addition to the tool set for both developers. Some testers I know outside of the games industry have expressed interest as well for ad-hoc testing of business software - though I think in this context it might be useful to have the output displayed in secondary tool window. |
| October 27, 2010 12:32 PM PDT
Matthew Jack
|
Alex - I think there's a lot of analysis that could be done, which is why I think it's important to get the results into a central database where you can run SQL queries against it, and then merge results for each level from multiple testers and multiple runs, so that we see a complete picture. Then you want a way for developers to survey the changes - for instance via a web-based diff interface - and as you say you can consider email alerts. I'm personally not a big fan of thresholds on this kind of thing, which is part of the philosophy behind this - 5 or 10 missed checkpoints may not be meaninful, since it really depends what they were! I would encourage flagging some points as "unmissable" and sending an email if any report finishes without them. Examples would be core code like pathfinding or hidepoint searches. That can be an excellent early warning that something important is broken - and crucially, the accompanying "fingerprint" of checkpoints hit or missed can help you immediately diagnose the problem. Similarly, you might flag a checkpoint for a codepath for an immediate alert if it is ever hit - perhaps where a rare bug is occurring - so you can get over there and see it in action. Those "unmissable" checkpoints coudl perhaps be found automatically by looking at historical data. For more extensive automated analyses, i think they would be best paired up with automated tests - but I'll give some thought to this and perhaps it could be a good subject for a future post. |
Trackbacks (2)
-
Twitter Trackbacks for
Getting more out of QA with Code Coverage – Intel Software Network Blogs
[intel.com]
on Topsy.com
October 15, 2010 6:59 PM PDT - Intel – An Overview of Code Coverage « MatthewJack.net
May 1, 2011 12:04 AM PDT




Alex
I wonder if it would be useful to send out emails when the number of missed or unexpected code coverage points changes by more than some threshold.