By William Van Winkle
Infrared5 Case Study [PDF 534 KB]
“Cats are incredibly effective hunters and are wiping out our native birds.”
- Gareth Morgan
In 2012, Gareth Morgan became mildly famous in New Zealand for drawing attention to the plight of native flightless birds and how domestic animals, particularly housecats, are decimating the kiwi population. Who could have guessed that the ensuing feline firestorm would inspire an award-winning advance in gaming and perceptual computing?
Backing up a few years, Chris Allen was an experienced Boston programmer. His wife Rebecca had business savvy and design experience from the print world. They started a small coding house called Infrared5 to develop applications for client companies. The Allens steadily grew Infrared5 and cultivated a keen appreciation for Google’s 20-percent time philosophy. Applying this philosophy, an Infrared5 employee got to twiddling with creating an Apple iPhone*-based control system for remote control helicopters. This eventually became Brass Monkey*, an SDK that allows Google Android* and Apple iOS* devices to serve as input controllers for Unity-based, Flash, HTML5 and native games.
In early 2013, Intel invited Infrared5 to compete in its Ultimate Coder Challenge: Going Perceptual, a groundbreaking contest that provided participants with an Ultrabook™ device, Creative Interactive Gesture Camera development kits, a still-evolving perceptual computing SDK, and all of the support possible for letting their imaginations run rampant. With its focus on sensor-based input, Brass Monkey technology seemed a natural complement to perceptual computing—but how to meld the two?
In finding an answer, Infrared5 devised a wholly new dual-input system, blending Wi-Fi* handheld devices with perceptual computing with a proof of concept that may change gaming forever.
Kiwi Katapult Revenge*: Form and Function
“Early on, I had the idea of using a head tracking mechanic and combining it with the phone as a game controller,” said Chris Allen. “In my head, it was this post-apocalyptic driving, shooting game, something like Mad Max* in 3D. But Rebecca and [art director] Aaron Artessa had been wanting to do a paper cutout concept for a long time. Then we heard an NPR newscast about New Zealand and how the kiwi birds were getting killed by domesticated cats. We thought it would be fun do a little political play and have you be a bird named Karl Kiwi able to fly around, firing lasers from your eyes, breathing fire, and taking revenge on the cats.”
Those accustomed to WASD keyboard controls or traditional console gamepads may find Infrared5's gameplay a bit daunting at first. Brass Monkey uses accelerometer data from the Wi-Fi connected phone to control flying movement. Screen taps on the phone, typically with thumbs, control firing.
Figure 1. Infrared5’s Brass Monkey software allows iOS* and Android* phone devices to serve as controllers along with Creative’s gesture camera.
Face tracking using the gesture camera governs aiming, plus there's also voice input.Karl can shoot flames from his mouth when the player shouts “aaahhhh!” or “fiiiirrree!”
That may seem like a lot for someone to juggle, but feedback from early players indicates that the gameplay was surprisingly natural after a little coaching and getting properly positioned in front of the camera. (Rebecca Allen noted that an in-game tutorial and calibration will drop the learning time to only a couple of minutes.) Head turns give the natural ability to peer around objects. The whole experience is remarkably intuitive. Still, over weeks of refining the interface and mechanics, the six-person development team found itself making several major changes.
Figure 2. Your flying kiwi isn’t the only one equipped with laser beam eyes. Note the fairly simple UI and graphics palette for faster processing.
“One of the problems we faced with the perceptual computing was with face tracking,” said Rebecca Allen. “You had to identify that people were actually controlling the view of the world with their face. We ended up doing a rearview mirror where you could actually see yourself and how you’re moving. Your face actually changes the perception, with the bird’s head moving as well. That also gave us the ability to see what’s behind the bird, because you’d be getting just slaughtered by cats from behind and not realize what was going on and who was shooting at you.”
Challenges Addressed During Development
Not surprisingly, Chris Allen had no experience with computer vision when he and Infrared5 started Intel’s contest. He admits having read a book on the subject a couple of years prior, but with no hands-on expertise, the team faced a steep learning curve in the opening weeks.
Infrared5 designers were particular about the kind of lighting and atmosphere they wanted in Kiwi Katapult Revenge. However, the impressive visualization and multiple input streams placed a significant processing load on the little Lenovo IdeaPad* Yoga convertible Ultrabook device Intel provided to Infrared5 for the contest. To help keep the user experience fluid and fun, Team Kiwi took several resource-saving steps, including the following:
- Dropped the face tracking frame rate. Since people in the image field were likely not to move much across several frames, Infrared5 found it could perform analysis less often and save on processing load.
- Optimized the process threading, leveraging the Ultrabook device’s quad-core CPU to offload certain tasks to available cores and load-balance more effectively.
- Pared down the color and visual assets. This change saved graphical resources and helped reduce the data load hammering the GPU core, while having little effect on the player experience.
- Filtered out any faces beyond a distance of one meter (3.28 feet). The camera’s depth sensor made this possible, and by eliminating so many extra faces in crowded environments, the face processing load dropped.
Figure 3. Currently, perceptual computing requires the addition of a USB gesture computer, but future Ultrabook™ device generations will likely integrate stereoscopic cameras directly.
Given Infrared5’s experience in working with Unity, it seemed sensible to use the Unity SDK for Kiwi Katapult Revenge and write the code in C#. Team members knew the SDK included a head tracking mechanism and so expected to “get face tracking for free.” However, it turned out the results simply weren’t responsive enough to feel realistic; aiming and shooting times were skewed too far. The team burned up two weeks figuring this out. Finally, they decided to take depth data from the SDK and combine it with a C library called OpenCV. Because programmers couldn’t get enough low-level access, they switched to developing entirely in C Environment and used a DLL for communication to Unity, which is a popular game development environment.
To resolve the head tracking responsiveness issue, Infrared5 devised a matrix map algorithm based on the camera’s position that stretched the optics so that closer objects appeared bigger. Because there was very little code publicly available for doing this, the programmer had to read everything available on the subject, including academic papers and two books on OpenCV, and then write the routine to Unity in C# from scratch. The team ran into issues with the C# port of OpenCV in Unity and finally ended up rewriting it in C++. Infrared5 plans to make this new code open source to help foster the perceptual gaming community.
Despite warnings to the contrary from Intel specialists, Infrared5 went into the Ultimate Coder Challenge: Going Perceptual Challenge thinking that they could conquer gaze tracking. At least within the contest’s seven weeks, they were left disappointed.
“We were reaching for robust feature tracking to detect if the mouth was open or closed, the orientation of the head, and the gaze direction right from the pupils,” said Infrared5 on its blog. “All three of the above share the same quality that makes them difficult: in order for specific feature tracking to work with the robustness of a controller in real time, you need to be confident that you are locked onto each feature as the user moves around in front of the camera. We have learned that finding track-able points and tracking them from frame to frame does not enable you to lock onto the targeted feature points that you would need to do something like gaze tracking. As the user moves around, the points slide around. Active Appearance Model may help us overcome this later.”
Like all of the other contestants, Infrared5 worked with the Intel® Perceptual Computing SDK while it was still in beta, which meant that programmers encountered the inevitable gaps and bumps. This is to be expected with any new technology, and Infrared5 took the tool in the manner in which it was intended. As the company posted on its synopsis post for the third challenge week, “They [Intel] are trying out a lot of things at once and not getting lost in the specifics of just a few features. This allows developers to tell them what they want to do with it without spending tremendous effort on features that wouldn’t even be used. The lack of decent head, gaze, and eye tracking is what’s inspired us on to eventually release our tracking code as open source. Our hope is that future developers can leverage our work on these features.” Infrared5 would like to continue working with Intel to advance the SDK, possibly with its code merged into the Intel Perceptual Computing SDK.
A fair bit has been written about how the Ultimate Coder Challenge: Going Perceptual Challenge contestants cooperated with one another, lending encouragement and sharing tools. Less has been noted about the same sort of relationship existing between Intel and the contest participants. Intel worked hand-in-hand with the contestants, helping them through their issues, and observing their needs and priorities. Participants emerged knowledgeable and skilled with perceptual computing—traits that can in turn be immediately applied to new products ahead of their competition.
Lessons Learned, Advice Given
Infrared5 tied with Lee Bamber for winning the Best Blog category in the Ultimate Coder Challenge: Going Perceptual Challenge. As seen in the few examples cited here, the Infrared5 crew went to great lengths in documenting their progress and sharing their wisdom with the broader community. Naturally, some things never made it into the blog, and Infrared5 wants to make sure that readers know of several key points as they progress into the world of perceptual computing.
First, people are not used to controlling software with their head. While some elements of the Kiwi Katapult Revenge mechanic are quite natural, many users found that head control and the dual-input paradigm require a two-minute tutorial—a tutorial the team wasn’t able to create during the contest. Originally, Infrared5 tied the up-and-down movement to head control, but this resulted in players instinctively performing squats while trying to fly, which wasn’t quite the desired experience (although it could be in an exercise app!). They removed the feature and found alternatives.
Figure 4. Infrared5 showcased Kiwi Katapult Revenge at the annual Game Developers Conference in San Francisco. Enthusiastic response was the key in helping to fine-tune face tracking and accelerometer inputs.
“Don’t get caught up in your expectations,” advised Chris Allen. “Say you expect to get full-on eye tracking working. That could’ve totally stopped us. Working within constraints is a really important thing to always do, even though you maybe aren’t hitting everything you want. Sometimes through those limitations you can actually discover something, which is like more of a breakthrough.”
Infrared5 is also fond of conducting a sort of conceptual triage at the beginning stages of a project. Try to identify the biggest risk elements within the project and then devise tests to see if those elements become problematic or not. As carpenters say: measure twice, cut once. The Kiwi Katapult Revenge team did this with image processing, checking first to make sure that the gesture camera could connect to Unity, and then writing the code to connect the two. Take on successive chunks, and prioritize by risk.
Also consider the target form factor early in the planning. For example, tablets lie flat on a table. Kiwi Katapult Revenge cannot operate on a tablet with an external gesture camera because there is no way to mount the gesture camera to the device and have it point at a user. The Lenovo IdeaPad Yoga convertible Ultrabook device, in contrast, has several form factor possibilities and can mount the camera. With a tablet, they might not have even attempted bringing in their Brass Monkey tools.
Finally, Chris urges developers to collaborate with their peers, much as they did with other Intel contestants. By sharing code and ideas, all teams emerged more enriched. In the process perceptual computing not only grew in its capabilities but it also may have nudged that much closer toward having industry-standard commands and code sets. Had the contestants remained isolated, this development seems much less likely.
Perhaps not surprisingly given the newness of the perceptual computing field, Infrared5 didn’t make much use of outside resources during the creation of Kiwi Katapult Revenge. They did make use of the Unity sample code Intel provided. Intel technicians also provided constructive feedback to the Infrared5 designer, helping to massage and smooth the app over inevitable rough spots. Infrared5 engineers consulted books on OpenCV and made use of multiple open source libraries. Again, collaboration with other Ultimate Coder Challenge: Going Perceptual Challenge teams was invaluable.
Infrared5 is working on adding more achievements, more enemies, and more non-player characters to Kiwi Katapult Revenge. It complements Intel “for not building out every feature” in its SDK because now the company has real-world feedback from developers and early adopters to help optimize the toolset for the features that matter most. This can only help accelerate perceptual computing’s progress and ensure a better software experience for everyone.