Case Study: Mystic Blocks Brings the Magic of Gesture Recognition and Facial Analysis to Desktop Gaming
By Erin Wright
Developer Matty Hoban of Liverpool, England, is always looking for innovative ways to integrate his love of mathematics, physics, and software development with next-generation technology. So he welcomed the opportunity to participate in Phase 1 of the Intel® Perceptual Computing Challenge.
The Intel Perceptual Computing Challenge invited coders to push the boundaries of the Intel® Perceptual Computing software development kit (SDK) and Creative* Interactive Gesture Camera, which together offer significant advancements in human–computing interactions, including:
- Speech recognition
- Close-range depth tracking (gesture recognition)
- Facial analysis
- Augmented reality
Hoban is currently finishing his computing degree at the Open University. He is also the founder of Elefant Games, which develops tools for game developers in addition to the Bomb Breaker app for Windows* desktops and touch screens.
After entering the Challenge, Hoban looked to the Perceptual Computing SDK and Creative Interactive Gesture Camera for inspiration. He explains, “I wanted to get a feel for them. I felt it wasn’t enough to take an existing idea and try to make it work with a perceptual camera. Whatever I made, I knew that it had to work best with this camera over all possible control methods.”
Testing the Gesture Camera and Perceptual Computing SDK
Hoban began by testing the capabilities of the Creative Interactive Gesture Camera: “The first thing I did, as anyone would do, was try the sample that comes with it. This lets you see that the camera is working, and it gives back real-time variables of angles for your head and the position of your hands.”
Hoban then ran sample code through the Perceptual Computing SDK. He says, “Capturing hand and head movements is simple. There are multiple ways of utilizing the SDK: You can use call-backs, or you can create the SDK directly to get the data you need.”
Prototyping with Basic Shapes
After getting familiar with the Gesture camera and the SDK, Hoban began manipulating basic shapes using the gesture-recognition abilities of the close-range, depth-tracking usage mode. He says, “Once I looked at the samples and saw that the real-world values for your hands returned well, I started to get an idea for the game.”
He developed a method of creating block-based geometric shapes using three two-dimensional matrices populated with ones and zeroes. Each matrix represents the front, bottom, or side of an individual shape. This method eliminated the need for three-dimensional (3D) software and expedited the process of generating shapes within the game. Figure 1 shows examples of the shape matrices.
Figure 1.Constructing shapes with matrices
With the Gesture camera and shape matrices in place, Hoban added facial analysis to track head position in relation to the visual perspective on the screen—and Mystic Blocks was born.
Developing Mystic Blocks
Mystic Blocks is a magician-themed puzzle game that requires players to use hand motions to turn keys to fit approaching locks, as shown in Figure 2. The keys are a variety of 3D shapes the matrix method described above generates.
Figure 2.Mystic Blocks level 1
“I’ve compared Mystic Blocks to the Hole in the Wall game show, where contestants need to position themselves correctly to fit an approaching hole,” explains Hoban. “Mystic Blocks does the same but with 3D geometry that uses rotation to allow the same shape to fit through many different-shaped holes.”
Players begin by turning the keys with one hand, but as the game progresses, they have to coordinate both hands to move the keys in a 3D space. In addition to mastering hand coordination, players must repeat each sequence from memory on the second try as the locks approach with hidden keyholes. If players want a better view of the approaching locks, they can shift the game’s perspective by moving their heads from side to side. To see Mystic Blocks in action, check out the demonstration video at http://www.youtube.com/watch?v=XUqhcI_4nWo.
Close-range Depth Tracking (Gesture Recognition)
Mystic Blocks combines two usage modes from the Perceptual Computing SDK: close-range depth tracking and facial analysis. Close-range depth tracking recognizes and tracks hand positions and gestures such as those used in Mystic Blocks.
Opportunities and Challenges of Close-range Depth Tracking
Hoban found creative solutions for two challenges of close-range depth tracking: detection boundaries and data filtering.
Mystic Blocks gives players text instructions to hold their hands in front of the camera. Although players are free to determine their own hand positions, Hoban’s usability tests revealed that hand motions are detected most accurately when players hold their palms toward the camera, with fingers slightly bent as if about to turn a knob, as demonstrated in Figure 3.
Figure 3.Mystic Blocks hand position for gesture recognition
Additional usability tests showed that players initially hold their hands too high above the camera. Therefore, a challenge for developers is creating user interfaces that encourage players to keep their hands within the detection boundaries.
Currently, Mystic Blocks meets this challenge with graphics that change from red to green as the camera recognizes players’ hands, as shown in Figure 4.
Figure 4.Mystic Blocks hand recognition alert graphics
“I’d like to add a visual mechanism to let the user know when his or her hand strays out of range as well as some demonstrations of the control system,” notes Hoban. “I think that as the technology progresses, we’ll see standard gestures being used for common situations, and this will make it easier for users to know instinctively what to do.”
Yet, even without these standardized movements, Hoban’s adult testers quickly adapted to the parameters of the gesture-based control system. The only notable control issue arose when a seven-year-old tester had difficulty turning the keys; however, Hoban believes that he can make the game more child friendly by altering variables to allow for a wider variety of hand rotations. He says, “I have noticed improvements in the Perceptual Computing SDK since I developed Mystic Blocks with the beta version, so I am confident that the controls can now be improved significantly.”
During user testing, Hoban noticed that the hand-recognition function would occasionally become jumpy. He reduced this issue and improved the players’ ability to rotate the keys by filtering the incoming data. Specifically, the game logic ignores values that stray too far out of the established averages.
In the future, Hoban would like to leverage the flexibility of the Perceptual Computing SDK to fine-tune the filters even further. For instance, he wants to enhance the game’s ability to distinguish between left and right hands and increase gesture recognition performance in bright, outdoor light.
The Perceptual Computing SDK facial analysis usage mode can track head movements like those Mystic Blocks players use to adjust their visual perspectives. Hoban says, “The head tracking was simple to add. Without it, I would need to offset the view by a fixed distance, because the player’s view is directly behind the shape, which can block the oncoming keyhole.”
Mystic Blocks’ head tracking is primarily focused on side-to-side head movements, although up and down movements can also affect the onscreen view to a lesser extent. This lets players find their most comfortable position and add to their immersion in the game. “If you’re looking directly towards the camera, you’ll have the standard view of the game,” explains Hoban. “But if you want to look around the corner or look to the side of the blocks to see what’s coming, you just make a slight head movement. The camera recognizes these movements and the way you see the game changes.”
The Creative Interactive Gesture Camera provides Hoban with a sampling rate of 30 fps. The Mystic Blocks application, which runs at 60 fps, can process gesture recognition and head tracking input as it becomes available. Hoban states, “The Gesture Camera is responsive, and I am quite impressed with how quickly it picks up the inputs and processes the images.”
Third-party Technology Integration
Mystic Blocks incorporates The Game Creators (TGC) App Game Kit with Tier 2 C++ library for rendering and the NVIDIA PhysX* SDK for collision detection and physics. Hoban also used several third-party development tools, including Microsoft Visual Studio* 2010, TGC 3D World Studio, and Adobe Photoshop*.
These third-party resources integrated seamlessly with the Intel® Perceptual Computing technology. Hoban reports, “You just launch the camera and fire up Visual Studio. Then, you can call the library from the SDK and work with some example code. This will give you immediate results and information from the camera.”
Figure 5 outlines the basic architecture behind Mystic Blocks in relation to the Gesture Camera.
Figure 5.Mystic Blocks architecture diagram
The Ultrabook™ Experience
Mystic Blocks was developed and tested on an Ultrabook™ device with an Intel® Core™ i7-3367U CPU, 4 GB of RAM, 64-bit operating system, and limited touch support with five touch points. Hoban comments, “There were no problems with power or graphics. It handled the camera and the game, and I never came up against any issues with the Ultrabook.”
The Future of Perceptual Computing
Hoban believes that perceptual computing technologies will be welcomed by gamers and nongamers alike: “I don’t see it taking over traditional keyboards, but it will fit comfortably alongside established controls within most apps—probably supporting the most common scenarios, such as turning the page or going to the next screen with a flick of your hand. Devices will also be able to recognize your face, conveniently knowing your settings.”
According to Hoban, gesture recognition is a perfect control system for motion-based games like Mystic Blocks; however, game developers will need to strike a balance between perceptual computing and traditional keyboard control methods in complex games with numerous options. “If you take your hands away from the camera to use the keyboard, you might lose focus on what you’re doing,” he comments. Instead, he advises developers to enrich complex games with gesture recognition for specific actions, such as casting spells or using a weapon.
Facial analysis and voice recognition offer additional opportunities to expand and personalize gaming control systems. For example, Hoban predicts that facial analysis will be used to automatically log in multiple players at once and begin play exactly where that group of players left off, while voice recognition will be used alongside keyboards and gesture recognition to perform common tasks, such as increasing power, without interrupting game play.
“I would like to add voice recognition to Mystic Blocks so that you could say ‘faster’ or ‘slower’ to speed up or slow down the game, because right now you can’t press a button without losing hand focus in the camera,” notes Hoban.
And the Winner Is...
Matty Hoban’s groundbreaking work with Mystic Blocks earned him a grand prize award in the Intel Perceptual Computing Challenge Phase 1. He is currently exploring opportunities to develop Mystic Blocks into a full-scale desktop game, while not ruling out the possibility of releasing the game on Apple iOS* and Google Android* devices. “Mystic Blocks is really suited to the camera and gesture inputs,” he says. “It will transfer to other devices, but if I develop it further, it will primarily be for perceptual computing on the PC.”
In the meantime, Hoban has entered the Intel Perceptual Computing Challenge Phase 2 with a new concept for a top-down racing game that will allow players to steer vehicles with one hand while accelerating and braking with the other hand.
Matty Hoban’s puzzle game Mystic Blocks won a grand prize in the Intel Perceptual Computing Challenge Phase 1. Mystic Blocks gives players the unique opportunity to move shapes in a 3D space using only hand gestures. Players also have the ability to control the game’s visual perspective by moving their heads from side to side. During development, Hoban created his own innovative method of filtering data through the Perceptual Computing SDK and Creative Interactive Gesture Camera. He also gained valuable insight into the process of helping players adapt to gesture recognition and facial analysis.
For More Information
- “The Intel® Perceptual Computing SDK 2013: Now with Background Subtraction and Expanded Language Support” at http://software.intel.com/en-us/vcsource/tools/perceptual-computing-sdk
- Intel® Perceptual Computing Challenge at https://perceptualchallenge.intel.com
About the Author
Erin Wright, M.A.S., is an independent technology and business writer in Chicago, Illinois.
Intel, the Intel logo, Ultrabook, and Core are trademarks of Intel Corporation in the U.S. and/or other countries.
Copyright © 2013 Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.