Ultimate Coder Challenge Week Two Summary: Infrared5 Integrates Brass Monkey Controls Plus Updates on Head/Eye Tracking Approach

Week Two brought with it some interesting challenges. We’ve finished our first sprint and many of the features for Kiwi Catapult Revenge are now complete. We were pleased to see that all the tasks we set for ourselves wasn’t too big of a bite to take, and our knowledge of the capabilities of the perceptual computing camera and the PC SDK are making us comfortable with achieving our goal of eye/gaze tracking.

Game Play

We’ve got laser and fireball shooting, plus a basic AI for the cat enemies in place. Brass Monkey support is now in the game too; you can fly around the environment by tilting your phone and using the touch screen of that device to shoot and fly. You should definitely try it out by grabbing the Brass Monkey app for Android or iOS and loading up the latest demo of the game.

We did some early playtesting with attendees at the MassDigi Game Challenge at Microsoft’s NERD Center this weekend. People found the controls to be intuitive, and the early version of the game fun. The main complaint was how hard it was to shoot the cats. Luckily we believe that eye tracking, a well designed target reticle, and better hit detection will solve a lot of these issues.

Levano Ultrabook As a Gaming Device

What we’ve been finding is that playing the game with Brass Monkey controls with the Yoga Ultrabook flipped so the keyboard isn’t usable really works well. The form factor of the convertible Ultrabook plus its weight and graphics capabilities really make for an excellent gaming system, and when flipping it around like this it gives the feeling of a dedicated arcade machine. This effect is going to be even more pronounced once we incorporate the head tracking features.

Face Tracking

Our endeavors into face tracking this week haven’t been without their challenges. Since we’re dealing with the C# ports of C++ libraries (for both the OpenCV libraries and the Intel Perceptual Computing SDK), we had quite a few issues with the documentation not matching the classes and with Unity parsing the DLLs correctly.  By Friday we had decided to ditch the Unity Port of the Intel Perceptual Computing SDK all together. The nail in the coffin was when Unity incorrectly threw a compile error on one of OpenCV core data types because a private variable was declared in the serialized parent and the inheriting class (Unity essentially thought a local variable was being declared twice).  We are now going to write what we need in C++ using OpenCV and the depth data directly, and create a DLL to be loaded into Unity with the methods we need exposed there. We’ve decided to also publish what we come up with as an Open Source Library once the competition is complete. We are taking inspiration for this from Open Kinect.  A robust community like that will really help stoke the fire for other developers to create applications using Intel’s Perceptual Computing technology.

Speaking of Open Connect, we got thinking based on this library, that one huge limitation with the Intel SDK is that it only runs on Windows. It would be very beneficial to have a library that bridges the gap to the other operating systems as well. According to Bob Duffy at Intel “Mac and Linux support are desired, but not yet official”. We will be doing a ton of refinement on head tracking and eye tracking features and making it easier for Unity developers to work with the Perceptual Computing Camera with our library. Perhaps what we end up publishing can be incorporated directly into Intel’s library, and we will all have better tools moving forward.

There were lots of people on Intel’s forum trying to figure out the UV mappings in Unity, and it doesn’t seem that anyone has yet figured this out. Our version does work, but we’ve found that it has a ton of limitations and would need smoothing to be super accurate.

For those of you that are having trouble with mapping the depth sensor data  with the RGB data in in Unity we offer you this code (in C#) for now:

using UnityEngine;
using System;
using System.Runtime.InteropServices;
using OpenCvSharp;

public class TexturePlayback:MonoBehaviour 
 private const float HIGH_PASS = 1000.0f;//825.0f;
 private const float LOW_PASS = 0.0f;//625.0f;
    private Texture2D rgbTexture;
 private PXCUPipeline pp;
    private int[] depthMapSize = new int[2]{0,0};
    private int[] RGBMapSize = new int[2]{0,0};
    private int[] uvMapSize = new int[2]{0,0};
 private PXCUPipeline.Mode mode = PXCUPipeline.Mode.DEPTH_QVGA | PXCUPipeline.Mode.COLOR_VGA;

    void Start() 
 pp = new PXCUPipeline();
 if (!pp.Init(mode)) 
 print("Unable to initialize the PXCUPipeline");

 if (RGBMapSize[0] > 0) 
 print("rgb map size: width = " + RGBMapSize[0] + ", height = " + RGBMapSize[1]);
 rgbTexture = new Texture2D (RGBMapSize[0], RGBMapSize[1], TextureFormat.ARGB32, false);
         // use the rgb texture as the rendered texture
 renderer.material.mainTexture = rgbTexture;
 if (depthMapSize[0] > 0) 
 print("depth map size: width = " + depthMapSize[0] + ", height = " + depthMapSize[1]);
 if (uvMapSize[0] > 0) 
 print("uv map size: width = " + uvMapSize[0] + ", height = " + uvMapSize[1]);
    void OnDisable() 

    void Update() 
 if (!pp.AcquireFrame(false)) 

 bool textureUpdated = false;

 if (pp.QueryRGB(rgbTexture)) 
 textureUpdated = true;

 // only attempt the following if we have a depth map with uvs
 if (depthMapSize[0] > 0 && uvMapSize[0] > 0)
 short[] depthStorage = new short[depthMapSize[0] * depthMapSize[1]];
 //IplImage mask = Cv.CreateImage(new CvSize(depthMapSize[0], depthMapSize[1]), BitDepth.U8, 3);
 float[] uvStorage = new float[uvMapSize[0] * uvMapSize[1] * 2];
 if (pp.QueryDepthMap(depthStorage) && pp.QueryUVMap(uvStorage)) 
 //float range = HIGH_PASS - LOW_PASS;
 DepthData[] depthData = new DepthData[depthMapSize[0] * depthMapSize[1]];
 for (int r = 0; r < depthData.Length; r++)
 depthData[r] = new DepthData();
 // create a depth data map that has been corrected to match the RGB x,y positions
 for (int y = 0; y < depthMapSize[1]; y++) 
              for (int x = 0; x < depthMapSize[0]; x++) 
 int currentIndex = y * depthMapSize[0] + x;
 //float rawDepthData = depthStorage[currentIndex];
 //float depthColor = 0.0f;
 // make sure we don't go out of range
 // find the new x and y for 640x480 color map using the uvs and scale back to 320x240
 int xx = (int)((uvStorage[currentIndex * 2 + 0] * depthMapSize[0]));
 int yy = (int)((uvStorage[currentIndex * 2 + 1] * depthMapSize[1]));
 if (xx >= 0 &&  xx < depthMapSize[0] && yy >= 0 && yy < depthMapSize[1])
 int newIndex = yy * depthMapSize[0] + xx;
 //if (rawDepthData < HIGH_PASS && rawDepthData > LOW_PASS)
 // depthColor = (HIGH_PASS - rawDepthData) / range;
                  //depthData[newIndex] = new Color(depthColor, depthColor, depthColor, 1.0f);
 depthData[newIndex].rawDepth = depthStorage[currentIndex];
 // TODO smooth the depth data

 // set the mask on the rgb data
 Color[] rgbColors = rgbTexture.GetPixels();
 for (int y = 0; y < rgbTexture.height; y++) 
              for (int x = 0; x < rgbTexture.width; x++) 
 int currentIndex = y * rgbTexture.width + x;
 // look back into the depth data for each pixel
 int depthIndex = (y / 2) * depthMapSize[0] + (x / 2);
 // change the color based on the results from the high pass filter
 if (depthData[depthIndex].rawDepth > HIGH_PASS)
 rgbColors[currentIndex] = new Color(1, 1, 1, 1);

 // set the new pixels on the texture

 // mark the texture as updated
 textureUpdated = true;

 // if we have had an update, apply it to the material
 // we want Texture.Apply to be called as few times as possible since it is very expensive
 if (textureUpdated)


 void SetColor(Texture2D texture, Color color, bool applyInstantly = false)
 Color[] rgbColors = texture.GetPixels();
 for (int y = 0; y < texture.height; y++) 
         for (int x = 0; x < texture.width; x++) 
 rgbColors[y * texture.width + x] = color;
 if (applyInstantly)

Here is a quick video displaying the above code:


The moment when light struck our brains came on line 069 - multiplying the size of the UV Storage array by 2.  Even though the uv map resolution is the same as the Depth Data, you need to double it to account for the x AND y coordinates needed for mapping per pixel. We lost some hours on that (why return a specific UV Map size in QueryUVMapSize that needs to be doubled?  Are you angry with us?)

Detailed Field of View specs for the RBG + Depth camera are not provided by Intel (just one spec for Diagonal FOV).  With Open Kinect’s API and specs pages, they give you diagonal, vertical, and horizontal FOV, which is nice to have for estimating the size of objects in video streams (rather than doing a full camera calibration via OpenCV). This is something that Intel should provide out of the box. In the future you will be able to use our code (for this make & model camera).

Eye Tracking Eigen Eyes Demo seems too slow.  Instead we’ve decided to go with Haar Cascade of eye images to grab the eyes region, then we will use another algorithm to detect exact pupil movement. We have had fun this week playing with all of the Haar Cascades that ship with OpenCV (mouth, frontal face, profile face, mouth, pretty comprehensive; we’re very pleased).

Since this is a pretty complex system, and is difficult to describe with just words, we’ve provided a activity diagram with our new approach.

Some Final Thoughts on Perceptual Computing

While going through the process of working with the Intel Perceptual Computing SDK and the  Intel Perceptual Computing (IPC) camera we’ve been getting inspired by all the possibilities this technology enables. We mentioned last week how cool it would be to incorporate Google Glass into the mix, but there are some other devices that could offer an interesting future for perceptual computing too. If you’ve not checked out the Leap and Myo do yourself a favor and do so now.

Leap Motion
The Leap takes an interesting approach as a perceptual computing device. It does not include an RGB camera, nor is it mounted like a typical web cam. It instead is designed to sit on your desk and the hand movements and gestures are designed to be detected from the bottom up. This of course means that eye tracking and head tracking aren’t possible, but I like the idea of their setup for hand gestures. The Leap combined with the IPC camera could make for some very interesting applications. I can imagine using the IPC camera for face tracking like we are doing with Kiwi Catapult Revenge, and using the Leap sensor to focus on hand gestures.

Another approach to perceptual computing and something very, very different is the Myo. This device fits onto your arm as a band that detects the muscle movements that can be boiled down to exact finger movements, wrist rotations and more. The coolest thing about this detection method is that it doesn’t rely on line of site to detect the gestures. Myo combined with Brass Monkey controls would open up all kinds of possibilities. Image what could be done with gun shooting or sword fighting games. You would have the phone in your hand that can trigger the touch screen, and send gyroscope data, plus the muscle events from the Myo. This combined would allow for very fine tuned detection, and you would not need a camera involved at all. Of course adding in a IPC camera would only make the possibilities that much greater. Head and gaze tracking, plus hand movements not in line of site combined with the data from the phone via Brass Monkey just makes our heads spin with the possibilities this would allow. Click to watch a Myo video.

What would you do with these next generation Perceptual Computing gadgets? What ways do you see them working in conjunction with Intel’s offering? We would love to hear your ideas, and of course, as always we welcome your feedback on our progress in the competition.

For more complete information about compiler optimizations, see our Optimization Notice.