Ultimate Coder 2, Week 7 :: Wrapping Up and Sucking Up to Nicole

Decisions, decisions...

As the final post we really need to drill down on several things, explain how they effect the final product and that's likely to make an atypically long, and potentially boring post of pain and suffering...sorry Judges. But you should perhaps thank Nicole who clearly drove us to this. :)

Let's start with our particular design/programming problem:

Some of the other Ultimate Coders and judges have suggested that taking an existing product was a particularly difficult task because it meant we had to fit the perceptual inputs to a pre-existing control schema. While that was true in technical practice, as a team we thrive on that kind of thing and may have had an easier time adjusting to that set of tasks than expected. While the problem was not simplistic in nature we used the first week of the competition to determine what changes we wanted in a perfect Sci-Fi world, and from there started moving backward to what was actually practical and needed. In other words, we aimed really, really high at first, knowing full well we'd never get there. But in that frame of mind everything was on the table. At the end of our first-week team mind-meld we decided on a path forward:

Don't force perceptual computing on our current schema.
Instead develop a full input class to process the perceptual input and make a new schema specifically built around the unique possibilities of the hardware.

Out of this mantra we wound up creating a simple event and variable driven input class that can be accessed at any time and in any script. The new input class needed to be verbose but it also needed a minimal processor footprint due to the already hungry code related to retina level graphics and animations. This limit drove us to build a library that got cleaner, smaller, and faster as we went along and all in the pursuit of a higher frame rate. Iteration one of our tracking, for instance, was small to begin with and only shrunk as the competition went along.

Driving Factors:

Speed and responsiveness became organizing catalysts that drove several decisions. It turns out the camera is very computationally heavy when using more then one video stream. Audio processing was also painfully expensive but necessary for a viable product. Saying “Start” needed to work.

However, the pressure to keep it small was not the roadblock it might have been. Instead, the limits forced us to reduce ideas to the simplest possible form. Two hands became one. Twenty voice commands became eight. Eye tracking became head kinda-tracking. But serendipitously the ongoing simplification consistently made the actual experience better - not worse.

After that the initial mind-meld perceptual synthesizing started. The process was slow and painful, often raw exploration of the SDK and just trying to determine the hardware's capabilities. Jumping in like that seemed daunting but allowed us to challenge preconceptions at the same time.

Perhaps the first thing to evolve in this way were the hand gestures. Looking through the SDK we explored a wide variety of options on what gestures to use. At first a "thumbs up" seemed the quickest to recognize and respond so we implemented that. But at GDC we saw that recognition started to fail in a complicated environment. What's more, thumbs up required hands and arms to align correctly where the "peace" gesture was recognizable from many angles - so we've changed that. Similarly, as stated earlier, we started by making a control schema based on two hands. At first we used thumbs up and closed as our trigger method. While that was supposed to be the final idea, the delay related to gesture recognition was too long to make a valid response time for the user so it was adjusted to open hand and closed hand which significantly reduced the response time during user testing. After the initial implementation of two handed control there were two main problems. The first and most annoying was the target reticule shook like leaves in a light breeze. Second was the player's arms got so tired that it was not a feasible end solution. Buffering tracking data solved the shaking problem and changing the active field of view provided for a gaming posture that allowed the arms to rest at reasonable periods.

On to head tracking and coding pain:

Stating the process to alleviate the second problem TAS (Tired Arm Syndrome), we removed the need to use two hands and moved camera movement to simple head tracking...
simple head tracking...
LOL!
Based on our initial exposure to the SDK we thought head tracking was so simple anyone could do it. But, um, that ended up being false. The camera loved to loose the player's head constantly which meant tracking was completely wonky unless you lived in a box...which is exactly how we were testing it...and fast forward to 1 day before leaving for GDC.

With one day to go both the main GUI developer and Perceptual developer were in agreement:
The "perceptualness" of Stargate Gunship was terrible.
Seriously, it almost wan't worth presenting. The reticule was still terrible, the responsiveness was laughable and no one knew what was going on while playing. The race was on to make SGG at GDC no lamesauce. The breakthrough came when Gavin realized the urgent need for some positive, on-screen feedback. Within 8 hours we added screen representations for where your head and hands were as seen by the camera. Taht simple GUI element allowed a player to quickly orient his/her hand and head and see immediate feedback on their motion. It was a kind of natural calibration step that was a gigantic step forward. We also added the thumbs up/down gesture (but eventually found that to be problematic) to set a initial location for your head and hands.

This iterative change unlocked a whole new breakthrough: the possibility of a dynamic sampling area because, you know, not everyone will be exactly the same or hold their arms in the same place. We found that there was no need to sample the entire camera field of view. Instead, we find the player's hand and then ignore all pixels beyond about a quarter of available resolution surrounding our target. That saved us a LOT of processor horsepower by trimming the sample area by 75%. Also the addition of a dampening array to buffer reticule movement over a few frames (which does add delay (1/6 s) but makes the jitter minimal and promotes usability).

8 Hours == Playable demo we can agree doesn’t suck. Here we come GDC!

GDC:

A thriving trade show floor is proof that internal QA is always useless, aka the white box of testing is dust free and perfect...and completely unrealistic. After just two hours of booth time at GDC we found head tracking was fidgety (that is being nice), the thumbs up sign was the most difficult thing for the hundred or so people trying the game and the symbols for what the camera is doing were confusing and esoteric. Yet, it was not all bad, our hand tracking for causing explosive fun was “perfect” (Thanks random GDC dude). So despite a less-than-perfect demonstration, the real-world exposure was probably the most useful and powerful version of Aukum's Razor to bring Startgate Gunship out of an academic experiment to a genuinely marketable product.

What's Next?

So, with GDC behind us, what can we do in one final week of polish? We plan to perfect the gesture feedback controls and re-work head tracking based on what we saw and learned. We put off worrying about head tracking but now is the time to get it right and implement gestures and voice commands that are easier to compute. The new gesture we used was the peace sign (Gosh duh, two fingers will always be better than a single thumb). The perceptual GUI will be enhanced to use a face icon to show face location and a hand that can show the peace sign, the open hand and the closed hand. Which actually shows the user what is going on. (Head to forehead bonk, duh use visual symbols not targeting reticules) Which leaves us this next week to re-imagine head tracking to allow useful results outside the little white room.

In summary, regardless of how the judges rte the apps, we had a fantastic experience here. It was a genuinely challenging puzzle with real-world stakes and it took a lot of time and energy to tackle it. That said, perhaps the best, and unlooked for, result was how the work we did to re-imagine the UI for perceptual made us completely rethink how we did the UI with touch as well. Next to the perceptual version of SGG we were testing a new touch version of the game on iPad and we found a far, far better way to play the game that would not have happened without this contest.

So Intel - thanks for drawing the best out of us, and look for SGG on AppUp soon!

Etiquetas:
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.