A couple of months ago, I was lucky enough to be asked if I would like to participate in an upcoming Intel® challenge, known at the time as Ultimate Coder 2. Having followed the original Ultimate Coder competition, I was highly chuffed to even be considered. I had to submit a proposal for an application that would work on a convertible Ultrabook™ and would make use of something called Perceptual Computing. Fortunately for me, I'd been inspired a couple of days earlier to write an application and describe how it was developed on CodeProject - my regular hangout for writing about things that interest me and that haven't really been covered much by others. This seemed to me to be too good an opportunity to resist; I did some research on what Perceptual Computing was and I'd write the application to incorporate features that I thought would be a good match. As a bonus, I'd get to write about this and as I like giving code away, I'd publish the source to the actual Perceptual Computing side as a CodeProject article at the end of the competition.
Okay, at this point, you're probably wondering what the application is going to be. It's a photo editing application, called Huda, and I bet your eyes just glazed over at that point because there are a bazillion and one photo editing applications out there and I said this was going to be original. Right now you're probably wondering if I've ever heard of Photoshop® or Paint Shop Pro®, and you possibly think I've lost my mind. Bear with me though, I did say it would be different and I do like to try and surprise people.
A slight sidebar here. I apologise in advance if my assault on the conventions of the English language become too much for you. Over the years I've developed a chatty style of writing and I will slip backwards and forwards between the first and third person as needed to illustrate a point - when I get into the meat of the code that you need to write to use the frameworks, I will use the third person.
So what's so different about Huda? Why do I think it's worth writing articles about? In traditional photo editing applications, when you change the image and save it, the original image is lost (I call this a destuctive edit) - Fireworks did offer something of this ability, but only if you work in .png format. In Huda, the original image isn't lost because the edits aren't applied to it - instead, the edits are effectively kept as a series of commands that can be replayed against the image, which gives the user the chance to come back to a photo months after they last edited it and do things like insert filters between others, or possibly rearrange and delete filters. The bottom line is, whatever edit you want to apply, you can still open your original image. Huda will, however, provide the ability to export the edited image so that everyone can appreciate your editing genius.
At this stage, you should be able to see where I'm going with Huda (BTW it's pronounced Hooda), but what really excited me was the ability to offer alternative editing capabilities to users. This, to me, has the potential to really open up the whole photo editing experience for people, and to make it accessible beyond the traditional mouse/keyboard/digitizer inputs. After all, we now have the type of hardware available to us that we used to laugh at in Hollywood films, so let's develop the types of applications that we used to laugh at. In fact, I've turned to Hollywood for ideas because users have been exposed to these ideas already and this should help to make it a less daunting experience for users.
Why is this learning curve so important? Well, to answer this, we really need to understand what I think Perceptual Computing will bring to Huda. You might be thinking that Perceptual Computing is a buzz phrase, or marketing gibberish, but I really believe that it is the next big thing for users. We have seen the first wave of devices that can do this with the Wii and then the XBox/Kinect combination, and people really responded to this, but these stopped short of what we can achieve with the next generation of devices and technologies. I'll talk about some of the features that I will be fitting into Huda over the next few weeks and we should see why I'm so excited about the potential and, more importantly, what I think the challenges will be.
Touch computing. Touch is an important feature that people are used to already, and while this isn't being touted in the Perceptual Computing SDK, I do feel that it will play a vital part in the experience for the user. As an example, when the user wants to crop an image, they'll just touch the screen where they want to crop to - more on this in a minute because this ties into another of the features we'll use. Now this is all well and good but we can do more, perhaps we can drag those edits round that we were talking about to reorder them. But wait, didn't we say we want our application to be more Hollywoody? Well, how about we drag the whole interface around? Why do we have to be constrained for it to look like a traditional desktop application? Let's throw away the rulebook here and have some fun.
Gestures. Well, touch is one level of input, but gestures take us to a whole new level. Whatever you can do with touch, you really should be able to do with gesture, so Huda will mimic touch with gestures, but that's not enough. Touch is 2D, and gestures are 3D, so we really should be able to use that to our advantage. As an example of what I'll be doing with this - you'll reach towards the screen to zoom in, and pull back to zoom out. The big challenge with gestures will be to provide visual cues and prompts to help the user, and to cope with the fact that gestures are a bit less accurate. Gestures are the area that really excite me - I really want to get that whole Minority Report feel and have the user drag the interface through the air. Add some cool glow effects to represent the finger tips and you're well on the way to creating a compelling user experience.
Voice. Voice systems aren't new. They've been around for quite a while now, but their potential has remained largely unrealised. Who can forget Scotty, in Star Trek, picking up a mouse and talking to it? Well, voice recognition should play a large part in any Perceptual system. In the crop example, I talked about using touch, or gestures, to mark the cropping points; well, at this point your hands are otherwise occupied, so how do you actually perform the crop? With a Perceptual system, you merely need to say "Crop" and the image will be cropped to the crop points. In Huda, we'll have the ability to add a photographic filter merely by issuing a command like "Add Sepia". In playing round with the voice code, I have found that while it's incredibly easy to use this, the trick is to really make the commands intuitive and memorable. There are two ways an application can use voice; either dictation or command mode. Huda is making use of command mode because that's a good fit. Interestingly enough, my accent causes problems with the recognition code, so I'll have to make sure that it can cope with different accents. If I'd been speaking with a posh accent, I might have missed this.
A feature that I'll be evaluating for usefulness is the use of facial recognition. An idea that's bubbling around in my mind is having facial recognition provide things like different UI configurations and personalising the most recently used photos depending on who's using the application. The UI will be fluid, in any case, because it's going to cope with running as a standard desktop, and then work in tablet mode - one of the many features that makes Ultrabooks™ cool.
So, how much of Huda is currently built? Well, in order to keep a level playing field, I only started writing Huda on the Friday at the start of the competition. Intel were kind enough to supply a Lenovo® Yoga 13 and a Gesture Camera to play around with, and I've spent the last couple of weeks getting up to speed with the Perceptual SDK. Huda is being written in WPF because this is a framework that I'm very comfortable in and I believe that there's still a place for desktop applications, plus it's really easy to develop different types of user interfaces, which is going to be really important for the applicatino. My aim here is to show you how much you can accomplish in a short space of time, and to provide you with the same functionality at the end as I have available. This, after all, is what I like doing best. I want you to learn from my code and experiences, and really push this forward to the next level. Huda isn't the end point. Huda is the starting point for something much, much bigger.
Final thoughts. Gesture applications shouldn't be hard to use, but the experience of using it should be easily discoverable. I want the application to let the user know what's right, and to be intuitive enough to us the without having to watch a 20 minute getting started video. It should be familiar and new at the same time. Hopefully, by the end of the challenge, we'll be in a much better position to create compelling Perceptual applications, and I'd like to thank Intel® for giving me the chance to try and help people with this journey. And to repay that favour, I'm making the offer that you will get access to all the perceptual library code I write.