Tom Cruise and “Minority Report”, or Tony Stark and “Iron Man” – that’s what most people think of when you bring up the phrase “perceptual computing”. Imagine controlling your computer merely by using your voice or a wave of your hand, rather than a mouse, a keyboard, or even a touchscreen. Perceptual computing is all about natural human interactions with machines in addition to those tried and true control mechanisms: facial recognition, voice commands, gesture swiping, etc. Responsive computing that is individually tailored to an individual’s unique needs is really what perceptual computing is all about. There’s a lot going on in this space right now, and in this article, we’re going to highlight just a few of the most exciting innovations that feature perceptual computing integrations.
The SixthSense Device, conceived and designed through a couple of collaborative experiments at the MIT Media Lab, hangs around your neck and broadcasts computer-generated content onto a wall or any other surface. Users can interact with it, even dial a telephone number projected on their hands, move images with your hands on a wall, and email what you see to someone else:
“The SixthSense prototype is comprised of a pocket projector, a mirror and a camera. The hardware components are coupled in a pendant like mobile wearable device. Both the projector and the camera are connected to the mobile computing device in the user’s pocket. The projector projects visual information enabling surfaces, walls and physical objects around us to be used as interfaces; while the camera recognizes and tracks user's hand gestures and physical objects using computer-vision based techniques. The software program processes the video stream data captured by the camera and tracks the locations of the colored markers (visual tracking fiducials) at the tip of the user’s fingers using simple computer-vision techniques. The movements and arrangements of these fiducials are interpreted into gestures that act as interaction instructions for the projected application interfaces. The maximum number of tracked fingers is only constrained by the number of unique fiducials, thus SixthSense also supports multi-touch and multi-user interaction. “ – Sixth Sense Device, MIT Media Lab
Watch the video of this amazingly simple device in action to get an idea of what it’s capable of:
The SixthSense device is basically a wearable computer that interacts intuitively with the world around you. It puts a layer of digital information onto the physical world. The entire device is simply a webcam, a battery-powered 3M projector with mirror, and an Internet-connected mobile phone. Of course, the real magic comes in the software, which is designed to recognize gestures and environment, inputting contextual information that is instantly layered on top of whatever it is you’re doing.
Inventor Pranav Mistry used caps from Magic Markers on his hand so that the camera would be able to figure out which finger was which and recognize hand gestures, along with software that Mistry created himself. Using his fingers to frame and snap a photo (no camera needed), the photos are instantly downloaded to his mobile phone; he can later pull out those images with a gesture and start editing them on any available surface.
The device is also able to project a word cloud of sorts that tells you about someone you’re meeting for the first time – their URL, what company they work for, what position they hold their social media profiles, etc. In another demo, we see him pick up a boarding pass and he’s able to project the current status of his flight on top of that pass. Want to know what time it is? Just draw a circle on your arm and a watch will pop up. How about email? Draw a “@” symbol in the air. Need to call someone? You can project a phone pad onto your palm and dial a number – no phone required. One of the most intriguing demos was the ability to read a newspaper and project a video instantly onto the page that provides more information about what you’re reading; this is truly interactive technology:
"We're trying to make it possible to have access to relevant information in a more seamless way," says Dr Pattie Maes, who heads the Fluid Interfaces Group at MIT.
She says that while today's mobile computing devices can be useful, they are "deaf and blind," meaning that we have to stop what we're doing and tell those devices what information we need or want. “We have a vision of a computing system that understands, at least to some extent, where the user is, what the user is doing, and who the user is interacting with," says Dr. Maes. "SixthSense can then proactively make information available to that user based on the situation." – BBC, “SixthSense blurs digital and the real”
Ultimate Coder projects
In the Ultimate Coder: Going Perceptual Challenge, there are seven developer teams competing over the span of seven weeks to create an app that utilizes the latest Ultrabook convertible hardware along with the Intel Perceptual Computing camera in order to build the ultimate app prototype. The seven projects have turned out to be wildly different:
- Sixense is creating Puppet Theater, an interactive virtual puppet experience that users can control using gestures and voice input
- Lee Bamber is creating a new breed of webcam software “which solves the problem of bandwidth lag while teleconferencing with multiple users over the internet. At the same time, I will build features into the app to allow hands-free sharing and manipulation of images, voice and facial synthesis to protect your privacy and my favourite; the ability to transport yourself into a virtual 3D scene such as a conference room or Viking mead hall.”
- The Code-Monkeys are revamping a game that was previously touch-based, writing full perceptual computing plugins for Unity 4 and Windows
- Simian Squared are creating a virtual pottery studio: “We will be using the PerC hardware to allow the user to interact directly with their digital clay - sculpting, moulding and painting it physically with their hands as well as using the Ultrabook's unique form factor and it's touch screen to access the tools.”
- Eskil is creating an entirely new kind of interface: “The project under taken during the challenge will be to enhance a simple to use platform library with features such as multitouch, tilt sensors, head-tracking and stereoscopics…. I will develop a identified interface toolkit with sliders, buttons, menus and so on designed specifically for software development where the application may run on as diverse hardware setups as: tablets, TVs, PCs, Laptops, convertibles, tables, head-mounted displays or large scale multi-user walls and include features such as wands, head tracking, multitouch and stereoscopics.”
- Brass Monkey/Infrared5 are creating a new game that combines play experiences like those on the Nintendo Wii and Microsoft Kinect: “ To that end, we plan on using the perceptual computing SDK for head tracking, allowing the player to essentially peer into the tablet/laptop screen like a window. When the player moves his/her head the view of the 3D world will change depending on the angle of the player’s vision.”
- Peter O’Hanlon is creating a photo editing application that “makes use of the Ultrabook and Perceptual Computing to provide a wide variety of inputs and manipulations – bringing the type of manipulation you see in films like Minority Report to real world applications.”
All developers in this Challenge are using the PerC SDK and camera. In order to fully support developers and perceptual computing development for Ultrabooks and PCs, Intel released the Intel Perceptual Computing Software Development Kit in October 2012. To support this kit, Intel also made available the Creative Interactive Gesture Camera. This SDK enables developers to fully integrate innovative facial and voice recognition, gesture controls, and alternative reality features into next generation Ultrabooks and PCs. It is absolutely free and can be found at intel.com/software/perceptual.
The apps built using the Intel Perceptual Computing SDK can take advantage of the Ultrabook’s many sensors, including touch, accelerometer, GPS, and NFC. With this SDK, savvy developers can also create programs that include facial analysis, voice recognition, and 2D/3D object tracking. These are all advanced, intuitive features that will only enhance user-to-machine collaboration. For example, instead of just building a standard online text translation, developers could use this SDK to build a real-time, voice activated translation service. Another idea might be giving users the ability to track items on the screen via a series of simple, intuitive gestures via gesture and facial recognition.
Intel Perceptual Computing Challenge
Separate but similar to the Ultimate Coder Challenge: Going Perceptual is the Intel Perceptual Computing Challenge:
“The Intel® Perceptual Computing Challenge is a contest to create innovative applications using natural human interfaces such as gestures, voice, and facial tracking. Use the Intel® Perceptual Computing SDK and the Creative* Interactive Gesture Camera Kit to fast track your development of application prototypes. Compete against other developers, use your imagination, and show us your vision for the future of computing! A total of USD $1 Million in prizes are in store for the winners of this Challenge, which will be done in two phases. Phase 1 of the contest has a total of USD $185K in cash prizes and is now open for entries. Phase 2 of the Challenge follows in March 2013 and will have more than USD $800K in prizes. “
A few of the entries so far include a prototype for perceptual table tennis using hand tracking and voice recognition, a game where you control space craft with hand movements and shoot missiles with a thumbs up signal, a board game with finger-driven input, a 3D action game, virtual reality car racing, augmented reality chess, and many more. You can see quite a few of the entries here.
Digital and organic
Perceptual computing is making it possible for our digital worlds to interact with our physical, organic worlds in meaningful ways. Many of the projects outlined in this article are stepping across boundaries that just a few years ago would have been impossible to imagine. What do you see as the future of perceptual computing? Leave your thoughts in the comments.