Contest Winners Combine Augmented Reality with an Encyclopedia with ARPedia*

By Garret Romaine

The interfaces of tomorrow are already in labs and on test screens somewhere, waiting to turn into fully developed samples and demos. In fact, look no further than the winner of the Creative User Experience category in the Intel® Perceptual Computing Phase 2 Challenge, announced at CES 2014. Zhongqian Su and a group of fellow graduate students used the Intel® Perceptual Computing SDK and the Creative Interactive Gesture Camera Kit to combine augmented reality (AR) and a common encyclopedia into ARPedia*—Augmented Reality meets Wikipedia*. ARPedia is a new kind of knowledge base that users can unlock with gestures instead of keystrokes.

Six team members from Beijing University of Technology developed the application over two months, using a variety of tools. The team used Maya* 3D to create 3D models, relied on Unity* 3D to render 3D scenes and develop the application logic, then used the Intel Perceptual Computing SDK Unity 3D plug-in (included in the SDK) to pull it all together. Their demo combines 3D models and animated videos to create a new way of interacting with the virtual world. Using body movements, hand gestures, voice, and touch, the application encourages digital exploration in an unknown world, and the possibilities for future work are exciting.

All About Dinosaurs


Think of ARPedia as a story-making and experiencing game, with AR visual effects. While users enjoy a seamless interactive experience, a lot of technology went into creating even the simplest interactions. In a PC game, the common mouse and keyboard—or a touch screen—are the usual means of interfacing with the application. But none of these are used in ARPedia. In an AR application, a natural user interface is very important. ARPedia users are able to control the action with bare hand gestures and face movement, thanks to the Creative Senz3D* camera. Many interesting gestures are designed to help advance the game, such as grasping, waving, pointing, raising, and pressing. These gestures make the player the real controller of the game and the virtual world of dinosaurs.


Figure 1: ARPedia* is a combination of augmented reality and a wiki-based encyclopedia, using gestures to navigate the interface.

Team leader Zhongqian Su had used a tiny Tyrannosaurus rex character in his previous work creating educational applications, so he made that well-known dinosaur the star of his ARPedia app. Players use hand motions to reach out and pick up the tiny dinosaur image, then place it at various points on the screen. Depending on where they put the dinosaur, players can get information about the creature’s diet, habits, and other characteristics.

Figure 2: Users interact with a tiny Tyrannosaurus rex to learn about fossils, paleontology, and geology.

According to team member Liang Zhang, the team had coded an AR application for the education market before using this dinosaur’s 3D model. Although they had the basics of an application in place, they had to do a lot of work to be eligible for the contest. For example, their in-house camera used 3D technology, so they needed to rewrite that code (see Figure 3) to interface with the newer Creative Interactive Gesture Camera Kit. That also meant coming up to speed quickly on the Intel Perceptual Computing SDK.


bool isHandOpen(PXCMGesture.GeoNode[] data)
	{
		int n = 1;
		for(int i=1;i<6;i++)
		{
			if(data[i].body==PXCMGesture.GeoNode.Label.LABEL_ANY)
				continue;
			bool got = false;
			for(int j=0;j<i;j++)
			{
				if(data[j].body==PXCMGesture.GeoNode.Label.LABEL_ANY)
					continue;
				Vector3 dif = new Vector3();
				dif.x = data[j].positionWorld.x-data[i].positionWorld.x;
				dif.y = data[j].positionWorld.y-data[i].positionWorld.y;
				dif.z = data[j].positionWorld.z-data[i].positionWorld.z;
				if(dif.magnitude<1e-5)
					got = true;
			}
			if(got)
				continue;
			n++;
		}
		return (n>2);
	}


Figure 3: The ARPedia* team rewrote their camera code to accommodate the Creative Interactive Gesture Camera.

Fortunately, Zhang said, his company was keen on investing time and energy into learning new technologies. “We have been doing a lot of applications already,” he said. “We keep track of the new hardware and software improvements that we can use in our business. Before this contest, we used Microsoft Kinect* for its natural body interactions. When we found the camera, we were quite excited and wanted to try it. We thought this contest could give us a chance to prove our technical skills as well, so why not?”

Smart Choices Up Front


Because of the contest’s compressed time frame, the team had to come up to speed quickly on new technology. Zhang spent two weeks learning the Intel Perceptual Computing SDK, and then the team designed as many different interaction techniques as Zhang could think of.

At the same time, a scriptwriter began writing stories and possible scenarios the team could code. They met and discussed the options, with Zhang pointing out strengths and weaknesses based on his knowledge of the SDK. He knew enough about the technical details to make informed decisions, so the team felt comfortable selecting what he described as “…the best story and the most interesting and applicable interactions.”

Zhang said that one of the most important early decisions they made was to keep the player fully involved in the game. For example, in the egg-hatching sequence early on, the player has a godlike role while creating the earth, making it rain, casting sunlight, and so on. There are many gestures required as the player sets up and learns.

In another sequence, the player has to catch the dinosaur. Zhang set up the system so that a piece of meat falls on the player's hand, and the dinosaur comes to pick up the meat (Figure 4). That action keeps the player interacting with the dinosaur and builds involvement. “We want to always keep the player immersed and consistent with the virtual world,” he said.

Figure 4: Feeding the baby dinosaur keeps the user engaged and builds involvement.

However, going forward those plans will require more effort. The demo includes so many new hand gestures that users struggled. “When I talked with the people who were playing the game in Intel's booth at CES,” Zhang said, “I found they couldn't figure out how to play the game by themselves, because there are many levels with different gestures for each level. We learned that it wasn't as intuitive as we had thought, and that the design must be more intuitive when we add new interactive methods. We will definitely keep that in mind in our next project.”

The ARPedia team introduced two key gestures in their entry. One is “two hands open” and the other is “one hand open, fingers outstretched.” The two-hands-open gesture, which they use to start the application, was a straightforward coding effort. But coding the second gesture took more work.

Figure 5: The team struggled to make sure the camera didn't detect the wrist as a palm point.

“The original gesture of ‘hand-open’ was not very precise. Sometimes the wrist was detected as a palm point,” Zhang explained. “Then the fist was detected as one finger, and the system thought that meant openness, which was wrong. So we designed a new hand-open gesture that is recognized when at least two fingers are stretching out.” They then added text hints on the screen to guide the user through the additions (Figure 5).

The Intel® Perceptual Computing SDK


The ARPedia team used the Intel Perceptual Computing SDK 2013 and especially appreciated the ease of camera calibration, application debugging, and support for speech recognition, facial analysis, close-range depth tracking, and AR. It allows multiple perceptual computing applications to share input devices and contains a privacy notification to tell users when the RGB and depth cameras are turned on. The SDK was designed to easily add more usage modes, add new input hardware, support new game engines and customized algorithms, and support new programming languages.

The utilities include C/C++ components such as PXCUPipeline(C) and UtilPipeline(C++). These components are mainly used to set up and manage the pipeline sessions. The frameworks and session ports include ports for Unity 3D, processing, other frameworks and game engines, and ports for programming languages such as C# and Java*. The SDK interfaces include core framework APIs, I/O classes, and algorithms. The perceptual computing applications interact with the SDK through these three main functional blocks.

“The Intel [Perceptual Computing] SDK was quite helpful,” Zhang said. “We didn’t encounter any problems when we were developing this application. We were able to become productive in a very short amount of time.”

Intel® RealSense™ Technology


Developers around the world are learning more about Intel® RealSense™ technology. Announced at CES 2014, Intel RealSense technology is the new name and brand for what was formerly called Intel® Perceptual Computing technology. The intuitive new user interface has features such as gesture and voice, which Intel brought to the market in 2013. With Intel RealSense technology, users will have new, additional features, including scanning, modifying, printing, and sharing in 3D, plus major advances in AR interfaces. These new features will yield games and applications where users can naturally manipulate and play with scanned 3D objects using advanced hand- and finger-sensing technology.

Zhang has now seen directly what other developers are doing with AR technology. At CES 2014, he viewed several demos from around the world. While each demo was unique and sought to achieve different objectives, he was encouraged by the rapidly evolving 3D camera technology. “It is such a big deal to have hand-gesture detection within the SDK. People still can use the camera in a different way, but the basics are there for them. I suggest to developers that they do their homework with this technology and find capabilities to fully develop their ideas.”

With advanced hand-and-finger tracking, developers can put their users in situations where they can control devices with heightened precision, from the simplest commands to intricate 3D manipulations. Coupled with natural-language voice technology and accurate facial recognition, devices will get to know their users on a new level.

Depth sensing makes gaming feel more immersive, and accurate hand-and-finger tracking brings exceptional precision to any virtual adventure. Games become more immersive and fun. Using AR technology and finger sensing, developers will be able to blend the real world with the virtual world.

Zhang believes the coming Intel RealSense 3D camera will be uniquely suited to application scenarios he is familiar with. “From what I have heard, it’s going to be even better—more accurate, more features, more intuitive. We are looking forward to it. There will be 3D face tracking and other great features, too. It’s the first 3D camera for a laptop that can serve as a motion-sensing device,” he said, “but it’s different than a Kinect. It can cover as much area as an in-house 3D camera, too. I think the new Intel camera will be a better device for manufacturers to integrate into laptops and tablets. That’s very important as a micro user-interface device for portability as well. We will definitely develop a lot of programs in the future with this camera.”

Maya 3D


The ARPedia team used Maya 3D animation software to continually tweak their small, realistic model of the well-known Tyrannosaurus rex dinosaur. By building the right model—with realistic movements and fine, detailed colors—the basics were in place for the rest of the application.

Maya is the gold standard for creating 3D computer animation, modeling, simulation, rendering, and more. It’s a highly extensible production platform, supports next-generation display technology, boasts accelerated modeling workflows, and handles complex data. The team was new to 3D software, but they had some experience with Maya and were able to update and integrate their existing graphics easily. Zhang said the team spent extra time on the graphics. “We spent almost a month on designing and modifying the graphics to make everything look better and to improve the interaction method as well,” he said.

Unity 3D


The team chose the Unity engine as the foundation for their application. Unity is a powerful rendering engine used to create interactive 3D and 2D content. Known as both an application builder and a game development tool, the Unity toolset is known to be intuitive, easy to use, and reliably supports multi-platform development. It’s an ideal solution for beginners and veterans alike looking to develop simulations, casual and serious games, and applications for web, mobile, or console.

Zhang said the Unity decision was an easy one. “We developed all our AR applications using Unity, including this one,” he said. We know it and trust it to do the things we need.” He was able to import meshes as proprietary 3D application files from Maya quickly and easily, saving time and energy.

Information Today, Games Tomorrow


ARPedia has many interesting angles for future work. For starters, the team sees opportunities for games and other applications, using their work from the Intel Perceptual Computing Challenge as a foundation. “We’ve talked a lot with some interested parties,” Zhang said. “They want us to draw up this demo into a full, complete version as well. Hopefully, we can find a place in the market. We will add many more dinosaurs to the game and introduce a full knowledge of these dinosaurs to gain more interest. They are in an interesting environment, and we’ll design more interesting interactions around it.“

“We also plan to design a pet game where users can breed and raise their own virtual dinosaur. They’ll have their own specific collections, and they can show off with each other. We will make it a network game as well. We plan to do a lot more scenes for a new version.”

The team was very surprised to win, as they were not familiar with the work of other development teams around the world. “We didn’t know other people’s work. We work on our own things, and we don’t get much opportunity to see what others are doing,” Zhang said. Now they know where they fit, and they’re ready for more. “The contest gave us motivation to prove ourselves and a chance to compare and communicate with other developers. We are very thankful to Intel for this opportunity. We now know more about the primary technologies around the world, and we have more confidence in developing augmented reality applications in the future.”

Resources


Intel® Developer Zone
Intel® Perceptual Computing Challenge
Intel® RealSense™ Technology
Intel® Perceptual Computing SDK
Check out the compatibility guide in the perceptual computing documentation to ensure that your existing applications can take advantage of the Intel® RealSense™ 3D Camera.
The Intel® Perceptual Computing SDK 2013 R7 release notes.
Maya* software overview
Unity*

For more complete information about compiler optimizations, see our Optimization Notice.