Archived - Intel® RealSense™ Depth Enabled Photography

The Intel® RealSense™ SDK has been discontinued. No ongoing support or updates will be available.

By Sean Golden

Photography is in the midst of an exciting revolution. Intel, the corporation that has enabled and powered the digital age, has developed Intel® RealSense™ Depth Enabled Photography (DEP), a transformative approach to enabling artistic photography through new optical and processing technology. The result is a user experience that merges a virtual world with the real world in powerful ways. In short, it’s a new and compelling way to create, present, and experience visual art.

What Is Depth Enabled Photography (DEP)?

DEP expands existing digital camera technology into three dimensions by capturing depth information as part of the image. Adding a depth value to each pixel during image capture allows photographers to utilize exciting new use cases that support new editing and presentation opportunities like being able to refocus on a portion of the image after capture or being able to apply a filter to an object in the foreground while retaining the background. The initial capture of these images can be accomplished with cameras such as those using Intel’s RealSense technology, which are becoming available even on mobile devices. The capture of the images and additional metadata is enabled by a new file format – eXtensible Device Metadata (XDM).

Reference Image

Depth Map

eXtensible Device Metadata (XDM)

The core of DEP is the XDM depth enabled file format that Intel has contributed to along with other leading technology companies with the goal of creating an open standard for the DEP ecosystem. The XDM specification, version 1.0, is a standard for storing device-related metadata in common image containers such as JPEG while maintaining compatibility with existing image viewers. XDM enabled files expand on the image information captured and stored, including a new three-dimensional (3D) depth value. One way to understand the new format is to consider a typical RGB image, which store three color values for every pixel. Then, add an additional attribute for each pixel’s distance from the camera. This depth information is added as a separate “depth map” or “point cloud” within the file container. The addition of depth information enables new experiences and interactivity with captured photographs. XDM supporting files can also contain data from multiple cameras. At a minimum a single camera with depth information is required. The first camera is associated with the container image (usually in JPEG format), while other cameras can optionally provide additional image, point clouds or depth maps that are processed in relation to the first image. One scenario is that additional full-resolution images from slightly different perspectives are also supported.

In addition to the new depth attribute for each pixel, XDM supporting files allow users to include additional metadata as desired, such as the camera’s orientation, location, or even manufacturer-specific sensor specifications. Intel is working with other technology companies to support and standardize this new file format with the goal of creating a worldwide standard file format so that the files are compatible across a wide range of platforms, from smartphones to desktop computers. A standardized image format also allows software developers to create applications that can recognize and support files created on different hardware platforms.

XDM Metadata in a JPG container

Depth Enabled Photography in Action

Depth information can be used for a wide variety of functions that extend our idea of photography. Intel provides the Intel® RealSense™ SDK, which includes powerful use cases for photographers’ image-manipulation applications. Intel also encourages developers to create entirely new use cases that exploit or extend the depth enabled data format. Examples of core use cases that are expected to become commonplace are:

  • Artistic Filters/Background manipulation. By separating the image data into layers, image editing software can apply a wide variety of filters (masks) in real time. Users can apply filters to individual layers or can apply them while excluding specific layers. A simple example is maintaining color information for the foreground layer while converting the background layer to grayscale. Because the depth information allows for real-time separation of elements, it is even possible to substitute backgrounds, as shown in this Jim Parsons commercial demonstrating a virtual green screen in real time.
  • Depth-of-Field Change Effect. Depth of field (DoF) is a photographic concept originally controlled by the aperture and focal length of a camera lens. In simple terms, DoF is the distance between the nearest and farthest objects in the scene that appear sharp in the final photograph. Photographers use DoF to accentuate the elements in the photograph they want the viewer to notice. With traditional cameras, even modern digital SLRs, users must take multiple photographs to select different DoFs of the same image. Depth enabled photography images allow the photographer to change the DoF after the photograph has been taken. For example, a single photograph of a large gathering at a family reunion can be taken one time and shared with as many people as desired, with the resulting image making each person in the image the focal point simply by touching his or her face in the photo, as long as everyone in the photo is in focus. This video featuring Jim Parsons from “The Big Bang Theory” demonstrates dynamic depth of field changing.
  • Motion Effects. Creative uses of depth data allow us to be able to create a two-dimensional (2D) image with the illusion of motion. XDM containing image files can be used to create motion effects like parallax and dolly zoom. Parallax is the slight difference in position between foreground and background objects in a scene based on the viewer’s line of sight. Simulating that difference in position by manipulating the foreground and background depth information in an XDM file allows a powerful illusion of depth from a single image. Dolly zoom is a more cinematic effect created by enlarging or shrinking either the foreground or background image in relation to the other. For example, enlarging the background while keeping the subject in the foreground the same size creates an illusion that the subject is either moving backwards or shrinking. In contrast, enlarging the subject in the foreground while keeping the background the same size creates an illusion that the subject is moving forward or growing rapidly.
  • Editing. Identifying layers by depth allows users to make a wide range of static or dynamic edits to files. Users can easily insert objects into an image between the foreground and background based on the depth information captured in the XDM file. Objects can be removed and replaced with new objects without complex and time-consuming pixel-by-pixel edits. Once placed, the new elements can be moved around or resized without having to redefine boundaries or spend more time on further detailed pixel editing.
  • In-Photo Measurements. Determining the size of objects can be a difficult task. Most of us have had the experience of needing to know the size of a box or a piece of furniture. Having the 3D data in an image allows users to get quick estimates of sizes of objects without needing a measuring tape. A photo of a room can allow the user to access measurements from their tablet while shopping for curtains or a new sofa.

What Does It All Mean?

DEP is transforming the fundamental concepts of image capture, processing, and viewing. It extends our idea of what an image is to three dimensions and provides dynamic interactivity at all stages of the image’s life cycle. In many respects, the XDM file enhancement is a new graphic medium, as different from traditional digital images as digital images are from chemical film-based images. Just as digital imaging created new opportunities for artists to create, manipulate, and display their art, Intel RealSense provides people with new ways to express themselves in a dynamic, interactive environment.

Other aspects of Intel RealSense technology are more applicable to science and technology. Intel RealSense cameras provide highly accurate, real-time 3D rendering of the environment, allowing drones or robots to maneuver through complex terrain more effectively. This has obvious potential in the ongoing development of self-driving cars and trucks as well as for greatly improving the efficiency of remote drones, such as the Mars rovers.

Where Does Intel® RealSense™ Go from Here?

Intel RealSense technology is opening the door into a new form of graphic expression. As with most transformative technologies, there’s no way to predict every way it will change how we interact with images.

Many manufacturers have already begun to include Intel RealSense cameras in their products. New applications are being built with the Intel RealSense SDK as you read this. Clearly, cameras on mobile devices will not be the same once DEP becomes mainstream.

For More Information

About the Author

Sean Golden is an author and technical writer. His most recent corporate job was as program director at a global financial services company. His background includes managing initiatives in financial product development, online media, Internet and traditional publishing, and data center consolidations for Fortune 500 companies. Prior to that, he managed application and enterprise software development for 15 years and served as publisher for a suite of personal computing monthly publications. He has a B.S. degree in physics.

For more complete information about compiler optimizations, see our Optimization Notice.