Introduction to Embree 2.1 - Part 1

Published:01/25/2014   Last Updated:01/24/2014

This is part of a series of blogs on Embree, a collection of high performance ray tracing kernels. Embree has been released open source since version 1.0. Version 2.0 was released during SIGGRAPH 2013 and Embree 2.1 was published on github just before Christmas 2013. The official web site has an overview on Embree with links to source code and example data sets. I'm not going to duplicate what's already been said over there and in the README.txt. They already include detailed instructions on how to get the source code, build, and run Embree with the example scenes. This series of blog articles will focus on the usage of Embree from a developer's perspective. I have been using Embree on Linux, but all information presented here should be relevant as well if you use other environments (Windows/Mac).

Embree 2.0 was a dramatic change from version 1.0: it added support for Intel Xeon Phi and used Intel ISPC as the primary SIMD programming language for packet ray tracing. Embree 2.1 is also quite a bit of change from 2.0, it introduced a new kernel API that supports user extensions on primitive types, instancing, as well as more flexible ways to structure the scene. Since version 1.0, Embree has always bundled the kernels with an example path tracer. This also changed in version 2.1 where the kernels were released separately from the reference path tracing renderer. All of these are a lot of information to absorb, but I will try to cover them in the subsequent blogs.

In the first couple of blogs I will discuss the reference implementation of the Embree renderer. The motivation is to allow someone unfamiliar to Embree can quickly exercise on various aspects of the renderer and the underlying kernels. After all, without a renderer, one can do little with just the kernels. Another important reason is that a highly optimized kernel requires a highly optimized renderer to reach its full potential. The Embree renderer is a good example of how to efficiently drive these kernels.

The Embree Render Devices

After you checkout the embree and embree-renderer source code from the github repository, follow the README.txt instructions in both projects to build them. Once they are built and properly set up so that the renderer can find the kernels libraries, you should be able to run this successfully:

renderer -c ../models/cornell_box.ecs

You should see a render window pop up, similar to this:

Embree cornell box render window

The entry point to the renderer is in


This is an excellent place to learn how to drive the Embree renderer at a high level. Most code in Embree is under the namespace "embree". At the beginning of renderer.cpp, a list of global states and their default values are defined, followed by a number of functions which will parse the command line options to initialize these global states. It's best to look at the *.ecs files (which are plain text) in the context of the command line parsers in the renderer.cpp file. You will understand how the options were passed on to the renderer. One of the key concepts of the renderer API is the render Device interface. A derived class of Device implements the Embree renderer API, which is specified in /devices/device/device.h The benefit of having this API is that the renderer does not have to worry about where the computation is actually occurring (Xeon or Xeon Phi, local or remote). There are currently four devices implemented in the renderer:

  • COIDevice in /device/device_coi
  • ISPCDevice in /device/device_ispc
  • NetworkDevice in /device/device_network
  • SingleRayDevice in /device/device_singleray

Embree render device types 

The ISPCDevice is a complete implementation of a path tracer with lights and materials. It's written in ispc which can be compiled to target Intel Xeon and Xeon Phi. It could also be set to use SSE, AVX, or Xeon Phi 512 bit SIMD instructions during compile time. The ISPCDevice utilizes the packet and hybrid ray tracing kernels, where SIMD is used to operate on multiple rays during traversal tests. In contrast, although the SingleRayDevice is also a complete implementation of the path tracer, it operates on single rays. The renderer features are very similar to the ISPCDevice. This renderer is mostly refactored from Embree 1.0.

The NetworkDevice is a communication device that abstracts the actual compute device (a processor on a remote machine). It handles data transfer between local and remote machine through the network socket. A renderer_server binary is built from this device and should be run on the remote machine to listen to incoming render requests. Behind the scene, the NetworkDevice uses the other devices (ISPCDevice, SingleRayDevice) to do the actual rendering. The COIDevice is similar to the NetworkDevice in that it abstracts communication over the Xeon Phi Common Offload Infrastructure (COI) API. This device allows the renderer to pass data onto the Xeon Phi for rendering. Behind the scene, it spawns an ISPCDevice process on Xeon Phi.

So this is a quick overview of Embree 2.1 and its renderer devices. In the next blog, we will look at some of them in more details and see what you can do with the renderer.

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804