Developer Guide

Contents

Annotating Unified Shared Memory Pointers

When using unified shared memory (USM) to perform a host allocation or a device allocation, Intel® recommends annotating raw USM pointers inside the kernel before accessing the pointers using the
host_ptr
or
device_ptr
object. The
host_ptr
and
device_ptr
objects are instances of the
multi_ptr
class in SYCL that provides constructors for address space qualified pointers.
Using
host_ptr
and
device_ptr
objects allow the compiler to perform better alias analysis, which typically leads to better throughput and smaller Silicon area for your design. Also, host or device annotated pointers allow the compiler to infer simpler RTL, because load-store units (LSUs) that want to access the USM pointers must be connected only to the host memory or only to the device memory, respectively. Without the annotations, the compiler is compelled to connect LSUs to both memories because the location of the pointer is unknown at compile time.
For example, when using the function to define a pointer
Ptr
, construct a
device_ptr
object using the pointer
Ptr
inside the kernel, and access the
device_ptr
object directly instead of accessing the pointer
Ptr
:
T* ptr = malloc_device<T>(1024, Queue); ... cgh.single_task<class DeviceAnnotation>([=]() { Ptr[0] = 42; // load-store unit connected to both device and host memories device_ptr<T> DevicePtr(Ptr); DevicePtr[1] = 43; // load-store unit connected only to the device memory });
Similarly, when using the function to define a pointer
Ptr
, construct a
host_ptr
object using the pointer
Ptr
inside the kernel and access the
host_ptr
object directly instead of accessing
Ptr
:
T* ptr = malloc_host<T>(1024, Queue); ... cgh.single_task<class HostAnnotation>([=]() { Ptr[0] = 42; // load-store unit connected to both device and host memories host_ptr<T> HostPtr(Ptr); HostPtr[1] = 43; // load-store unit connected only to the host memory });
Use annotations consistently and match them with the type of the runtime allocation used. Mismatches and inconsistencies when using the address space annotations are considered undefined behavior and may lead to incorrect results or hardware hangs.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.