Intel® DPC++ Compatibility Tool
Transform Your CUDA Applications into Standards-Based Data Parallel C++ Code
Efficient Code Migration
- The Intel® DPC++ Compatibility Tool assists in migrating your existing CUDA code to Data Parallel C++ (DPC++) code
- DPC++ is based on ISO C++ and incorporates standard SYCL* and community extensions to simplify data parallel programming
How It Works
- The tool ports both CUDA language kernels and library API calls
- Typically, 80%-90% of CUDA code automatically migrates to DPC++ code
- Inline comments help you finish writing and tuning your DPC++ code
DPC++ Compatibility Tool Guide
Intel® oneAPI DPC++/C++ Compiler
What You Need
- The Intel DPC++ Compatibility Tool is included in the Intel® oneAPI Base Toolkit
- It integrates into familiar IDEs, including Eclipse* and Microsoft Visual Studio*
Code Migration: Before & After
Source CUDA Code
The Intel® DPC++ Compatibility Tool migrates software programs implemented with current and previous versions of CUDA. For details, see the release notes.
#include <cuda.h>
#include <stdio.h>
const int vector_size = 256;
__global__ void SimpleAddKernel(float *A, int offset)
{
A[threadIdx.x] = threadIdx.x + offset;
}int main()
{
float *d_A;
int offset = 10000;
cudaMalloc( &d_A, vector_size * sizeof( float ) );
SimpleAddKernel<<<1, vector_size>>>(d_A, offset);
float result[vector_size] = { };
cudaMemcpy(result, d_A, vector_size*sizeof(float), cudaMemcpyDeviceToHost);
cudaFree( d_A );
for (int i = 0; i < vector_size; ++i) {
if (i % 8 == 0) printf( "\n" );
printf( "%.1f ", result[i] );
}
return 0;
}
Migrated DPC++ Code
This resulting code is typical of what you can expect to see after code is ported. In most cases, code edits and optimizations will be required to complete the code migration.
#include <CL/sycl.hpp>
#include <dpct/dpct.hpp>
#include <stdio.h>
const int vector_size = 256;
void SimpleAddKernel(float *A, int offset, sycl::nd_item<3> item_ct1)
{
A[item_ct1.get_local_id(2)] = item_ct1.get_local_id(2) + offset;
}int main()
{
dpct::device_ext &dev_ct1 = dpct::get_current_device();
sycl::queue &q_ct1 = dev_ct1.default_queue();
float *d_A;
int offset = 10000;
d_A = sycl::malloc_device<float>(vector_size, q_ct1);
q_ct1.submit([&](sycl::handler &cgh) {
cgh.parallel_for(sycl::nd_range(sycl::range(1, 1, vector_size),
sycl::range(1, 1, vector_size)),
[=](sycl::nd_item<3> item_ct1) {
SimpleAddKernel(d_A, offset, item_ct1);
});
});
float result[vector_size] = { };
q_ct1.memcpy(result, d_A, vector_size * sizeof(float)).wait();
sycl::free(d_A, q_ct1);
for (int i = 0; i < vector_size; ++i) {
if (i % 8 == 0) printf( "\n" );
printf( "%.1f ", result[i] );
}
return 0;
}
Get Started
Download
Install and configure the Intel DPC++ Compatibility Toolkit, which is part of the Intel oneAPI Base Toolkit.
Learn More
Access additional samples, tutorials, and training resources.

Get the Intel® DPC++ Compatibility Tool as part of the Intel® oneAPI Base Toolkit
This foundational set of tools and libraries includes:
- Familiar tools and languages
- Advanced analysis and debugging tools
- Intel® DPC++ Compatibility Tool for CUDA code migration
Specifications
Operating system for development:
- Linux*
- Windows*
Software tool requirements:
- CUDA header files
- Eclipse* (optional)
- Microsoft Visual Studio* (optional)
For details, see the system requirements.