Essentials of Data Parallel C++
Learn the fundamentals of this language designed for data parallel and heterogeneous computing through hands-on practice in this guided learning path.
Overview
Data Parallel C++ (DPC++) provides a consistent programming language across CPU, GPU, FPGA, and AI accelerators in a heterogeneous framework where each architecture can be programmed and used either in isolation or together.
The language and API extensions in DPC++ enable different development use cases, including development of new offload acceleration or heterogeneous compute applications, conversion of existing C or C++ code to SYCL and DPC++, and migrating from other accelerator languages or frameworks.
Use this learning path to get hands-on practice with the essentials of DPC++ using Jupyter* Notebooks on Intel® DevCloud.
Objectives
Who is this for?
Developers who want to learn the the basics of DPC++ for heterogeneous computing (CPU, GPU, FPGA, and AI accelerators).
What will I be able to do?
Practice the essential concepts and features of DPC++ with live sample code on Intel DevCloud.
Start Learning DPC++
Get hands-on practice with code samples in Jupyter Notebooks running live on Intel DevCloud.
To get started:
- Sign in to Intel DevCloud, select One Click Log In for JupyterLab, and then select Launch Server (if needed).
- Open the oneAPI_Essentials folder, and then select 00_Introduction_to_Jupyter to open the folder.
- Select Introduction_to_Jupyter.ipynb.
- If you already have an Intel DevCloud account, it may be necessary to update oneAPI_Essentials. To do this, scroll to the bottom of Introduction_to_Jupyter.ipynb and execute the last code cell.
- Refresh your browser.
Modules
Introduction to JupyterLab and Notebooks
Use Jupyter Notebooks to modify and run code as part of learning exercises.
To begin, open Introduction_to_Jupyter.ipnyb.
Introduction to DPC++
- Articulate how oneAPI can help to solve the challenges of programming in a heterogeneous world.
- Use oneAPI solutions to enable your workflows.
- Understand the DPC++ language and programming model.
- Become familiar with using Jupyter Notebooks for training throughout the course.
DPC++ Program Structure
- Articulate the SYCL fundamental classes.
- Use device selection to offload kernel workloads.
- Decide when to use basic parallel kernels and ND Range Kernels.
- Create a host accessor.
- Build a sample DPC++ application through hands-on lab exercises.
DPC++ Unified Shared Memory
- Use new DPC++ features like Unified Shared Memory (USM) to simplify programming.
- Understand implicit and explicit ways of moving memory using USM.
- Solve data dependency between kernel tasks in an optimal way.
DPC++ Sub-Groups
- Understand advantages of using Sub-groups in DPC++.
- Take advantage of Sub-group collectives in ND-Range kernel implementation.
- Use Sub-group Shuffle operations to avoid explicit memory operations.
Demonstration of Intel® Advisor
- See how Offload Advisor¹ identifies and ranks parallelization opportunities for offload.
- Run Offload Advisor using command line syntax.
- Use performance models and analyze generated reports.
¹Offload Advisor is a feature of Intel Advisor installed as part of the Intel oneAPI Base Toolkit.
Intel® VTune™ Profiler on Intel® DevCloud
- Profile a DPC++ application using Intel® VTune™ Profiler on Intel® DevCloud.
- Understand the basics of command line options in Intel VTune Profiler to collect data and generate reports.
Introduction to oneDPL: A Set of oneDPC++ Libraries
- Simplify DPC++ programming using Intel® oneAPI DPC++ Library (oneDPL).
- Use DPC++ Library algorithms for heterogeneous computing.
- Implement oneDPL algorithms using buffers and unified shared memory.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.