Get Started Guide

  • 2021.1
  • 12/04/2020
  • Public Content

Get Started with the
Intel® oneAPI Data Analytics Library

Intel® oneAPI Data Analytics Library
) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The current version of
provides Data Parallel C++ (DPC++) API extensions to the traditional C++ interface.
For general information about
, visit the
GitHub* repository
or browse
. The complete list of features and documentation are available at official

Before You Begin

The latest
is located in the
directory where
is the directory in which
Intel® oneAPI Base Toolkit
was installed. The current version of
with DPC++ is available for Linux* and Windows* 64-bit operating systems. The prebuilt
libraries can be found in the
To learn about the system requirements and the dependencies needed to build examples with DPC++ API extensions, refer to the System Requirements page.

End-to-end Example

Below you can find a typical usage workflow for a
algorithm on GPU. The example is provided for Principal Component Analysis algorithm (PCA).
The following steps depict how to:
  • Read data from a CSV file
  • Run the training and inference operations for PCA
  • Access intermediate results obtained at the training stage
  1. Include the following header that makes all
    declarations available:
    #include "oneapi/dal.hpp" /* Standard library headers required by this example */ #include <cassert> #include <iostream>
  2. Create a SYCL* queue with the desired device selector. In this case, GPU selector is used::
    const auto queue = sycl::queue{ sycl::gpu_selector{} };
  3. Since all
    declarations are in the
    namespace, import all declarations from the
    namespace to use dal instead of
    for brevity:
    using namespace oneapi;
  4. Use CSV data source to read the data from the CSV file into a table:
    const auto data = dal::read<dal::table>(queue, dal::csv::data_source{"data.csv"});
  5. Create a PCA descriptor, configure its parameters, and run the training algorithm on the data loaded from CSV:
    const auto pca_desc = dal::pca::descriptor<float> .set_component_count(3) .set_deterministic(true); const dal::pca::train_result train_res = dal::train(queue, pca_desc, data);
  6. Print the eigenvectors:
    const dal::table eigenvectors = train_res.get_eigenvectors(); const auto acc = dal::row_accessor<const float>{eigenvectors}; for (std::int64_t i = 0; i < eigenvectors.row_count(); i++) { /* Get i-th row from the table, the eigenvector stores pointer to USM */ const dal::array<float> eigenvector = acc.pull(queue, {i, i + 1}); assert(eigenvector.get_count() == eigenvectors.get_column_count()); std::cout << i << "-th eigenvector: "; for (std::int64_t j = 0; j < eigenvector.get_count(); j++) { std::cout << eigenvector[j] << " "; } std::cout << std::endl; }
  7. Use the trained model for inference to reduce dimensionality of the data:
    const dal::pca::model model = train_res.get_model(); const dal::table data_transformed = dal::infer(queue, pca_desc, data).get_transformed_data(); assert(data_transformed.column_count() == 3);

Build and Run Examples

This section describes how to set up dependencies between components that are required to build and run
examples. These components are
Intel® oneAPI Data Analytics Library
Intel® oneAPI DPC++/C++ Compiler
, and Intel® oneAPI Threading Building Blocks. As each of them is a part of the
Intel® oneAPI Base Toolkit
, you can set up the dependencies between all components of the toolkit by running
Perform the following steps to build and run examples demonstrating the basic usage scenarios of
with DPC++. Go to
and then set up an environment as shown in the example below:
All content starting with a # below is considered a NOTE and should be removed before using the code.
  1. Set up the
    • On Linux*:
      # Run script to setup CPATH, LIBRARY_PATH and LD_LIBRARY_PATH for oneDAL source ./env/
    • On Windows*:
      # Run script to setup PATH, LIB and INCLUDE for oneDAL env/vars.bat
  2. Set up the compiler environment for
    Intel® oneAPI DPC++/C++ Compiler
    . See Get Started with
    Intel® oneAPI DPC++/C++ Compiler
    for details. For the multi-threaded version of
    , set up the environment for Intel® oneAPI Threading Building Blocks.
  3. Build and run DPC++ examples:
    You need to have write permissions to the
    folder to build examples, and execute permissions to run them. Otherwise, you need to copy
    folders to the directory with right permissions. These two folders must be retained in the same directory level relative to each other.
    Multi-threaded version of
    is used to compile examples.
    • On Linux*:
      # Navigate to DPC++ examples directory and build examples cd ./examples/oneapi/dpc # Compile and run Correlation example using Intel® oneAPI DPC++/C++ Compiler make example=cor_dense_batch # Compile all DPC++ examples make mode=build
    • On Windows*:
      # Navigate to DPC++ examples directory and build examples cd examples/oneapi/dpc # Compile and run Correlation example using Intel® oneAPI DPC++/C++ Compiler nmake libintel64 example=cor_dense_batch+ # Compile all DPC++ examples nmake libintel64 mode=build
    To see all available parameters of the build procedure, type
    on Linux* or
    on Windows*.
  4. The resulting example binaries and log files are written to the
You should run DPC++ examples from
folder, not from
folder. Most examples require data to be stored in
folder and to have a relative link to it started from

Find More

Refer to
Developer Guide for detailed information about implemented algorithms.
Check system requirements before you install
Intel® oneAPI Data Analytics Library
Refer to release notes for
Intel® oneAPI Data Analytics Library
to learn about new updates in the latest release.
Learn how to use
with daal4py, a Python* API.
Learn about requirements for implementations of oneAPI Data Analytics Library.
Intel® Data Analytics Acceleration Library (Intel® DAAL) is now Intel® oneAPI Data Analytics Library (oneDAL). Documentation for older versions of Intel® DAAL is available for download only. For a list of available documentation downloads by product version, see these pages:

Notices and Disclaimers

Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at