Get Started Guide

  • 2021.3
  • 06/28/2021
  • Public Content

Get Started with the
Intel® oneAPI Data Analytics Library

The
Intel® oneAPI Data Analytics Library
(
oneDAL
) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The current version of
oneDAL
provides Data Parallel C++ (DPC++) API extensions to the traditional C++ interface.
For general information about
oneDAL
, visit the
oneDAL
GitHub* repository
or browse
oneDAL
documentation
. The complete list of features and documentation are available at official
oneDAL
website
.

Before You Begin

The latest
oneDAL
is located in the
<install-dir>/dal/latest
directory where
<install-dir>
is the directory in which
Intel® oneAPI Base Toolkit
was installed. The current version of
oneDAL
with DPC++ is available for Linux* and Windows* 64-bit operating systems. The prebuilt
oneDAL
libraries can be found in the
<install-dir>/dal/<version>/redist
directory.
To learn about the system requirements and the dependencies needed to build examples with DPC++ API extensions, refer to the System Requirements page.

End-to-end Example

Below you can find a typical usage workflow for a
oneDAL
algorithm on GPU. The example is provided for Principal Component Analysis algorithm (PCA).
The following steps depict how to:
  • Read data from a CSV file
  • Run the training and inference operations for PCA
  • Access intermediate results obtained at the training stage
  1. Include the following header that makes all
    oneDAL
    declarations available:
    #include "oneapi/dal.hpp" /* Standard library headers required by this example */ #include <cassert> #include <iostream>
  2. Create a SYCL* queue with the desired device selector. In this case, GPU selector is used::
    const auto queue = sycl::queue{ sycl::gpu_selector{} };
  3. Since all
    oneDAL
    declarations are in the
    oneapi::dal
    namespace, import all declarations from the
    oneapi
    namespace to use dal instead of
    oneapi::dal
    for brevity:
    using namespace oneapi;
  4. Use CSV data source to read the data from the CSV file into a table:
    const auto data = dal::read<dal::table>(queue, dal::csv::data_source{"data.csv"});
  5. Create a PCA descriptor, configure its parameters, and run the training algorithm on the data loaded from CSV:
    const auto pca_desc = dal::pca::descriptor<float> .set_component_count(3) .set_deterministic(true); const dal::pca::train_result train_res = dal::train(queue, pca_desc, data);
  6. Print the eigenvectors:
    const dal::table eigenvectors = train_res.get_eigenvectors(); const auto acc = dal::row_accessor<const float>{eigenvectors}; for (std::int64_t i = 0; i < eigenvectors.row_count(); i++) { /* Get i-th row from the table, the eigenvector stores pointer to USM */ const dal::array<float> eigenvector = acc.pull(queue, {i, i + 1}); assert(eigenvector.get_count() == eigenvectors.get_column_count()); std::cout << i << "-th eigenvector: "; for (std::int64_t j = 0; j < eigenvector.get_count(); j++) { std::cout << eigenvector[j] << " "; } std::cout << std::endl; }
  7. Use the trained model for inference to reduce dimensionality of the data:
    const dal::pca::model model = train_res.get_model(); const dal::table data_transformed = dal::infer(queue, pca_desc, data).get_transformed_data(); assert(data_transformed.column_count() == 3);

Build and Run Examples

This section describes how to set up dependencies between components that are required to build and run
oneDAL
examples. These components are
Intel® oneAPI Data Analytics Library
,
Intel® oneAPI DPC++/C++ Compiler
, and Intel® oneAPI Threading Building Blocks. As each of them is a part of the
Intel® oneAPI Base Toolkit
, you can set up the dependencies between all components of the toolkit by running
<install-dir>/setvars.bat
.
Perform the following steps to build and run examples demonstrating the basic usage scenarios of
oneDAL
with DPC++. Go to
<install-dir>/dal/<version>
and then set up an environment as shown in the example below:
All content starting with a # below is considered a NOTE and should be removed before using the code.
  1. Set up the
    oneDAL
    environment:
    • On Linux*:
      # Run script to setup CPATH, LIBRARY_PATH and LD_LIBRARY_PATH for oneDAL source ./env/vars.sh
    • On Windows*:
      # Run script to setup PATH, LIB and INCLUDE for oneDAL env/vars.bat
  2. Set up the compiler environment for
    Intel® oneAPI DPC++/C++ Compiler
    . See Get Started with
    Intel® oneAPI DPC++/C++ Compiler
    for details. For the multi-threaded version of
    oneDAL
    , set up the environment for Intel® oneAPI Threading Building Blocks.
  3. Build and run DPC++ examples:
    You need to have write permissions to the
    examples
    folder to build examples, and execute permissions to run them. Otherwise, you need to copy
    examples/oneapi/dpc
    and
    examples/oneapi/data
    folders to the directory with right permissions. These two folders must be retained in the same directory level relative to each other.
    Multi-threaded version of
    oneDAL
    is used to compile examples.
    • On Linux*:
      # Navigate to DPC++ examples directory and build examples cd ./examples/oneapi/dpc # Compile and run Correlation example using Intel® oneAPI DPC++/C++ Compiler make so example=svm_two_class_thunder_dense_batch # Compile all DPC++ examples make so mode=build
    • On Windows*:
      # Navigate to DPC++ examples directory and build examples cd examples/oneapi/dpc # Compile and run Correlation example using Intel® oneAPI DPC++/C++ Compiler nmake dll example=svm_two_class_thunder_dense_batch+ # Compile all DPC++ examples nmake dll mode=build
    To see all available parameters of the build procedure, type
    make
    on Linux* or
    nmake
    on Windows*.
  4. The resulting example binaries and log files are written to the
    _results
    directory.
You should run DPC++ examples from
examples/oneapi/dpc
folder, not from
_results
folder. Most examples require data to be stored in
examples/oneapi/data
folder and to have a relative link to it started from
examples/oneapi/dpc
folder.

Find More

Document
Description
Refer to
oneDAL
Developer Guide and Reference for detailed information about implemented algorithms.
Check system requirements before you install
Intel® oneAPI Data Analytics Library
.
Refer to release notes for
Intel® oneAPI Data Analytics Library
to learn about new updates in the latest release.
Learn how to use
oneDAL
with daal4py, a Python* API.
Learn about requirements for implementations of oneAPI Data Analytics Library.
Intel® Data Analytics Acceleration Library (Intel® DAAL) is now Intel® oneAPI Data Analytics Library (oneDAL). Documentation for older versions of Intel® DAAL is available for download only. For a list of available documentation downloads by product version, see these pages:

Notices and Disclaimers

Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.