Get Started with the
Intel® oneAPI Data Analytics Library
Intel® oneAPI Data Analytics Library
The
Intel® oneAPI Data Analytics Library
(oneDAL
) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The current version of
oneDAL
provides Data Parallel C++ (DPC++) API extensions to the traditional C++ interface.
For general information about
oneDAL
, visit the
oneDAL
GitHub* repository or browse
oneDAL
documentation. The complete list of features and documentation are available at
official
oneDAL
website.
Before You Begin
The latest
oneDAL
is located in the
<install-dir>/dal/latest
directory where
<install-dir>
is the directory in which
Intel® oneAPI Base Toolkit
was installed. The current version of
oneDAL
with DPC++ is available for Linux* and Windows* 64-bit operating systems. The prebuilt
oneDAL
libraries can be found in the
<install-dir>/dal/<version>/redist
directory.
To learn about the system requirements and the dependencies needed to build examples with DPC++ API extensions, refer to the
System Requirements page.
End-to-end Example
Below you can find a typical usage workflow for a
oneDAL
algorithm on GPU. The example is provided for Principal Component Analysis algorithm (PCA).
The following steps depict how to:
- Read data from a CSV file
- Run the training and inference operations for PCA
- Access intermediate results obtained at the training stage
- Include the following header that makes alloneDALdeclarations available:#include "oneapi/dal.hpp" /* Standard library headers required by this example */ #include <cassert> #include <iostream>
- Create a SYCL* queue with the desired device selector. In this case, GPU selector is used::const auto queue = sycl::queue{ sycl::gpu_selector{} };
- Since alloneDALdeclarations are in theoneapi::dalnamespace, import all declarations from theoneapinamespace to use dal instead ofoneapi::dalfor brevity:using namespace oneapi;
- Use CSV data source to read the data from the CSV file into a table:const auto data = dal::read<dal::table>(queue, dal::csv::data_source{"data.csv"});
- Create a PCA descriptor, configure its parameters, and run the training algorithm on the data loaded from CSV:const auto pca_desc = dal::pca::descriptor<float> .set_component_count(3) .set_deterministic(true); const dal::pca::train_result train_res = dal::train(queue, pca_desc, data);
- Print the eigenvectors:const dal::table eigenvectors = train_res.get_eigenvectors(); const auto acc = dal::row_accessor<const float>{eigenvectors}; for (std::int64_t i = 0; i < eigenvectors.row_count(); i++) { /* Get i-th row from the table, the eigenvector stores pointer to USM */ const dal::array<float> eigenvector = acc.pull(queue, {i, i + 1}); assert(eigenvector.get_count() == eigenvectors.get_column_count()); std::cout << i << "-th eigenvector: "; for (std::int64_t j = 0; j < eigenvector.get_count(); j++) { std::cout << eigenvector[j] << " "; } std::cout << std::endl; }
- Use the trained model for inference to reduce dimensionality of the data:const dal::pca::model model = train_res.get_model(); const dal::table data_transformed = dal::infer(queue, pca_desc, data).get_transformed_data(); assert(data_transformed.column_count() == 3);
Build and Run Examples
Build and Run Examples
This section describes how to set up dependencies between components that are required to build and run
oneDAL
examples. These components are
Intel® oneAPI Data Analytics Library
,
Intel® oneAPI DPC++/C++ Compiler
, and Intel® oneAPI Threading Building Blocks. As each of them is a part of the
Intel® oneAPI Base Toolkit
, you can set up the dependencies between all components of the toolkit by running
<install-dir>/setvars.bat
.
Perform the following steps to build and run examples demonstrating the basic usage scenarios of
oneDAL
with DPC++. Go to
<install-dir>/dal/<version>
and then set up an environment as shown in the example below:
All content starting with a # below is considered a NOTE and should be removed before using the code.
- Set up theoneDALenvironment:
- On Linux*:# Run script to setup CPATH, LIBRARY_PATH and LD_LIBRARY_PATH for oneDAL source ./env/vars.sh
- On Windows*:# Run script to setup PATH, LIB and INCLUDE for oneDAL env/vars.bat
- Set up the compiler environment forIntel® oneAPI DPC++/C++ Compiler. See Get Started withIntel® oneAPI DPC++/C++ Compilerfor details. For the multi-threaded version ofoneDAL, set up the environment for Intel® oneAPI Threading Building Blocks.
- Build and run DPC++ examples:You need to have write permissions to theexamplesfolder to build examples, and execute permissions to run them. Otherwise, you need to copyexamples/oneapi/dpcandexamples/oneapi/datafolders to the directory with right permissions. These two folders must be retained in the same directory level relative to each other.Multi-threaded version ofoneDALis used to compile examples.
- On Linux*:# Navigate to DPC++ examples directory and build examples cd ./examples/oneapi/dpc # Compile and run Correlation example using Intel® oneAPI DPC++/C++ Compiler make example=cor_dense_batch # Compile all DPC++ examples make mode=build
- On Windows*:# Navigate to DPC++ examples directory and build examples cd examples/oneapi/dpc # Compile and run Correlation example using Intel® oneAPI DPC++/C++ Compiler nmake libintel64 example=cor_dense_batch+ # Compile all DPC++ examples nmake libintel64 mode=build
To see all available parameters of the build procedure, typemakeon Linux* ornmakeon Windows*. - The resulting example binaries and log files are written to the_resultsdirectory.
You should run DPC++ examples from
examples/oneapi/dpc
folder, not from
_results
folder. Most examples require data to be stored in
examples/oneapi/data
folder and to have a relative link to it started from
examples/oneapi/dpc
folder.
Find More
Document
| Description
|
---|---|
Refer to
oneDAL Developer Guide for detailed information about implemented algorithms.
| |
Check system requirements before you install
Intel® oneAPI Data Analytics Library .
| |
Refer to release notes for
Intel® oneAPI Data Analytics Library to learn about new updates in the latest release.
| |
Learn how to use
oneDAL with daal4py, a Python* API.
| |
Learn about requirements for implementations of oneAPI Data Analytics Library.
|
Intel® Data Analytics Acceleration Library (Intel® DAAL) is now Intel® oneAPI Data Analytics Library (oneDAL). Documentation for older versions of Intel® DAAL is available for download only. For a list of available documentation downloads by product version, see these pages:
Notices and Disclaimers
Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.