Intel® Data Analytics Acceleration Library (Intel® DAAL): how to add user-defined algorithm

By Andrey Nikolaev, Published: 10/20/2016, Last Updated: 10/20/2016


Intel® DAAL is a performance library that provides building blocks for all stages of data analysis, from data acquisition to machine learning and model based prediction. The algorithmic component of the library provides an extensive set of features for analysis and supervised and unsupervised learning optimized for Intel® architectures. The stage of data analysis is supported by the library via Principal Component Analysis, Variance-Covariance matrix, Cosine distance matrix, Outlier detection, Optimization solvers, etc. To describe the hidden structure of unlabeled data, you can use clustering algorithms such as K-means or Expectation-Maximization for Gaussian mixture model, which are also available in the library. Search of the model that describes your labeled data as accurately as possible can rely on classification algorithms such as Support Vector Machine (SVM) or regression algorithms, e.g., Ridge Regression. Recent extensions of the library provided in the Intel® DAAL 2017 release include building blocks for neural network based computations, including deep learning.The full list and description of the algorithms is available here.

In addition to the rich set of the library’s algorithmic features, quickly emerging analytical applications can use of the combination of standard/unchanged and customized/changed machine learning algorithms. For example, to use the Intel® DAAL SVM in your application, you may need a kernel function that is not supported in the library yet or even a customized kernel. Does it stop you from using the library in your application? No, it does not.

Intel® DAAL distribution offers the following options to customize algorithmic features of the library for the needs of your application.

Open source Intel® DAAL

Open source Intel® DAAL is licensed under Apache License 2.0. If your application requires some modification of the computation flow in an algorithm available in the library (e.g., support of an extra algorithmic parameter that improves the accuracy of your computations), download the source code. In the repository you will find all necessary instructions and pre-compiled dependencies to build the library in your environment. You are welcome to submit a pull request on github and contact Intel® DAAL team to discuss opportunities to enable your modification in the library, if it may have a broad use, from your perspective.

Enabling your algorithm in Intel® DAAL API

The interfaces of the library are designed to meet multiple requirements, and extensibility to easily support a new algorithm is one of them. Let’s discuss how to integrate a new C++ algorithm in the library and use it with other Intel® DAAL data structures such as Numeric Tables. For simplicity we assume that we want to enable the batch version of the algorithm for computation of the vector of means for the dense dataset (this link discusses the computation modes supported by the library).

Our goal is to have the example that calls the new algorithm in Intel® DAAL style and relies on use of the library data structures such as SharedPtr and NumericTable

#include "daal.h"
#include "user_defined_algorithm.h"
int main()
 services::SharedPtr<data_management::HomogenNumericTable<double> >
  dataPtr(new data_management::HomogenNumericTable<double>(x, p, n));
 /* Create user-defined algorithm to compute mean using the default method */
 new_daal_algorithm::Batch<> algorithm;
 /* Set input objects for the algorithm */
 algorithm.input.set(new_daal_algorithm::data, dataPtr);
 /* Compute mean */
 /* Get computed mean result */
 data_management::HomogenNumericTable<double>* r= 
  return 0;

We will discuss the conventions and components the new algorithm should support for the application to have the above style.  We will refer to respective lines of the user_defined_algorithm.h header file that contains the new algorithm to be integrated with Intel® DAAL.

Algorithm namespace

Each C++ algorithm of the library is introduced in its namespace. For a new algorithm we introduce new_daal_algorithm namespace.

Algorithm method

Each C++ algorithm of the library is associated with the computation method. In the namespace of the algorithm, we introduce respective enumeration enum Method and populate it with one identifier (line 36). This value is used by the library at the stage of validation of the objects associated with the algorithm.

Each C++ algorithm is represented with a set of the classes that describe its inputs, results, parameters, and computational drivers. Those classes are declared in the namespace of the algorithm

Algorithm parameter

Introduce the structure Parameter to represent the parameters of your algorithm and derive it from the base library class daal::algorithms::Parameter (lines 42-51):

struct Parameter : public daal::algorithms::Parameter {};

Define the fields and constructors of the structure and add the method for validation of the parameters required by the conventions of the library.

Algorithm input

It is represented with two components:

  •  Enumeration  enum InputId that defines the identifier of the input object, non-negative integer  (line 54).
  • The class Input (lines 60-81) introduces the input objects of the algorithm such as the dataset represented in the format of the Intel® DAAL Numeric Table. The class Input of the new algorithm is derived from the respective base class available in the library: 
class Input : public daal::algorithms::Input{};

This base class takes care of the actual storage of the input objects and access to them. Internally, the base class stores the input objects as a collection of the shared pointers to the SerializationIface class: Collection<SharedPtr<SerializationIface>>

Define the get()/set() methods for accessing and setting the input dataset in the class, as well as the method check() for validation of the input objects required by the conventions of the library.

Algorithm result

As the Input class, Result introduces the result objects of the algorithm with two components:

  •  Enumeration enum ResultId that defines the identifier of the input object, a non-negative integer (line 84).
  • The class Result (lines 90-128), that introduces the result objects of the algorithm. In this example it is one result, the vector of means. The class Result of the new algorithm is derived from the respective base class available in the library
class Result : public daal::algorithms::Result {};

As the base class for an input object, this base class takes care about the storage of the result objects.

Define the get()/set() methods for accessing and setting the result in the class, as well as the method check() for validation of the result objects required by the conventions of the library.

Additionally, the conventions of the library require to define an additional template method allocate() (line 114) responsible for allocation of the memory for result objects of the algorithm. Note that the allocation method has the template parameter that allows you to choose the data type to represent the results of your algorithm.


Now we are ready to discuss other important classes of the algorithm that would represent the driver of the computations. Each C++ algorithm in Intel® DAAL is represented with a couple of related classes, Batch and BatchContainer. The Batch class is a tiny interface class of the algorithm that serves as a “depository” for input objects and parameters of the algorithm, as well as for its results. It provides the service functionality we will show below. You do not need to define compute() method as it is defined in one of the base classes of Intel® DAAL hierarchy of the algorithmic component that manages the computation flow. The real computations associated with the algorithm are represented by the method compute() of the related BatchContainer class called by the compute() method of the interface class Batch. This class serves as a storage for CPU specific implementations of the algorithm and takes a request from Batch class to run the computations and return the result or report an error.

Algorithm Batch class

The class derives from the base class available in the library. For simplicity we assume that the new algorithm will run the analysis computations. So, we derive it from daal::algorithms::Analysis<batch> class as shown below (line 145):

template<typename algorithmFPType = double, Method method = defaultDense> class : public daal::algorithms::Analysis<batch>{};

The class is a template that has two parameters:

  • Parameter algorithmFPType specifies which precision, single or double, will be used in the intermediate computations of the algorithm. This parameter allows you to run computations in the precision different from the precision used to store the input dataset.
  • Parameter Method specifies the method of the computations (we introduced just one method named defaultDense)

The class has two public members, one for Input and the other for Parameter (line 148 and 149), and a private member, for Result (line 204). Conventions of the library require two constructors, one default and another one clone (lines 151 and 156), as well as the methods for accessing the results and the index of the method of the algorithm (lines 165 and 168). Also, have a look at the protected section of the class. It contains few methods including allocateResult (line 175) required by the conventions of the library.

Algorithm BatchContainer class

We will declare the container of the algorithm that does the real work. This class derives from the base class of the library (line 134) with the default constructor:

class BatchContainer : public daal::algorithms::AnalysisContainerIface<batch>{}

Now we are ready to implement the algorithm for computation of the mean. To do this, we will declare the method compute() and implement it in the same file (line 208).


To build the example that uses the new algorithm together with Intel® DAAL, use the regular normal compilation line and additionally specify the path to the header file with the algorithm just created.


Representation of the new algorithm using Intel DAAL® API style will help transparently combine your new algorithm with other components of the library, both algorithmic and data management, and have consistent Intel® DAAL programming style. The list of conventions defined by the library for integration of the new algorithm is small and easy to follow. You can always consult with  the library documents available here, as well as look at the interface files and source code of the algorithms at to clarify implementation details of the algorithms. You are always welcome to contact Intel® DAAL team via forum,  if you have any questions on the customization of the library for your specific requirements.

This paper is one in the series of the papers for deep dive into the features, architecture, and algorithms of the library. Please let us know if you are interested in learning additional details of specific feature or component of Intel® DAAL

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804