A Walk-Through of Online Processing Using Intel® DAAL

By Zhang Zhang, Published: 07/23/2015, Last Updated: 07/23/2015

Intel® Data Analytics Acceleration Library (Intel® DAAL) is a new highly optimized library targeting data mining, statistical analysis, and machine learning applications. It provides advanced building blocks supporting all data analysis stages. Intel DAAL supports three processing modes, batch processing, online processing, and distributed processing.

Online processing, a.k.a. streaming, is applicable when data is processed in blocks. This can be helpful if the entire dataset is too big to fit in memory all at once; or if the data is only available piecemeal.


Some Intel DAAL algorithms enable processing of data sets in blocks. In the online processing mode, thecompute(), and finalizeCompute() methods of a particular algorithm class are used. This computation mode assumes that the data arrives in blocks i = 1, 2, 3, … nblocks. Call the compute() method each time new input becomes available. When the last block of data arrives, call the finalizeCompute() method to produce final results. If the input data arrives in an asynchronous mode, you can use the getStatus() method for a given data source to check whether a new block of data is available for load.

The following diagram illustrates the computation schema for online processing:
Online Processing Workflow Step 1


While different data blocks may have different numbers of observations (ni ), they must have the same number of feature vectors (p).

Online Processing Workflow Step 3

Online Processing Workflow Step 4


Online processing typically involves a loop. Each iteration of the loop fetches one data block and computes the partial result for the current block. Then, outside the loop, partial results are combined to produce the overall results for all data blocks processed. The code snippet below illustrates the situation:

/* Create algorithm to compute SVD decomposition in online mode */
svd::Online<> algorithm;
Status loadStatus;
while((loadStatus = dataSource.loadDataBlock(nRowsInBlock)) == success)
        algorithm.input.set( svd::data, dataSource.getNumericTable() );
        /* Compute SVD algorithm */

/* Finalize the computations and retrieve SVD results */
SharedPtr<svd::Result> res = algorithm.getResult();

/* Access results */
printNumericTable(res->get(svd::singularValues), "Singular values:");
printNumericTable(res->get(svd::rightSingularMatrix), "Right orthogonal matrix V:");
printNumericTable(res->get(svd::leftSingularMatrix), "Left orthogonal matrix U:", 10);


Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804