A Walk-Through of Online Processing Using Intel® DAAL

Intel® Data Analytics Acceleration Library (Intel® DAAL) is a new highly optimized library targeting data mining, statistical analysis, and machine learning applications. It provides advanced building blocks supporting all data analysis stages. Intel DAAL supports three processing modes, batch processing, online processing, and distributed processing.

Online processing, a.k.a. streaming, is applicable when data is processed in blocks. This can be helpful if the entire dataset is too big to fit in memory all at once; or if the data is only available piecemeal.

Introduction

Some Intel DAAL algorithms enable processing of data sets in blocks. In the online processing mode, thecompute(), and finalizeCompute() methods of a particular algorithm class are used. This computation mode assumes that the data arrives in blocks i = 1, 2, 3, … nblocks. Call the compute() method each time new input becomes available. When the last block of data arrives, call the finalizeCompute() method to produce final results. If the input data arrives in an asynchronous mode, you can use the getStatus() method for a given data source to check whether a new block of data is available for load.

The following diagram illustrates the computation schema for online processing:
Online Processing Workflow Step 1


NOTE

While different data blocks may have different numbers of observations (ni ), they must have the same number of feature vectors (p).


Online Processing Workflow Step 3


Online Processing Workflow Step 4

Example

Online processing typically involves a loop. Each iteration of the loop fetches one data block and computes the partial result for the current block. Then, outside the loop, partial results are combined to produce the overall results for all data blocks processed. The code snippet below illustrates the situation:

/* Create algorithm to compute SVD decomposition in online mode */
svd::Online<> algorithm;
Status loadStatus;
while((loadStatus = dataSource.loadDataBlock(nRowsInBlock)) == success)
{
        algorithm.input.set( svd::data, dataSource.getNumericTable() );
        /* Compute SVD algorithm */
        algorithm.compute();
}

/* Finalize the computations and retrieve SVD results */
algorithm.finalizeCompute();
SharedPtr<svd::Result> res = algorithm.getResult();

/* Access results */
printNumericTable(res->get(svd::singularValues), "Singular values:");
printNumericTable(res->get(svd::rightSingularMatrix), "Right orthogonal matrix V:");
printNumericTable(res->get(svd::leftSingularMatrix), "Left orthogonal matrix U:", 10);

 

有关编译器优化的更完整信息,请参阅优化通知