What is Intel® DAAL?

已发布:07/23/2015   最后更新时间:07/23/2015

What is Intel® DAAL?

The Intel® Data Analytics Acceleration Library (Intel® DAAL) is the library of Intel® Architecture optimized building blocks covering all data analytics stages: data acquisition from a data source, preprocessing, transformation, data mining, modeling, validation, and decision making. To achieve best performance on a range of Intel® processors, Intel® DAAL uses optimized algorithms from the Intel® Math Kernel Library and Intel® Integrated Performance Primitives.

Intel® DAAL supports the concept of the end-to-end analytics when some of data analytics stages are performed on the edge devices (close to where the data is generated and where it is finally consumed). Specifically, Intel® DAAL Application Programming Interfaces (APIs) are agnostic about a particular cross-device communication technology and therefore can be used within different end-to-end analytics frameworks.      

Intel® DAAL is the library of commonly used building blocks for accelerating data analytics applications. It supports variety of usage scenarios ranging from doing analytics on IA-based mobile device or a sensor to running large scale distributed Big Data analysis on high performance clusters.

The library is targeting software developers who care about performance and power efficiency of their data analytics software as well as overall productivity. They don't need to spend days and months by implementing and optimizing commonly used data analysis algorithmic building blocks.

Intel® DAAL is friendly to many data analytics application developers. Its API support C++ and Java* languages and allow software developers to seamlessly integrate DAAL with their C++ and Java applications and platforms to get great native code performance even in managed code environments.

Unlike other libraries targeting Machine Learning and Data Mining domains, Intel® DAAL optimizes the entire workflow, from data acquisition from SQL* and no-SQL data sources to data transformations to data analysis, training and prediction.

What problems does this library solve?

Intel® DAAL can be utilized in Knowledge discovery and data mining, predictive analysis, machine learning, statistical analysis, AI, pattern recognition, neurocomputing and many other problems where huge amount of data is involved and analysis and decision making to be done faster.

Which application domains can this library be used?

Now a days every application generates significant amount of data in terms, texts, images, video, audio, sensor data, customer behaviors, financial data etc.  Whether it is to predict the customer behavior for shopping, pushing ads based on that or push the products that are highly likely to be bought by the user to the online portal the user visits, or analysis patients data to find a better medicine in the drug discovery domain or to reduce power consumption by analyzing the data from various sensor inputs or predicting the probability of repayment of a loan by the consumer in banks, and each and every domain where large data to be generated, prepared, analyzed Intel® DAAL can be used.

What algorithms are available?

The Algorithms component of the Intel® Data Analytics Acceleration Library (Intel® DAAL) consists of classes that implement algorithms for data analysis (data mining), and data modeling (training and prediction). Variety of algorithms that are used in various stages of data analytics are available in Intel® DAAL. 

  • Data mining and analysis algorithms for
    • Computing correlation distance and cosine distance
    • PCA (Correlation, SVD)
    • Matrix decomposition (SVD, QR, Cholesky)
    • Computing statistical moments
    • Computing variance-covariance and correlation matrices
    • Computing quantiles
    • Univariate and multivariate outlier detection
    • Association rule mining
    • Linear and RBF kernel functions
  • Algorithms for supervised and unsupervised machine learning:
    • Linear regressions
    • Naïve Bayes classifier
    • AdaBoost, LogitBoost, and BrownBoost classifiers
    • SVM classifier
    • K-Means clustering
    • Expectation Maximization (EM) for Gaussian Mixture Models (GMM)
    • Support for validation metrics for classifiers including Confusion Matrix, Accuracy, Precision, Recall, and Fscore.

Batch/Streaming/Distributed processing

The Intel® DAAL algorithms support the following computation modes:

  • Batch processing
  • Online processing
  • Distributed processing

You can select the computation mode during initialization of the Algorithm.

Batch Processing : All Intel® DAAL algorithms support at least the batch processing computation mode. In the batch processing mode, the only compute method of a particular algorithm class is used.

Online Processing : Some Intel® DAAL algorithms enable processing of data sets in blocks. In the online processing mode, the compute(), and finalizeCompute() methods of a particular algorithm class are used. This computation mode assumes that the data arrives in blocks i = 1, 2, 3, … nblocks. Call the compute() method each time new input becomes available. When the last block of data arrives, call the finalizeCompute() method to produce final results. If the input data arrives in an asynchronous mode, you can use the getStatus() method for a given data source to check whether a new block of data is available for load.

Distributed Processing:  Some Intel® DAAL algorithms enable processing of data sets distributed across several devices. In distributed processing mode, the compute(), and finalizeCompute() methods of a particular algorithm class are used. This computation mode assumes that the data set is split in nblocks blocks across computation nodes.

产品和性能信息

1

英特尔的编译器针对非英特尔微处理器的优化程度可能与英特尔微处理器相同(或不同)。这些优化包括 SSE2、SSE3 和 SSSE3 指令集和其他优化。对于在非英特尔制造的微处理器上进行的优化,英特尔不对相应的可用性、功能或有效性提供担保。该产品中依赖于微处理器的优化仅适用于英特尔微处理器。某些非特定于英特尔微架构的优化保留用于英特尔微处理器。关于此通知涵盖的特定指令集的更多信息,请参阅适用产品的用户指南和参考指南。

通知版本 #20110804