## Optimized for Your Hardware

For maximum calculation speed, each function is highly tuned to the instruction set, vector width, core count, and memory architecture of each target processor.

Benchmark Source: Intel Corporation.

Configuration: 2x Intel® Xeon® E5-2660 CPU @ 2.60GHz, 128 GB, Intel® DAAL 2018; Alternating Least Squares – Users=1M Products=1M Ratings=10M Factors=100 Iterations=1 MLLib time=165.9 sec DAAL time=40.5 sec Gain=4.1x; Correlation – N=1M P=2000 size=37 GB Mllib time=169.2 sec DAAL=12.9 sec Gain=13.1x; PCA – n=10M p=1000 Partitions=360 Size=75 GB Mllib=246.6 sec DAAL (seq)=17.4 sec Gain=14.2x

See below for further notes and disclaimers.^{1}

Intel DAAL is tuned for a broad range of Intel® processors including Intel Atom®, Intel® Core™, Intel® Xeon®, and Intel® Xeon Phi™ processors targeting IoT gateways to back-end servers since applications may benefit from splitting analytics processing across several platforms.

## Optimized for Developer Productivity

Provides advanced Python*, C++, and Java* data analytics functions spanning all processing stages, pre-optimized and ready to use to reduce software development time.

- Get fast throughput with easy connections to popular analytics platforms (Hadoop* and Spark*) and data sources (SQL, non-SQL, files, in-memory)
- Batch, streaming (online) and distributed compute models are all supported to cover a range of application data set sizes and performance requirements

## Algorithms

#### Data Analysis: Characterization, Summarization, and Transformation

**Low Order Moments**

Computes the basic dataset characteristics such as sums, means, second order raw moments, variances, standard deviations, etc.

**Quantile**

Computes quantiles that summarize the distribution of data across equal-sized groups as defined by quantile orders.

**Correlation and Variance-Covariance Matrices**

Quantifies pairwise statistical relationship between feature vectors.

**Cosine Distance Matrix**

Measures pairwise similarity between feature vectors using cosine distances.

**Correlation Distance Matrix**

Measures pairwise similarity between feature vectors using correlation distances.

**Cholesky Decomposition**

Decomposes a symmetric positive-definite matrix into a product of a lower triangular matrix and its transpose. This decomposition is a basic operation used in solving linear systems, non-linear optimization, Kalman filtration, etc.

**QR Decomposition**

Decomposes a general matrix into a product of an orthogonal matrix and an upper triangular matrix. This decomposition is used in solving linear inverse and least squares problems. It is also a fundamental operation in finding eigenvalues and eigenvectors.

**Singular Value Decomposition (SVD)**

Decomposes a matrix into a product of a left singular vector, singular values, and a right singular vector. It is the basis of principal component analysis (PCA), solving linear inverse problems, and data fitting.

**Principal Component Analysis (PCA)**

Reduces the dimensionality of data by transforming input feature vectors into a new set of principal components that are orthogonal to each other.

**K-Means**

Partitions a dataset into clusters of similar data points. Each cluster is represented by a centroid, which is the mean of all data points in the cluster.

**Expectation-Maximization**

Finds maximum-likelihood estimate of the parameters in models. It is used for the Gaussian Mixture Model as a clustering method. It can also be used in non-linear dimensionality reduction, missing value problems, etc.

**Outlier Detection**

Identifies observations that are abnormally distant from other observations. An entire feature vector (multivariate) or a single feature value (univariate), can be considered in determining if the corresponding observation is an outlier.

**Association Rules**

Discovers a relationship between variables with certain level of confidence.

**Linear and Radial Basis Function Kernel Functions**

Map data onto higher-dimensional space.

**Quality Metrics**

Compute a set of numeric values to characterize quantitative properties of the results returned by analytical algorithms. These metrics include a confusion matrix, accuracy, precision, recall, F-score, etc.

#### Machine Learning: Regression, Classification, and More

**Neural Networks for Deep Learning**

A programming paradigm that enables a computer to learn from observational data.

**Linear and Ridge Regressions**

Models relationship between dependent variables and one or more explanatory variables by fitting linear equations to observed data.

**Naïve Bayes Classifier**

Splits observations into distinct classes by assigning labels. Naïve Bayes is a probabilistic classifier that assumes independence between features. Often used in text classification and medical diagnosis, it works well even when there are some level of dependence between features.

**Boosting**

Builds a strong classifier from an ensemble of weighted weak classifiers, by iteratively re-weighting according to the accuracy measured for the weak classifiers. A decision stump is provided as a weak classifier. Available boosting algorithms include AdaBoost (a binary classifier), BrownBoost (a binary classifier), and LogitBoost (a multi-class classifier).

**SVM**

Support Vector Machine is a popular binary classifier. It computes a hyperplane that separates observed feature vectors into two classes.

**Multiclass Classifier**

Builds a multi-class classifier using a binary classifier such as SVM.

**Alternating Linear Squares (ALS)**

Collaborative filtering method of making predictions about the preferences of a user, based on preference information collected from many users..

**Decision Trees**

A method commonly used in data mining. The goal is to create a model that predicts the value of a target variable based on several input variables that use a decision tree as a predictive model, to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves).

**Decision Forests**

An ensemble learning method for classification, regression and other tasks, which operate by constructing a multitude of decision trees at training time and outputting the mode of the classes (classification), or mean prediction (regression) of the individual trees.

**k-Nearest Neighbors (k-NN)**

A type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification.