Developer Guide



For more information about algorithms implemented in the
Intel® Data Analytics Acceleration Library
, refer to the following publications:
Adams, Robert A., and John JF Fournier.
Sobolev spaces.
. Vol. 140. Elsevier, 2003.
Rakesh Agrawal, Ramakrishnan Srikant.
Fast Algorithms for Mining Association Rules
. Proceedings of the 20th VLDB Conference Santiago, Chile, 1994.
Arthur, D., Vassilvitskii, S.
k-means++: The Advantages of Careful Seeding
. Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics Philadelphia, PA, USA, 2007, pp. 1027-1035. Available from
B. Bahmani, B. Moseley, A. Vattani, R. Kumar, S. Vassilvitskii.
Scalable K-means++
. Proceedings of the VLDB Endowment, 2012. Available from
Ben-Gal I.
Outlier detection
. In: Maimon O. and Rockach L. (Eds.) Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers", Kluwer Academic Publishers, 2005, ISBN 0-387-24435-2.
C. Biernacki, G. Celeux, and G. Govaert.
Choosing Starting Values for the EM Algorithm for Getting the Highest Likelihood in Multivariate Gaussian Mixture Models
. Computational Statistics & Data Analysis, 41, 561-575, 2003.
Nedret Billor, Ali S. Hadib, and Paul F. Velleman.
BACON: blocked adaptive computationally efficient outlier nominators
. Computational Statistics & Data Analysis, 34, 279-298, 2000.
Christopher M. Bishop.
Pattern Recognition and Machine Learning
, p.198, Computational Statistics & Data Analysis, 34, 279-298, 2000. Springer Science+Business Media, LLC, ISBN-10: 0-387-31073-8, 2006.
B. E. Boser, I. Guyon, and V. Vapnik.
A training algorithm for optimal marginclassifiers.
. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp: 144–152, ACM Press, 1992.
Leo Breiman.
Random Forests
. Machine Learning, Volume 45 Issue 1, pp. 5-32, 2001.
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone.
Classification and Regression Trees
. Chapman & Hall, 1984.
Bro, R.; Acar, E.; Kolda, T..
Resolving the sign ambiguity in the singular value decomposition
. SANDIA Report, SAND2007-6422, Unlimited Release, October, 2007.
R. H. Byrd, S. L. Hansen, Jorge Nocedal, Y. Singer.
A Stochastic Quasi-Newton Method for Large-Scale Optimization
, 2015. arXiv:1401.7020v2 [math.OC]. Available from
T. Chen, C. Guestrin.
XGBoost: A Scalable Tree Boosting System
, KDD '16 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
, 2016. arXiv:1511.07289 [cs.LG]. Available from
Defazio, Aaron, Francis Bach, and Simon Lacoste-Julien.
SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives.
Advances in neural information processing systems. 2014.
A.P.Dempster, N.M. Laird, and D.B. Rubin.
Maximum-likelihood from incomplete data via the em algorithm
. J. Royal Statist. Soc. Ser. B., 39, 1977.
Elad Hazan, John Duchi, and Yoram Singer.
Adaptive subgradient methods for online learning and stochastic optimization
. The Journal of Machine Learning Research, 12:21212159, 2011.
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu.
A density-based algorithm for discovering clusters in large spatial databases with noise.
. In Proceedings of the 2nd ACM International Conference on Knowledge Discovery and Data Mining (KDD). 226-231, 1996.
Rong-En Fan, Pai-Hsuen Chen, Chih-Jen Lin.
Working Set Selection Using Second Order Information for Training Support Vector Machines.
. Journal of Machine Learning Research 6 (2005), pp: 1889–1918.
Rudolf Fleischer, Jinhui Xu.
Algorithmic Aspects in Information and Management
. 4th International conference, AAIM 2008, Shanghai, China, June 23-25, 2008. Proceedings, Springer.
Yoav Freund, Robert E. Schapire.
Additive Logistic regression: a statistical view of boosting
. Journal of Japanese Society for Artificial Intelligence (14(5)), 771-780, 1999.
Yoav Freund.
An adaptive version of the boost by majority algorithm
. Machine Learning (43), pp. 293-318, 2001.
Friedman, Jerome H., Trevor J. Hastie and Robert Tibshirani.
Additive Logistic Regression: a Statistical View of Boosting.
. 1998.
Jerome Friedman, Trevor Hastie, and Robert Tibshirani.
Additive Logistic regression: a statistical view of boosting
. The Annals of Statistics, 28(2), pp: 337-407, 2000.
Friedman, Jerome, Trevor Hastie, and Rob Tibshirani.
Regularization paths for generalized linear models via coordinate descent.
. Journal of statistical software 33.1 (2010): 1.
Jerome Friedman, Trevor Hastie, Robert Tibshirani. 2017.
The Elements of Statistical Learning Data Mining, Inference, and Prediction.
Xavier Glorot and Yoshua Bengio.
Understanding the difficulty of training deep feedforward neural networks
. International conference on artificial intelligence and statistics, 2010.
Karol Gregor, Yann LeCun.
Emergence of Complex-Like Cells in a Temporal Product Network with Local Receptive Fields
. ArXiv:1006.0448v1 [cs.NE] 2 Jun 2010.
Trevor Hastie, Robert Tibshirani, Jerome Friedman.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
. Second Edition (Springer Series in Statistics), Springer, 2009. Corr. 7
printing 2013 edition (December 23, 2011).
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on Image Net
, arXiv:1502.01852v1 [cs.CV] 6 Feb 201, available from
Arthur E. Hoerl and Robert W. Kennard.
Ridge Regression: Biased Estimation for Nonorthogonal Problems
. Technometrics, Vol. 12, No. 1 (Feb., 1970), pp. 55-67.
Chih-Wei Hsu and Chih-Jen Lin.
A Comparison of Methods for Multiclass Support Vector Machines
. IEEE Transactions on Neural Networks, Vol. 13, No. 2, pp: 415-425, 2002.
Yifan Hu, Yehuda Koren, Chris Volinsky.
Collaborative Filtering for Implicit Feedback Datasets
. ICDM'08. Eighth IEEE International Conference, 2008.
Wayne Iba, Pat Langley.
Induction of One-Level Decision Trees
. Proceedings of Ninth International Conference on Machine Learning, pp: 233-240, 1992.
Sergey Ioffe, Christian Szegedy.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
, arXiv:1502.03167v3 [cs.LG] 2 Mar 2015, available from
Gareth James, Daniela Witten, Trevor Hastie, and Rob Tibshirani.
An Introduction to Statistical Learning with Applications in R
. Springer Series in Statistics, Springer, 2013 (Corrected at 6
printing 2015).
K. Jarrett, K. Kavukcuoglu, M. A. Ranzato, and Y. LeCun.
What is the best multi-stage architecture for object recognition?
International Conference on Computer Vision, pp. 2146-2153, IEEE, 2009.
Thorsten Joachims.
Making Large-Scale SVM Learning Practical
. Advances in Kernel Methods - Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola (ed.), pp: 169 – 184, MIT Press Cambridge, MA, USA 1999.
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton.
ImageNet Classification with Deep Convolutional Neural Networks
. Available from
Yann LeCun, Yoshua Bengio, Geoffrey E. Hinton.
Deep Learning
. Nature (521), pp. 436-444, 2015.
Li, Shengren, and Nina Amenta. "Brute-force k-nearest neighbors search on the GPU." In International Conference on Similarity Search and Applications, pp. 259-270. Springer, Cham, 2015.
Stuart P Lloyd.
Least squares quantization in PCM
. IEEE Transactions on Information Theory 1982, 28 (2): 1982pp: 129–137.
Maitra, R.
Initializing Optimization Partitioning Algorithms
. ACM/IEEE Transactions on Computational Biology and Bioinformatics 2009, 6 (1): pp: 144-157.
Matsumoto, M., Nishimura, T.
Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator
. ACM Transactions on Modeling and Computer Simulation, Vol. 8, No. 1, pp. 3-30, January 1998.
Matsumoto, M., Nishimura, T.
Dynamic Creation of Pseudorandom Number Generators
Monte Carlo and Quasi-Monte Carlo Methods 1998, Ed. Niederreiter, H. and Spanier, J., Springer 2000, pp. 56-69, available from
Matthew D. Zeiler, Rob Fergus.
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
. 2013. Available from
Tom M. Mitchell.
Machine Learning
. McGraw-Hill Education, 1997.
Mu Li, Tong Zhang, Yuqiang Chen, Alexander J. Smola.
Efficient Mini-batch Training for Stochastic Optimization
, 2014. Available from
Md. Mostofa Ali Patwary, Nadathur Rajagopalan Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey.
PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures
, 2016. Available from
J. R. Quinlan.
Induction of Decision Trees
. Machine Learning, Volume 1 Issue 1, pp. 81-106, 1986.
J. R. Quinlan.
Simplifying decision trees
. International journal of Man-Machine Studies, Volume 27 Issue 3, pp. 221-234, 1987.
Jason D.M. Rennie, Lawrence, Shih, Jaime Teevan, David R. Karget.
Tackling the Poor Assumptions of Naïve Bayes Text classifiers
. Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), Washington DC, 2003.
David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams.
Learning representations by back-propagating errors
. Nature (323), pp. 533-536, 1986.
Marina Sokolova, Guy Lapalme.
A systematic analysis of performance measures for classification tasks
. Information Processing and Management 45 (2009), pp. 427–437. Available from
Christian Szegedy, Alexander Toshev, Dumitru Erhan.
Scalable Object Detection Using Deep Neural Networks
. Advances in Neural Information Processing Systems, 2013.
Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, (First Edition) Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA, 2005, ISBN: 032132136. Available from:
Verma, Deepika, Namita Kakkar, and Neha Mehan. "Comparison of brute-force and KD tree algorithm." International Journal of Advanced Research in Computer and Communication Engineering 3, no. 1 (2014): 5291-5294.
Wen, Zeyi, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. ThunderSVM: A fast SVM library on GPUs and CPUs. The Journal of Machine Learning Research, 19, 1-5 (2018).
D.H.D. West.
Updating Mean and Variance Estimates: An improved method
. Communications of ACM, 22(9), pp: 532-535, 1979.
Ting-Fan Wu, Chih-Jen Lin, Ruby C. Weng.
Probability Estimates for Multi-class Classification by Pairwise Coupling
. Journal of Machine Learning Research 5, pp: 975-1005, 2004.
Zhu, Ji, Hui Zou, Saharon Rosset and Trevor J. Hastie.
Multi-class AdaBoost
. 2005

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804