The Intel® Data Analytics Acceleration Library (Intel® DAAL) helps speed big data analytics by providing highly optimized algorithmic building blocks for all data analysis stages (Pre-processing, Transformation, Analysis, Modeling, Validation, and Decision Making) for offline, streaming and distributed analytics usages. It’s designed for use with popular data platforms including Hadoop*, Spark*, R, and Matlab*. for highly efficient data access.
Intel DAAL is available for Linux*, OS X* and Windows*.
Like the Intel® Math Kernel Library (Intel® MKL), Intel DAAL is a highly optimized library of computationally intensive routines supporting Intel architecture including Intel® Xeon® processors, Intel® Core processors, Intel® Atom processors and Intel® Xeon Phi™ processors.
Indeed, Data Scientists have been using Intel MKL to help with Big Data problems for some time. There are algorithms in Intel DAAL that have been in Intel MKL for years such matrix decomposition and low order moments. However, most of Intel MKL was designed for when all the data to operate upon fits in memory at once. Intel DAAL can handle situations when data is too big to fit in memory all at once, which can be referred to as having an ‘out of core’ algorithm. Intel DAAL provides for data to be available in chunks rather than all at once. Intel DAAL is designed for use with popular data platforms including Hadoop, Spark, R, Matlab, etc. for highly efficient data access. Intel DAAL has data management built in so that applications can directly access data from various kind for sources including files, in-memory buffer, SQL database, HDFS, etc.
Intel® DAAL supported three processing modes:
Intel® DAAL provides a rich set of algorithms, ranging from the most basic descriptive statistics for datasets to more advanced data mining and machine learning algorithms.
The Intel DAAL team loves feedback, and encourages it! Feedback from the beta this year means Intel will be adding more customer-requested algorithms in the upcoming months. The initial release, available as of August 25, 2015, of Intel DAAL includes the following algorithms:
Intel DAAL includes C++ and Java interfaces. In order to maximize performance, all the compute kernels in Intel DAAL are actually implemented using C++. Java is supported via wrappers around the high performance C++ implementation. The Java interface interacts with C++ kernel through the JNI (Java Native Interface). Users do not need to write any JNI code, it’s included with Intel DAAL.
The performance advantages of Intel® DAAL can be substantial. A comparison of the Principle Component Analysis in Intel DAAL vs. Spark + MLLib is shown here:
The 4X – 7X result is based on this very specific benchmark. Of course, your results may vary. For instance, consider this quote from a customer:
- Ilya Ganelin, Senior Data Engineer, Capital One Data Innovation Lab
I doubt there is a "typical" speed-up to expect - so I highly recommend trying Intel DAAL with your own Big Data needs. It's free to try!
You can download an evaluation copy of Intel DAAL today.
There is a series of webinars being held starting in September 2015 which cover many topics related to Intel Parallel Studio XE 2016. On September 29, 2015 (9am-10am Pacific Time) there is one entitled "Faster Big Data Analytics Using New Intel® Data Analytics Acceleration Library." The webinars can be attended live, and offer interactive question and answer time. The webinars will also be available for replay after the live webinar is held.
If you want to look at the mechanics of using Intel DAAL, you might want to take a look at Intel DAAL Code Samples showing some integration examples (which were posted to the Intel DAAL Forum on the Intel website), starting with a basic usage in C++ and Java code examples for both Apache Spark (interacting with Spark* RDD (Resilient Distributed Datasets)) and Apache Hadoop (using DAAL functions with Hadoop MapReduce including interacting with HDFS). Included in the Intel DAAL Code Samples are three code samples:
The Intel MKL optimizes many routines critical for Machine learning/deep learning. Intel fellow, Pradeep Dubey, had an excellent talk at Intel’s Developer Forum in San Francisco (August 18-20, 2015) which he summarizes in his blog “Pushing Machine Learning to a New Level with Intel® Xeon® and Intel® Xeon Phi™ Processors.” His presentation “Technology Insight: Data Analytics and Machine Learning” covers this topic. As Pradeep notes, even though results today offer record breaking performance, future releases of both Intel MKL and Intel DAAL will feature additional improvements for CNN/DNN.
Another presentation you may want to examine is “Accelerating Machine Learning with Intel® Tools and Libraries” created by Fred Magnotta, Zhang Zhang, and Vikram Saletore of Intel with Ilya Ganelin, Sr. Data Engineer, Capital One.
Intel DAAL is available for Linux, OS X and Windows.
An evaluation copy of Intel DAAL can be obtained by requesting an evaluation copy of Intel® Parallel Studio XE 2016. It is available for purchase worldwide as stand-alone library, or as part of an Intel® Parallel Studio XE 2016.
The Intel DAAL is also available via the Community Licensing for Intel Performance Libraries. Under this option, the library is free for any one who registers, with no royalties, and no restrictions on company or project size. The community licensing program offers the current versions of Intel DAAL without Online Service Center access (the Online Service Center offers exclusive 1-on-1 support via an interactive and secure web site where you can submit questions or problems and monitor previously submitted issues. It requires registration after purchase of the software, or special qualification offered to students, educators, academic researchers and open source contributors.). Of course, anyone can ask questions on the public Intel DAAL Forum.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserverd for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804