Intel® AI Analytics Toolkit
Achieve End-to-End Performance for AI Workloads
Accelerate Data Science & AI Pipelines
The Intel® AI Analytics Toolkit gives data scientists, AI developers, and researchers familiar Python* tools and frameworks to accelerate end-to-end data science and analytics pipelines on Intel® architectures. The components are built using oneAPI libraries for low-level compute optimizations. This maximizes performance from preprocessing through machine learning.
Using this toolkit, you can:
- Deliver high-performance deep learning (DL) training on Intel® XPUs and integrate fast inference into your AI development workflow with Intel-optimized DL frameworks: TensorFlow* and PyTorch*, pretrained models, and low-precision tools.
- Achieve drop-in acceleration for data analytics and machine learning workflows with compute-intensive Python* packages: Modin*, NumPy, Numba, scikit-learn*, and XGBoost* optimized for Intel.
- Gain direct access to Intel analytics and AI optimizations to ensure that your software works together seamlessly.
Develop in the Cloud
Get what you need to build, test, and optimize your oneAPI projects for free. With an Intel® DevCloud account, you get 120 days of access to the latest Intel® hardware—CPUs, GPUs, FPGAs—and Intel oneAPI tools and frameworks. No software downloads. No configuration steps. No installations.
Download the Toolkit
In the News
An Open Road to Swift DataFrame Scaling
This podcast looks at the challenges of data preprocessing, especially time-consuming, data-wrangling tasks. It discusses how Intel and Omnisci are collaborating to provide integrated solutions that improve dataframe scaling.
Machine Learning Performance Results for Deep Learning Training on a CPU
Reflecting the broad range of AI workloads, Intel submitted results for Machine Learning Performance Training Release v0.7 in June 2020 for three training topologies: MiniGo, DLRM, and ResNet-50 v1.5. Results in each case demonstrated that Intel continues to raise the bar for training on general purpose CPUs.
Optimize XGBoost Training Performance
Compare the training performance of XGBoost 1.1 on a CPU with third-party GPUs. Learn more about the optimizations introduced to this popular gradient boosting trees algorithm.
Intel and Facebook Accelerate PyTorch Performance
Harnessing the new bfloat16 capability in Intel® Deep Learning Boost, the team substantially improved PyTorch performance across multiple training workloads on 3rd generation Intel® Xeon® Scalable processors.
Treatise of Medical Image Processing: COVID-19 Recognition
Read about a new proposal that uses an AI-based analytics system to detect COVID-19 from chest X-rays and CT radiographs.
Features
Optimized Deep Learning
- Leverage popular, Intel-optimized frameworks—including TensorFlow and PyTorch—to use the full power of Intel® architecture and yield high performance for training and inference.
- Expedite development by using the open source pretrained machine learning models that are optimized by Intel for best performance.
- Take advantage of automatic accuracy-driven tuning strategies along with additional objectives like performance, model size, or memory footprint using low precision tools.
Data Analytics & Machine Learning Acceleration
- Increase machine learning model accuracy and performance with algorithms in scikit-learn and XGBoost, optimized for Intel® architectures.
- Scale out efficiently to clusters and perform distributed machine learning by using daal4py, a Python interface to Intel® oneAPI Data Analytics Library (oneDAL).
High-Performance Python*
- Take advantage of the most popular and fastest growing programming language for AI and data analytics with underlying instruction sets optimized for Intel architectures.
- Process larger scientific data sets more quickly using drop-in performance enhancements to existing Python code.
- Achieve highly efficient multithreading, vectorization, and memory management, and scale scientific computations efficiently across a cluster.
Simplified Scaling across Multi-node DataFrames
- Seamlessly scale Pandas workflows to multicores and multi-nodes with only one line of code change using the Intel® Distribution of Modin*, an extremely light-weight parallel DataFrame.
- Accelerate data analytics with high-performance backends, such as OmniSci.
Benchmarks
These benchmarks illustrate the performance capabilities of Intel AI Analytics Toolkit.
What’s Included
Intel® Optimization for TensorFlow*
In collaboration with Google, TensorFlow has been directly optimized for Intel® architecture using the primitives of oneAPI Deep Neural Network Library (oneDNN) to maximize performance. This package provides the latest TensorFlow binary version compiled with CPU enabled settings (--config=mkl).
Intel® Optimization for PyTorch*
In collaboration with Facebook, this popular deep learning framework is now directly combined with many Intel optimizations to provide superior performance on Intel architecture. This package provides the binary version of latest PyTorch release for CPUs, and further adds Intel extensions and bindings with oneAPI Collective Communications Library (oneCCL) for efficient distributed training.
Model Zoo for Intel® Architecture
Access pretrained models, sample scripts, best practices, and step-by-step tutorials for many popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors.
Intel® Low Precision Optimization Tool
Provide a unified, low-precision inference interface across multiple deep learning frameworks optimized by Intel with this open-source Python library.
Intel® Distribution of Modin*(Beta)
Scale data preprocessing across multi-nodes using this intelligent, distributed DataFrame library with an identical API to pandas. The library integrates with OmniSci in the backend for accelerated analytics. This component is available only via Anaconda distribution of the toolkit. To download and install, refer to the Installation Guide.
Intel® Distribution for Python*
Achieve fast math-intensive workload performance without code changes for data science and machine learning problems.
Numerical and Scientific
- NumPy, SciPy: These popular libraries are accelerated with Intel® oneAPI Math Kernel Library (oneMKL) and provide drop-in performance enhancement to the vast ecosystem of statistics, mathematical optimizations, and many other data-centric computations.
- Numba*: This just-in-time compiler for decorated Python code allows the latest SIMD features and multicore execution to fully utilize modern CPUs. It is accelerated with Intel® oneAPI Threading Building Blocks (oneTBB).
Machine Learning
- XGBoost: This well-known machine learning package for gradient-boosted decision trees now includes seamless, drop-in acceleration for Intel architectures to significantly speed up model training and improve accuracy for better predictions.
- scikit-learn*: This popular machine learning Python package is now prebuilt and accelerated with Intel® oneAPI Data Analytics Library (oneDAL), oneMKL, and oneTBB.
- daal4py: A Python interface to oneDAL, this combines the one-liner API simplicity, similar to scikit-learn, with automatic scaling over multiple compute nodes.
Complete List of Packages for the Intel® Distribution for Python*
Documentation & Code Samples
Documentation
- Installation Guides: Intel Installer | Anaconda | Docker
- Package Managers: YUM | APT | Zypper
- Get Started Guides: Linux* | Containers
- Release Notes
- Maximize TensorFlow* Performance on CPUs: Considerations and Recommendations for Inference Workloads
- Intel Distribution for Python
- oneAPI GPU Optimization Guide
Code Samples
- Get Started: TensorFlow | PyTorch | Modin | XGBoost
- End-to-End Machine Learning for Census Workload
- INT8 Quantized Inference Performance
- TensorFlow Performance Analysis
- Multi-node Training with PyTorch
- Auto Mixed Precision with Intel Distribution for PyTorch
- Distributed Linear Regression Training and Prediction
- Distributed K-Means Training and Prediction
Training
- AI Analytics Part 1: Optimize End-to-End Data Science and Machine Learning Acceleration
- AI Analytics Part 2: Enhance Deep Learning Workloads on 3rd Gen Intel® Xeon® Scalable Processors
- AI Analytics Part 3: Walk through the Steps to Optimize End-to-End Machine Learning Workflows
- Intel® Optimization for TensorFlow*: Tips & Tricks for AI & HPC Convergence
- Achieve High-Performance Scaling for End-to-End Machine Learning and Data Analytics Workflows
Specifications
Processors:
- Intel® Xeon® processors
- Intel® Xeon® Scalable processors
- Intel® Core™ processors
Language:
- Python
Operating systems:
- Linux*
Development environments:
- Compatible with compilers from Intel and others that follow established language standards
- Linux: Eclipse* IDE
Distributed environments:
- MPI (MPICH-based, OpenMPI)
Support varies by tool. For details, see the system requirements.
Get Help
Your success is our success. Access these support resources when you need assistance.
- Intel AI Analytics Toolkit Support Forum
- Deep Learning Frameworks Support Forum
- Machine Learning and Data Analytics Support Forum
For additional help, see our general oneAPI Support.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.