Intel® Advisor
Design Code for Efficient Vectorization, Threading, Memory Usage, and Accelerator Offloading
Design Software for Today's and Future Hardware
Intel® Advisor is a design and analysis tool for achieving high application performance. This is done through efficient threading, vectorization, and memory use, and GPU offload on current and future Intel® hardware.
The tool supports C, C++, Fortran, Data Parallel C++ (DPC++), OpenMP*, and Python*. It has the following features:
- Offload Advisor: Get your code ready for efficient GPU offload even before you have the hardware.
- Automated Roofline Analysis: See performance headroom against hardware limitations and get insights for an effective optimization roadmap.
- Vectorization Advisor: Enable more vector parallelism and get guidance to improve its efficiency.
- Threading Advisor: Model, tune, and test threading design options.
- Flow Graph Analyzer: Create, visualize, and analyze task and dependency computation for heterogeneous algorithms.
Develop in the Cloud
Get what you need to build, test, and optimize your oneAPI projects for free. With an Intel® DevCloud account, you get 120 days of access to the latest Intel® hardware—CPUs, GPUs, FPGAs—and Intel oneAPI tools and frameworks. No software downloads. No configuration steps. No installations.
Download the Toolkit
Intel Advisor is included as part of the Intel® oneAPI Base Toolkit.
Features
Efficiently Offload Your Code to GPUs
Use Offload Advisor to understand if your code would benefit from GPU porting.
- Identify offload opportunities where it pays off the most.
- Quantify the potential performance speedup from GPU offloading.
- Locate bottlenecks and identify potential performance gains of fixing each.
- Estimate data-transfer costs and get guidance on how to optimize data transfer.
Cookbook: Identify Code Regions to Offload to GPU and Visualize GPU Usage
Cookbook: Model C++ Application Performance on a Target GPU
Optimize for Memory and Compute
Automated Roofline Analysis provides an intuitive visual representation of application performance against hardware-imposed limitations, such as memory bandwidth and compute capacity.
With automated roofline analysis, you can:
- See performance headroom against hardware limitations
- Get insights into an effective optimization roadmap
- Identify high-impact optimization opportunities
- Detect and prioritize bottlenecks by performance gain and understand their likely causes, such as memory bound versus compute bound
- Pinpoint exact memory bottlenecks (L1, L2, L3, or DRAM)
- Visualize optimization progress
Get Started with the Roofline Feature
CPU Roofline Analysis User Guide
Optimize Vectorization for Better Performance
Vectorization is the operation of Single Instruction Multiple Data (SIMD) instructions on multiple data objects in parallel within a single CPU core. This can greatly increase performance by reducing loop overhead and making better use of the multiple math units in each core.
- Find loops that will benefit from better vectorization.
- Identify where it is safe to force compiler vectorization.
- Pinpoint memory-access issues that may cause slowdowns.
- Get actionable user code-centric guidance to improve vectorization efficiency.
Model, Tune, and Test Multiple Threading Designs
Threading Advisor helps you quickly prototype multiple threading options, project scaling on larger systems, optimize faster, and implement with confidence.
- Identify issues and fix them before implementing parallelism.
- Add threading to C, C++, C#, and Fortran code.
- Prototype the performance impact of different threaded designs and project scaling on systems with larger core counts to identify potential scaling without disrupting development or implementation.
- Find and eliminate data-sharing issues during design (when they're less expensive to fix.)
Create, Visualize, and Analyze Task and Dependency Computation Graphs
Flow Graph Analyzer (FGA) is a rapid visual prototyping environment for applications that can be expressed as flow graphs using Intel® Threading Building Blocks (Intel® TBB) applications.
- Construct, validate, and model application design and performance before generating Intel TBB code.
- Get insight into nested or top-level data-parallel algorithm efficiency.
In addition to Intel TBB, FGA helps you to:
- Visualize and interact with DPC++ asynchronous task graphs
- Get insights into DPC++ task scheduling inefficiencies
- Speed up algorithm design and express data-parallel constructs efficiently
- Visualize and analyze OpenMP task dependence graphs for performance bottlenecks
Overview of Flow Graphs Using Intel® Threading Building Blocks (Intel® TBB)
Optimize Driving Performance with the Flow Graph Analyzer
Unravel OpenMP and Intel TBB Task Graphs with Flow Graph Analyzer
What's New in 2021?
Offload Advisor
Get your code ready for efficient GPU offload even before you have the hardware. Identify offload opportunities, quantify potential speedup, locate bottlenecks, estimate data transfer costs, and get guidance on how to optimize.
Explore the Intel Advisor Cookbook
Automated Roofline Analysis for GPUs
Visualize actual performance of GPU kernels against hardware-imposed performance limitations and get recommendations for effective memory vs. compute optimization.
Memory-Level Roofline Analysis
Pinpoint exact memory hierarchy bottlenecks (L1, L2, L3 or DRAM).
Flow Graph Analyzer Support for DPC++
Visualize asynchronous task graphs, diagnose performance issues, and get recommendations to fix them.
Intuitive User Interface
New interface workflows and toolbars incorporate Roofline Analysis for GPUs and Offload Advisor.
Intel® Iris® Xe MAX Graphics Support
Roofline analysis and Offload Advisor now supports Intel's first discrete GPU.
For a complete and up-to-date list, see the release notes.
Get Started
Download
Intel Advisor is a part of the Intel® oneAPI Base Toolkit.
Try It Out
Follow the Get Started Guide and use an introductory code sample to see how Intel Advisor works.
Learn High-Performance Code Design
Browse the cookbooks for recipes to help you design and optimize high-performing code for modern computer architectures.
Documentation & Code Samples
Get Started
Code Samples
Learn how to access oneAPI code samples in a tool command line or IDE.
Training
Articles
Optimize the Performance of oneAPI Applications
A Unified User Experience with New Intuitive GUI
Gain Performance Insights Using the Python API for Intel® Advisor
How to Use Intel Advisor for Collecting and Analyzing Data on Cray* Systems
Use Intel® Advisor and Intel® VTune™ Profiler with Message Passing Interface (MPI)
Webinars and Videos
Offload Your Code from CPU to GPU… and Optimize It
Roofline: Should I Optimize for Compute, Memory, or Both?
Roofline Part 2: Fast Insights to Optimized Vectorization and Memory
OpenMP and TBB Task Graphs: Unraveling the Spaghetti with Flow Graph Analyzer
Transform Serial Code to Parallel Code Using Threading and Vectorization: Part 1 (54 min) | Part 2 (55 min)
Specifications
Processors:
- Intel® processors
GPU:
- Intel® Processor Graphics Gen9 and above
- Xe architecture
Languages:
- DPC++
- C and C++
- Fortran
- Python (for mixed native and Python codes)
Operating systems:
- Windows*
- Linux*
- macOS* (viewer only)
Compilers:
- Compilers from Intel
- Microsoft* compilers
- GNU Compiler Collection (GCC)*
- Other compilers that follow the same standards
For additional details, see the system requirements.
Intel® VTune™ Profiler
Use this separate performance tool to optimize application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more.
- CPU, GPU, and FPGA: Tune the entire application’s performance―not just the accelerated portion.
- Multilingual: Profile DPC++, C, C++, C#, Fortran, OpenCL™, Python, Google Go* programming language, Java*, Assembly, or any combination.
- System or application: Get coarse-grained system data for an extended period or detailed results mapped to source code.
- Power: Optimize performance while avoiding power- and thermal-related throttling.
Get Help
Your success is our success. Access these support resources when you need assistance.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.