Intel® Advisor 2021.2
- New Source view for theOffload ModelingandGPU Roofline InsightsperspectivesTheOffload Modelingand GPU Roofline Insights reports now include a full-screenSourceview with syntax highlighting in a separate tab. Use it to explore application source code and related metrics.For theGPU Roofline Insightsperspective, theSourceview also includes theAssemblerview, which you can view side-by-side with the source.To switch to theSourceview, double-click a kernel from the main report.
- New Details pane with in-depth GPU kernel analytics for theGPU Roofline InsightsperspectiveThe GPU Roofline Regions report now includes a newDetailspane, which provides in-depth kernel execution metrics for a single kernel, such as execution time on GPU, work size and SIMD width, a single-kernel Roofline highlighting the distance to the nearest roof (performance limit), floating-point and integer operation summary, memory and cache bandwidth, EU occupancy, and instruction mix summary.
- Offload Modeling:
- Data transfers estimations with data reuse on GPUTheOffload Modelingperspective introduces a new data reuse analysis, which provides more accurate estimations of data transfer costs.Data reuse analysis detects groups of regions that can reuse the same memory objects on GPU. It also shows which kernels can benefit from data reuse and how it impacts application performance. This can decrease the data transfer tax because when two or more kernels use the same memory object, it needs to be transferred only once.
- Command line use cases for each Intel Advisor perspectiveSeveral new topics explain how to run eachIntel Advisorperspective from command line. Use these topics to understand what steps you should run for each perspective, recommended options to consider at each step, and different ways available to view the results. See the following topics:
- Guidance on how to check if you need to run the Dependencies analysis for the Offload Modeling perspectiveInformation about loop-carried dependencies might be very important to decide if a loop can be profitable to run on a GPU. Intel Advisor can use different resources to get this information, including the Dependencies analysis. The analysis adds a high overhead to your application and is optional for the Offload Modeling workflow. A new topic shows a recommended strategy that you can use to Check How Assumed Dependencies Affect Modeling and decide if you need to run the Dependencies analysis.
Intel® Advisor 2021.1
- Data Parallel C++ (DPC++):
- Implemented support for Data Parallel C++ (DPC++) code performance profiling on CPU and GPU targets.
- Implemented support for oneAPI Level Zero specification for DPC++ applications.
- Introduced a new and improvedIntel Advisoruser interface (UI) that includes:
To switch back to the old UI, set theADVISOR_EXPERIMANTAL=advixe_gui.
- New look-and-feel for multiple tabs and panes, for example,Workflowpane andToolbars
- Offload ModelingandGPU Rooflineworkflows integrated in GUI
- New notion ofperspective, which is a complete analysis workflow that you can customize to manage accuracy and overhead trade-off. Each perspective collects performance data, but processes and presents it differently so that you could look at it from different points of view depending on your goal.Intel AdvisorincludesOffload Modeling,GPU Roofline Insights,Vectorization and Code Insights,CPU / Memory Roofline Insights, andThreadingperspectives.
- Renamed executables and environment scripts:
See the Command Line Interface for details and sample command lines.The previous command line interface and executables are supported for backward compatibility.
- advixe-clis renamed toadvisor.
- advixe-guiis renamed toadvisor-gui.
- advixe-pythonis renamed toadvisor-python.
- advixe-vars.[c]shandadvixe-vars.batare renamed toadvisor-vars.[c]shandadvisor-vars.batrespectively.
- Offload Modeling:
- Introduced the Offload Modeling perspective (previously known as Offload Advisor) that you can use to prepare your code for efficient GPU offload even before you have a hardware. Identify parts of code can be efficiently offloaded to a target device, estimate potential speedup, and locate bottlenecks.
- Introduced data transfer analysis as an addition to the Offload Modeling perspective. The analysis reportsdata transfer costsestimated for offloading to a target device, estimatedamount of memoryyour application uses per memory level, andhintsfor data transfer optimizations.
- Introduced strategies to manage kernel invocation taxes (or kernel launch taxes) when modeling performance: do not hide invocation taxes, hide all invocation taxes except the first one, hide a part of invocation taxes. For more information, see Manage Invocation Taxes.
- Added support for modeling application performance for the Intel® Iris® Xe MAX graphics.
- Introduced Memory-Level Roofline feature (previously known as Integrated Roofline, tech preview feature). Memory-Level Roofline collects metrics for all memory levels and allows you to identify memory bottlenecks at different cache levels (L1, L2, L3 or DRAM).
- Added a limiting memory level roof to the Roofline guidance and recommendations, which improves recommendation accuracy.
- Added a single-kernel Roofline guidance for all memory levels with dots for multiple levels of a memory subsystem and limiting roof highlighting to theCode Analyticspane.
- Introduced a GPU Roofline Insights perspective. GPU Roofline visualizes actual performance of GPU kernels against hardware-imposed performance limitations. Use it to identify the main limiting factor of your application performance and get recommendations for effective memory vs. compute optimization. GPU Roofline report supports float and integer data types and reports metrics for all memory levels.
- Added support for profiling GPU workloads that run on the Intel® Iris® XeMAX graphics and building GPU Roofline for them.
- Flow Graph Analyzer:
- Added rules to the Static Rule-check engine to determine issues with unnecessary copies during the creation of buffers, host pointer accessor usage in a loop, multiple build/compilations for the same kernel when invoked multiple times.
- Introduced a PDF version of theIntel AdvisorUser Guide. ClickDownload as PDFat the top of this page to use the PDF version.
- Introduced a new user guide structure that focuses on the new UI and reflects the usage flow to improve usability.