Learn how to perform advanced tuning for specific microarchitectures.
This Python* tuning demonstration uses covariance implementations built into NumPy and the Intel® Data Analytics Acceleration Library. It includes code snippets.
Get step-by-step instructions for collecting performance data for MPI and hybrid MPI plus thread codes in a Linux* environment. It provides flexibility for profiling all ranks or just a subset.
See a demonstration of the Application Performance Snapshot. It offers fast ways to discover untapped performance and make the best use of your computer hardware. (20:27 min)
This video discusses the needs, advantages, and common tools and techniques for profiling Python applications. It includes a demo and code sample. (47:28 min)
In multisocket non-uniform memory access (NUMA) systems, get the best performance through memory object placement on the memory subsystem. (58:39 min)
Are you working with a hybrid program that just isn't performing? Find out how to give it a jolt with Intel's performance analysis tools. (43:49 min)
Where should I start to add parallelism? How scalable is my application? What sort of speed-up can I expect? This webinar answers these questions and more. (57:41 min)
Explore profiling a memory-bound linear-regression application using the General Exploration and Memory Access analyses.
Analyze an application that's based on the Data Plane Developer Kit (DPDK) for potential misconfiguration problems on a multisocket system using the General Exploration analysis.
Detect and fix frequent parallel bottlenecks of OpenMP* programs, such as imbalance on barriers and scheduling overhead.
Configure a Docker* container for the Intel VTune Amplifier analysis to identify hotspots in a Java* application running in the isolated container
Use Intel VTune Amplifier for .NET core dynamic-code profiling to locate performance hotspots in the managed code and optimize the application turnaround.