• 2019
  • 09/13/2018
  • Public Content

Intel® VTune™ Amplifier 2019
  • New Hotspots analysis, combining former Basic Hotspots and Advanced Hotspots analysis configurations, that provides quick understanding of the application performance hotspots and further analysis steps -
    . By default, the Hotspots analysis operates on the user-mode sampling collection mode, but you can enable the lower overhead hardware event-based sampling mode that requires the sampling driver to be installed.
  • New Threading analysis combining and replacing former Concurrency and Locks and Waits analysis types
  • New Intel VTune Amplifier Platform Profiler tool that provides low-overhead, system-wide analysis and insights into overall system configuration performance and behavior. Use the tool to:
    • Identify bottlenecks by monitoring over- or under-utilized subsystems and buses (CPU, storage, memory, PCIe, and network interfaces) and platform-level imbalances
    • Understand a system topology using diagrams annotated with performance data
    • Capture average-case and transient behaviors for data-center applications
  • Microarchitecture analysis improvements:
    • Microarchitecture Exploration (formerly known as General Exploration) analysis configuration split to provide either a lightweight summary analysis or full detailed analysis with all levels of PMU metrics
    • Microarchitecture Exploration analysis view extended with the hardware metric representation that helps easily identify bottlenecks in the hardware usage and benefit from quick insights
  • HPC workload profiling improvements:
    • CPU Utilization metric refined to differentiate the utilization on logical vs. physical cores, which is particularly important for HPC applications running on Intel® Xeon® processor family processors
    • Intel® Omni-Path Architecture Interconnect Bandwidth and Packet rate metrics added to HPC Performance Characterization analysis to identify performance bottlenecks caused by interconnect limits
    • HPC Performance Characterization analysis enriched with a thread affinity report that helps analyze CPU utilization or memory access issues of multithreaded and hybrid MPI and OpenMP* applications
  • GPU Compute/Media Hotspots analysis (formerly known as GPU Hotspots) on Linux updated to use Intel Metric Discovery API library for GPU metric collection, which involves support for kernel 4.14 and higher
  • Input and Output analysis on Linux* extended to profile DPDK and SPDK IO API. Use this data to correlate CPU activity with the network data plane utilization, visualize PCIe bandwidth utilization per NIC, estimate UPI bandwidth on multi-socket systems, and identify bottlenecks.
  • Containerization support improvements:
  • Managed runtime analysis improvements:
    • Extended JIT profiling for server-side applications running on the LLVM* or HHVM* PHP servers to support the event-based sampling analysis in the attach mode
    • Extended Java* code analysis with support for OpenJDK* 9 and Oracle* Java SE Development Kit 9
    • Improved source code analysis for .NET* Core applications running on Linux and Windows systems
  • Analysis on embedded platforms and accelerators:
    • New CPU/FPGA Interaction analysis (PREVIEW) to assess the balance between the CPU and FPGA on systems with a discrete Intel® Arria® 10 FPGA running OpenCL™ applications
    • New GPU Rendering analysis (PREVIEW) for CPU/GPU utilization of your code running on the Xen* virtualization platform installed on a remote embedded target
    • Support for the sampling command-line analysis on remote QNX* embedded systems via ethernet connection
      A PREVIEW FEATURE may or may not appear in a future production release. It is available for your use in the hopes that you will provide feedback on its usefulness and help determine its future. Data collected with a preview feature is not guaranteed to be backward compatible with future releases. Please send your feedback to parallel.studio.support@intel.com or to intelsystemstudio@intel.com.
  • KVM guest OS profiling extended to profile both KVM kernel and user space from the host system, which is helpful for a full-scale performance analysis of host and guest systems
  • Application Performance Snapshot improvements:
    • Added uncore-based metrics for DRAM/MCDRAM memory analysis, which helps identify whether your application is bandwidth bound
    • Added the ability to pause/resume collection with
      itt API
      . The
      option was added to exclude application execution from collection from the start to the first collection resume occurrence.
    • Enabled selection of which data types are collected to reduce overhead. The choices include MPI tracing, OpenMP tracing, hardware counter based collection, or a combination of the three.
    • Exposed the CPU Utilization metric by physical cores on processors that support proper hardware events.
    • Significantly reduced MPI tracing overhead when there are a large number of ranks.
    • Enriched MPI statistics generated by the
      utility by showing information about communicators used in the application and to group and filter collective operations by the communicators.
    • Improved integration with Intel® Trace Analyzer and Collector by adding the ability to generate profiling configuration files with the
    • Intel® Omni-Path Architecture Interconnect Bandwidth and Packet rate metrics added to explore MPI communication bottlenecks
    • Added an HTML-based rank-to-rank communication diagram to better visualize MPI application communication patterns
  • Quality and usability improvements:
    • Optimized product graphical interface with a simplified analysis configuration workflow providing you with pre-selected target and collection options available in the same view
    • Hardware event-based analysis supported for targets running in the Hyper-V* environment on Windows* 10 Fall Creators Update (RedStone3)
    • Default finalization mode set to Fast to minimize post-processing overhead if the number of collected samples exceeds the threshold
    • The Data of Interest type of metric used for the hotspot navigation in the Source view replaced with the explicit metric selection in the grid and applying the
      Use for Hotspot Navigation
      context menu command
    • CPU Frequency metric provided for the event-based analysis types (using the sampling driver) is improved to display more reliable data based on the P-State collection. The CPU Frequency metric is not provided for the user-mode sampling and tracing analyses and for analyses using the Perf* collector.
    • A list of supported output formats for the command line reports extended to support XML and HTML options
  • Support for new operating systems:
    • SUSE* Linux* Enterprise Server (SLES) 15
    • Red Hat* Enterprise Linux* 7.5
    • Ubuntu* 18.04
    • Fedora* 28
    • Microsoft Windows* 10 RS4

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.