User Guide

Contents

SPDK IO Data View

Use
Intel VTune
Profiler
's Input and Output analysis to profile SPDK IO API, analyze PCIe traffic, and identify IO performance issues that may be caused by ineffective accesses to remote sockets, under-utilized throughput of an SPDK device, and others.
Prerequisites:
for successful analysis, make sure SPDK is built using the
--with-vtune
option.
VTune
Profiler
helps you optimize the following SPDK usage models:
  • Application services:
    • SPDK
      vhost-scsi
      to provide optimized block storage to VMs
    • SPDK NVMe to optimize access to the locally attached storage
  • Disaggregated storage:
    • NVM Express* over Fabrics
    • iSCSI targets
For SPDK analysis, consider the following workflow:

Identify Low SPDK Throughput Utilization

Start your analysis with the Summary window that displays overall SPDK performance statistics per executed operation types. Expand an operation block to identify potential IO performance imbalance among SSDs:
Explore the
SPDK Throughput
histogram to understand how long your workload has been under-utilizing SPDK throughput per device:
Then, you can switch to the
Bottom-up
window and filter out the Timeline view by
Low SPDK Throughput Utilization
metric to see the correlation among the throughput under-utilization, SPDK IO API calls, and PCIe traffic breakdown per physical device:
Locate an area of recession (
Low SPDK Throughput
markers with a high duration) on the timeline and zoom in to see performance changes for IO communications (for example, drops for SPDK operations). Right-click and select the
Filter In by Selection
menu option:
When the
Bottom-up
view is filtered in, you can apply the
Function
grouping to the grid and identify functions executed at the selected time frame. Double-click a function with the highest CPU time value to dive to the source view and analyze the code.

Identify IO Misconfiguration Issues on Multi-Socket Systems

Use the
Platform
window to analyze whether your SPDK workload is configured properly for a multi-socket system. To do this, switch to the
Package/Physical Core/Logical Core
grouping on the legend pane to track IO performance per package.
The example below illustrates an ineffective IO flow when an SPDK device and core consuming/producing data belong to different packages. As a result, you see high
UPI Bandwidth
values, which signals a heavy utilization of the interconnect:

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.