OS Thread Migration
- Application:a test OpenMP* application. The application is used as a demo and not available for download.
- Performance analysis tools:Intel® VTune™version 2018 or newer - Hotspots analysisProfiler
- All the Cookbook recipes are scalable and can be applied to Intel VTune Amplifier 2018 and higher. Slight version-specific configuration changes are possible.
- Intel® VTune™ Amplifier has been renamed to Intel® VTune™ Profiler starting with its version for Intel® oneAPI Base Toolkit (Beta). You can still use a standalone version of the VTune Profiler, or its versions integrated into Intel Parallel Studio XE or Intel System Studio.
- Operating system:Linux*, Ubuntu* 16.04 64-bit
- CPU:Intel® Core™ i7-6700K processor
Run Advanced Hotspots Analysis
Identify Thread Migration
To identify thread migration using the GUI, select the
Expand core nodes to see the number of software threads. In general, you need the total number of threads to be less than or equal to the total number of hardware threads supported by the CPU. In addition to this, you need the threads to be equally distributed across the cores. If you see more than the expected number of software threads under any core in your result, there is a thread migration occurring in your application. In the above example, there are 12 OpenMP* worker threads instead of 2 threads (since this is an Intel® Xeon® processor supporting Intel® Hyper-Threading Technology), executing on core_8. This indicates thread migration.
Thread/H/W Contextgrouping to analyze thread migration the Timeline pane.
Expand the thread nodes to see the number of CPUs where this thread was executed and analyze thread execution over time. In the example above, OpenMP thread #0 was executing on cpu_23 and then migrated to cpu_47.
amplxe-cl -group-by thread,cpuid -report hotspots -r /temp/test/omp -s "H/W Context" -q | less
Thread H/W Context CPU Time:Self ------------------------------ ----------- ------------- OMP Worker Thread #5 (0x3d86) cpu_0 0.004 matmul-intel64 (0x3d52) cpu_1 0.013 OMP Worker Thread #15 (0x3d90) cpu_10 2.418 matmul-intel64 (0x3d52) cpu_10 2.023 OMP Worker Thread #8 (0x3d89) cpu_10 0.687 OMP Worker Thread #13 (0x3d8e) cpu_10 0.097 OMP Worker Thread #6 (0x3d87) cpu_10 0.065 OMP Worker Thread #4 (0x3d85) cpu_10 0.059 OMP Worker Thread #1 (0x3d82) cpu_10 0.048 OMP Worker Thread #9 (0x3d8a) cpu_10 0.034 OMP Worker Thread #11 (0x3d8c) cpu_10 0.009