Isolate Application Performance Issues on Hyper-Threading Technology-Enabled Systems


Challenge

Identify the source of performance degradations or low performance gains of applications running on systems that support Hyper-Threading Technology. Once applications have been tuned for the Pentium® 4 processor, they can be tuned for processors that support Hyper-Threading Technology as a separate process. In some cases, however, the tuning process may not yield acceptable increases in performance.


Solution

Verify that the issue is related to Hyper-Threading Technology, and then root-cause it by means of the VTune™ Performance Analyzer. This analysis follows a standard, five-step methodology:.

  1. Assuming that the performance is not as expected on processors with Hyper-Threading Technology, the next step is to review the Intel® Pentium® 4 and Intel® Xeon® Processor Optimization Manual and the [http://shareit.intel.com/cd/ids/developer/asmo-na/eng/technologies/threading/hyperthreading/index.htm] white papers on Hyper-Threading Technology that are available on the Intel® Developer Services Web site. These resources can be used to identify known Hyper-Threading Technology optimization opportunities and coding pitfalls that may still be part of the application.
  2. Assuming that the performance is still not as expected, the next step is to narrow the scope of interest to a Hyper-Threading Technology-enabled processor performance issue. You should gather performance results from the following types of systems:

    • a single-processor system with a uni-processor kernel
    • a single-processor system with a multi-processor kernel
    • a single-processor system with Hyper-Threading Technology enabled and a multi-processor kernel
    • a dual -processor system with a multi-processor kernel.

    Comparing these performance results, verify that the performance degradation is not a multi-processor issue. Verify that the dual Pentium 4 processor system performance is as expected and exceeds single Pentium 4 processor without Hyper-Threading Technology enabled. If not, or if the performance gain is very low, then the tuning effort should follow the standard SMP tuning methodology.
  3. Next, verify that the single Pentium 4 processor with multi-processor kernel degrades less than 5% versus a single Pentium 4 processor uni-processor kernel. Note that single threaded (or effectively single-threaded) applications may actually degrade due to multi-processor kernel overhead not required for uni-processor kernels.
  4. Finally, verify that the performance on Hyper-Threading Technology-enabled processors degrades versus a single Pentium 4 processor with uni-processor kernel.
  5. Assuming reasonable SMP performance but degraded performance on Hyper-Threading Technology-enabled processors, the next step is to root-cause the performance degradation using the VTune Performance Analyzer.

 

Use the VTune Performance Analyzer tuning assistant feature, sometimes referred to as Automatic Hotspot Analysis, for Hyper-Threading Technology-enabled processors. The tuning methodology and support for data collection will guide the user as to what events are significant to collect initially and what are reasonable event-ratio expectations. In addition to the data collected for Hyper-Threading Technology-enabled processors, the same data should be collected on single Pentium 4 processor systems without Hyper-Threading enabled and dual Pentium 4 processor systems.

Comparing the time in clock ticks between systems can narrow the scope of where processor time is being spent. Then it is a matter of understanding what is causing the difference in clock ticks between the various platforms using the other recommended processor events.


Source

Threading Methodology: Principles and Practices

 


Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.