analysis

How to Analyze the Performance of Parallel Codes 101: A Case Study with Open|SpeedShop

Performance analysis is an essential step in the development of HPC codes. It will even gain in importance with the rising complexity of machines and applications that we are seeing today. Many tools exist to help with this analysis, but the user is too often left alone with interpreting the results. In this tutorial we will provide a practical road map for the performance analysis of HPC codes and will provide users step by step advice on how to approach the optimization of their codes as well as on how to investigate observed performance bottlenecks in detail.

Structure the Design Phase of Threaded Application Development Cycle


Challenge

Develop a methodology for the design phase of the development cycle. Regions identified by the analysis phase are examined during the design phase to determine changes that must be made to accommodate a threading paradigm.

  • analysis
  • Multi-thread apps for Multi-Core
  • How to thread?
  • Parallel Computing
  • Apply Data Decomposition to Create Threaded Code


    Challenge

    Implement data decomposition on a serial function in order to produce a threaded version. The threaded version creates threads, each performing individual pieces of a computationally intensive operation.

  • analysis
  • Multi-thread apps for Multi-Core
  • How to thread?
  • Parallel Computing
  • Choose the Right Threading Model (Task-Parallel or Data-Parallel Threading)


    Challenge

    Choose task-level or data-parallel threading for various parts of an application. Choosing the right threading method minimizes the amount of time spent modifying, debugging, and tuning threaded code.


    Solution

    Describe your application (or an individual operation in that application) in terms of one of two models based on fit for the particular job:

  • analysis
  • Multi-thread apps for Multi-Core
  • How to thread?
  • Parallel Computing
  • analysis abonnieren