1. Optimize MPI communications
|
- Prepared for the application analysis.
- Used the Event Timeline, Function Profile, Message Profile and Imbalance Diagram to detect serialization that slows down the application.
- Removed serialization by replacing the problem-causing function.
- Compared the original trace file with the trace file of the revised application.
- Analyzed the improved communications in the Event Timeline.
|
-
Ungroup MPI functions to identify which functions slow down the application.
-
Use the Function Profile and Message Profile charts to see how much time is spent in MPI.
-
Generate the idealized trace and compare it with the original trace to get an insight on your application under the ideal circumstances and isolate problematic interactions.
-
In the real-world cases, it may be necessary to formulate a hypothesis regarding how the program should behave and to check this hypothesis using the most suitable chart.
|
2. Improve intra-process performance
|
- Built the target and launched the Basic Hotspots data collection using the interoperability features of the tools.
- Analyzed function calls and CPU time spent in each program unit of your application and identified the function that took the most CPU time.
- Found possible way to resolve the issue and optimize the source code.
|
-
Start analyzing the performance of your application from the Summary window to explore the performance metrics for the whole application.
-
Then, move to the Bottom-up window to analyze the performance per function. Focus on the hotspots - functions that took the most CPU time. By default, they are located at the top of the table.
-
Double-click the hotspot function in the Bottom-up pane or Call Stack pane to open its source code.
|