Summary

You have completed the Analyzing Application with Intel® Trace Analyzer and Collector and Intel® VTune™ Amplifier tutorial. The following is the summary of important things to remember when using these tools to analyze and tune your application.

Step

Tutorial Recap

Key Tutorial Take-aways

1. Optimize MPI communications

  • Prepared for the application analysis.
  • Used the Event Timeline, Function Profile, Message Profile and Imbalance Diagram to detect serialization that slows down the application.
  • Removed serialization by replacing the problem-causing function.
  • Compared the original trace file with the trace file of the revised application.
  • Analyzed the improved communications in the Event Timeline.
  • Ungroup MPI functions to identify which functions slow down the application.

  • Use the Function Profile and Message Profile charts to see how much time is spent in MPI.

  • Generate the idealized trace and compare it with the original trace to get an insight on your application under the ideal circumstances and isolate problematic interactions.

  • In the real-world cases, it may be necessary to formulate a hypothesis regarding how the program should behave and to check this hypothesis using the most suitable chart.

2. Improve intra-process performance

  • Built the target and launched the Basic Hotspots data collection using the interoperability features of the tools.
  • Analyzed function calls and CPU time spent in each program unit of your application and identified the function that took the most CPU time.
  • Found possible way to resolve the issue and optimize the source code.
  • Start analyzing the performance of your application from the Summary window to explore the performance metrics for the whole application.

  • Then, move to the Bottom-up window to analyze the performance per function. Focus on the hotspots - functions that took the most CPU time. By default, they are located at the top of the table.

  • Double-click the hotspot function in the Bottom-up pane or Call Stack pane to open its source code.

Next step: Use the Intel® Trace Analyzer and Collector and Intel® VTune™ Amplifier to analyze your own application.

For more complete information about compiler optimizations, see our Optimization Notice.