Perform Code Timing and Profiling for Linux on 64-Bit Architecture


Challenge

Measure the time a program and its functions take to execute as part of the diagnosis phase of performance optimization. Such measurements are extremely valuable as a simple means to become familiar with how an application behaves during execution.


Solution

Use either the Linux time command or the clock function in the C library, and profile the application during compilation. The time command is used as follows:

prompt> time

It gives the following information:

  • User time (time spent in user mode, including time for cache misses)
  • System time (time spent in kernel mode)
  • Elapsed time (the actual time that has passed since the start of program execution)

 

For example, if the system time is much greater than the user time, there may be problems such as page faults, misaligned memory references, and floating-point exceptions.

Of greater importance is to time a portion of a program. This can give performance figures for individual loops. The code below determines the CPU time to perform the dot_product function:

#include 

... 

main () { 

clock_t start, finish; 

double duration; 

... 

start = clock (); 

result = dot_product (A, B); 

finish = clock (); 

duration = (double) (finish – start) / CLOCKS_PER_SEC; 

... 

} 

 

To obtain the most detailed timing information, profile your application. When compiling your application, supply ecc with the -prof_gen switch. ecc will instrument your code so that it generates an execution profile when it is run. This profile then can be analyzed with the Intel® VTune™ Performance Analyzer. The VTune analyzer can be used to determine the "hotspots" and bottlenecks in your code (see Richard Greco's article Performance Analysis of Applications Running on Itanium Processors).

As an alternative, you can re-compile your application using the -prof_use switch. This is called profile-guided optimization (PGO), and it can assist the compiler by giving it real-world usage and performance information. The following table summarizes ecc profiling switches: 

Switch Description Default
-prof_file filename Specifies the filename for the profiling summary file. OFF
-prof_gen[x] Instruments the program to prepare for instrumented execution and creates a new static profile information file (.spi ). With the x qualifier, extra information is gathered. OFF
-prof_use Uses dynamic feedback information. OFF
-qp, -p, -pg Compile and link for function profiling with the UNIX* prof tool. OFF

 


Source

Directives and Pragmas and Switches Oh My!

 


For more complete information about compiler optimizations, see our Optimization Notice.