Finding hotspot within a #define statement

Finding hotspot within a #define statement

We are using a #define statement, which would produce sometimes up to 100 lines of code.

Vtune just treats it as one line, but I would love to find hotspots within the detailed lines produced.

Is there an easy way to do so?


9 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

For example, you have a macro define for one function, actually C/C++ preprocessor of compiler will expand it then parse it, then build it. So all function lines always are located at macro definition which is one line (many lines with `\`).

I suspect that you may use preprocessor to generate medium files which expand all, such as macros and include files, then you use compiler to build these medium files (treat them as source files), VTune Amplifier may locate at right source lines with performance data. 

Thanks for the reply,

the situation is not too different from a function call I think. The code is always in one line as well. What makes it a little more difficult is, that we give code to a macro parameter...

I will give gcc option -save-temps a chance folowing your advice:)

Options like -debug inline-debug-info may be needed also, as would be the case when inline functions are employed in place of macros.

I think where Peter said medium the more usual term is intermediate, but his meaning was understood.

Thanks to Tim. Correct me that "intermediate" is right terminology in stead of "medium":-)

Thanks for your suggestions!

I did use -save-temps now, and I get the .ii intermediate file. But I can not convince vtune to use it?!

In the documentation of vtune I find only very little information on.ii files. More or less I only find, that they are supported....

In the source display of vtune I would expect .ii file to be an additional choice to .cc and assembly display?!

Thanks for any ideas:) 

Hi Detlef S.,

Unfortunately VTune currently can't attribute performance metrics on lines of C macroses through that should be possible to do for gcc if debug info generated in DWARF.

I guess Peter meant to run preprocessor on source file manually, replace original file by preprocessed one in your source tree, recompile your application and rerun analysis.


Thank you to clarify again.


Here is a simple example:

# g++ -E -P primes.cpp > primes.pre.cpp
# g++ -g primes.pre.cpp -openmp -o primes -lpthread

Now you can use VTune Amplifier XE to collect data, and primes.pre.cpp will be treated in source in VTune report.

My colleague Feilong Huang (compiler guy) told me, that "-P" is important to remove all source line info with original source file in primes.pre.cpp.


Thank you very much!

As my project is quite large this approch is a little unhandy. More important: gcc preprocesor puts the whole define in one line again, so that you would again get only one number. 

Therefor you would have to add a search/replace to introduce line breaks...

I used the following for now: turned on -save-temps, compiled. Now I got .ii filed with unroled macros additionaly to the normal compiled executable. I took the unroled macro, where I suspected the hot spot from the previous vtunes run where I got a number for the line.

I made a search replace to break the .ii line into multi lines and replaced the macro in the .cc file with this by hand. now rcompile and rund vtune again, and there you are.

Thanks so much for your help!

Leave a Comment

Please sign in to add a comment. Not a member? Join today