Some developers like performance data representation in form of “call graph”, where nodes are functions and edges are function calls from caller to callee. Also functions are attributed with CPU time. This way of call sequence visualization is not necessary needed and VTune™ Amplifier XE doesn’t have this graph – there are a lot of other powerful interactive ways to view, sort and filter performance data.
But there are good news for those who likes call graphs: it’s possible to make them from VTune Amplifier XE results. This blog post describes the way how to do it.
Firstly I need to emphasis that the way to build call graph described in this blog post is not suggested or supported by Intel. It was discovered by enthusiasts and uses three third-party open source tools that just make some manipulations with performance profile data provided by VTune Amplifier XE. So don’t submit tickets to support in case something goes wrong.
Step 1. Collect performance profile
Start profiling your application as usual. It can be done in any way: from GUI or command line. I’ve profiled “tachyon” from default VTune Amplifier XE samples in command line:
$ amplxe-cl -collect hotspots -result-dir r000hs -- tachyon_find_hotspots
Step 2. “gprof-cc” output
VTune Amplifier XE command line interface has ability to print performance results in gprof-like format. This kind of representation is needed for next steps of converting profile data into call graph. Save “amplxe-cl –report” output to a file for further manipulations:
$ amplxe-cl -report gprof-cc -result-dir r000hs -format text -report-output r000hs_gprof_cc.txt
r000hs_gprof_cc.txt file contains VTune Hotspots analysis data in gprof-style format. It should look this way:
Step 3. Grof2Dot tool
Gprof2Dot is a free open source tool from “Jose Fonseca's utilitities”. It can create a call graph from output of different performance profilers. The graph is represented in DOT language. Gprof2Dot is just one python script, you can download it from the author’s site.
We will need to apply a patch to it. I’ve used Git capabilities, so cloned a Git repository as described here:
$ git clone https://code.google.com/p/jrfonseca.gprof2dot/
I’ve used this Git client to clone the repository on Windows. Note that there maybe problems with connecting to the repository behind corporate proxy. After successful cloning you should have “gprof2dot.py” file in your local repository.
There are also patch utilities that you can use to apply the patch without cloning the repository.
Step 4. David Flater’s patch
The Gprof2Dot can make DOT graph from GNU gprof results. But VTune Amplifier XE provides gprof-like, but not exactly the same output as original gprof. E.g. there is no “called” column that contains number of function calls in GNU gprof. So the Gprof2Dot can’t process VTune results out of the box. Thanks to David Flater who created a special patch that implements processing of VTune’s “gprof-cc” output. Download the patch and put it in the same directory as original gprof2dot.py script. Then apply the patch:
$ git apply gprof2dot.py-20121125-DWF1.2.1.patch
Note that the patch is created against gprof2dot.py 1.0 rev. 2012-11-25. Again, you may use other ways to apply the patch, like patch utilities.
Step 5. Graphviz tool
Step 6. Create a call graph picture
When you have gprof2dot with the patch applied and Graphviz installed, you can finally create the call graph. Run gprof2dot.py script with “-f axe” and a file as input parameters. “-f axe” says the tool to interpret input data as VTune Amplifier XE gprof-cc output. This will print a call graph in DOT format to console:
$ python gprof2dot.py -f axe r000hs_gprof_cc.txt
Graphviz has a GUI, but I’ve used CLI on all steps in this blog, so let’s use command line version as well. You can simply pipe gprof2dot output to Graphviz (“dot” command) to print a picture in a desired format:
$ python gprof2dot.py -f axe r000hs_gprof_cc.txt | dot -Tpng -or000hs_call_graph.png
That’s all. You now have png image of a call graph, built from VTune Amplifier XE performance profile.
Here is a fragment of the call graph from my test:
Update April 9th 2013
Good news! Jose Fonseca has integrated David Flater’s patch into Gprof2Dot. So dismiss steps 3 and 4 - just download the gprof2dot.py script, you can use it as is now. After step 2 move to step 5.
If you’re fond of call graphs and missed this functionality in VTune Amplifier XE – this steps maybe what you needed. Though they are based on some tweaks, but you may find more efficient solution basing on this proposal. The way to build call graph described in this blog works for both Windows* and Linux*.