Making visualized call graph from Intel® VTune™ Amplifier XE results

Some developers like performance data representation in form of “call graph”, where nodes are functions and edges are function calls from caller to callee. Also functions are attributed with CPU time. This way of call sequence visualization is not necessary needed and VTune™ Amplifier XE doesn’t have this graph – there are a lot of other powerful interactive ways to view, sort and filter performance data.

But there are good news for those who likes call graphs: it’s possible to make them from VTune Amplifier XE results. This blog post describes the way how to do it.

Firstly I need to emphasis that the way to build call graph described in this blog post is not suggested or supported by Intel. It was discovered by enthusiasts and uses three third-party open source tools that just make some manipulations with performance profile data provided by VTune Amplifier XE. So don’t submit tickets to support in case something goes wrong.

Step 1. Collect performance profile

Start profiling your application as usual. It can be done in any way: from GUI or command line. I’ve profiled “tachyon” from default VTune Amplifier XE samples in command line:

$ amplxe-cl -collect hotspots -result-dir r000hs -- tachyon_find_hotspots

Step 2. “gprof-cc” output

VTune Amplifier XE command line interface has ability to print performance results in gprof-like format. This kind of representation is needed for next steps of converting profile data into call graph. Save “amplxe-cl –report” output to a file for further manipulations:

$ amplxe-cl -report gprof-cc -result-dir r000hs -format text -report-output r000hs_gprof_cc.txt

r000hs_gprof_cc.txt file contains VTune Hotspots analysis data in gprof-style format. It should look this way:

Step 3. Grof2Dot tool

Gprof2Dot is a free open source tool from “Jose Fonseca's utilitities”. It can create a call graph from output of different performance profilers. The graph is represented in DOT language. Gprof2Dot is just one python script, you can download it from the author’s site.
We will need to apply a patch to it. I’ve used Git capabilities, so cloned a Git repository as described here:

$ git clone

I’ve used this Git client to clone the repository on Windows. Note that there maybe problems with connecting to the repository behind corporate proxy. After successful cloning you should have “” file in your local repository.

There are also patch utilities that you can use to apply the patch without cloning the repository.

Step 4. David Flater’s patch

The Gprof2Dot can make DOT graph from GNU gprof results. But VTune Amplifier XE provides gprof-like, but not exactly the same output as original gprof. E.g. there is no “called” column that contains number of function calls in GNU gprof. So the Gprof2Dot can’t process VTune results out of the box. Thanks to David Flater who created a special patch that implements processing of VTune’s “gprof-cc” output. Download the patch and put it in the same directory as original script. Then apply the patch:

$ git apply

Note that the patch is created against 1.0 rev. 2012-11-25. Again, you may use other ways to apply the patch, like patch utilities.

Step 5. Graphviz tool

The Graphviz open source tool can draw graph pictures from DOT language. There are versions for multiple operating systems, I’ve used a Windows one. Download and install the package.

Step 6. Create a call graph picture

When you have gprof2dot with the patch applied and Graphviz installed, you can finally create the call graph. Run script with “-f axe” and a file as input parameters. “-f axe” says the tool to interpret input data as VTune Amplifier XE gprof-cc output. This will print a call graph in DOT format to console:

$ python -f axe r000hs_gprof_cc.txt

Graphviz has a GUI, but I’ve used CLI on all steps in this blog, so let’s use command line version as well. You can simply pipe gprof2dot output to Graphviz (“dot” command) to print a picture in a desired format:

$ python -f axe r000hs_gprof_cc.txt | dot -Tpng -or000hs_call_graph.png

That’s all. You now have png image of a call graph, built from VTune Amplifier XE performance profile.
Here is a fragment of the call graph from my test:

Update April 9th 2013

Good news! Jose Fonseca has integrated David Flater’s patch into Gprof2Dot. So dismiss steps 3 and 4 - just download the script, you can use it as is now. After step 2 move to step 5.


If you’re fond of call graphs and missed this functionality in VTune Amplifier XE – this steps maybe what you needed. Though they are based on some tweaks, but you may find more efficient solution basing on this proposal. The way to build call graph described in this blog works for both Windows* and Linux*.

For more complete information about compiler optimizations, see our Optimization Notice.


Ayam's picture

This is really interesting post. I was looking for something like this. 

I have tried this tutorial but the output graph is a flat 1D representation of the application. Can you please guide me what might be causing this behavior?


Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.