I'd like to compare some aspects of the functionality between intel's VTune 6.0 and profiler of TI's Code Composer Studio 2.0. I hope this comarison will be helpful for development and application of VTune.
My development work mostly focus on the field of on speech signal processing. About 6 months ago I was developing a G.729 speech codec on TI's 62x DSP platform. Both the hardware and development environment (Code Composer Studio - CCS 2.0) are powerful. CCS 2.0 also has a profiler. It's working method is different from that of VTune in some ways, at least to me, from a user's point of view. The profiler is integrated with the IDE - CCS 2.0. A user can add any function and segment of codes to the profiler. To me, the most important of all is, the profiler seems to get precise number of CPU cycles (for example 665795 for one of my key function in the algorithm) for everything being profiled. The profile result is displayed in a view inside the IDE, with values changing as the program running. I don't know the way they implement this kind of profiling. Or does this precise counting of CPU cycles need some special support of the CPU hardware?
VTune, on the other hand, use EBS to do the CPU events counting. But there is a Sample After Value, which cannot be too small. If I set it to 1, my hard disk will be full in no time. And VTune's sampling is a system-wide operation. This is necessary, cause I can know how much time my program spend on calling system APIs or MFC functions relating to the user interface. But If VTune can allow user to specify what part of his own program to profile and don't include anything else in profiling, a lot of time, disk space and memory will be saved. At the same time, profiling targeted at a small part of the program can be more refined and can generate more accurate results.
VTune has a very powerful utility - the Call Graph, which TI does not has a counterpart as far as I know. It's very useful to locate the heaviest excution path and function in the program being profiled. But once I HAS located the heaviest function, the remaining work mostly involve tuning this function (or even a segment of codes) again and again. At this time generating the same profiling results (including call graph and sampling) for the parts that the user are currently not interested in (including "light" parts of the user's program and OS modules, etc.) is a waste of time and resource.
Anyway, my technical experience and knowledge are very limited. I don't know whether what I have pointed out are reasonable or objective. If not, please point out my error and I will be very happy to receive your correction. I'm sorry if my comment on VTune make you uncomfortable. But I do hope this can provide some useful ideas from the user's point of view. I do hope VTune has the most powerful functionality and is easy to use.