You may have received an email inviting you to the Intel® Parallel Studio XE 2016 Beta. VTune Amplifier XE 2016 beta is part of the studio and adds OpenMP* parallelization inefficiency, imbalance and work sharing analysis to tune for more efficient use of parallel regions. It also now supports multi-rank analysis of MPI* compute nodes with or without OpenMP use. Various ease-of-use enhancements include confidence indicators in General Exploration analysis results, "super tiny" bird's-eye view timeline, and "Platform" tab replacing "Tasks and Frames" tab.
VTune(TM) Amplifier XE 2015 can analyze MPI processes combined in hybrid codes in cluster system. It means that VTune Amplifier runs parallel MPI program on N ranks to collect performance data, then identify which hot function on which rank consumed highest CPU time.
First at all, need to set tools' environment, these tools are from Intel Cluster Stdio XE 2015: (for example)
1. Intel Composer XE
$ source /opt/intel/ics/2015.0.3.032/composer_xe_2015/bin/compilervars.sh intel64
2. Intel MPI Library
My application is using many IPP (Intel Performance Primitives) functions. When I profile it with vTune, I can see how much time IPP functions use, but I can't see the names of the IPP functions. Instead, I see function names like func@0x18107b040:
I have 100% reproducible BugCheck with the latest VTune update to date. Though stack trace is not very meaningful.
works fine. But for many others such as MEM_UNCORE_RETIRED.REMOTE_DRAM
amplxe-cl will give error like:
amplxe: Error: Cannot configure sampling event groups. The collection is terminated.
Could anyone help? Thanks
I am using Amplifier XE 2015 on Windows 7 and trying to profile 4xMPI processes running on my local machine. I get 3x of the above messages when running 4 MPI processes. Is that expected? That is it seems that XE is having problems profiling multiple MPI processes at the same time.
mpiexec -n 4 amplxe-cl -result-dir my_result_ah -collect hotspots -- <my_exe.exe>
We have spin locks from TBB (rw). I am interested to know who owns the lock. Yes, we have information who spins at particular object, but where is that guy who holds a lock? How to identify it?
I have Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz which is Haswell based processor. I want to estimate FLOPS of an application. I am using Intel VTune Amplifier XE 2015. wondering if anybody knows how to find FLOPS?
I tried following steps on https://software.intel.com/en-us/articles/estimating-flops-using-event-b... but I don't find Processor Event Name on the pages in VTune. wondering if anybody has successfully done this on Haswell processor.
- Page 1