How to use Pause/Resume API in your code and running it on Intel® Xeon® Phi coprocessor correctly

(Importance : this article was only for old product 2013 since U13 or 2015 Beta, please reference new article if you work on 2015 version.)

VTune™ Amplifier XE 2013 Update 13 now supports ITT Pause/Resume API on the Intel® Xeon Phi™ coprocessor. Here is the article to describe that the user has to set environment variables for Intel Xeon Phi coprocessor.

I ever wrote old article about using Pause/Resume API on traditional Intel Xeon processor. (Use same example code) I hope that I will do same things that I did for Xeon processor, besides setting environment variables. Finally I realized there is the trick that we need to pay attentions on that.

Steps:

1.(Use Intel C/C++ Composer XE 2013) Build a Native Intel Xeon Phi coprocessor application.

# icpc -g -mmic test_api.cpp -I/opt/intel/vtune_amplifier_xe_2013/include /opt/intel/vtune_amplifier_xe_2013/bin64/k1om/libittnotify.a -lpthread -o test_api

2. Copy  native application onto Intel Xeon Phi coprocessor

# scp test_api mic0:/root

 

3.Run VTune data collection.

 


		# amplxe-cl -collect knc-hotspots -start-paused --search-dir all:rp=./ -- ssh mic0 INTEL_LIBITTNOTIFY64=$MIC_INTEL_LIBITTNOTIFY64 INTEL_JIT_PROFILER64=$MIC_INTEL_JIT_PROFILER64 INTEL_ITTNOTIFY_CONFIG=$MIC_INTEL_ITTNOTIFY_CONFIG /root/test_api 

		amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /home/peter/problem_report/r014hs -command stop.

		amplxe: Collection paused.

		amplxe: Collection resumed.

		amplxe: Collection stopped.

		amplxe: Using result path `/home/peter/problem_report/r001hs' 

		

This is wrong result, I start data collection in paused state, but resume collection in code, pause it again and resume it again. There should be two data collection phases, so result was incorrect.

The trick is that I should create a separate file which uses all MIC environment variables, for example: run_sample.sh with contents:

ssh mic0 INTEL_LIBITTNOTIFY64=$MIC_INTEL_LIBITTNOTIFY64 INTEL_JIT_PROFILE64=$MIC_JIT_PROFILE64 INTEL_ITTNOTIFY_CONFIG=$MIC_INTEL_ITTNOTIFY_CONFIG /root/test_api

Then run collector,


		# amplxe-cl -collect knc-hotspots -search-dir all:rp=./ -start-paused -- /home/peter/problem_report/run_sample.sh

			amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /home/peter/problem_report/r009hs -command stop.
amplxe: Collection paused.

			amplxe: Collection resumed.

			amplxe: Collection paused.

			amplxe: Collection resumed.

			amplxe: Collection stopped. 

			amplxe: Using result path `/home/peter/problem_report/r002hs'

		

This is an expected result I want.

Here is another sample that I used Pause/Resume API in offload program (I attached two samples for Native mode and Offload mode of using APIs in Xeon Phi program)

 

# icpc -g test_api_offload.cpp -offload-option,mic,compiler,"-I/opt/intel/vtune_amplifier_xe_2013/include /opt/intel/vtune_amplifier_xe_2013/bin64/k1om/libittnotify.a" -lpthread -o test_api_offload

MIC_ENV_PREFIX=MIC amplxe-cl -collect knc-hotspots -start-paused -- ./test_api_offload
For more complete information about compiler optimizations, see our Optimization Notice.

5 comments

Top
Peter Wang (Intel)'s picture

Important!

If you work on VTune(TM) Amplifier XE 2015 Initial Release, please use new method on using command line, see this article.

If you compile my example code - "offload" application for Xeon Phi, with Composer XE 2015, don't forget to specify linker's option to recorgnize Pthread library,

Change

# icpc -g test_api_offload.cpp -offload-option,mic,compiler,"-I/opt/intel/vtune_amplifier_xe_2013/include /opt/intel/vtune_amplifier_xe_2013/bin64/k1om/libittnotify.a" -lpthread -o test_api_offload

To

# icpc -g test_api_offload.cpp -offload-option,mic,compiler,"-I/opt/intel/vtune_amplifier_xe_2015/include /opt/intel/vtune_amplifier_xe_2015/bin64/k1om/libittnotify.a" -lpthread -o test_api_offload -offload-option,mic,ld,"-lpthread"
 

Peter Wang (Intel)'s picture

The problem has been solved:

# MIC_ENV_PREFIX=MIC amplxe-cl -collect knc-hotspots -search-dir all:rp=./ -start-paused -- /home/peter/problem_report/run_sample.sh 

"MIC_ENV_PREFIX=MIC" should be used for Pause/Resume API on MIC, both for "native" program and "offload" program.

 

Peter Wang (Intel)'s picture

I can repeat this problem in U17, however it worked during U13:-(  I will report the developer and get back soon.

Using APIs in offload program (my case 2) can work smoothly, I mean U17.

Satoshi Ohshima's picture

Hi, I tried the pause/resume on my Xeon Phi (KNC) with VTune XE 2013 Update 17, but it didn't work.

In the case of "-start-paused" measurement, I obtained the message as below. Do you have any solutions?

 amplxe: Error: Sampling was started in PAUSE mode and never RESUMED.

 

Peter Wang (Intel)'s picture

I don't know why there is no way to attach my examples, so put them here.

// test_api.cpp : Defines the entry point for the console application.

#include "stdio.h"

#include "ittnotify.h"

void foo_data_collected()

{

for (int i=0; i<10000000L; i++);

}

void foo_data_not_collected()

{

for (int i=0; i<10000000L; i++);

}

int main(int argc, char* argv[])

{

// Assume that data collector was paused in user interface

foo_data_not_collected();

// Now resume data collecting

__itt_resume();

foo_data_collected();

// Pasue data collecting again

__itt_pause();

foo_data_not_collected();

// Resume data collecting again

__itt_resume();

foo_data_collected();

// The user shouldn't see data collecting in foo_data_not_collected() in result

return 0;

}

--------------------------------------------------------------------------------------------------------------------------------------
// test_api_offload.cpp : Defines the entry point for the console application.

#include "stdio.h"

#pragma offload_attribute (push, target (mic))

#if defined(__MIC__) || (_MIC_)
#include "ittnotify.h"
#endif

void foo_data_collected()

{
//printf("foo_data_collected\n");

for (int i=0; i<10000000L; i++);
}

void foo_data_not_collected()

{
//printf("foo_data_not_collected\n");

for (int i=0; i<10000000L; i++);
}

void offload_proc()
{
//printf("start offload\n");
// Assume that data collector was paused in user interface
#if defined(__MIC__) || defined(_MIC_)
foo_data_not_collected();

// Now resume data collecting

__itt_resume();

foo_data_collected();

// Pasue data collecting again

__itt_pause();

foo_data_not_collected();

// Resume data collecting again

__itt_resume();

foo_data_collected();
#endif
//printf("stop offload\n");
}

#pragma offload_attribute(pop)

int main(int argc, char* argv[])

{
//printf("start host\n");

#pragma offload target (mic)
offload_proc();

//printf("stop host\n");
// The user shouldn't see data collecting in foo_data_not_collected() in result

return 0;

}

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.