Looking for a link that teach how to use VTune in native mode for Knight corner.

Looking for a link that teach how to use VTune in native mode for Knight corner.

Hi All,

I just start to learn to use Vtune on MIC. I wish to get your kindly help: where can I find a doc that teach how to use Vtune to count event for programs runing on MIC in native mode (not offload mode). 

Thank you in advance.

Susan

13 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

You can find associated docs from http://software.intel.com/en-us/mic-developer

Recommended article is - http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization

My experience of using VTune(TM) Amplifier XE 2013 on Xeon Phi(TM) processor, read my blog

Is there anything related to the Vtune XE 2015 ? 

I really have some problems with running analysis on native mode with the amplxe-gui.

 

Fabio

>>>My experience of using VTune(TM) Amplifier XE 2013 on Xeon Phi(TM) processor, read my blog>>>

Very interesting blog. Thanks for posting the link.

2015 Updates for MIC:

1. There is change for using collector on MIC, please see this blog.

2. (Good news) You can use event-based sampling with Call Stack on MIC, see this blog

Hope it helps.

 

 

As Peter pointed out, there have been some changes in the interfaces for accessing Intel coprocessors from VTune Amplifier XE with the 2015 version.  In amplxe-gui you should find a dialog in the project properties that should enable the program to collect data on both native and offload runs, and select which coprocessor in cases where there is more than one.  Fabio, can you provide more details regarding the problems you ran into?

Hello,

I dig up this subject because I have the same problem.

I really try to make it run but I do not succeed.

So I'm working on a Xeon based server, running SLES 11 SP 2, with two 5110p MIC. My code is a Fortran hybrid MPI/OpenMP application. Thanks to people here, I manage to analyze hotspots for the MPI part (no OpenMP activated) inside the server (so no MIC involved).

Now I want to launch an analysis of the OpenMP part (only one MPI process) but running on the MIC, so native mode.

I have the same account on the server and on the MIC, with passwordless ssh connection.
So the following command works directly :

ssh myserver-mic0 mpirun -np 1 /home/mic/mycode

For different numbers of OpenMP threads I get the output, the results, everything's fine.

 

If I want to use VTune over it, here is what I try, from the server :

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native -collect advanced-hotspots -- ssh myserver-mic0 mpirun -np 1 -genv OMP_NUM_THREADS 120 /home/mic/mycode

I get the following message :

amplxe: Using target: mic-native
amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /home/mic/TEST_MIC -command stop.
amplxe: Warning: To enable hardware event-base[d] sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
amplxe: Internal Error

Finally, here are the version of the Intel packages I use :

ifort --version
ifort (IFORT) 14.0.2 20140120
Copyright (C) 1985-2014 Intel Corporation.  All rights reserved.

mpirun --version
Intel(R) MPI Library for Linux* OS, Version 4.1.0 Build 20120831
Copyright (C) 2003-2012, Intel Corporation. All rights reserved.

amplxe-cl --version
Intel(R) VTune(TM) Amplifier XE 2015 (build 367959) Command Line Tool
Copyright (C) 2009-2014 Intel Corporation. All rights reserved.

 

 

If you have no problem to run this application without VTune(TM) Amplifier XE 2015, was it possible that you didn't specify mic card #?

> amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native -collect advanced-hotspots -- ssh mys

This is my case, it worked. 

# amplxe-cl -collect advanced-hotspots -target-system=mic-native:0 --search-dir=. -- ssh mic0 /bin/mpirun -n 240 /root/mpi_pi.MIC

 

Hello,

Thank you for your message.

I did specify the MIC card. Maybe the layout of my previous message is not very readable, so I rewrite the command

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native -collect advanced-hotspots -- ssh myserver-mic0 mpirun -np 1 -genv OMP_NUM_THREADS 120 /home/mic/mycode

 

I'm sorry I'm a bit long hereafter but I need to put things down.

Launched from my server, the following works :

ssh myserver-mic0 mpirun -genv OMP_NUM_THREADS 240 -n 1  ./mycode

 

Launched from the MIC card, the following works (the code runs correctly on the MIC) :

mpirun -genv OMP_NUM_THREADS 240 -n 1  /home/mic/mycode

 

Launched from my server, the following fails :

mpirun -genv OMP_NUM_THREADS 240 -host myserver-mic0 -n 1 ./mycode

with the error message :

bash: /opt/intel/impi/4.1.0.024/intel64/bin/pmi_proxy: No such file or directory

On my server, this missing file exists :

\ls -al /opt/intel/impi/4.1.0.024/intel64/bin/pmi_proxy
-rwxr-xr-x 1 root root 928884 31 août   2012 /opt/intel/impi/4.1.0.024/intel64/bin/pmi_proxy

Of course, there's no /opt directory on the MIC card.

If I copy the file (so taken from the mic subdirectory and not from the intel64 one)

scp /opt/intel/impi/4.1.0.024/mic/bin/pmi_proxy myserver-mic0:/usr/bin

on the MIC, it still fails.

If I create the directories on the MIC card

mkdir -p /opt/intel/impi/4.1.0.024/intel64/bin

And If, on the MIC, I copy :
cp /usr/bin/pmi_proxy /opt/intel/impi/4.1.0.024/intel64/bin

The following command finally works (still launched from my server) :

mpirun -genv OMP_NUM_THREADS 240 -host myserver-mic0 -n 1 ./mycode

I find this very strange, but It solves this problem.

 

Now I'm back to amplxe, that's what I want to do : to use it to analyze the behaviour of my application on the MIC card in native mode (for the moment).

I try several ways to launch the code through the profiler.

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native -collect advanced-hotspots -- mpirun -np 1 -host myserver-mic0 -genv OMP_NUM_THREADS 120 /home/mic/mycode

=> the code runs on the MIC but the profiler analyzes mpirun. Some values are strange :

Summary
-------
Elapsed Time:  707936109.048

instead of 120 sec. Moreover, there's no cpu usage data for any subroutine of my code.

 

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native -collect advanced-hotspots -- ssh myserver-mic0 ./mycode -genv OMP_NUM_THREADS 240 -n 1

=> The code runs on the MIC but the tool analyzes ssh.

 

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native -collect advanced-hotspots -- ./mycode -host myserver-mic0 -genv OMP_NUM_THREADS 240 -n 1

=> The code runs on the MIC but the tool still analyzes ssh even if in "Analysis target" the studied application is said to be mine.

 

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native -collect advanced-hotspots -- ./mycode -genv OMP_NUM_THREADS 240 -n 1

=> The code runs on the MIC but the tool still analyzes ssh even if in "Analysis target" the studied application is said to be mine.

 

So, for the moment, I do not have any other idea.

 

I hate to put you in the position of guinea pig, but I've been trying to sort out the changes for coprocessor invocation in the new VTune Amplifier as well, and might have another incantation to try, but first, a question:

Does your host have a /etc/hosts entry for "mic0" (as opposed to only having myserver-mic0 defined)?   If mic0 is a known IP address, you might have some success with the following:

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native:0 -collect advanced-hotspots mpirun -genv OMP_NUM_THREADS 240 -n 1  /home/mic/mycode

The "-target-system=mic-native:0" should by translated by amplxe-cl into an ssh run on the local mic0, so you shouldn't need to respecify by using the -host as above.  

Hello,

As I understood you have two options how to launch mpirun:

1. Launch it right on the card with:

mpirun -genv OMP_NUM_THREADS 240 -n 1  /home/mic/mycode

2. Launch it from the Xeon server host:

mpirun -genv OMP_NUM_THREADS 240 -host myserver-mic0 -n 1 ./mycode

Robert correctly pointed the launch command for the first case:

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native:0 -collect advanced-hotspots mpirun -genv OMP_NUM_THREADS 240 -n 1  /home/mic/mycode

For the second case we need to use -target-system=mic-host-launch option since mpirun that invokes MIC rank is launched from the host:

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-host-launch:0 -collect advanced-hotspots mpirun -genv OMP_NUM_THREADS 240 -host myserver-mic0 -n 1  /home/mic/mycode

Thanks & Regards, Dmitry

Thank you both for your messages.

@robert, the command amplxe-cl does not exist on MIC. If I connect to the MIC, and I type your command, i.e.

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native:0 -collect advanced-hotspots mpirun -genv OMP_NUM_THREADS 240 -n 1  /home/mic/mycode

What I get is only :

-bash: amplxe-cl: command not found

 

 

 

@dmitry

I try your command :

amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-host-launch:0 -collect advanced-hotspots mpirun -genv OMP_NUM_THREADS 240 -host myserver-mic0 -n 1  /home/mic/mycode

It works, but no information is related to my application. Please look at the two pictures I attach. Moreover, timeline is completely false due to false elapsed time

Elapsed Time:  707061185.207

 

 

Anexos: 

If I attempt a command line run as suggested above:

]$ /opt/intel/vtune_amplifier_xe_2015.1.0.367959/bin64/amplxe-cl -r /home/mic/TEST_MIC -target-system=mic-native:0 -collect advanced-hotspots /home/mic/runffast.sh
amplxe: Using target: mic-native:0
amplxe: Error: Permission denied (publickey,password,keyboard-interactive).
amplxe: Error: Amplifier cannot detect Intel Xeon Phi coprocessor configuration.
amplxe: Fatal error: Cannot create analysis type. Check input parameters or reinstall the product.

 

The ability to set up password-less ssh was lost several mpss versions ago, but still I was able to run from amplxe-gui successfully with the beta test version.

Now the amplxe-gui run completes "with warnings" but the only warning is "cannot locate debugging symbols for file /bin/bash"

Summary pane says result file 1 MB.  There is a "tasks by threads" window showing just 2 threads (118 threads show in the screen echo for the remote run) and an empty "tasks over time" window, no "bottom-up" or other such window such as in previous versions.

I repeated the k1om driver installation script which reported success and instructed me to restart mpss; this didn't change anything.

Due to expiration of my licenses, I am no longer permitted to use the beta version and apparently not any future update.

Deixar um comentário

Faça login para adicionar um comentário. Não é membro? Inscreva-se hoje mesmo!