VTune Amplifier XE 2011 installation on clusters

VTune™ Amplifier XE 20111 was primarily designed to be used as tool for profiling processes running on SMP systems, i.e. threading level and data parallelism analysis, unlike Intel® Trace Collector and Analyzer, a tool designed for process level parallelism analysis. However, using VTune Amplifier XE on clusters or other distributed environments which have an LDAP/NIS administration model and share file resources by NFS is as easy as on a local machine. The only trick is installation of the kernel driver (a part of VTune Amplifier XE that enables Event Based Sampling analysis).

By default the installation program puts the product files directory in the /opt directory, mounted in the local file system. The top-level installation directory for the product (let’s use an alias name for it:  <INSTALL-DIR>) is /opt/intel/vtune_amplifier_xe_2011/.  This can be changed however to any place in the local or shared file system that is accessible to users for reading. Within the installation directory you can find the sepdk directory, which contains all the driver files including the source code.

Let’s consider a general distributed computing system, which can be either a cluster or a computing network with shared data storage resources. Typically users have an access to the share point with installed programs, may write to the /home/<user_name> directory, and allowed to execute programs on some nodes/machines in the network. Probably, some disk storage space is available on the nodes, at least for the /tmp directory. (Pic.1)

network1.jpg
Pic.1.

We will discuss specifics of the systems/network later on.

A typical product installation and usage schema on a distributed environment might proceed as follows.
The product is installed by an administrator on a share point, available for all the users. The administrator installs and enables the kernel driver for specific nodes and user(s). The users launch VTune Amplifier XE on a node/machine, from the share point mounted on their systems, to analyze programs/system behavior on the node/machine. Even though the program’s execution might be distributed among the other nodes (enabled by MPI or other libraries), a single instance of the tool is collecting performance data upon the single node it has been launched. With some restrictions, you could even run multiple instances simultaneously on different nodes.

Here are the details for the administrator with examples of how to install VTune Amplifier XE 2011.

1. First, retrieve the installation packages from the archive. Then, install all components on a file system executing the following commands:
tar –xzf <install-package>.tar.gz
cd vtune_amplifier_xe_<release>
./install.sh

2. During installation you may change the install directory, and select the components to install. Make sure you are installing the product on the shared file system path accessible for reading by all expected users.

3. In addition, you may select the ‘Change Advanced Options’ menu item to configure the kernel driver installation options.
By default the kernel driver is installed on the current node.

3.1. While many OSes are officially supported (see the Release Notes document), not all are. If this OS is in the support list, a prebuilt driver should be available in the package – choose the [use pre-build driver] option in the ‘Driver install type’ sub-menu. If the OS is not in the support list, choose the [Build driver] option. Choose the [Driver kit files only] option, if you will be building the driver by yourself later.
3.2. You should add users to the list who are allowed to run the hardware EBS based analysis on the current node. By default, the installer creates a group called “vtune.” Change the ‘Driver access group’ option in order to specify a different permission group.
Note: The specified group can be either local group or network group. If a network group name is required, it should exist. The installation program will search for the network group. If it’s not found, it creates a local group with the specified name.
3.3. If you need the kernel driver to be loaded immediately in the current system, set the ‘Load driver’ [yes] option.
3.4. If you need the kernel driver to be loaded in the current system every time it boots, set the ‘Install boot script’ [yes] option

4. You may want to change the driver installation schema to enable hardware EBS analysis on the rest of the nodes, in addition or instead of the current node. You don’t need to install the entire product on each node, as it’s already set in the shared file system. However, driver installation is required on every node in the network where hardware EBS data is to be collected.

To do that, you need to enter each node and run the installation scripts located at the product installation directory.
Go to the directory with the driver (depending on whether driver is prebuilt for the supported system or has been built by the installation program):
cd <INSTALL-DIR>/sepdk/prebuilt
or
cd <INSTALL-DIR>/sepdk/src

Run the install scripts:
./insmod-sep3 –-group my_group
./boot-script –-install –-group my_group

Where my_group is the users group or NIS group to have access to the hardware EBS data collection.

Note: by default the vtune group is used if the --group option is omitted. Have the users included into the vtune group in this case.

The insmod-sep script loads the driver into the system on the current node. The boot-script configures the driver boot script and then installs it in the appropriate system directory. For more details of the available options run the script with the --help option.

Note: the described installation schema works only if the cluster or the network is homogeneous, i.e. the nodes hardware configuration and OS are identical. In case the machines in the network are not identical, you need a local driver build and installation for each machine.

5. Now the users from the specified group can use the tool including hardware EBS analysis on the nodes. In order to check if the kernel driver is installed and loaded on a node, use the following command:
lsmod | grep sep


Alternatively, if you want more control over driver installation, you may go the following way.

1. Go to the unpacked product on and run the install script:
./install.sh --SHARED_INSTALL

The --SHARED_INSTALL option allows skipping the driver installation for the current machine. This is needed as users are expected to launch profiling on their nodes, in general, not necessarily on the main node or the node used by admin for installation.

2. Even without the EBS driver, the product can be used for profiling using predefined profiles excluding the profiles based on hardware Event-Based Sampling (e.g. Lightweight Hotspots). Users launch the product from the shared drive, for example: /mnt/nfs/appsrvr/intel/vtune_amplifier_xe_2011/bin64/amplxe-cl

3. Prepare the kernel driver for the installation. The driver files are located in the <INSTALL-DIR>/sepdk directory. Select below depending on whether the current OS is supported:

3.1. The OS is in the support list. In this case a prebuilt driver should be located in the <INSTALL-DIR>/sepdk/prebuilt directory. Use this directory to install the driver.

3.2. The OS is not in the support list. You can build the driver for the current OS. Run the following commands:
cd <INSTALL-DIR>/sepdk/src
./build-driver –ni –-install-dir=../prebuilt
The driver will be compiled and installed in the prebuilt directory.

If you need to build and install the driver in a custom directory, use the --install-dir option:
./build-driver  --install-dir=/path-to-share/my_vtune_driver/
to specify the driver installation directory. The my_vtune_driver directory should exist in the path.

Then copy these scripts to the driver installation directory:
cp insmod-sep3 /path-to-share/my_vtune_driver/
cp rmmod-sep3 /path-to-share/my_vtune_driver/
cp boot-script /path-to-share/my_vtune_driver/
cd pax
cp insmod-pax /path-to-share/my_vtune_driver/pax
cp rmmod-pax /path-to-share/my_vtune_driver/pax
cp boot-script /path-to-share/my_vtune_driver/pax

See the <INSTALL-DIR>/sepdk/src/README.txt document for more details on building the driver or run the script with the --help option for more details on the available options.

4. Now install the kernel driver on the selected nodes and add users to the list who are allowed to run the EBS analysis.

Enter each node that is expected to be used for performance profiling and run the following commands from the shared directory where the appropriate driver is located. E.g. for the prebuilt driver:
cd <INSTALL-DIR>/sepdk/prebuilt
./insmod-sep3 –-group my_group
./boot-script –-install –-group my_group

The insmod-sep script loads the driver into the system on the current node. The boot-script configures the driver boot script and then installs it in the appropriate system directory. For more details of the available options run the script with the --help option.

If needed, the driver could be unloaded and uninstalled on any node. To do that enter the selected node and run the following commands from the directory where the appropriate driver is located. E.g. for the prebuilt driver:
cd <INSTALL-DIR>/sepdk/prebuilt
./rmmod-sep3
./boot-script –-uninstall

5. Now the users that belong to my_group can run hardware EBS analysis on the nodes. Users may run either command line or the GUI version of the tool, depending on their display device. Users are expected to set the results directory path within their home directory. By default the tool proposes to set the path: path-to-user-home/intel/amplxe/Projects/project-name. Users may set up a directory to save analysis results to a local path, e.g. /tmp. In case of very slow network connection this may help faster data loading and processing while analyzing collected results.


Let’s consider a more specific case, which is typical for a cluster infrastructure. Users usually have no direct access to the nodes except one node; let’s call it “master node” or node1. The only disk space available for writing is a user’s home directory. (Pic.2.). The main idea of such configuration is that users have all their data and software on the file system mounted on the master node and start their tasks using special scripts which involve MPI mechanisms for dispatching the tasks among other nodes.

network2.jpg
Pic.2.

Here the installation is not much different from the previous. The administrator has to make sure the product can be launched on each node and the kernel driver is installed and loaded on each node.
Note: the cluster has to be homogeneous.

The main difference is in how the users run the performance collection on the nodes, as they are not able to run the product directly on the nodes. In this case users should use the scheduling system scripts in order to launch an analysis. E.g. for Intel MPI the mpiexec script can be used on the master node to launch the profiling collector on the other nodes, specifying a user application to run as a parameter.

The usage model of VTune Amplifier XE 2011 on clusters will be discussed in detail in a separate article.

1 VTune is a trademark of Intel Corporation in the U.S. and other countries

For more complete information about compiler optimizations, see our Optimization Notice.