Intel® Trace Analyzer and Collector

Understand MPI application behavior, quickly finding bottlenecks, and achieving high performance for parallel cluster applications

  • Powerful MPI Communications Profiling and Analysis
  • Scalable - Low Overhead & Effective Visualization
  • Flexible to Fit Workflow – Compile, Link or Run

Intel® Trace Analyzer and Collector 9.0 is a graphical tool for understanding MPI application behavior, quickly finding bottlenecks, improving correctness, and achieving high performance for parallel cluster applications based on Intel architecture. Improve weak and strong scaling for small and large applications with Intel Trace Analyzer and Collector.

Benefits:

  • Visualize and understand parallel application behavior
  • Evaluate profiling statistics and load balancing
  • Analyze performance of subroutines or code blocks
  • Learn about communication patterns, parameters, and performance data
  • Identify communication hotspots
  • Decrease time to solution and increase application efficiency

MPI checking

  • A unique MPI Correctness Checker detects deadlocks, data corruption, and errors with MPI parameters, data types, buffers, communicators, point-to-point messages and collective operations.
  • The Correctness Checker allows the user to scale to extremely large systems and detect errors even among a large number of processes.

Interface and Displays

  • Intel® Trace Analyzer and Collector includes full-color customizable GUI with many drill-down view options.
  • The analyzer is able to extremely rapidly unwind the call stack and use debug information to map instruction addresses to source code.
  • With both command-line and GUI interfaces, the user can additionally set up batch runs or do interactive debugging.

Scalability

  • Low overhead allows random access to portions of a trace, making it suitable for analyzing large amounts of performance data.
  • Thread safety allows you to trace multithreaded MPI applications for event-based tracing as well as non-MPI threaded applications.

Instrumentation and Tracing

  • Low-intrusion instrumentation supports MPI applications with C, C++, or Fortran.
  • Intel Trace Analyzer and Collector automatically records performance data from parallel threads in C, C++, or Fortran

What’s new

  • MPI Communications Profile Summary Overview
    • Quickly Understand Computation vs Communications
    • Identify which MPI communications are being most used
    • Advice of where to start your analysis

  • Expanded Standards Support with MPI 3.0
    • Automated MPI Communications Analysis with Performance Assistant
    • Detect common MPI performance issues
    • Automated tips on potential solutions

Videos to help you get started.

Register for future Webinars


Previously recorded Webinars:

  • Increase Cluster MPI Application Performance with a "MPI Tune" Up
  • MPI on Intel® Xeon Phi™ coprocessor
  • Quickly discover performance issues with the Intel® Trace Analyzer and Collector 9.0 Beta

Featured Articles

Nessun contenuto trovato

More Tech Articles

Nessun contenuto trovato
Iscriversi a Articoli Intel Developer Zone

Supplemental Documentation

Nessun contenuto trovato
Iscriversi a Articoli Intel Developer Zone

You can reply to any of the forum topics below by clicking on the title. Please do not include private information such as your email address or product serial number in your posts. If you need to share private information with an Intel employee, they can start a private thread for you.

New topic    Search within this forum     Subscribe to this forum


mpd failed to boot
Di Chao M.3
Hi all, When I try to run mpdboot on single node (RHEL7), but I got error message: [root@hpc-test intel64]# mpd & [1] 3527 [root@hpc-test intel64]# mpd_uncaught_except_tb handling:   <type 'exceptions.IndexError'>: list index out of range     /opt/intel/impi/5.0.2.044/intel64/bin/mpd  264  pin_Uni_num         if list.index(list[i]) == i:     /opt/intel/impi/5.0.2.044/intel64/bin/mpd  1449  pin_Cpuinfo         info['cache1'] = pin_Uni_num(info['cache1_id'], info['lcpu'])     /opt/intel/impi/5.0.2.044/intel64/bin/mpd  1658  run         self.CpuInfo = pin_Cpuinfo(self.PinCase,self.Arch)     /opt/intel/impi/5.0.2.044/intel64/bin/mpd  3676  <module>         mpd.run() I use I_MPI_CPUINFO=proc to work around, but I don't know what cause it, and how to fix it correctly? Thank you for your help. Thanks, Chao  
Trying to use I_MPI_PIN_DOMAIN=socket
Di William N.3
I'm running on an IBM cluster with nodes that have dual socket Ivy Bridge processors and 2 Nvidia K40 Tesla cards.  I'm trying to run with 4 MPI ranks using Intel MPI 5 Update 2 with a single MPI rank for each socket.  I'm trying to learn how to do this by using a simple MPI Hello World program that prints out the host name, rank and cpu ID.  When I run with 2 MPI ranks, my simple program works as expected.  When I run with 4 MPI ranks and use the mpirun that comes with Intel MPI, all 4 ranks run on the same node that I launched from.  I am doing this interactively and get a set of two nodes using the following command: qsub -I -l nodes=2,ppn=16 -q k20 I am using the following commands to run my program: source /opt/intel/bin/compilervars.sh intel64; \ source /opt/intel/impi_latest/intel64/bin/mpivars.sh; \ export I_MPI_DAPL_PROVIDER=ofa-v2-mlx4_0-1u; \ /opt/intel/impi_latest/intel64/bin/mpirun -genv I_MPI_PIN=1 -genv I_MPI_PIN_DOMAIN=socket -n 4 hw_ibm_impi If I use a different ...
Multiple Versions of Intel MPI Library Runtime on one machine
Di adambruss4
Hi Intel we appreciate all the time and effort you put into your products. We have a question about installing multiple versions of the Intel MPI Library Runtime Environment on one machine. How can one have multiple versions of the Intel MPI Library Runtime Environment installed on the same machine? We are asking because some versions of our software use 4.0 and some use 4.1. In the near future another version may use 5.0. There seems to be a conflict because the Windows services all use the same name.  Thanks for the great products, Adam Bruss
How to initiate MPI in the Fortran subroutine?
Di dingjun.chencmgl.ca0
Hi, All, I am using Intel MPI and OpenMP. My application is a hybrid MPI and OpenMP Fortran-code application. Currently it is tested on a multicore computers. My question for you is as follows: Can I initiate MPI processes and  run all MPI subroutines  within a Fortran subroutine? The details are given as follows: program xsamg2014 implicit none .......................... /* Before calling xsamg, all codes are only OpenMP codes */  /*OpenMP Fortran codes  */ ..................................... /* call xsamg to initial MPI processes call xsamg (.........)   // all MPI processes are with this subroutine xsamg(.....) /* After finish calling XSAMG, then all others codes are OpenMP codes ***/ /*OpenMP Fortran codes*/ end program xsamg2014   subroutine xsamg(......, ....,.....) implicit none call  MPI_INIT()  call MPI_COMM_SIZE(MPI_COMM_WORLD, count) call MPI_COMM_RANK(MPI_COMM_WORLD, myid)  ........................................................ /* doing  some computation work with MP...
mpi_type_vector three dimensional
Di diedro1
Dear all, I have a three dimensional array AA(:,:,:) and I would like to sent it, or at least part of it from one CPU to another. The idea is to combine MPI_TYPE_VECTOR. This is my program. I do not understand it, sometimes it works and some time not. What do you think? program vector USE mpi IMPLICIT NONE integer SIZE_ parameter(SIZE_=4) integer numtasks, rank, source, dest, tag, i, ierr real*4 AA(SIZE_,5,4), BB(SIZE_,5,4) integer stat(MPI_STATUS_SIZE), rowtype,colrowtype !Fortran stores this array in column major order AA=0. AA(1,1,1)= 1.0 AA(1,1,2)= 4.0 AA(1,1,3)= 10.0 AA(1,1,4)= 33.0 AA(2,1,1)= 10.0 AA(2,1,2)= 40.0 AA(2,1,3)= 100.0 AA(2,1,4)= 330.0 CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr) CALL MPI_TYPE_VECTOR(5, 5, 5, MPI_REAL, rowtype, ierr) CALL MPI_TYPE_COMMIT(rowtype, ierr) CALL MPI_TYPE_VECTOR(4, 4, 4, rowtype, colrowtype, ierr) CALL MPI_TYPE_COMMIT(colro...
Can't install Parallel Studio XE 2015 Update1 on Scientific Linux 6.6
Di Zhenzhen B.3
When I install Parallel Studio XE 2015 Update1 on Scientific Linux 6.6, the message shows: ------------------------------------------------------------------------------------------------------------------------------------- Missing optional prerequisites -- Intel(R) MPI Library, Development Kit 5.0 Update 1 for Linux* OS: Unsupported OS -- Intel(R) Trace Analyzer and Collector 9.0 Update 1 for Linux* OS: Unsupported OS -- Intel(R) VTune(TM) Amplifier XE 2015: Unsupported OS -- Intel(R) Inspector XE 2015: Unsupported OS -- Intel(R) Advisor XE 2015: Unsupported OS -- Intel(R) Parallel Studio XE 2015 Composer Edition for C++ Linux*: Unsupported OS -- Intel(R) Parallel Studio XE 2015 Composer Edition for Fortran Linux*: Unsupported OS -- Intel(R) Parallel Studio XE 2015 Composer Edition for Fortran and C++ Linux*: Unsupported OS ---------------------------------------------------------------------------------------------------------------------------------------- And when I type a sele...
Can't install Parallel Studio XE 2015 Update1 on CentOS 6.6
Di Satoshi Ohshima8
Hi, I'm trying to install (upgrading) Parallel Studio XE 2015 Update1. However, the installer causes segmentation fault. Do you know any solutions? Target system has Xeon E5-2697 v2 and CentOS 6.6 is installed. Parallel Studio XE 2015 (Initial Release) is already installed without any errors. Following text is all output string of "./install.sh". The line 582 contains only "fi". -------------------------------------------------------------------------------- Initializing, please wait... -------------------------------------------------------------------------------- ./install.sh: line 582: 5226 Segmentation fault "$pset_engine_cli_binary" --TEMP_FOLDER="$temp_folder" --PSET_PWD="$runningdir" $params $@   kind regards
MPI Library Runtime Enviroment 4.0
Di Pablo V.1
Hello, I am working by using remote deskpot of Cornell University servers and I have not internet conection in my deskpot. I am using Visual Studios 2008 with Intel Visual Fortran Composer XE 2011, and supposedly it has already installed MPI Library Runtime Enviroment 4.0 I can´t find the files msmpi.lib or impi.lib, or the include path. Nevertheless, I found the folder with other files like mpichi2mpi.dll, impi.dll, impimt.dll,mpiexec.ex, wmpiexec.exe, etc. The package ID in the support file is w_mpi_rt_p_4.0.1.007 listed    How can I include .dll mpi files to my code?  Thanks,
Iscriversi a Forum
  • What are some key things I can learn about my program using Intel® Trace Analyzer and Collector?
  • The Intel Trace Analyzer and Collector is a graphical tool used primarily for MPI-based programs. It helps you understand your application's behavior across its full runtime. It can help find temporal dependencies in your code and communication bottlenecks across the MPI ranks. It also checks the correctness of your application and points you to potential programming errors, buffer overlaps, and deadlocks.

  • Will Intel Trace Analyzer and Collector only work with Intel MPI Library?
  • No, the Intel Trace Analyzer and Collector support all major MPICH2-based implementations. If you're wondering whether your MPI library can be profiled using the Intel Trace Analyzer and Collector, you can run a simple ABI-compatibility check by compiling the provided mpiconstants.c file and verifying the values with the ones provided in the Intel Trace Collector Reference Guide..

  • Can Intel Trace Analyzer and Collector be used on applications for Intel® Many Integrated Core Architecture (Intel® MIC Architecture)?
  • Yes, Intel MIC Architecture is fully supported by the Intel Trace Analyzer and Collector.

  • What file and directory permissions are required to use Intel Trace Analyzer and Collector?
  • You do not need to install special drivers, kernels, or acquire extra permissions. Simply install the Intel Trace Analyzer and Collector in the $HOME directory and link it with your application of choice from there.

  • Should I recompile/relink my application to collect information?
  • It depends on your application. For Windows* OS, you have to relink your application by using the –trace link-time flag.

    For Linux* OS (and if your application is dynamically linked), you do not need to relink or recompile. Simply use the –trace option at runtime (for example: mpirun –trace).

  • How do I control which part of my application should be profiled?
  • The Intel Trace Collector provides several options to control the data collection. By default, only information about MPI calls is collected. If you'd like to filter which MPI calls should be traced, create a configuration file and set the VT_CONFIG environment variable.

    If you'd like to expand the information collected beyond MPI and include all user-level routines, recompile your application with the –tcollect switch available as part of the Intel® Compilers. In this case, Intel Trace Collector will gather information about all routines in the application, not just MPI. You can similarly filter this via the –tcollect-filter compiler option.

    If you'd like to be explicit about which parts of the code should be profiled, use the Intel Trace Collector API calls. You can manually turn tracing on and off via a quick API call.

    For more Information on all of these methods, refer to the Intel Trace Collector Reference Guide..

  • What file format is the trace data collected in?
  • Intel Trace Collector stores all collected data in Structured Tracefile Format (STF) which allows for better scalability across both time and processes. For more details, refer to the "Structured Tracefile Format" section of Intel Trace Collector Reference Guide.

  • Can I import or export trace data to/from Intel Trace Analyzer and Collector?
  • Yes, you can export the data from any of the Profile charts (Function Profile, Message Profile, and Collective Operations Profile) as part of the Intel Trace Analyzer interface. To do this, open one of these profiles in the GUI, right-click to bring up the Context Menu, and select the "Export Data" option. The data will be saved in simple text format for easy reading.

    At a separate level, you can save your current working Intel Trace Analyzer environment via the Project Menu. If you choose to "Save Project", your current open trace view and associated charts will be recorded as they are open on your screen. You can later choose to "Load Project" from this same menu, which will bring up a previously-saved session.

  • What size MPI application can I analyze with Intel Trace Analyzer and Collector?
  • It depends on how large or complex your application is, how many MPI calls you are making, and for how long you are running. There are no internal limitations on the size of the MPI job but there are plenty of external ones. It all depends on how much memory is available on the system (per core) both for the application, the MPI library, and for the Intel Trace Collector processes, as well as disk space availability. Any additional flags enabled (for example, storing call stack and source code locations) cause an increase in the size of the trace file. Filtering out unimportant information is always a good solution to reducing trace files.

  • How can I control the amount of data collected to a reasonable amount? What is a reasonable amount?
  • Each application is different in terms of the profiling data it can provide. The longer an application runs, and the more MPI calls it makes, the larger the STF files will be. You can filter some of the unnecessary information out by applying appropriate filters (see Question #6 for more details or check out some tips on Intel Trace Collector Filtering).

    Additionally, you can be restricted by the resources allocated to your account; consult your cluster administration about quotas and recommendations.

  • How can I analyze the collected information?
  • Once you have collected the trace data, you can analyze it via the Graphical Interface called the Intel Trace Analyzer. Simply call the command ($ traceanalyzer) or double-click on the Intel Trace Analyzer icon and navigate to your STF files via the File Menu.

    You can get started by opening up the Event Timeline chart (under the Charts Menu) and zooming in at an appropriate level.

    Check out the Detecting and Removing Unnecessary Serialization Tutorial on ideas how to get started. For details on all Intel Trace Analyzer functionality, refer to the Intel Trace Analyzer Reference Guide.

  • Can I use Intel Trace Analyzer and Collector with Intel® VTune™ Amplifier XE, Intel® Inspector XE, or other analysis tools?
  • While these tools would collect information separate from each other, in their own format, it's easy enough to use the Intel VTune Amplifier XE and Intel Inspector XE tools under an MPI environment. Check each tool's respective User's Guide for more info on Viewing Collected MPI Data.

    You can use tools such as Intel VTune Amplifier XE and Intel Inspector XE for node-level analysis, and use the Intel Trace Analyzer and Collector for cluster-level analysis.

Intel® Trace Analyzer & Collector

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Intel® Trace Analyzer and Collector Product Documentation - View documentation online!
Documentation for other software products

Featured Support Topics

Nessun contenuto trovato