Debugging Intel® Xeon Phi™ Applications on Windows* Host

Published: 08/25/2015, Last Updated: 08/25/2015


Intel® Xeon Phi™ coprocessor is a product based on the Intel® Many Integrated Core Architecture (Intel® MIC). Intel® offers a debug solution for this architecture that can debug applications running on an Intel® Xeon Phi™ coprocessor.

There are many reasons for the need of a debug solution for Intel® MIC. Some of the most important ones are the following:

  • Developing native Intel® MIC applications is as easy as for IA-32 or Intel® 64 hosts. In most cases they just need to be cross-compiled (/Qmic).
    Yet, Intel® MIC Architecture is different to host architecture. Those differences could unveil existing issues. Also, incorrect tuning for Intel® MIC could introduce new issues (e.g. alignment of data, can an application handle more than hundreds of threads?, efficient memory consumption?, etc.)
  • Developing offload enabled applications induces more complexity as host and coprocessor share workload.
  • General lower level analysis, tracing execution paths, learning the instruction set of Intel® MIC Architecture, …

Debug Solution for Intel® MIC

For Windows* host, Intel offers a debug solution, the Intel® Debugger Extension for Intel® MIC Architecture Applications. It supports debugging offload enabled application as well as native Intel® MIC applications running on the Intel® Xeon Phi™ coprocessor.

How to get it?

To obtain Intel® Debugger Extension for Intel® MIC Architecture on Windows* host, you need the following:

Debug Solution as Integration

Debug solution from Intel® based on GNU* GDB:

  • Full integration into Microsoft Visual Studio*, no command line version needed
  • Available with Intel® Composer XE 2013 SP1 and later
    (Intel® Parallel Studio XE Composer Edition is the successor)

Pure native debugging on the coprocessor is also possible by using Intel’s version of GNU* GDB for the coprocessor. This is covered in the following article for Linux* host:

Why integration into Microsoft Visual Studio*?

  • Microsoft Visual Studio* is established IDE on Windows* host
  • Integration reuses existing usability and features
  • Fortran support added with Intel® Parallel Studio XE Composer Edition for Fortran (former Intel® Fortran Composer XE)

Components Required

The following components are required to develop and debug for Intel® MIC Architecture:

  • Intel® Xeon Phi™ coprocessor
  • Windows* Server 2008 RC2, Windows* 7 or later
  • Microsoft Visual Studio* 2012 or later
    Support for Microsoft Visual Studio* 2013 was added with Intel® Composer XE 2013 SP1 Update 1. Microsoft Visual Studio* 2015 is supported with Intel® Parallel Studio XE 2016 Composer Edition and Intel® Parallel Studio XE 2015 Composer Edition Update 4, and later.
  • Intel® MPSS 3.1 or later
  • C/C++ development:
    Intel® C++ Composer XE 2013 SP1 for Windows* or later
  • Fortran development:
    Intel® Fortran Composer XE 2013 SP1 for Windows* or later

Configure & Test

It is crucial to make sure that the coprocessor setup is correctly working. Otherwise the debugger might not be fully functional.

Setup Intel® MPSS:

  • Follow Intel® MPSS readme-windows.pdf for setup
  • Verify that the Intel® Xeon Phi™ coprocessor is running

Before debugging applications with offload extensions:

  • Use official examples from:
    C:\Program Files (x86)\IntelSWTools\samples_2016\en
  • Verify that offloading code works

It is crucial to make sure that the coprocessor setup is correctly working. Otherwise the debugger might not be fully functional.

Prerequisite for Debugging

Debugger integration for Intel® MIC Architecture only works when debug information is being available:

  • Compile in debug mode with at least the following option set:
    /Zi (compiler) and /DEBUG (linker)
  • Optional: Unoptimized code (/Od) makes debugging easier
    (due to removed/optimized away temporaries, etc.)
    Visual Studio* Project Properties (Debug Information & Optimization)

Applications can only be debugged in 64 bit

  • Set platform to x64
  • Verify that /MACHINE:x64 (linker) is set!
    Visual Studio* Project Properties (Machine)

Debugging Applications with Offload Extension

Start Microsoft Visual Studio* IDE and open or create an Intel® Xeon Phi™ project with offload extensions. Examples can be found in the Samples directory of Intel® Parallel Studio XE Composer Edition (former Intel® Composer XE), that is:

C:\Program Files (x86)\IntelSWTools\samples_2016\en

  • compiler_c\psxe\ or
  • compiler_f\psxe\

We’ll use intro_SampleC from the official C++ examples in the following.

Compile the project with Intel® C++/Fortran Compiler.

Characteristics of Debugging

  • Set breakpoints in code (during or before debug session):
    • In code mixed for host and coprocessor
    • Debugger integration automatically dispatches between host/coprocessor
  • Run control is the same as for native applications:
    • Run/Continue
    • Stop/Interrupt
    • etc.
  • Offloaded code stops execution (offloading thread) on host
  • Offloaded code is executed on coprocessor in another thread
  • IDE shows host/coprocessor information at the same time:
    • Breakpoints
    • Threads
    • Processes/Modules
    • etc.
  • Multiple coprocessors are supported:
    • Data shown is mixed:
      Keep in mind the different processes and address spaces
    • No further configuration needed:
      Debug as you go!

Setting Breakpoints

Debugging Applications with Offload Extension - Setting Breakpoints

Note the mixed breakpoints here:
The ones set in the normal code (not offloaded) apply to the host. Breakpoints on offloaded code apply to the respective coprocessor(s) only.
The Breakpoints window shows all breakpoints (host & coprocessor(s)).

Start Debugging

Start debugging as usual via menu (shown) or <F5> key:
Debugging Applications with Offload Extension - Start Debugging

While debugging, continue till you reach a set breakpoint in offloaded code to debug the coprocessor code.

Thread Information

Debugging Applications with Offload Extension - Thread Information

Information of host and coprocessor(s) is mixed. In the example above, the threads window shows two processes with their threads. One process comes from the host, which does the offload. The other one is the process hosting and executing the offloaded code, one for each coprocessor.

Additional Requirements

For debugging offload enabled applications additional environment variables need to be set:

  • Intel® MPSS 2.1:

  • Intel® MPSS 3.*:

Set those variables before starting Visual Studio* IDE!

Those are currently needed but might become obsolete in the future. Please be aware that the debugger cannot and should not be used in combination with Intel® VTune™ Amplifier XE. Hence disabling SEP (as part of Intel® VTune™ Amplifier XE) is valid. The watchdog monitor must be disabled because a debugger can stop execution for an unspecified amount of time. Hence the system watchdog might assume that a debugged application, if not reacting anymore, is dead and will terminate it. For debugging we do not want that.

Do not set those variables for a production system!

Debugging Native Coprocessor Applications


Create a native Intel® Xeon Phi™ coprocessor application and transfer & execute the application to the coprocessor target:

  • Use micnativeloadex.exe provided by Intel® MPSS for an application C:\Temp\mic-examples\bin\myApp, e.g.:

    > "C:\Program Files\Intel\MPSS\bin\micnativeloadex.exe" "C:\Temp\mic-examples\bin\myApp" -d 0
  • Option –d 0 specifies the first device (zero based) in case there are multiple coprocessors per system
  • The application is executed directly after transfer

micnativeloadex.exe transfers the specified application to the specified coprocessor and directly executes it. The command itself will be blocked until the transferred application terminates.
Using micnativeloadex.exe also takes care about dependencies (i.e. libraries) and transfers them, too.

Other ways to transfer and execute native applications are also possible (but more complex):

  • NFS
  • FTP
  • etc.

Debugging native applications with Start Visual Studio* IDE is only possible via Attach to Process…:

  • micnativeloadex.exe has been used to transfer and execute the native application
  • Make sure the application waits till attached, e.g. by:
    		static int lockit = 1;
    		while(lockit) { sleep(1); }
  • After having attached, set lockit to 0 and continue.
  • No Visual Studio* solution/project is required.

Only one coprocessor at a time can be debugged this way.


Open the options via TOOLS/Options… menu:Debugging Native Coprocessor Applications - Configuration

It tells the debugger extension where to find the binary and sources. This needs to be changed every time a different coprocessor native application is being debugged.

The entry solib-search-path directories works the same as for the analogous GNU* GDB command. It allows to map paths from the build system to the host system running the debugger.

The entry Host Cache Directory is used for caching symbol files. It can speed up lookup for big sized applications.


Open the options via TOOLS/Attach to Process… menu:Debugging Native Coprocessor Applications - Attach to Process...

Specify the Intel(R) Debugger Extension for Intel(R) MIC Architecture. Set the IP and port the GDBServer should be executed with. The usual port for GDBServer is 2000 but we recommend to use a non-privileged port (e.g. 16000).
After a short delay the processes of the coprocessor card are listed. Select one to attach.

Note Checkbox Show processes from all users does not have a function for the coprocessor as user accounts cannot be mapped from host to target and vice versa (Linux* vs. Windows*).


More information can be found in the official documentation from Intel® Parallel Studio XE Composer Edition:
C:\Program Files (x86)\IntelSWTools\documentation_2016\en\debugger\ps2016\get_started.htm

Additional Resources

Intel® Xeon Phi™ Coprocessor developer zone

Intel® Many Integrated Core Architecture Forum

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804