Elaborazione parallela

forwarding technique

Hi, I have a question on the pipeline of Xeon Phi. That is, does the pipeline of Xeon Phi support forwarding technique? If it does, there should not be any stalls for the RAW instructions (suppose the instructions access register only and no memory access). 

(1) ADD R1, R2, R3    // write the result into register R1

(2) MUL R5, R1, R4    // read the content of register R1

In theory, if the pipeline suupports the forwarding technique, there should not be any stalls between instruction (1) and instruction (2).


Best Known Method: Avoid heterogeneous precision in control flow calculations

Best Known Method

Running an MPI program in symmetric mode on an Intel® Xeon® host and an Intel Xeon Phi™ coprocessor may deadlock in specific cases due to the heterogeneous precision in replicated control flow calculations. The advice is to determine the control flow only on one master MPI process.

The Issue

Intel MPI applications can be executed on multiple combinations of Intel Xeon processors and Intel Xeon Phi coprocessors. Four different execution models are supported:

  • Sviluppatori
  • Professori
  • Studenti
  • C/C++
  • Fortran
  • Intermedio
  • Compilatore C++ Intel®
  • Intel® MPI Library
  • MPI
  • computational accuracy
  • solver control
  • Message Passing Interface
  • Architettura Intel® Many Integrated Core
  • Elaborazione parallela
  • How To Froce a Kernel Dump


    I have been working on MIC kernel modules and had kernel crash dump working for a while. After upgrading MPSS to mpss_gold_update_3, somehow the vmcore can no longer be generated. Does anyone know a good way to force a kernel dump ? My kmod is hanging and I really want toknow what various kernel threads have been doing.

    Thanks, Wendy

    dgemm: very slow with rectangular matrices

    I need to calculate the products of some relatively small matrices. I've written a test program (attached, hopefully) which demonstrates several performance problems with MKL's dgemm function on Phi. The matrices are tall and narrow, which seems to be particularly bad for Phi. (All of my dimensions are multiples of 64 bytes, so I don't think alignment is the culprit.)

    New white paper: in-place multithreaded transposition with common code for CPU and MIC

    A few months ago I posted about issues with producing a matrix transposition code that works well on Xeon Phi. Since then, I did more homework and improved the code to yield a satisfactory 113 GB/s transposition rate on 7110P (67% of the STREAM copy bandwidth).

    Intel Premier Support is upgrading to a new version

    Hi everyone,

    I'm very happy to announce that a new version of Intel® Premier Support is about to be released. For details, check out the New Intel Premier Support.

    Please expect some down-time: Thursday August 15th ~6:00pm PDT to Sunday August 18th ~5:00 PDT.  During this period, please use the forums for issue submission.


    Can not update the Flash of MIC to the latest version

    Hello everybody~

    I got the MIC card in January,2013, but I finally set up a server to use the MIC card just a few days ago. However, when I try to install the latest MPSS(mpss_gold_update_3-2.1.6720-16), I met a problem. It is required to update the SMC & Flash of MIC to finish the installation of MPSS. However, if you want to update the SMC & Flash, you should meet the requirements:

    Iscriversi a Elaborazione parallela