Hi, I have a question on the pipeline of Xeon Phi. That is, does the pipeline of Xeon Phi support forwarding technique? If it does, there should not be any stalls for the RAW instructions (suppose the instructions access register only and no memory access). 

(1) ADD R1, R2, R3    // write the result into register R1

(2) MUL R5, R1, R4    // read the content of register R1

In theory, if the pipeline suupports the forwarding technique, there should not be any stalls between instruction (1) and instruction (2).


Best Known Method: Avoid heterogeneous precision in control flow calculations

Best Known Method

Running an MPI program in symmetric mode on an Intel® Xeon® host and an Intel Xeon Phi™ coprocessor may deadlock in specific cases due to the heterogeneous precision in replicated control flow calculations. The advice is to determine the control flow only on one master MPI process.

The Issue

Intel MPI applications can be executed on multiple combinations of Intel Xeon processors and Intel Xeon Phi coprocessors. Four different execution models are supported:

