Hi, every body
I should be very obliged if some one can help me on this: Suppose for simplicity that there is a parallel loop, and two matrices A & B, both having dimension m*n, and also we have P processors. Matrix A is distributed row-wise and B is distributed column-wise on P processors. Can you possibly give an elegant and fast way for doing the following pesudo-code using MPI commands (like MPI_Gather & MPI_Type_struct &...) in C language?
for k=1,...,KMAX // parallel loop on all P processors
// some parallel calculation on B to produce new results
// dim A= dim B= m*n
//A is distributed row-wise on P processors for example in local_A
//B is distributed colum-wise on P processorsfor example in local_B
A=B // ???? how to do thisfast in the parallel, using MPI
For an example suppose that P=3 (p0,p1,p2) , m=n=4 and note that first row of A and first column of B is stored on p0,..... (see the following diagram). Please note that each process has stored diiferent amount of A & B and we want direct setting i.e. a(i,j)=b(i,j) (no transpose). The bottleneck is that we want to set A=B in a loop for many times and each process has stored diiferent amount of A & B.
Thanks very much in advance
set A=B in MPI