Hi, every body
I should be very obliged if some one can help me on this: Suppose for simplicity that there is a parallel loop, and two matrices A & B, both having dimension m*n, and also we have P processors. Matrix A is distributed row-wise and B is distributed column-wise on P processors. Can you possibly give an elegant and fast way for doing the following pesudo-code using MPI commands (like MPI_Gather & MPI_Type_struct &...) in C language?
for k=1,...,KMAX // parallel loop on all P processors
// some parallel calculation on B to produce new results
// dim A= dim B= m*n
//A is distributed row-wise on P processors for example in local_A
//B is distributed colum-wise on P processorsfor example in local_B
A=B // ???? how to do thisfast in the parallel, using MPI
end for
For an example suppose that P=3 (p0,p1,p2) , m=n=4 and note that first row of A and first column of B is stored on p0,..... (see the following diagram). Please note that each process has stored diiferent amount of A & B and we want direct setting i.e. a(i,j)=b(i,j) (no transpose). The bottleneck is that we want to set A=B in a loop for many times and each process has stored diiferent amount of A & B.
Thanks very much in advance
Best regards,
Ham. Sha.
Mat. A
set A=B in MPI
Mat.B
p0
p1
p2
p2
p0
a11
a12
a13
a14
b11
b12
b13
b14
p1
a21
a22
a23
a24
=
b21
b22
b23
b24
p2
a3
1
a32
a33
a34
b31
b32
b33
b34
p2
a41
a42
a43
a44
b41
b42
b43
b44

