Scalapack p*gemr2d return codes

Scalapack p*gemr2d return codes

Bild des Benutzers John Young

Hi,

I'm using the MKL Scalapack functions P*RGEM2D to distribute matrices between different contexts.  Most MKL Scalapack functions take an 'info' argument to return various error information from the function.  I cannot find any intel documentation for the meaning of the returned values in 'info' for the P*RGEM2D functions.  Any help concerning the meaning of the info values would be appreciated.

Thanks,

John

6 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Bild des Benutzers John Young

Sorry, I meant P*GEMR2D and not P*RGEM2D

Bild des Benutzers Zhang Z (Intel)

MKL doesn't seem to offer this routine, although it is found in Netlib ScaLAPACK. I'm trying to get an explanation from the MKL engineering team and will report back here.

Bild des Benutzers John Young

It's definitely in Intel's MKL library because we have been using it for several years.  However, it is undocumented in the Intel MKL guides.  It is documented in the Scalapack book on the netlib web site.  It is an extremely useful Scalapack routine as it's the only one I know of that does an inter-context transfer. 

Bild des Benutzers Zhang Z (Intel)

It looks like MKL has a documentation gap for this routine. A bug report will be created to have the gap filled in future MKL releases.

Bild des Benutzers Zhang Z (Intel)

Before the documentation is fixed, please use the information below for descriptions of the PDGEMR2D routine. Note that it is a FORTRAN routine, but it can be called from C code as pdgemr2d_(...)

    -- ScaLAPACK routine (version 1.7) --

       Oak Ridge National Laboratory, Univ. of Tennessee, and Univ. of

       California, Berkeley.

       October 31, 1994.

 

      SUBROUTINE PDGEMR2D( M, N,

     $                     A, IA, JA, ADESC,

     $                     B, IB, JB, BDESC,

     $                     CTXT)

  ------------------------------------------------------------------------

    Purpose

    =======

 

    PDGEMR2D copies a submatrix of A on a submatrix of B.

    A and B can have different distributions: they can be on different

    processor grids, they can have different blocksizes, the beginning

    of the area to be copied can be at a different places on A and B.

 

    The parameters can be confusing when the grids of A and B are

    partially or completly disjoint, in the case a processor calls

    this routines but is either not in the A context or B context, the

    ADESC[CTXT] or BDESC[CTXT] must be equal to -1, to ensure the

    routine recognise this situation.

    To summarize the rule:

    - If a processor is in A context, all parameters related to A must be valid.

    - If a processor is in B context, all parameters related to B must be valid.

    -  ADESC[CTXT] and BDESC[CTXT] must be either valid contexts or equal to -1.

    - M and N must be valid for everyone.

    - other parameters are not examined.

 

 

    Notes

    =====

 

    A description vector is associated with each 2D block-cyclicly dis-

    tributed matrix.  This vector stores the information required to

    establish the mapping between a matrix entry and its corresponding

    process and memory location.

 

    In the following comments, the character _ should be read as

    "of the distributed matrix".  Let A be a generic term for any 2D

    block cyclicly distributed matrix.  Its description vector is DESC_A:

 

   NOTATION        STORED IN      EXPLANATION

   --------------- -------------- --------------------------------------

   DT_A   (global) DESCA( DT_ )   The descriptor type.

   CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating

                                  the BLACS process grid A is distribu-

                                  ted over. The context itself is glo-

                                  bal, but the handle (the integer

                                  value) may vary.

   M_A    (global) DESCA( M_ )    The number of rows in the distributed

                                  matrix A.

   N_A    (global) DESCA( N_ )    The number of columns in the distri-

                                  buted matrix A.

   MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute

                                  the rows of A.

   NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute

                                  the columns of A.

   RSRC_A (global) DESCA( RSRC_ ) The process row over which the first

                                  row of the matrix A is distributed.

   CSRC_A (global) DESCA( CSRC_ ) The process column over which the

                                  first column of A is distributed.

   LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local

                                  array storing the local blocks of the

                                  distributed matrix A.

                                  LLD_A >= MAX(1,LOCp(M_A)).

 

 

 

    Important notice

    ================

     The parameters of the routine have changed in April 1996

     There is a new last argument. It must be a context englobing

     all processors involved in the initial and final distribution.

 

     Be aware that all processors  included in this

      context must call the redistribution routine.

 

    Parameters

    ==========

 

 

    M        (input) INTEGER.

             On entry, M specifies the number of rows of the

             submatrix to be copied.  M must be at least zero.

             Unchanged on exit.

 

    N        (input) INTEGER.

             On entry, N specifies the number of cols of the submatrix

             to be redistributed.rows of B.  M must be at least zero.

             Unchanged on exit.

 

    A        (input) DOUBLE PRECISION

             On entry, the source matrix.

             Unchanged on exit.

 

    IA,JA    (input) INTEGER

             On entry,the coordinates of the beginning of the submatrix

             of A to copy.

             1 <= IA <= M_A - M + 1,1 <= JA <= N_A - N + 1,

             Unchanged on exit.

 

    ADESC    (input) A description vector (see Notes above)

             If the current processor is not part of the context of A

             the ADESC[CTXT] must be equal to -1.

 

 

    B        (output) DOUBLE PRECISION

             On entry, the destination matrix.

             The portion corresponding to the defined submatrix are updated.

 

    IB,JB    (input) INTEGER

             On entry,the coordinates of the beginning of the submatrix

             of B that will be updated.

             1 <= IB <= M_B - M + 1,1 <= JB <= N_B - N + 1,

             Unchanged on exit.

 

    BDESC    (input) B description vector (see Notes above)

             For processors not part of the context of B

             BDESC[CTXT] must be equal to -1.

 

    CTXT     (input) a context englobing at least all processors included

                in either A context or B context

   Memory requirement :

   ====================

 

   for the processors belonging to grid 0, one buffer of size block 0

   and for the processors belonging to grid 1, also one buffer of size

   block 1.

C interface:

pdgemr2d_ (MKL_INT *m, MKL_INT *n, double *A, MKL_INT *ia, MKL_INT *ja, MKL_INT *desc_A,

                                MKL_INT *B, MKL_INT *ib, MKL_INT *jb, MKL_INT *desc_B, MKL_INT *gcontext);

Melden Sie sich an, um einen Kommentar zu hinterlassen.