pardiso_setenv crashes pardiso

pardiso_setenv crashes pardiso

I have a simple example as follows:

 int n = 5;
 std::vector<int> ia(6);
 for(unsigned i=0;i<ia.size();_i++)ia[i]=_i;
  std::vector<int> ja(5);
  for(unsigned _=0_i<_ja.size();i++)ja[_i]=_i;

 std::vector<double> a(5,1.);

If I call padrsio_setenv to set PARDISO_OOC_FILE_NAME, it crashes during the numerical factorization. However, if I use a different example,

int n = 8;
  int nrhs = 2;
  int _ia[9] = { 1, 5, 8, 10, 12, 15, 17, 18, 19};
  int _ja[18] =
  { 1,    3,       6, 7,
      2, 3,    5,
      3,             8,
      4,       7,
      5, 6, 7,
      6,    8,
      7,
      8
  };
  for(int i = 0; i <9; i++) _ia[i]--;
  for(int i = 0; i<18; i++) _ja[i]--;
  double _a[18] =
  { 7.0,      1.0,           2.0, 7.0,
      4.0, 8.0,      2.0,
      1.0,                     5.0,
      7.0,           9.0,
      5.0, 1.0, 5.0,
      1.0,      5.0,
      11.0,
      5.0
  };

std::vector<int> ia (_ia, _ia + sizeof(_ia) / sizeof(int) );
  std::vector<int> ja (_ja, _ja + sizeof(_ja) / sizeof(int) );
  std::vector<double> a (_a, _a + sizeof(_a) / sizeof(double) );

It works fine. Here is how I call pardiso:

for (int i = 0; i < 64; i++) {
    iparm[i] = 0;
  }
  iparm[0] = 1; /* No solver default */
  iparm[1] = 2; /* Fill-in reordering from METIS */
  /* Numbers of processors, value of OMP_NUM_THREADS */
  iparm[2] = 1;
  iparm[3] = 0; /* No iterative-direct algorithm */
  iparm[4] = 0; /* No user fill-in reducing permutation */
  iparm[5] = 0; /* Write solution into x */
  iparm[6] = 0; /* Not in use */
  //iparm[7] = 0; /* Max numbers of iterative refinement steps */
  iparm[8] = 0; /* Not in use */
  iparm[9] = 13; /* Perturb the pivot elements with 1E-13 */
  iparm[10] = 1; /* Use nonsymmetric permutation and scaling MPS */
  iparm[11] = 0; /* Not in use */
  iparm[12] = 1; /* Maximum weighted matching algorithm is switched-on (default for non-symmetric) */
  iparm[13] = 0; /* Output: Number of perturbed pivots */
  iparm[14] = 0; /* Not in use */
  iparm[15] = 0; /* Not in use */
  iparm[16] = 0; /* Not in use */
  iparm[17] = -1; /* Output: Number of nonzeros in the factor LU */
  iparm[18] = -1; /* Output: Mflops for LU factorization */
  iparm[19] = 0; /* Output: Numbers of CG Iterations */
  iparm[27] = 0;
  iparm[34] = 1;
  iparm[59] = 1; // 0: in-core; 1: in-core first; then OOC if not enough memory; 2: ooc

  maxfct = 1; /* Maximum number of numerical factorizations. */
  mnum = 1; /* Which factorization to use. */
  error = 0; /* Initialize error flag */

for (int i = 0; i < 64; i++) {
    pt[i] = 0;
  }

std::string fname = this->tmpDir + "/" + ooc_prefix;
  PARDISO_ENV_PARAM param = PARDISO_OOC_FILE_NAME;
  pardiso_setenv(pt, &param, fname.c_str());

phase = 11;

callPARDISO(pt, &maxfct, &mnum, &mtype, &phase,
      &n, (AT*)&a[0], (int*)&ia[0], (int*)&ja[0], &idum, &idum,
      iparm, &mkl_msglvl, &ddum, &ddum, &error);

phase = 22;
  callPARDISO (pt, &maxfct, &mnum, &mtype, &phase,
      &n, (AT*)&a[0], (int*)&ia[0], (int*)&ja[0], &idum, &idum,
      iparm, &mkl_msglvl, &ddum, &ddum, &error);

By the way, I tried to modify the mkl example pardiso_sym_0_based.c to reproduce this problem but got compiling error

./source/pardiso_sym_0_based.c: In function ‘main’:                                                                              
./source/pardiso_sym_0_based.c:120: error: ‘PARDISO_ENV_PARAM’ undeclared (first use in this function)                           
./source/pardiso_sym_0_based.c:120: error: (Each undeclared identifier is reported only once                                     
./source/pardiso_sym_0_based.c:120: error: for each function it appears in.)                                                     
./source/pardiso_sym_0_based.c:120: error: expected ‘;’ before ‘param’                                                           
./source/pardiso_sym_0_based.c:121: error: stray ‘@’ in program                                                                  
./source/pardiso_sym_0_based.c:121: error: ‘param’ undeclared (first use in this function)

As an workaround, I set the OOC file name using the configuration file.

I am using MKL 11.1.0

Major version: 11
Minor version: 1
Update version: 0
Product status:  Product
Build: n20130711

13 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I am using gnu compiler:

make sointel64 function=pardiso_sym_0_based compiler=gnu

----- Compiling gnu_lp64_parallel_intel64_so ----- pardiso_sym_0_based
gcc -m64  -w -I"../../include" \
        ./source/pardiso_sym_0_based.c  \
        -L"../../lib/intel64" -lmkl_intel_lp64 \
        -lmkl_gnu_thread \
        -lmkl_core \
         -L"../../../compiler/lib/intel64" -liomp5 -lpthread -lm -ldl -o _results/gnu_lp64_parallel_intel64_so/pardiso_sym_0_based.out
./source/pardiso_sym_0_based.c: In function ‘main’:
./source/pardiso_sym_0_based.c:110: error: ‘PARDISO_ENV_PARAM’ undeclared (first use in this function)
./source/pardiso_sym_0_based.c:110: error: (Each undeclared identifier is reported only once
./source/pardiso_sym_0_based.c:110: error: for each function it appears in.)
./source/pardiso_sym_0_based.c:110: error: expected ‘;’ before ‘param’
./source/pardiso_sym_0_based.c:111: error: ‘param’ undeclared (first use in this function)
./source/pardiso_sym_0_based.c:111: error: ‘fname’ undeclared (first use in this function)
make[1]: *** [pardiso_sym_0_based] Error 1
make: *** [sointel64] Error 2

 

here is the trace back:

SIGSEGV: memory access exception
Command: StepSimulation
   Recoverability: Non-recoverable
   ServerStack: [
libStarNeo.so: SignalHandler::signalHandlerFunction(int, siginfo*, void*),
libpthread.so.0(),
libmkl_core.so(mkl_pds_lp64_check_precision_c+0x46),
libmkl_core.so(mkl_pds_lp64_pardiso+0x83),
libDirectSolver.so: MKLPARDISO<double>::numericalFact(),

Let's establish a basic fact. You took a standard MKL example source code that works correctly, and modified it. The modified source code caused the compiler to emit error messages. You have shown many disjointed pieces of code, but did not state exactly what the modifications to the file pardiso_sym_0_based.c were. Nevertheless, you report error messages and ask for a fix. This is not a reasonable thing to do without spending considerable effort, to say the least, and, quite possibly, impossible.

This is what I suggest: describe the modifications precisely and completely, or attach the modified source file.

 

I did want to reproduce the bug using your example but it gives me compiling error (see above and below). The only way to reproduce the bug is my integration given above.

make sointel64 function=pardiso_sym_0_based compiler=gnu

----- Compiling gnu_lp64_parallel_intel64_so ----- pardiso_sym_0_based

gcc -m64  -w -I"../../include" \

        ./source/pardiso_sym_0_based.c  \

        -L"../../lib/intel64" -lmkl_intel_lp64 \

        -lmkl_gnu_thread \

        -lmkl_core \

         -L"../../../compiler/lib/intel64" -liomp5 -lpthread -lm -ldl -o _results/gnu_lp64_parallel_intel64_so/pardiso_sym_0_based.out

./source/pardiso_sym_0_based.c: In function ‘main’:

./source/pardiso_sym_0_based.c:110: error: ‘PARDISO_ENV_PARAM’ undeclared (first use in this function)

./source/pardiso_sym_0_based.c:110: error: (Each undeclared identifier is reported only once

./source/pardiso_sym_0_based.c:110: error: for each function it appears in.)

./source/pardiso_sym_0_based.c:110: error: expected ‘;’ before ‘param’

./source/pardiso_sym_0_based.c:111: error: ‘param’ undeclared (first use in this function)

./source/pardiso_sym_0_based.c:111: error: ‘fname’ undeclared (first use in this function)

make[1]: *** [pardiso_sym_0_based] Error 1

make: *** [sointel64] Error 2

If you tell me how to fix the compiling error using your example, I will try to reproduce the bug using your example.

Inside a C source file, instead of

PARDISO_ENV_PARAM param = PARDISO_OOC_FILE_NAME;

you should write

enum PARDISO_ENV_PARAM penv = PARDISO_OOC_FILE_NAME;

However, since the example pardiso_sym_0_based.c does not do any out of core solution, calling pardiso_setenv() does not do anything significant.

Attached please find the modified pardiso_sym_0_based.c which reproduce the crash.

Did you forget to attach the file? If you have problems doing so, you can also paste short source code inline, using the "{....}/code" button in the toolbar and selecting "C++".

/*
********************************************************************************
*   Copyright(C) 2004-2014 Intel Corporation. All Rights Reserved.
*   
*   The source code, information  and  material ("Material") contained herein is
*   owned  by Intel Corporation or its suppliers or licensors, and title to such
*   Material remains  with Intel Corporation  or its suppliers or licensors. The
*   Material  contains proprietary information  of  Intel or  its  suppliers and
*   licensors. The  Material is protected by worldwide copyright laws and treaty
*   provisions. No  part  of  the  Material  may  be  used,  copied, reproduced,
*   modified, published, uploaded, posted, transmitted, distributed or disclosed
*   in any way  without Intel's  prior  express written  permission. No  license
*   under  any patent, copyright  or  other intellectual property rights  in the
*   Material  is  granted  to  or  conferred  upon  you,  either  expressly,  by
*   implication, inducement,  estoppel or  otherwise.  Any  license  under  such
*   intellectual  property  rights must  be express  and  approved  by  Intel in
*   writing.
*   
*   *Third Party trademarks are the property of their respective owners.
*   
*   Unless otherwise  agreed  by Intel  in writing, you may not remove  or alter
*   this  notice or  any other notice embedded  in Materials by Intel or Intel's
*   suppliers or licensors in any way.
*
********************************************************************************
*   Content : MKL PARDISO C example
*
********************************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#include "/u/xeons46/people/xian/intel/composer_xe_2013_sp1.3.174/mkl/include/mkl_pardiso.h"
#include "mkl_types.h"
#include "mkl.h"

MKL_INT main (void)
{
    /* Matrix data. */
/* my example */
    MKL_INT n = 5;
    MKL_INT ia[6] = { 0, 1, 2, 3, 4, 5};
    MKL_INT ja[5] = { 0, 1, 2, 3, 4};
    double a[5] = {1.0, 1.0, 1.0, 1.0, 1.0};
    double b[5], x[5];
/* end of my example */
/* original example */
/*
    MKL_INT n = 8;
    MKL_INT ia[9] = { 0, 4, 7, 9, 11, 14, 16, 17, 18};
    MKL_INT ja[18] =
    { 0,   2,       5, 6,
        1, 2,    4,
           2,             7,
              3,       6,
                 4, 5, 6,
                    5,    7,
                       6,
                          7
    };
    double a[18] =
    { 7.0,      1.0,           2.0, 7.0,
          -4.0, 8.0,      2.0,
                1.0,                     5.0,
                     7.0,           9.0,
                          5.0, 1.0, 5.0,
                              -1.0,      5.0,
                                   11.0,
                                         5.0
    };
    double b[8], x[8];
*/
/* end of original example */

    MKL_INT mtype = -2;       /* Real symmetric matrix */
    /* RHS and solution vectors. */
    MKL_INT nrhs = 1;     /* Number of right hand sides. */
    /* Internal solver memory pointer pt, */
    /* 32-bit: int pt[64]; 64-bit: long int pt[64] */
    /* or void *pt[64] should be OK on both architectures */
    void *pt[64];
    /* Pardiso control parameters. */
    MKL_INT iparm[64];
    MKL_INT maxfct, mnum, phase, error, msglvl;
    /* Auxiliary variables. */
    MKL_INT i;
    double ddum;          /* Double dummy */
    MKL_INT idum;         /* Integer dummy. */
/* -------------------------------------*/
/* .. Setup Pardiso control parameters. */
/* -------------------------------------*/
    for ( i = 0; i < 64; i++ )
    {
        iparm[i] = 0;
    }
    iparm[0] = 1;         /* No solver default */
    iparm[1] = 2;         /* Fill-in reordering from METIS */
    iparm[3] = 0;         /* No iterative-direct algorithm */
    iparm[4] = 0;         /* No user fill-in reducing permutation */
    iparm[5] = 0;         /* Write solution into x */
    iparm[7] = 2;         /* Max numbers of iterative refinement steps */
    iparm[9] = 13;        /* Perturb the pivot elements with 1E-13 */
    iparm[10] = 1;        /* Use nonsymmetric permutation and scaling MPS */
    iparm[12] = 0;        /* Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy */
    iparm[13] = 0;        /* Output: Number of perturbed pivots */
    iparm[17] = -1;       /* Output: Number of nonzeros in the factor LU */
    iparm[18] = -1;       /* Output: Mflops for LU factorization */
    iparm[19] = 0;        /* Output: Numbers of CG Iterations */
    iparm[34] = 1;        /* PARDISO use C-style indexing for ia and ja arrays */
    maxfct = 1;           /* Maximum number of numerical factorizations. */
    mnum = 1;         /* Which factorization to use. */
    msglvl = 1;           /* Print statistical information in file */
    error = 0;            /* Initialize error flag */
/* ----------------------------------------------------------------*/
/* .. Initialize the internal solver memory pointer. This is only  */
/*   necessary for the FIRST call of the PARDISO solver.           */
/* ----------------------------------------------------------------*/
    for ( i = 0; i < 64; i++ )
    {
        pt[i] = 0;
    }
    enum PARDISO_ENV_PARAM penv = PARDISO_OOC_FILE_NAME;
    PARDISO_SETENV(pt, &penv, "/OOC");
/* --------------------------------------------------------------------*/
/* .. Reordering and Symbolic Factorization. This step also allocates  */
/*    all memory that is necessary for the factorization.              */
/* --------------------------------------------------------------------*/
    phase = 11;
    PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, &ddum, &ddum, &error);
    if ( error != 0 )
    {
        printf ("\nERROR during symbolic factorization: %d", error);
        exit (1);
    }
    printf ("\nReordering completed ... ");
    printf ("\nNumber of nonzeros in factors = %d", iparm[17]);
    printf ("\nNumber of factorization MFLOPS = %d", iparm[18]);
/* ----------------------------*/
/* .. Numerical factorization. */
/* ----------------------------*/
    phase = 22;
    PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, &ddum, &ddum, &error);
    if ( error != 0 )
    {
        printf ("\nERROR during numerical factorization: %d", error);
        exit (2);
    }
    printf ("\nFactorization completed ... ");
/* -----------------------------------------------*/
/* .. Back substitution and iterative refinement. */
/* -----------------------------------------------*/
    phase = 33;
    iparm[7] = 2;         /* Max numbers of iterative refinement steps. */
    /* Set right hand side to one. */
    for ( i = 0; i < n; i++ )
    {
        b[i] = 1;
    }
    PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, b, x, &error);
    if ( error != 0 )
    {
        printf ("\nERROR during solution: %d", error);
        exit (3);
    }
    printf ("\nSolve completed ... ");
    printf ("\nThe solution of the system is: ");
    for ( i = 0; i < n; i++ )
    {
        printf ("\n x [%d] = % f", i, x[i]);
    }
    printf ("\n");
/* --------------------------------------*/
/* .. Termination and release of memory. */
/* --------------------------------------*/
    phase = -1;           /* Release internal memory. */
    PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, &ddum, ia, ja, &idum, &nrhs,
             iparm, &msglvl, &ddum, &ddum, &error);
    return 0;
}

 

OK, now I can see the error appearing even on Windows with the 14.0.2.176 Icl and the associated MKL 11.1.3 (IA32 and X64). I think that the combination that causes the access violation to occur is (i) a diagonal matrix, and (ii) a call to pardiso_setenv before the first call to pardiso().

The program in #10 uses a OOC file path of /OOC, but changing this to a file name in one of the user's directories where there is no access problem still causes the access violation to occur.

Hi   xian-zhong.guous.cd-adapco.com  ,  mecej4

Thanks much for the discussion. I can reproduce the problem. look like we have a memory corruption in diagonal matrix computation. The issue have been escalated to our developer, will keep you update if any news. 

Thanks

Ying

Dear all, 

I heard from our developer, the issue have been fixed and the fixed code will be in MKL 11.2.1, which is target to be release around Nov or Dec.  You are welcome to try it and let us know if any issue at that time. 

Thanks

Ying 

 

Leave a Comment

Please sign in to add a comment. Not a member? Join today