problems with MKL_DSS_DIAGONAL_SOLVE

problems with MKL_DSS_DIAGONAL_SOLVE

Bild des Benutzers Pietro B.

Hello. I am using the following example (heavily inspired by mkl/examples/solverc/source/dss_sym_c.c) to try and solve a system on a very simple diagonal matrix. The code below works correctly, and the output is as follows:

(0):  1 2 -3 -3
(1):  1 -2 -3 3
(2):  0 0 -0 -0
(3):  1 0.333333 -0.142857 -0.142857

However, if I assign MKL_DSS_DIAGONAL_SOLVE to opt_solve, instead of asking for a full solve, I get garbage as the resulting solution. The output is as follows:

(0):  2.02053e-319 1.39142e-317 2.27227e-318 1.39142e-317
(1):  3.52525e-294 2.64829e-314 0 0
(2):  0 0 2.82113e-317 2.81698e-317
(3):  0 0 0 2.122e-314

Am I doing anything wrong? The only difference between the first and second result is the changed assignment to opt_solve.

Thanks,

Pietro

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "mkl_dss.h"
#include "mkl_types.h"
#define NROWS     4
#define NCOLS     4
#define NNONZEROS 4
#define NRHS      4
MKL_INT main () {
  const MKL_INT nRows     = NROWS;
  const MKL_INT nCols     = NCOLS;
  const MKL_INT nNonZeros = NNONZEROS;
  const MKL_INT nRhs      = NRHS;
  _INTEGER_t rowIndex[NROWS + 1] = { 1, 2, 3, 4, 5};
  _INTEGER_t columns[NNONZEROS] = { 1,2,3,4};
  _DOUBLE_PRECISION_t values[NNONZEROS] = { 1,3,-7,-7};
  _DOUBLE_PRECISION_t rhs[NRHS*NROWS] = { 1, 6, 21, 21,
                                          1,-6, 21, -21,
                                          0, 0,  0,   0,
                                          1, 1,  1,   1};
  _DOUBLE_PRECISION_t solValues[NRHS*NCOLS];
  _MKL_DSS_HANDLE_t handle;
  _INTEGER_t error;
  MKL_INT i,j,
    opt_create  = MKL_DSS_DEFAULTS,
    opt_def_str = MKL_DSS_SYMMETRIC,
    opt_reord   = MKL_DSS_AUTO_ORDER,
    opt_factor  = MKL_DSS_INDEFINITE,
    opt_solve   = MKL_DSS_DEFAULTS,
    opt_default = MKL_DSS_DEFAULTS;
  if ((   (error = dss_create           (handle, opt_create))                                              != MKL_DSS_SUCCESS)
      || ((error = dss_define_structure (handle, opt_def_str, rowIndex, nRows, nCols, columns, nNonZeros)) != MKL_DSS_SUCCESS)
      || ((error = dss_reorder          (handle, opt_reord, 0))                                            != MKL_DSS_SUCCESS)
      || ((error = dss_factor_real      (handle, opt_factor, values))                                      != MKL_DSS_SUCCESS)
      || ((error = dss_solve_real       (handle, opt_solve, rhs, nRhs, solValues))                         != MKL_DSS_SUCCESS)
      || ((error = dss_delete           (handle, opt_default))                                             != MKL_DSS_SUCCESS)) {
    printf ("error: %dn", error);
  } else {
    printf ("Solutions:n");
    for (j=0;j<nRhs; ++j) {
      printf ("(%d): ", j);
      for (i = 0; i < nCols; i++)
        printf (" %g", solValues[j*nCols+i]);
      printf ("n");
    }
  }
}

15 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Bild des Benutzers Noah Clemons (Intel)

Hi, I am investigating with MKL engineering to confirm that this is a bug. I will get back to you ASAP.

Bild des Benutzers mecej4

Pietro:

It appears to me that you are not making a consistent sequence of calls using the DSS interface.

You have specified that the matrix type is symmetric positive definite, in which case the matrix A is factorized into the form LLT, For such a matrix, there is no provision for the diagonal solve phase. 

Secondly, even for a symmetric indefinite matrix, which is factored into the LDLT form, the diagonal solve phase is to be applied only after the forward solution phase has been successfully substituted. You will find some of these matters described in the sections of the MKL manual for the PARDISO solver.

What is the reason for your attempting  to do only the MKL_DSS_DIAGONAL_SOLVE phase? How do you expect it to work? Are you trying this for a more general case of a positive definite matrix, or for the special case of a diagonal matrix as in the C example code that you gave above? Note that the values you specified for the diagonal elements are such that the matrix is not positive-definite.

If you are sure that the matrix will always be diagonal, you do not need to use DSS to solve the system at all! Perhaps you should elaborate on what you wish to do.

Bild des Benutzers Pietro B.

@Noah: thanks for your reply. I look forward to your follow-up.

@mecej4: I'm following instructions of the manual for the MKL version 10.3.12: http://download-software.intel.com/sites/products/documentation/hpc/mkl/mklman/mklman.pdf

Zitat:

It appears to me that you are not making a consistent sequence of calls using the DSS interface.

The sequence of calls seems to respect the flow chart on page 2292 of the MKL manual: dss_create(), dss_define_structure(), dss_reorder(), dss_factor_real(), dss_solve_real(), dss_delete().

Zitat:

You have specified that the matrix type is symmetric positive definite, in which case the matrix A is factorized into the form LLT, For such a matrix, there is no provision for the diagonal solve phase.

I haven't specified the matrix to be positive definite. The matrix of the problem I am trying to solve can be indefinite. In my call to dss_factor_real(), I'm passing opt_factor=MKL_DSS_INDEFINITE (line 29 of the code above). According to the MKL manual (page 2299), dss_factor_real is the right function to pass that parameter.

Zitat:

Secondly, even for a symmetric indefinite matrix, which is factored into the LDLT form, the diagonal solve phase is to be applied only after the forward solution phase has been successfully substituted. You will find some of these matters described in the sections of the MKL manual for the PARDISO solver.

What is the reason for your attempting  to do only the MKL_DSS_DIAGONAL_SOLVE phase? How do you expect it to work? Are you trying this for a more general case of a positive definite matrix, or for the special case of a diagonal matrix as in the C example code that you gave above? Note that the values you specified for the diagonal elements are such that the matrix is not positive-definite.

I'm not sure why I should call dss_solve_real with forward solution phase and then the diagonal one. The method you mention, discussed on page 2262 of the manual, is for solving a system Ax=b, for example with with A symmetric indefinite. I am _not_ interested in solving such a system. I am interested in D itself, therefore I am trying to solve a system Dx=e, where e is the vector of all ones, so that x will contain the inverse of the diagonal elements of D. Clearly, this only works if there are no null eigenvalues, but that is another matter (I can pass "Inertia" to dss_statistics to get the number of null eigenvalues). I am interested in solving systems of the form Lx=f in other parts of the procedure, but first I would like to solve a system with D only. Also, if matrix D has 2x2 diagonal blocks, two systems Dx=e and Dx=f (where f has -1 and 1 alternating) should suffice to determine which are the 1x1 and which are the 2x2 blocks. I'm resorting to this because that I do not know how to actually access the D matrix. There is an example, mkl/examples/solverc/source/pardiso_sym_getdiag_c.c, that accesses D by computing its inverse, but that again supposes that D is nonsingular.

Zitat:

If you are sure that the matrix will always be diagonal, you do not need to use DSS to solve the system at all! Perhaps you should elaborate on what you wish to do.

The matrix A is nondiagonal in general, if that were the case I would obviously not use the DSS procedure.

Thanks,

Pietro

Bild des Benutzers mecej4

Pietro,

Thanks for the clarifications, and I see from them that I misunderstood some parts of your earlier post. Since DSS, in the MKL implementation, is simply a wrapper around PARDISO, I took the symmetric indefinite example from the MKL PARDISO examples. I used your diagonal matrix and just one of your RHS vectors. The modified file is attached. It works as expected, and gives the solution as [1, 1/3, 0, 1/7]. If you now make a single change, on line-90, changing "phase = 33;" to "phase = 332;", asking it to do only the diagonal solution phase, as you wish to do, the program gives junk results. Without seeing the PARDISO source code one cannot explain why this happens. We can conclude that even though in a hand calculation one may recognize that the matrix is diagonal and skip the factorization and, likewise, skip the forward elimination and backward substitution parts of the solution phase, the code as written does not allow this simplification to be reflected in the PARDISO calls.

As you recognized already, this is only a minor issue since the intended application of your program does not involve diagonal matrices.

Anlagen: 

AnhangGröße
Herunterladen sym1.c3.85 KB
Bild des Benutzers Pietro B.

@mecej4:

Thanks for taking the time to experiment with source code. Yours is another example where a diagonal matrix will give troubles when we try the diagonal solution. It is probably better to check for "diagonality" and run a different, simpler procedure in that case. However, that does not tell us if we get junk every time the matrix is non-diagonal and we set the option to MKL_DSS_DIAGONAL_SOLVE.

Curiously enough, when experimenting with a non-diagonal matrix and asking for both a full and a diagonal solve, I get the correct result both times. This proves little since I only tried a couple of non-diagonal matrices. My guess, however, is that PARDISO might check for diagonality and, in that case, run a different code. Perhaps that different code, when asked for a diagonal solve, returns garbage but works correctly in the full-solve case. It would be strange however that the user has to check for diagonality, since this is not mentioned in the manual. 

Pietro

Bild des Benutzers mecej4

Pietro,

The API style provided by DSS is, in one sense, "object-oriented", and the examples in this thread expose the deficiencies in such API-s.

The library routines require the user to provide a minimal set of problem data, and keep all internal algorithmic data hidden from the user. The algorithm has many stages and the library routines maintain corresponding "state variables" reachable through the "problem handle". Things work fine as long as the calls are made in the intended order.

Out-of-order calls destroy the sanity/coherence of the algorithmic state and the library routines may not be smart enough to detect the sanity loss. They output garbage. Such out-of-order calls are the equivalent of throwing a wrench into finely tuned machinery.

Bild des Benutzers Pietro B.

@mecej4:

I actually think information hiding is very useful and at the base of a lot of successful APIs, for a lot of reasons that benefit both the API authors and their user base. As I said before, I think I make the calls in the correct order and pass the correct parameters. The API interface is some sort of "contract" with the user: if the user calls this function with these parameters, a given output will be returned. I do not think I have breached the contract or thrown any wrench in, but rather, as Noah Clemons points out, there might be a bug in the code. Either the contract (and the manual) should be changed to say "do not pass diagonal matrices" or the code should be changed to accommodate for these cases.

Thanks,

Pietro

Bild des Benutzers mecej4

I agree that if only certain sequences of calls/phases are allowed those sequences should be clearly documented. The current documentationn has shortcomings in that regard. The examples of using DSS that are provided with MKL work fine but if one wishes (as you did) to alter the sequence from that used in an example, there should be better possibilities than blind trial and error.

Bild des Benutzers Noah Clemons (Intel)

Pietro, I have discussed the matter with engineering and this could indeed be a bug. Based on your iterations with mecej4, could you re-characterize your exact problem? From there I can submit an issue and keep you updated on its working into a new product release.

Bild des Benutzers Pietro B.

Noah,

thanks for your reply. The smallest and most complete example I could come up with is below. The exchange with mecej4 indicates that a problem arises when calling dss_solve_real() with option MKL_DSS_DIAGONAL_SOLVE onto a diagonal matrix. When calling dss_solve_real() on a non-diagonal matrix or for a full solve, all is OK.

The code below shows more: if the first call to dss_solve_real() is made with MKL_DSS_DIAGONAL_SOLVE onto a diagonal matrix, the result is garbage:

Solution: 2.64855e-314 3.51861e-314
Solution: 1.6 1.6
Solution: 1.6 1.6

However, the solution with MKL_DSS_DIAGONAL_SOLVE is correct after a call with DSS_DSS_DEFAULTS. When commenting "#define DIAGONAL_MATRIX", this is what I get (correct result):

Solution: 1.6 2.5
Solution: 1 1
Solution: 1.6 2.5

To recap, for diagonal matrices any call to the same function with MKL_DSS_DIAGONAL_SOLVE that is made before another call with MKL_DSS_DEFAULT will give wrong results, but correct if the call is made after a call with MKL_DSS_DEFAULT. For non-diagonal matrices, the results are correct regardless of the order of calls to dss_solve_real().

Thanks,

Pietro

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "mkl_dss.h"
#include "mkl_types.h"
#define NROWS 2
#define NCOLS 2
#define NRHS 1
#define DIAGONAL_MATRIX
#ifdef DIAGONAL_MATRIX
#define NNONZEROS 2
#else
#define NNONZEROS 3
#endif
void solve_print (_MKL_DSS_HANDLE_t, MKL_INT, _DOUBLE_PRECISION_t *, const MKL_INT, _DOUBLE_PRECISION_t *, MKL_INT);
MKL_INT main () {
  const MKL_INT nRows = NROWS;
  const MKL_INT nCols = NCOLS;
  const MKL_INT nNonZeros = NNONZEROS;
  const MKL_INT nRhs = NRHS;
#ifdef DIAGONAL_MATRIX
  _INTEGER_t rowIndex[NROWS + 1] = {1, 2, 3};
  _INTEGER_t columns[NNONZEROS] = {1, 2};
  _DOUBLE_PRECISION_t values[NNONZEROS] = {5, 5};
#else
  _INTEGER_t rowIndex[NROWS + 1] = {1, 3, 4};
  _INTEGER_t columns[NNONZEROS] = {1, 2, 2};
  _DOUBLE_PRECISION_t values[NNONZEROS] = {5, 3, 5};
#endif
  _DOUBLE_PRECISION_t rhs[NRHS*NROWS] = {8, 8};
  _DOUBLE_PRECISION_t solValues[NRHS*NCOLS];
  _MKL_DSS_HANDLE_t handle;
  _INTEGER_t error;
  MKL_INT
    opt_create = MKL_DSS_DEFAULTS,
    opt_def_str = MKL_DSS_SYMMETRIC,
    opt_reord = MKL_DSS_AUTO_ORDER,
    opt_factor = MKL_DSS_INDEFINITE,
    opt_solve_d = MKL_DSS_DIAGONAL_SOLVE,
    opt_solve_f = MKL_DSS_DEFAULTS,
    opt_default = MKL_DSS_DEFAULTS;
  if (
    ( (error = dss_create (handle, opt_create)) != MKL_DSS_SUCCESS)
    || ((error = dss_define_structure (handle, opt_def_str, rowIndex, nRows, nCols, columns, nNonZeros)) != MKL_DSS_SUCCESS)
    || ((error = dss_reorder (handle, opt_reord, 0)) != MKL_DSS_SUCCESS)
    || ((error = dss_factor_real (handle, opt_factor, values)) != MKL_DSS_SUCCESS)
  ) {
    printf ("Error: %dn", error);
  } else {
    solve_print (handle, opt_solve_d, rhs, nRhs, solValues, nCols);
    solve_print (handle, opt_solve_f, rhs, nRhs, solValues, nCols);
    solve_print (handle, opt_solve_d, rhs, nRhs, solValues, nCols);
    if ((error = dss_delete (handle, opt_default)) != MKL_DSS_SUCCESS)
      printf ("Error in deletionn");
  }
}
//
// solve and print
//
void solve_print (_MKL_DSS_HANDLE_t handle,
  MKL_INT opt_solve,
  _DOUBLE_PRECISION_t *rhs,
  const MKL_INT nRhs,
  _DOUBLE_PRECISION_t *solValues,
  MKL_INT nCols) {
  _INTEGER_t error = dss_solve_real (handle, opt_solve, rhs, nRhs, solValues);
  if (error != MKL_DSS_SUCCESS)
    printf ("error: %dn", error);
  else {
    int i;
    printf ("Solution: ");
    for (i = 0; i < nCols; i++)
      printf (" %g", solValues [i]);
    printf ("n");
  }
}

Bild des Benutzers Noah Clemons (Intel)

This is indeed a bug and I will submit an issue right away. I will keep you updated on its status. Thank you for uncovering this.

Bild des Benutzers Pietro B.

OK, thanks.

Pietro

Bild des Benutzers Gennady Fedorov (Intel)

Pietro, 

the fix of this problem available into the latest 11.1 update 2 released yesterday. You can download this version from intel registration center. Would you please let us know if any further problem you will see.   here is the link to the announce: http://software.intel.com/en-us/forums/topic/505369

 

 

Bild des Benutzers Gennady Fedorov (Intel)

the issue is closed

Melden Sie sich an, um einen Kommentar zu hinterlassen.