.NET Interop and the Math Kernel Library LAPACK Routines

.NET Interop and the Math Kernel Library LAPACK Routines

I am building a program in C# and would like to use the DSYEV LAPACK routine within it. I have built a wrapper DLL in Fortran (code is below) linked against the MKL sequentially. This DLL is then linked into the C# via the System.Runtime.InteropServices.DllImport attribute.Additionally, due to the differences between multi-dimensional arrays in C# and Fortran, I have added two additional methods to convert a 2-dimensional array into a 1-dimensional one and vice versa.The Fortran DLL compiles without a problem, but the DSYEV function within C# is not functioning properly. The INFO parameter returns as -1, so a link seems to have been made, but the arrays A and W are returning with only 1 element when A should have N*N elements (to be reconverted back into 2D array) and W should have N elements.More information on the DSYEV subroutine can be found here.Basically, my question is: what am I doing wrong?INTEROP DECLARATION:

public static extern void DSYEV_INTEROP(string JOBZ, string UPLO, int N, ref double[] A, int LDA, ref double[] W, ref double[] WORK, int LWORK, ref int INFO);

FORTRAN DLL:

SUBROUTINE DSYEV_INTEROP(JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO)

    !DEC$ ATTRIBUTES DLLEXPORT::DSYEV_INTEROP
    !DEC$ ATTRIBUTES STDCALL::DSYEV_INTEROP
    !DEC$ ATTRIBUTES ALIAS:"DSYEV_INTEROP"::DSYEV_INTEROP

    !DEC$ ATTRIBUTES VALUE::JOBZ
    !DEC$ ATTRIBUTES VALUE::UPLO
    !DEC$ ATTRIBUTES VALUE::N
    !DEC$ ATTRIBUTES REFERENCE::A
    !DEC$ ATTRIBUTES VALUE::LDA
    !DEC$ ATTRIBUTES REFERENCE::W
    !DEC$ ATTRIBUTES REFERENCE::WORK
    !DEC$ ATTRIBUTES VALUE::LWORK
    !DEC$ ATTRIBUTES REFERENCE::INFO

    CHARACTER(LEN=1), INTENT(IN) :: JOBZ, UPLO
    INTEGER, INTENT(IN) :: N, LDA, LWORK
    INTEGER, INTENT(OUT) :: INFO
    !REAL*8, DIMENSION(:,:), INTENT(INOUT) :: A
    REAL*8, POINTER, DIMENSION(:,:) :: A
    REAL*8, DIMENSION(:), INTENT(INOUT) :: W, WORK

    CALL dsyev(JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO)

END SUBROUTINE DSYEV_INTEROP
19 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Please check the first parameter first. the info == -1 points thatthe 1-th parameter had an illegal value.

Hello,
You do not need to write wrappers. Please see examples of using MKL in C#:
http://software.intel.com/en-us/articles/using-intel-mkl-in-your-c-program/
You need not also to build custom dll if you use MKL 10.3.
Thanks,
Vladimir

Many thanks for the responses, and apologies for my delayed reply! I am looking into this option for .NET. Is there a simple way of knowing which data types should be passed by reference or value?

by reference.as an example, the Pardiso calling will be looks likephase = 11; Pardiso.pardiso (pt, ref maxfct, ref mnum, ref mtype, ref phase, ref n, a, ia, ja, idum, ref nrhs, iparm, ref msglvl, ddum, ddum, ref error);

I've tried the example in the download: the C# code is written. I am, however, experiencing some difficulty building the custom mkl.dll for the DSYEV routine.

I am not using the makefile, but am using it as an example to follow. I first compiled the _fseeki64.c file as in the makefile using the Microsoft C compiler (I don't have the Intel one). I then tried a variation on the link command, having set additional environment variables for the MKL libraries and compiler libraries:

MKLROOT=C:\Program Files (x86)\Intel\ComposerXE-2011\mkl\lib\intel64
OMPLIB=C:\Program Files (x86)\Intel\ComposerXE-2011\compiler\lib\intel64

where each file was preceeded with the necessary environment variable:

link /DLL /MACHINE:X64 /def:user.def _fseeki64.obj %MKLROOT%\mkl_intel_lp64_dll.lib %MKLROOT%\mkl_intel_thread_dll.lib %MKLROOT%\mkl_core_dll.lib %OMPLIB%\libiomp5md.lib msvcrt.lib user32.lib /out:mkl.dll

All files are found, but then I get the error:

LINK : error LNK2001: unresolved external symbol _DllMainCRTStartup
mkl.dll : fatal error LNK1120: 1 unresolved externals

Basically, what am I doing wrong?

may be You didn't set the MKLROOT?I checked how it works.nmake ia32 MKLROOT=C:\APPS\Intel\MKL\10.2.7
Microsoft Program Maintenance Utility Version 8.00.50727.42Copyright (C) Microsoft Corporation. All rights reserved.Add path of the MKL libraries to the lib environment variable set lib=%MKLROOT%\ia32\lib;%lib%MKL entries for custom dllWorkaround for pardisoMicrosoft 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86Copyright (C) Microsoft Corporation. All rights reserved._fseeki64.cBuild MKL custom dll nmake mkl.dll MACHINE=IX86 MKL_LIB="mkl_intel_c_dll.lib mkl_intel_thread_dll.lib mkl_core_dll.lib" MSLIB=user32.libMicrosoft Program Maintenance Utility Version 8.00.50727.42Copyright (C) Microsoft Corporation. All rights reserved. link /DLL /MACHINE:IX86 /def:user.def _fseeki64.obj mkl_intel_c_dll.libmkl_intel_thread_dll.lib mkl_core_dll.lib libiomp5md.lib user32.lib /out:mkl.dllMicrosoft Incremental Linker Version 8.00.50727.42Copyright (C) Microsoft Corporation. All rights reserved. Creating library mkl.lib and object mkl.expAdd path of the mkl.dll and path of the MKL binaries to the path environment variable set path=c:\!!;%MKLROOT%\ia32\bin;%path%Build and run examples nmake /a dgemm.exe dgeev.exe dfti_d1.exe pardiso.exe vddiv.exeMicrosoft Program Maintenance Utility Version 8.00.50727.42Copyright (C) Microsoft Corporation. All rights reserved.Compile dgemm.cs csc .\dgemm.csMicrosoft Visual C# 2005 Compiler version 8.00.50727.4927for Microsoft Windows 2005 Framework version 2.0.50727Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.Run dgemm example dgemm.exe....................

Right - I've got it working! I didn't realise I could just link the provided DLLs rather than build a custom one from the static libraries!Many thanks for all the assistance!

Hi,
I am using eigen values/vectors decomposition DSYEV in mkl for a university project and I need results to be reproductible.
I wrote a little wrapper (according to examples) to call DSYEV from C# :

namespace Eigen

{

class Solve

{

public double[,] decompo(double[,] A)

{

int lwork = -1;

int n = A.GetLength(0);

int lda = n;

double[] A_bis = new double[n * n];

double[] w = new double[n];

for (int i = 0; i < n; i++)

{

int cte = i * n;

for (int j = 0; j < n; j++)

{

A_bis[cte + j] = A[j, i];

}

}

lwork = -1;

double[] work_1 = new double[1];

int info = LAPACK_mkl.dsyev('V', 'U', n, A_bis, lda, w, work_1, lwork);

lwork = (int)work_1[0];

double[] work = new double[lwork];

info = LAPACK_mkl.dsyev('V', 'U', n, A_bis, lda, w, work, lwork);

double[,] res2 = new double[n, n + 1];

for (int i = 0; i < n; i++)

{

res2[i, 0] = w[i];

for (int j = 0; j < n; j++)

{

res2[i, j + 1] = A_bis[i + j * n];

}

}

GC.Collect(0);

return res2;

}

}

}

namespace mkl

{

public sealed class LAPACK_mkl

{

private LAPACK_mkl() { }

public static int dsyev(char jobz, char uplo, int N, double[] A, int LDA, double[] w, double[] work, int lwork)

{

LAPACKNative.kmp_set_warnings_off();

int num = Environment.ProcessorCount;

LAPACKNative.omp_set_num_threads(ref num);

int info = -1;

LAPACKNative.dsyev(ref jobz, ref uplo, ref N, A, ref LDA, w, work, ref lwork, ref info);

return info;

}

}

/** LAPACK native declarations */

[SuppressUnmanagedCodeSecurity]

internal sealed class LAPACKNative

{

[DllImport("libiomp5md", EntryPoint = "omp_set_num_threads")]

internal static extern void omp_set_num_threads(ref int num);

[DllImport("libiomp5md", EntryPoint = "kmp_set_warnings_off")]

internal static extern void kmp_set_warnings_off();

[DllImport("mkl", EntryPoint = "DSYEV", CallingConvention = CallingConvention.Cdecl), SuppressUnmanagedCodeSecurity]

internal static extern void dsyev(ref char jobz, ref char uplo, ref int n, double[] A, ref int lda, double[] w, double[] work, ref int lwork, ref int info);

private LAPACKNative()

{

kmp_set_warnings_off();

int num = Environment.ProcessorCount;

omp_set_num_threads(ref num);

}

}

}



I red something about memory alignment and 16-byte boundaries... does this apply here ?

Thank you for any answer or comment on my code

Yes, _aligned_malloc(size,16) rather than the default new for the arrays passed to MKL should improve numerical consistency and performance, at least on 32-bit Windows. On 64-bit Windows, perhaps new would default to 16-byte alignment.
I've seen an assertion that Intel C++ new would observe alignments specified by _declspec(align(16)). The Microsoft page states clearly that you must use _declspec, with _declspec(align(32)) preferred for improved caching and compatibility with AVX.
You omitted needed header definitions, which I wouldn't consider obvious.

First, thank you for the quick reply !As I mentionned in my previous post, this is C# code... and_aligned_malloc you are talking about is C++ (am I wrong?). Is there such a command, or a trick to work around and create arrays which are16-byte aligned, for C# ?And here is a list of definitions :using System;using System.Collections.Generic;using System.Linq;using System.Text;using System.Security;using System.Runtime.InteropServices;using mkl;Thank you

_aligned_malloc is probably simply a wrapper around malloc. The usual technique is to allocate a region, say 12 bytes longer than the data size (as you should be able to depend on 4-byte aligned malloc). Then you use the sub-region beginning at the first 16-byte aligned address in that region. If you free the region, you must free the entire region, including the bytes you skip. You shouldn't need a loop to find the aligned address, such as some of the web search references on c# data alignment show. You simply add 15 to the base pointer and mask off the low order bits, or find the remainder %16 of the base pointer and figure out how far up the region to start. For AVX, you probably want the extra 28 bytes, add 31 and mask to 32-byte alignment. I'm not fluent in C#, so should shut up now.

I discussed with a colleague who uses C#, and we suggest you use the mkl_malloc() function rather than c# new.

I am not fluent in C# either: the main resaon I have chosen C# instead of C++ is memory management, and it occurs to me that it becomes indeed more complex ...Do you have a code example on how to use mkl_malloc() in C# ?Thank you very much

No, I could get from my colleague an example of ipp malloc in C#. It should work like a plain malloc() with the addition of the alignment argument (by value).

That would be great !

 [DllImport("ippcore-7.0.dll",EntryPoint="ippMalloc")] private static extern
 IntPtr Malloc_x32 
 (int length );
 [DllImport("ippcoreem64t-7.0.dll",EntryPoint="ippMalloc")] private static extern
 IntPtr Malloc_x64 
 (int length );
 public IntPtr Malloc (int length ) {
  if (IntPtr.Size == 8)  
   return Malloc_x64(length);
  else
   return Malloc_x32(length);
 }

He checks whether to use the 32-bit or X64 ipp .dll (according to IntPtr.Size), then uses the ippMalloc, which is said to return a 64-byte aligned region. You could use that one, if you don't mind involving another big .dll, or use mkl_malloc(length, alignment)

For that matter, I would think you could still use the _aligned_malloc provided by Visual Studio.

We just noticed that VS2010 doc says the standard default malloc() provides 16-byte alignment. No idea if that would carry over to c# new. Prior to VS2010, malloc was 16-byte aligned only in the X64 version.

Okay, thank you.Using dependency walker on mkl_rt.dll, I foundMKL_MALLOC function, so it is ok.In mkl documentation, it seems that syntax is :
a_ptr=mkl_malloc(alloc_size,alignment);So i defined this :

        public static unsafe IntPtr mkl_malloc(int length)
        {
            LAPACKNative.kmp_set_warnings_off();
            int align = 32;
            IntPtr pointer = LAPACKNative.Malloc(length, align);
            return pointer;
        }
        [DllImport("mkl_rt.dll", EntryPoint = "MKL_MALLOC", CallingConvention = CallingConvention.Cdecl), SuppressUnmanagedCodeSecurity]
        internal static extern IntPtr Malloc(int length, int align);

Problem is, when I call this function this way :
IntPtr pointer_A_bis = LAPACK_mkl.mkl_malloc(n * n * sizeof(double));
I get an error concerning "an attempt to to read or write in protected memory. It often indicates another memory is damaged" ...Any idea ?

Finally, here is what I did (as I could not make mkl_malloc work...)

public unsafe double[] linsolve(ref double[,] A, ref double[] B)
        {
            int m = A.GetLength(0);
            int n = A.GetLength(1);
            int lwork = -1;
            int lda = m;
            int ldb = m;
            int nrhs = 1;
            int rank = 0;
            int align = 128;
            IntPtr pointer_A_bis_ini = Marshal.AllocHGlobal(m * n * sizeof(double) + align);
            IntPtr pointer_A_bis = pointer_A_bis_ini;
            IntPtr pointer_B_bis_ini = Marshal.AllocHGlobal(n * nrhs * sizeof(double) + align);
            IntPtr pointer_B_bis = pointer_B_bis_ini;
            Int64 adresse_A_bis = pointer_A_bis.ToInt64() % align;
            Int64 adresse_B_bis = pointer_B_bis.ToInt64() % align;
            int offset = 0;
            while (adresse_A_bis != 0)
            {
                offset++;
                pointer_A_bis = new IntPtr(pointer_A_bis_ini.ToInt64() + offset);
                adresse_A_bis = pointer_A_bis.ToInt64() % align;
            }
            offset = 0;
            while (adresse_B_bis != 0)
            {
                offset++;
                pointer_B_bis = new IntPtr(pointer_B_bis_ini.ToInt64() + offset);
                adresse_B_bis = pointer_B_bis.ToInt64() % align;
            }
            double* A_bis = (double*)pointer_A_bis.ToPointer();
            double* B_bis = (double*)pointer_B_bis.ToPointer();
            for (int i = 0; i < nrhs; i++)
            {
                int cte = i * m;
                for (int j = 0; j < m; j++)
                {
                    B_bis[cte + j] = B[j];
                }
            }
            for (int i = 0; i < n; i++)
            {
                int cte = i * m;
                for (int j = 0; j < m; j++)
                {
                    A_bis[cte + j] = A[j, i];
                }
            }
            if (m == n)
            {
                IntPtr pointer_IPIV_ini = Marshal.AllocHGlobal(m * sizeof(int) + align);
                IntPtr pointer_IPIV = pointer_IPIV_ini;
                Int64 adresse_IPIV = pointer_IPIV.ToInt64() % align;
                offset = 0;
                while (adresse_IPIV != 0)
                {
                    offset++;
                    pointer_IPIV = new IntPtr(pointer_IPIV_ini.ToInt64() + offset);
                    adresse_IPIV = pointer_IPIV.ToInt64() % align;
                }
                int* IPIV = (int*)pointer_IPIV.ToPointer();
                for (int i = 0; i < m; i++)
                {
                    IPIV[i] = 0;
                }
                LAPACK_mkl.dgetrf(m, m, A_bis, m, IPIV);
                LAPACK_mkl.dgetrs('N', m, nrhs, A_bis, m, IPIV, B_bis, m);
                double[] res2 = new double[n];
                for (int i = 0; i < n; i++)
                {
                    res2[i] = B_bis[i];
                }
                Marshal.FreeHGlobal(pointer_A_bis_ini);
                Marshal.FreeHGlobal(pointer_B_bis_ini);
                Marshal.FreeHGlobal(pointer_IPIV_ini);
                return res2;
            }
            else
            {
                int maxMN;
                if (m > n) { maxMN = m; } else { maxMN = n; }
                IntPtr pointer_JPVT_ini = Marshal.AllocHGlobal(n * sizeof(int) + align);
                IntPtr pointer_JPVT = pointer_JPVT_ini;
                Int64 adresse_JPVT = pointer_JPVT.ToInt64() % align;
                offset = 0;
                while (adresse_JPVT != 0)
                {
                    offset++;
                    pointer_JPVT = new IntPtr(pointer_JPVT_ini.ToInt64() + offset);
                    adresse_JPVT = pointer_JPVT.ToInt64() % align;
                }
                int* JPVT = (int*)pointer_JPVT.ToPointer();
                for (int i = 0; i < n; i++)
                {
                    JPVT[i] = 0;
                }
                double RCOND = 2.0E-16;
                IntPtr pointer_work_1 = Marshal.AllocHGlobal(sizeof(double));
                double* work_1 = (double*)pointer_work_1.ToPointer();
                LAPACK_mkl.dgelsy(m, n, nrhs, A_bis, lda, B_bis, maxMN, JPVT, RCOND, rank, work_1, lwork);
                lwork = (int)work_1[0];
                IntPtr pointer_work_ini = Marshal.AllocHGlobal(lwork * sizeof(double) + align);
                IntPtr pointer_work = pointer_work_ini;
                Int64 adresse_work = pointer_work.ToInt64() % align;
                offset = 0;
                while (adresse_work != 0)
                {
                    offset++;
                    pointer_work = new IntPtr(pointer_work_ini.ToInt64() + offset);
                    adresse_work = pointer_work.ToInt64() % align;
                }
                double* work = (double*)pointer_work.ToPointer();
                LAPACK_mkl.dgelsy(m, n, nrhs, A_bis, m, B_bis, maxMN, JPVT, RCOND, rank, work, lwork);
                
                double[] res2 = new double[n];
                for (int i = 0; i < n; i++)
                {
                    res2[i] = B_bis[i];
                }
                Marshal.FreeHGlobal(pointer_work_1);
                Marshal.FreeHGlobal(pointer_work_ini);
                Marshal.FreeHGlobal(pointer_A_bis_ini);
                Marshal.FreeHGlobal(pointer_B_bis_ini);
                Marshal.FreeHGlobal(pointer_JPVT_ini);
                GC.Collect(0);
                return res2;
            }
        }

It seems it's working fine.If anyone has a better idea, feel free to share !Best regards, Eric

Leave a Comment

Please sign in to add a comment. Not a member? Join today