# 3D FFT in MKL with data larger than cache

Hi,

I am working on a 3D numerical integrator for a non-linear PDE using the parallel FFT library included in MKL.

My arrays consist of 2^30 data points which is much much larger than the cache. This results in ~50% of cache references being misses leading to a massive amount of execution time being purely accessing memory.

Is there a clever way I can deal with this? Is it expected to have 50% cache misses using an array this large?

Any help would be much appreciated.

Thanks,

Dylan

# MKL sparse BLAS

```Hi all,

Please, are threaded the following MKL sparse BLAS functions?
'mkl_cspblas_?coosymv', 'mkl_?coosymv' and 'mkl_sparse_?_mv1 (11.3 beta)

I'm not able to run with more than one thread these functions.

Thanks in advance,

Roberto Souto.```

# Problem with ?gemm and AMD processor

My program is compiled with the latest revision of IVF 2015 Composer edition. It makes use of a number of mkl functions including CGEMM and ZGEMM. The code involving the mkl functions is unchanged and has worked flawlessly for many years through numerous versions of IVF and CVF before that.

# MKL 3D FFT of 1D function (fortran)

Dear all,

I want to perform a 3D Fourier transformation (using fortran) of a complex function which is in a 10^9 array, where each element corresponds to a 3D vector k. So each element of the array is in fact a particle in Fourier space with k vector being its wavenumber.

In a mathematical notation I want to do:

F(x)=3D integral(f(k vector)*exp(i*k*x))dx

F(x) is a real function and x is a real 3D vector

f(k) is a complex function of vector k

k*x is the dot product of the two 3D vectors

# Why mkl 11.2.3 can not use all the cpu in Intel(R) Core(TM) i7-4700HQ CPU @ 2.40GHz

Hi:

I use mkl_dss to solve a problem. I already use mkl_set_num_threads(8) to set the maximum threads of my computer. But when I run the program, I use top command and just can see only 4 cpu are 100% running, the other 4 cpu just 1%. My cpu is Intel(R) Core(TM) i7-4700HQ CPU @ 2.40GHz, 4 cores and 8 threads.

However, I run the same program on the other machine Intel(R) Xeon(R) CPU  X5650  @ 2.67GHz, 6 cores 12 threads. I use mkl_set_num_threads(12) to set the maximum threads. The program can take full advantage of all 12 cpu.

# about in-place VML operations

Hi all,

I am writing to check if MKL VML library supports in-place operations like vzmul(100, x, y, x) or vzmul(100, x, y, y); I saw a post in this forum and there it said VML support in-place operations. But when I click on the link for the document, it doesn't mention anything on in-place operations or not. Thanks.

# Same numbers differente signal

Dear MKL forum,

I am rewriting some code using the MKL library from a fortran + open source math library, in the validation process i realize the function LAPACKE_dgesv giving the exact solution from the former code. But, in the eigenvectors some signals are changed, where is positive should be negative in several cases.

The input matrix is exact the same, the fortran code:

call dsyev ( jobz , uplo , m, dd , m, w, work , lm, info )

The C code:

LAPACKE_dgesv(LAPACK_COL MAJOR,jobz, uplo, m, temp, m,w);

# Failure during collective

Hi ,

I have compiled espresso with intel mpi and MKL library but getting error Failure during collective error when ever it is working fine with openmpi.

is there problem with intel mpi

# FEAST Eigensolver not returning all eigenvalues in specified range

Hi,

I have came across a problem with FEAST in MKL 11.2, whereby it is not returning all the eigenvalues in the specified search range. The problem type is Generalized Sparse (feast_scsrgv).

# Compling error: undefined reference to `__isoc99_sscanf'

Hello, everyone,

I tried to apply MKL to compile several fortran code. However, I got some errors below. Also, I attached the makefile. It seemed something was not linked.  Is there anyone who can give me some hints? Thanks!

ifort  -w -fast -DMKL_ILP64 -m64  -c LinearSolverCSR.f