yes, this example is crushed. we will check more carefully what's going wrong with this code.
MKL FFT crashes when multi-threaded and for non-power 2 size
What we have discovered - the problem is caused by AVX code. as a temporarily work-around please try to turn off AVX branch be setting, as an example, MKL_CBWR=SSE4_2
I checked this approach on win7 and it works on my side.
--Gennady
Gennady,
Thnax for th equick response.
setting SSE4.2 worked,
Now I could run more tests, and now the next example crashes for DFTI_COMPLEX_COMPLEX (not for DFTI_COMPLEX_REAL
(crash happens typically at nrOfTransforms 3, nrOfSamples 2658):
for (unsigned nrOfTransforms = 1; nrOfTransforms <= 5; ++nrOfTransforms)
{
for (unsigned nrOfSamples = 1; nrOfSamples <= 10000; ++nrOfSamples)
{
std::cout << "Test 3c, Forward FFT Real-2-complex out-of-place nrOfTransforms " << nrOfTransforms << ", nrOfSamples " << nrOfSamples << std::endl;
MKL_LONG status;
DFTI_DESCRIPTOR_HANDLE _fft;
// allocate buffer (make buffer too big, just to be sure that inplace FFT does not go beyond allocate memory
float *x_in = new float [nrOfSamples*nrOfTransforms*10];
std::complex *x_out = new std::complex[nrOfSamples*nrOfTransforms*10];
status = DftiCreateDescriptor( &_fft, DFTI_SINGLE, DFTI_REAL, 1, nrOfSamples);
checkStatus(status);
status = DftiSetValue(_fft, DFTI_PLACEMENT, DFTI_NOT_INPLACE);
checkStatus(status);
// Specify the number of transforms
status = DftiSetValue(_fft, DFTI_NUMBER_OF_TRANSFORMS, nrOfTransforms);
checkStatus(status);
//status = DftiSetValue(_fft, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_REAL);
status = DftiSetValue(_fft, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_COMPLEX);
checkStatus(status);
// The FFT is now fully specified
status = DftiCommitDescriptor( _fft );
// Calculate forward FFT
status = DftiComputeForward(_fft, x_in, x_out);
checkStatus(status);
// cleanup
delete[] x_in, x_out;
status = DftiFreeDescriptor(&_fft);
checkStatus(status);
}
}
To specify how the multiple input and output vectors are laid out, you should do something like this before committing the descriptor:
DftiSetValue(_fft, DFTI_INPUT_DISTANCE, nrOfSamples);
DftiSetValue(_fft, DFTI_OUTPUT_DISTANCE, nrOfSamples/2+1);
This would tell the compute function that
1) real input element n of vector k is located in x_in[ n + nrOfSamples*k] (here n=0...nrOfSamples-1)
2) complex output element n of vector k is located in x_out[ n + (nrOfSamples/2+1)*k] (here n=0...nrOfSamples/2)
Thanks
Dima
Dirk-Jan,
I have reproduced the problem and I can suggest nothing but sequential FFT.
In MKL 11.0.1 there is DFTI_THREAD_LIMIT configuration setting, which should be set to 1 before DftiCommitDescriptor.
Thanks
Dima




MKL FFT crashes when multi-threaded and for non-power 2 size
BUG:
MKL FFT crashes (Segmentation faults) for certain FFT sizes (for example 2496, when using complex numbers, )
crash observed with cpp_studio_xe_2013_update1_intel64.tgz
when compiled with icc and with gcc.
crash not observed when compiled with icc and -mkl=sequentail
I am running it on a Intel® Xeon® Processor E5-2670 (8 cores per CPU)
for(unsigned nrOfSamples = 1;nrOfSamples <10000;++nrOfSamples );
{
std::cout << "nrOfSamples " << nrOfSamples << std::endl;
fflush(NULL);
MKL_LONG status;
DFTI_DESCRIPTOR_HANDLE _fft;
// Create the MKL FFT descriptor
status = DftiCreateDescriptor(&_fft, DFTI_SINGLE, DFTI_COMPLEX,1, nrOfSamples);
checkStatus(status);
// The FFT is now fully specified
status = DftiCommitDescriptor(_fft);
checkStatus(status);
// allocate buffer (make buffer too big, just to be sure that inplace FFT does not go beyond allocate memory
std::complex<float> *x = new std::complex<float>[nrOfSamples*100];
// Calculate forward FFT
status = DftiComputeForward(_fft, x);
checkStatus(status);
// cleanup
delete[] x;
status = DftiFreeDescriptor(&_fft);
checkStatus(status);
}
-------------------------------------------------------------------
installed : cpp_studio_xe_2013_update1_intel64.tgz
OS : opensuse 12.2
-------------------------------------------------------------------
ICC compiler:crash observed
icc link options : -L$(MKLROOT)/lib/intel64 -lmkl_rt -lpthread -lm
compile options -mkl=parallel : crash ( Signal name : SIGSEGV, Signal meaning : Segmentation fault)
Note : compile options -mkl=sequentail : no crash observed
-------------------------------------------------------------------
GCC compiler: 4.7.1 : also crashes observed
-------------------------------------------------------------------