FFT Crash in 1024-point, Multicore, C->C, Out-of-place, w/ MKL10.2

FFT Crash in 1024-point, Multicore, C->C, Out-of-place, w/ MKL10.2

I'm getting a crashing memory fault when computing a 1D, 1024 point, double precision, C->C FFT, out-of-place, on multiple cores. I'm using the latest 10.2 MKL multithreaded libraries.

Has anybody seen this? The bug does not surface on a single core or for in-place computation (I ran millions over the weekend). With the same code, a 1023 or a 1025 point C->C FFT computes fine. 2D seems fine.

Any information would be helpful.

Paul

CenterSpace Software
17 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Quoting - paulshirkey

Hi Victor,

sorry for the delay. Yes, I do have this crashing problem for other power of 2 lengths - I have verified the crash for out-of-place c->c FFT lengths of 512 & 256.

Can you give me any update on when I could expect a patch for this issue?

Thank you for your help on this.

Paul

Research Scientist
CenterSpace Software

Hi Paul,

Could you please send us small C# test-case to reproduce your problem? If you have crashing problem with 256 and 512 sizes then it means thatmore likely your originalproblem is not related to 8-byte alignments of data.

Thanks in advance for your time

Thanks,
-- Victor

Here is the code that is causing the crash. I also found that the SINGLE precision case is fine - this only crashes for the DOUBLE precision cases.

I've only see this crash on a windows 64 bit OS running multiple cores (eight). The single core case, on my 32 bit windows machine runs this fine.

Running MKL for Windows 10.1, update 3.

Any thoughts?

Thanks,
Paul

// FFTBug.c : Defines the entry point for the console application.
//

#include "stdlib.h"
#include "mkl_dfti.h"

typedef struct Complex
{
double real;
double imag;
}Complex16;

// This computes a series of C->C double precision FFT's, doubling the
// length of the FFT at each iteration. The starting length is
// defined by the var 'fftlength'.
int main(int argc, char* argv[])
{

Complex16 *x_in = (Complex16*) malloc(1024*1024*sizeof(Complex16));
Complex16 *x_out = (Complex16*) malloc(1024*1024*sizeof(Complex16));

int fftlength = 512; // Program fails starting with this length.

//int fftlength = 513; // Program finished starting with this length.

int doublings = 9;
int iterations = 10;

DFTI_DESCRIPTOR *my_desc1_handle;

int d, i;
for( d = 0; d < doublings ; d++)
{

printf("nProcessing fft of length: %in",fftlength);

// Must be a DFTI_DOUBLE precision FFT for the failure to occur.
// If we switch to DFTI_SINGLE the program will finish.
DftiCreateDescriptor( &my_desc1_handle, DFTI_DOUBLE, DFTI_COMPLEX, 1, fftlength);
DftiCommitDescriptor( my_desc1_handle );

if(DftiSetValue( my_desc1_handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE) == 0)
{
printf("Successfully set descriptor to DFTI_NOT_INPLACE.n");
}
else
{
printf("Failed to set descriptor to DFTI_NOT_INPLACE.n");
}

if(DftiCommitDescriptor( my_desc1_handle ) == 0)
{
printf("Successfully committed descriptor.n");
}
else
{
printf("Failed to commit descriptor.n");
}

for ( i = 0; i < iterations; i++)
{
DftiComputeForward( my_desc1_handle, x_in, x_out);
}

printf("Finished processing fft of length: %inn",fftlength);

fftlength *= 2;
}

DftiFreeDescriptor(&my_desc1_handle);

free(x_in);
free(x_out);

printf("Finished.");
getchar();

return 0;
}

CenterSpace Software

Hi,

Looks like the computation fails on 8-byte aligned data. As a workaround, please make sure x_in/x_out are 16-byte aligned.

Thanks
Dima

Quoting - Dmitry Baksheev (Intel)

Hi,

Looks like the computation fails on 8-byte aligned data. As a workaround, please make sure x_in/x_out are 16-byte aligned.

Thanks
Dima

I appreciate the quick reply and verification Dima. Also, I don't know if this affects this issue, but I've verified that we are using the latest MKL version 10.2.

We are made a support ticket #563230for this issue.

We build .NET/C# computational libraries and I'm calling the MKL routines from C++/cli. The arrays that I am processing are passed into my C++/cli kernel from C#, so I don't have any alignment control. If the array is misaligned, I'd have to do an array copy which is computationally infeasible.

Thanks again,

Paul

Center Space Software

CenterSpace Software

I'm aligning my x_in/x_out arrays to 16 byte address boundaries and I unfortunately still get an occasional crash.

Paul

CenterSpace Software

Hi Paul,

May it be the reason of the occasional crashes you see that the descriptor is freed once in the end instead of in the loop. Thank you for submitting the issue. Unfortunately the problem lives in MKL 10.2 too.

Thanks
Dima

Paul,

In case you use C# interfaces to MKL DFTI, the problems will likelyoccur if you use DftiSetValue function for setting forward/backward scales. The problem is associated with improper call of the variable argument function from the C# wrapper. The wrappers will be updated in the future, but there is a workaround, let me know if youneed it.

Thanks
Dima

Quoting - Dmitry Baksheev (Intel)

Hi Paul,

May it be the reason of the occasional crashes you see that the descriptor is freed once in the end instead of in the loop. Thank you for submitting the issue. Unfortunately the problem lives in MKL 10.2 too.

Thanks
Dima

I noticed that later and fix it, but it didn't affect the issue.

Paul

CenterSpace Software

Quoting - Dmitry Baksheev (Intel)
Paul,

In case you use C# interfaces to MKL DFTI, the problems will likelyoccur if you use DftiSetValue function for setting forward/backward scales. The problem is associated with improper call of the variable argument function from the C# wrapper. The wrappers will be updated in the future, but there is a workaround, let me know if youneed it.

Thanks
Dima

I'm not right now, but thanks for the heads up Dima. Our bug report ticket is in process.

Paul

CenterSpace Software

Hi Paul,

You found out that FFT crashes for 1024-point c2c out-of-place double precision but it looks like you should have analogous problems for other powers of 2 in case of problems with alignment. Please confirm that.

Thanks in advance

Thanks,
-- Victor

Quoting - Victor Pasko (Intel)
Hi Paul,

You found out that FFT crashes for 1024-point c2c out-of-place double precision but it looks like you should have analogous problems for other powers of 2 in case of problems with alignment. Please confirm that.
Thanks in advance

-- Victor

Hi Victor,

sorry for the delay. Yes, I do have this crashing problem for other power of 2 lengths - I have verified the crash for out-of-place c->c FFT lengths of 512 & 256.

Can you give me any update on when I could expect a patch for this issue?

Thank you for your help on this.

Paul

Research Scientist
CenterSpace Software

CenterSpace Software

Quoting - Victor Pasko (Intel)

Hi Paul,

Could you please send us small C# test-case to reproduce your problem? If ..... [deleted text]

Thanks in advance for your time

Hi Victor,

I can do this but it is quite a bit of work that I would like to avoid unless it is absolutely necessary. I don't have a pure C# example.

I've already sent an example that crashes in C, why is that not sufficient? Have you not be able to reproduce this bug using the C code? I also have sent all of the machine parameters describing the 8-core 64-bit OS machine on which we see this issue. I've never seen this crash on my single core 32-bit OS machine.

Paul

CenterSpace Software

Hi Paul,

As to 1024 and other powers of 2 (double precision) the problem with 8-byte aligned data is fixed.

But you wrote:
Yes, I do have this crashing problem for other power of 2 lengths - I have verified the crash for out-of-place c->c FFT lengths of 512 & 256.

Therefore,we need your C# test-case in order to reproduce crash problem for 256 and 512 because it's not related to alignment of data.

Thanks
--Victor

Thanks,
-- Victor

Quoting - Victor Pasko (Intel)
Hi Paul,

As to 1024 and other powers of 2 (double precision) the problem with 8-byte aligned data is fixed.

[...]

Thanks
--Victor

That is good news that the 1024 issue has been fixed. Thank you.

How/When will that fix be released?

Paul

CenterSpace Software

Quoting - Paul

That is good news that the 1024 issue has been fixed. Thank you.

How/When will that fix be released?

Paul

Hi Paul,

The fixes are in MKL 10.2.3 which will be available soon

--Victor

Thanks,
-- Victor

Quoting - Paul

I appreciate the quick reply and verification Dima. Also, I don't know if this affects this issue, but I've verified that we are using the latest MKL version 10.2.

We are made a support ticket #563230for this issue.

We build .NET/C# computational libraries and I'm calling the MKL routines from C++/cli. The arrays that I am processing are passed into my C++/cli kernel from C#, so I don't have any alignment control. If the array is misaligned, I'd have to do an array copy which is computationally infeasible.

Thanks again,

Paul

Center Space Software

Intel MKL 10.2 Update 3 is now available.
The problem discussed into this tread has been fixed into this update.
Please see the Intel MKL 10.2 Update 3 is now available announce. You can find there the link to the Intel registration center to download.Could you please let us know if the problem is still exists?
--Gennady

Leave a Comment

Please sign in to add a comment. Not a member? Join today