How do I specify a stride between elements when computing a 3D DCT in terms of 1D DCTs?

How do I specify a stride between elements when computing a 3D DCT in terms of 1D DCTs?

Hi,

I am working on software for acoustics simulation, which requires 3D Discrete Cosine Transforms to be performed. FFTW3 provides the fftwf_plan_r2r_3d function to create plans for just this purpose:

fftwf_plan plan = fftwf_plan_r2r_3d(Nx, Ny, Nz, gInData, gOutData, FFTW_REDFT10, FFTW_REDFT10, FFTW_REDFT10, FFTW_PRESERVE_INPUT);
fftwf_execute(plan);

However, MKL does not support 3D DCTs. So I tried to implement the 3D DCT in terms of multiple 1D DCTs. For some of these 1D DCTs, I need to specify strides between consecutive elements. Again, FFTW3 allows this via the function fftwf_plan_many_r2r, which also doesn't seem to be supported in MKL (the returned plans from MKL's FFTW3 wrappers are always NULL).

fftwf_r2r_kind kind = FFTW_REDFT10;
fftwf_plan planX = fftwf_plan_many_r2r(1, &Nx, 1, gInData, NULL, Ny*Nz, 1, gOutData, NULL, Ny*Nz, 1, &kind, FFTW_PRESERVE_INPUT);
fftwf_plan planY = fftwf_plan_many_r2r(1, &Ny, 1, gInData, NULL, Nz, 1, gOutData, NULL, Nz, 1, &kind, FFTW_PRESERVE_INPUT);
fftwf_plan planZ = fftwf_plan_many_r2r(1, &Nz, 1, gInData, NULL, 1, 1, gOutData, NULL, 1, 1, &kind, FFTW_PRESERVE_INPUT);
for (int x = 0; x < Nx; x++)
    for (int y = 0; y < Ny; y++)
        fftwf_execute_r2r(planZ, &gInData[x*Ny*Nz + y*Nz], &gOutData[x*Ny*Nz + y*Nz]);
for (int x = 0; x < Nx; x++)
    for (int z = 0; z < Nz; z++)
        fftwf_execute_r2r(planY, &gOutData[x*Ny*Nz + z], &gInData[x*Ny*Nz + z]);
for (int y = 0; y < Ny; y++)
    for (int z = 0; z < Nz; z++)
        fftwf_execute_r2r(planX, &gInData[y*Nz + z], &gOutData[y*Nz + z]);
 

I next tried the raw Trigonometric Transforms functions, s_init_trig_transform and the like. These don't seem to offer any way of setting a stride between elements. Since s_commit_trig_transform returns a DFTI_DESCRIPTOR_HANDLE, I tried using DftiSetValue to specify a stride on the descriptor, followed by a call to DftiCommitDescriptor.

int tt_type = MKL_STAGGERED_COSINE_TRANSFORM;
 int ir;
 DFTI_DESCRIPTOR_HANDLE dftiX, dftiY, dftiZ;
int iparX[128];
 float* sparX = new float[3*Nx/2 + 2];
 s_init_trig_transform(&Nx, &tt_type, iparX, sparX, &ir);
 s_commit_trig_transform(gInData, &dftiX, iparX, sparX, &ir);
 int xStride[2] = {0, Ny*Nz};
 DftiSetValue(dftiX, DFTI_INPUT_STRIDES, xStride);
 DftiSetValue(dftiX, DFTI_OUTPUT_STRIDES, xStride);
 DftiCommitDescriptor(dftiX);
int iparY[128];
 float* sparY = new float[3*Ny/2 + 2];
 s_init_trig_transform(&Ny, &tt_type, iparY, sparY, &ir);
 s_commit_trig_transform(gInData, &dftiY, iparY, sparY, &ir);
 int yStride[2] = {0, Nz};
 DftiSetValue(dftiY, DFTI_INPUT_STRIDES, yStride);
 DftiSetValue(dftiY, DFTI_OUTPUT_STRIDES, yStride);
 DftiCommitDescriptor(dftiY);
int iparZ[128];
 float* sparZ = new float[3*Nz/2 + 2];
 s_init_trig_transform(&Nz, &tt_type, iparZ, sparZ, &ir);
 s_commit_trig_transform(gInData, &dftiZ, iparZ, sparZ, &ir);
iparX[6] = 0;
 iparY[6] = 0;
 iparZ[6] = 0;
iparX[10] = 1;
 iparY[10] = 1;
 iparZ[10] = 1;
iparX[14] = Nx;
 iparY[14] = Ny;
 iparZ[14] = Nz;
memcpy(gOutData, gInData, Nx*Ny*Nz*sizeof(float));
for (int x = 0; x < Nx; x++)
    for (int y = 0; y < Ny; y++)
        s_backward_trig_transform(&gOutData[x*Ny*Nz + y*Nz], &dftiZ, iparZ, sparZ, &ir);
for (int x = 0; x < Nx; x++)
    for (int z = 0; z < Nz; z++)
        s_backward_trig_transform(&gOutData[x*Ny*Nz + z], &dftiY, iparY, sparY, &ir);
for (int y = 0; y < Ny; y++)
    for (int z = 0; z < Nz; z++)
        s_backward_trig_transform(&gOutData[y*Nz + z], &dftiX, iparX, sparX, &ir);
 

However, this does not produce correct results (i.e., the results don't match at all with the FFTW 3D DCT or the FFTW multiple-1D-DCT results). I am using MKL 10.3 Update 9 on Windows (32-bit).

Have I missed something here? Is there a way to either a) specify strides for 1D DCTs, or b) perform 3D DCTs using MKL? In the absence of a way to use MKL to compute 3D DCTs, I will be forced to switch away from using MKL.

Thanks,
Lakulish Antani. 

3 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Dmitry Baksheev (Intel)'s picture

Hi Lakulish,

Thank you very much for very detailed description of the problem and the ways you have been trying to solve it.

Unfortunately, MKL does not support non-unit stride DCTs.

Thanks
Dima

Hi Dmitry,

Thank you for your reply. It's unfortunate that MKL doesn't support non-unit-stride DCTs. Therefore, I would like to report a feature request to the MKL team, for a) non-unit-stride DCTs, and b) 3D DCTs. Let me know if this forum post suffices for this purpose, or if there's another place for me to report such a feature request.

Meanwhile, I will try implementing what I need in terms of non-unit-stride FFTs, which seem to support non-unit strides.

Thanks,
Lakulish.

Login to leave a comment.