Hello,
I want to find some methods to increase my 4096*4096 complex to complex DFT performance on MIC using the MKL-DFT. In the following website, I find the attract method:
Tip3: the leading dimension size for multiple –dimensional DFT can increase the DFT performance from 62.137836 GFLOPS to 103.776588 GFLOPS.
( https://software.intel.com/en-us/articles/tuning-the-intel-mkl-dft-funct...)
I am surprised at the increasing of the performance, so I want to try padding_leading_dim.But when I rum the example on my cpu ,I do not get the high performance, even the result doesn't changed a lot .
And I want to know that can I improve my DFT performance through the tip? Because I find the complex to complex may not need the input strides and output strides.
Best Regards!