Hi !
I am steadily progressing in my quest to convert from a gcc to an icc (and MKL) based build of my code. The reason for the shift is to be able to run my code on the Xeon Phi.
I have managed to get the thing to compile. However, when I run the code, I get an error:
fftw_die: rfftw() is not implemented because MKL DFTI doesn't support half-complex data layout
Looks like I have hit a function that is not supported in MKL. I have an option to avoid the branch calling those functions (I basically pre-generate the needed stuff and reload when using the Phi) so I can already run a limited version of the code, but that is not a solution on the long term. So what are my options ?
- Option 1: Get FFTW (I am using v.2, because I need the MPI-parallellized functions) to compile on the phi. I tried to force FFTW to cross-compile, but that didn't work (I am not a cross-compiling/configure guru though). Since FFTW is pretty clever at optimizing itself at compile time, I suppose that getting it to compile for Phi is not trivial. Has anybody succeeded ? Any "magic lines" that would work to get a running FFTW v2 on the Phi ? At this point, I just need functionning rfftw functions, the others would be taken from MKL. Performance is not really an issue, as the rfftw functions are just called at an initialization stage of the code.
- Option 2: replace the rfftw functions by supported functions. Unfortunately, the code part using rfftw is in C++ (I don't do C++ at all, so I would have to learn at least the basics), and I have no idea how te code internals work (it wasn't written by me, I just use it as a library - of course I understand the final result produced by it, just not it's internal workings). Fortunately (?), rfftw don't seem to be used widely in that library. When I do a grep of rfftw, only a few hits appear. This makes me think replacing the function by something else would not be such a daunting task. The grep appearances are:
in file fftw.cc:
[...]
p = rfftw_create_plan(N, FFTW_REAL_TO_COMPLEX, FFTW_MEASURE | FFTW_USE_WISDOM); \
[...]
p = rfftw2d_create_plan(N, M, FFTW_REAL_TO_COMPLEX, FFTW_MEASURE | FFTW_USE_WISDOM);
And in another file, called fftw.h:
[...]
rfftw_destroy_plan(p);
[...]
rfftw_plan p;
[...]
inline void fft::run(fftw_real *in, int pp) {
rfftw_one(p, in, out + pp*N);
}
[...]
/*! fftw plan generation */
void make_plan();
int
N, /*!< number of rows */
M, /*!< number of columns */
out_length; /*!< fft2d output length */
rfftwnd_plan p; /*!< fftw plan for fft2d computation */
fftw_complex *out; /*!< the output of the fft2d of type fftw_complex which is a structure type containing two doubles, the real part (real) and the imaginary part (im). */
};
/* computates a 2D fft of N*M elements*/
inline void fft2d::run(fftw_real *in) {
run(in, 0);
}
/* computes the ppth 2D fft of N*M elements*/
inline void fft2d::run(fftw_real *in, int pp) {
rfftwnd_one_real_to_complex(p, in, out + pp*out_length);
}
And that's about it. I am not really about the plan creation and destruction functions as those are basically ignored by MKL (right ?). So I would just need to change a couple of functions (and first figure out what they do).
So, for now, I am rooting for Option 1, as that would involve little work (I hope). But option 2 doesn't sound completely undoable.
What do you experts think ?
Thanks in advance !
Miska