Hi
I am trying to use MKL to do some matrix multiplications. However I need to multiply specific columns in one matrix to another matrix. The problem is that cblac_dgemm takes successive columns, so how can I do this and only multiply certain columns knowing the starting pointer of each of course ?? I tried using a for loop to multiply each column by the matrix but the execution time is much larger than taking successive columns all at once I also don't want to copy the columns to a new variable to but them in a successive order because this also takes long time.
I tried using pthreads to do it in a more paralleled way however still no much improvement.
How does MKL handle parallelization and make it much better than OpenMP and pthreads?
Thanks in advance
Ahmed