mkl_dcsrmm multiplies a CSR sparse matrix by a dense matrix, storing the result in a dense matrix.
From what I can tell, this function expects the B and C dense matrices to be in row-major order, as opposed to the column-major order used in most other BLAS routines. I've attached a test case which demonstrates that, given the same B and C inputs in the same format, mkl_dcsrmm produces completely different results than dgemm. dgemm calculates C = alpha*A*B + beta*C (as expected), but mkl_dcsrmm appears to calculate C' = alpha*A*B' + beta*C' instead.
This seems to make it very difficult to mix sparse functions with normal BLAS function. Must I transpose my matrices before and after using the sparse functions?
I tried using mkl_dcscmm instead, but it seems to do the same thing. In fact, I think that mkl_dcsrmm and mkl_dcscmm are identical except that they invert the meaning of the transa parameter.
Do all of the sparse matrix-matrix multiply functions expect the dense parts to be row-major? If so, what is the best practise for using these beside traditional column-major BLAS functions?
(I've found that this question has been asked before, but not answered.)