I need to solve a system of linear equations with three righ-hand-side vectors. Initially, I was using the sequential version of MKL (compiling with "libmkl_sequential.a") and solving for each rhs vector sequentially as:
dss_solve_real(DSS_handle, solOpt, rhs1, 1, x1);
dss_solve_real(DSS_handle, solOpt, rhs1 + numOfVars, 1, x1 + numOfVars);
dss_solve_real(DSS_handle, solOpt, rhs1 + 2*numOfVars, 1, x1 + 2*numOfVars);
where, numOfVars represent number of variables.
Then, I decided to ask dss_solve_real to solve for all rhs vectors at once and I assumed that it will roughly lead to 3 times improvement. So, I compiled the code using "libmkl_intel_thread.a" and used following code:
dss_solve_real(DSS_handle, solOpt, rhs1, 3, x1);
In my surprise, the timing is very wierd. Sequential version takes 0.548 sec while when I want to solve for all rhs vectors at once, it takes 5.024sec, which is almost 10 times more than sequential version.
I feel there is something wrong here and I may be needed to set some environment variables. So, please let me know if you have similar experience.
Any help is appreciated.