Hi,
my question is regarding improving the performance of following line:
------------------------
MKM = MD*FA1 - MATMUL(MATMUL(MATMUL(ME,MQ),TRANSPOSE(MG)),TRANSPOSE(ME)) + MATMUL(MATMUL(MATMUL(ME,MG),VA),VR)
------------------------
this line is executed for every element within a finite element implementation and is the bottleneck according to performance wizard.
All the matrices are max 12x12 by size. I have tried using DGEMM in the following way:
------------------------
CALL DGEMM('N', 'N', 12, 3, 12, 1.0D0, ME, 12, MQ, 12, 0, MDUMMY3, 12)
CALL DGEMM('N', 'T', 12, 12, 3, 1.0D0, MDUMMY3, 12, MG, 12, 0, MDUMMY4, 12)
CALL DGEMM('N', 'T', 12, 12, 12, 1.0D0, MDUMMY4, 12, ME, 12, 0, MDUMMY5, 12)
CALL DGEMM('N', 'N', 12, 3, 12, 1.0D0, ME, 12, MG, 12, 0, MDUMMY6, 12)
CALL DGEMM('N', 'N', 12, 1, 3, 1.0D0, MDUMMY6, 12, VA, 12, 0, MDUMMY7, 12)
CALL DGEMM('N', 'N', 12, 12, 1, 1.0D0, MDUMMY7, 12, VR, 1, 0, MDUMMY8, 12)
MKM = MD*FA1 - MDUMMY5 + MDUMMY8
------------------------
however it did not provide any improvement (I think it was even a little bit slower).
I was wondering if you would know if any MKL function or setting would help to speed up this line.
Thank you very much in advance,
Murat