Hi
We had a problem with inconsistent results across some of our grid nodes, which I thought was worth sharing. After investigation we pinned this down to two different OS configurations returning different results:
- Baremetal windows 2008
- Virtual windows 2008 running in KVM on RHEL
Both of the machines are identical in terms of hardware (Xeon E7-4870), which supports SSE4.1/2. At the time we were using MKL v11.1.2.
We use MKL’s CNR mode to force SIMD to use only SSE3 instructions, thus achieving numerical consistency across a range of hardware. What we discovered was that on the VM, the call to MKL_CBWR_Get_Auto_Branch was returning SSE3, and as a result we were not calling ::MKL_CBWR_Set(SSE3). Subsequently calculations on that machine were actually using SSE4 instructions, and this turned out to be the source of the numerical differences we were seeing.
The only numerical differences we saw between SSE3/SSE4 emanated from BLAS, although this may be circumstantial.
Although this was easily fixed (by always calling ::MKL_CBWR_Set(SSE3) regardless of what MKL_CBWR_Get_Auto_Branch returns) it took a great deal of investigation to pinpoint the problem.
Whether this issue stems from KVM rather than MKL itself I simply do not know, but thought it was worth sharing.
Thanks,