Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 3005

Wrong results from cblas_sgemm

$
0
0

Here is the code:

float max_val(const float * vec, size_t sz)
{

    float result = 0.0f;
    for (size_t i = 0; i < sz; ++i)
    {
        float val = abs(vec[i]);
        if (val > result)
            result = val;
    }

    return result;

}

int main()
{
    const int M = 64;
    const int N = 50176;
    const int K = 576;
    const float alpha = 1.0;
    const float beta = 0.0;


    float *A, *B, *C;
    A = (float *)mkl_malloc( M*K*sizeof( float ), 32 );
    B = (float *)mkl_malloc( K*N*sizeof( float ), 32 );
    C = (float *)mkl_malloc( M*N*sizeof( float ), 32 );

    for (size_t i = 0; i < M*K; ++i)
        A[i] = 1.0;
    for (size_t i = 0; i < K*N; ++i)
        B[i] = 2.0;
    for (size_t i = 0; i < M*N; ++i)
        C[i] = 1.0;

    cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, M, N, K, alpha, A, K, B,
       N, beta, C, N);
    printf("%f\n", max_val(C,M*N));
    mkl_free(A);
    mkl_free(B);
    mkl_free(C);

}

Compile:

icc -I/opt/intel/compilers_and_libraries_2016.0.109/linux/mkl/include -L/opt/intel/compilers_and_libraries_2016.0.109/linux/mkl/lib/intel64_lin test.cpp -o test_cblas -lmkl_rt

The array C should end up being all values of 1152. However, when I run this, I get an output of 1153.

Upon looking closer, it turns out that most of the values in C are 1152 except for a bunch of contiguous chunks that are 1153, or more generally, 1152+(initial_value_of_C_array).

If I do this instead with CblasColMajor (and change the stride values accordingly), everything works fine.

What is going on??

Thread Topic: 

Help Me

Viewing all articles
Browse latest Browse all 3005

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>