Random Number Generator

December 6, 2018, 1:11 pm

Latest and popular articles on Intel Technologies

≪ Previous: How to find out inverse of a binary matrix by using mkl functions?

Hello,

I am trying to generate a vector of random numbers using the following code. However, every time I execute the code, the output vector is the same as previous run! Can you please help? Thank you in advance. - Afshin

    int N = 30;
    double r[N];
    VSLStreamStatePtr stream;
    int errcode;
    double a=0,sigma=0.5;
   
    /***** Initialize *****/
    errcode = vslNewStream( &stream, VSL_BRNG_MT2203, 10000 );
    printf("err = %i\n",errcode);
    /***** Call RNG *****/
    errcode = vdRngGaussian( VSL_RNG_METHOD_GAUSSIAN_BOXMULLER, stream, N, r, a, sigma );
    printf("err = %i\n",errcode);
    vslDeleteStream(&stream);
 
    for (i=0; i<N; i++){
        printf("r = %f\n",r[i]);
    }

↧

spffrt2 issue

December 7, 2018, 7:55 am

Latest and popular articles on Intel Technologies

≫ Next: How to configure libs when trying to use Inspector-executor Sparse BLAS Routines and Pardiso

≪ Previous: Random Number Generator

I am trying to use mkl_dspffrt2, but it does not give a correct result. I used packed storage as recommended. The routine does not work for n=1. Let's say the matrix is A=[10]. Have anyone used this routine?

Thanks,

↧

How to configure libs when trying to use Inspector-executor Sparse BLAS Routines and Pardiso

December 8, 2018, 5:39 am

Latest and popular articles on Intel Technologies

≫ Next: Finding inverse of a binary matrix by using LAPACKE_dgetrf and LAPACKE_dgetri

≪ Previous: spffrt2 issue

Here is my environment : Visual Studio Community 2017 + Parallel Studio XE 2019 update 1

I`ve tried Link line advisor but it failed with 'No symbolic file loaded for mkl_avx2.dll'

And the current libs linked in are:

mkl_core.lib

mkl_intel_thread.lib

mkl_intel_ilp64.lib

mkl_blas95_ilp64.lib

impi.lib

Besides, I`ve tried add libomp5md.lib but it didn`t work for the problem.

↧

Finding inverse of a binary matrix by using LAPACKE_dgetrf and LAPACKE_dgetri

December 10, 2018, 2:08 am

Latest and popular articles on Intel Technologies

≫ Next: Different computation results on different processors using mkl_cbwr_set(int)

≪ Previous: How to configure libs when trying to use Inspector-executor Sparse BLAS Routines and Pardiso

I'm trying to find out inverse of the following binary matrix by using following functions,

lapack_int LAPACKE_dgetrf (int matrix_layout , lapack_int m , lapack_int n , double * a , lapack_int lda , lapack_int * ipiv );

lapack_int LAPACKE_dgetri (int matrix_layout , lapack_int n , double * a , lapack_int lda , const lapack_int * ipiv );

But expected outputs are not achieved (Expected output is given below, which is found out by matlab inv(D).)

My_code.c

#define N 84

int main()
{

int i,j,m=N,n=N,lda=84,ipiv[N]

double D[N*N]={           //84 x 84 (Input matrix)
                   0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
                   1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,
                   0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
                   0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,
                   0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,
                   0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
                   0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,
                   0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,
                   1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,
                   0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,
                   0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1};

printf("LU info:%d\n",LAPACKE_dgetrf(LAPACK_ROW_MAJOR,m,n,D,lda,ipiv));

printf("Inverse Info:%d\n",LAPACKE_dgetri (LAPACK_ROW_MAJOR,N,D,lda,ipiv));

return 0;

Output: I'm getting expected inverse result of matrix A, except column no:16 (entire column values are 0).

I've verified the result of LAPACKE_dgetrf.(Upto this everthing is perfect)

Could anyone explain me what is happing here and how to solve this problem?

Expected 16^thcol result is:

0
1.85037170770859e-17
-1.85037170770859e-17
0
1.85037170770859e-17
0
0
0
0
1.85037170770859e-17
0
1.85037170770859e-17
0
1.85037170770859e-17
7.40148683083438e-17
1
-1.85037170770859e-17
0
0
0
0
0
0
0
0
1.85037170770859e-17
0
0
3.70074341541719e-17
0
1.85037170770859e-17
-1.85037170770859e-17
1.85037170770859e-17
-1.85037170770859e-17
0
0
0
0
0
-1.85037170770859e-17
0
1.85037170770859e-17
0
0
0
0
0
0
0
0
0
0
1.85037170770859e-17
-7.40148683083438e-17
0
0
0
0
0
0
0
0
0
0
3.70074341541719e-17
0
3.70074341541719e-17
0
0
0
0
-3.70074341541719e-17
0
0
-3.70074341541719e-17
-7.40148683083438e-17
1.85037170770859e-17
0
-1.85037170770859e-17
0
0
0
0
1.85037170770859e-17

↧

Different computation results on different processors using mkl_cbwr_set(int)

December 10, 2018, 6:39 am

Latest and popular articles on Intel Technologies

≫ Next: Pardiso Optimization

≪ Previous: Finding inverse of a binary matrix by using LAPACKE_dgetrf and LAPACKE_dgetri

There are two desktop PC with the following processors: Intel Core i5 4570 (Haswell) и Intel Core i5 3330 (Ivy Bridge).
Use MKL 2019.0.1, build 20180928.
The app use PARDISO with the following settings: MKL_INT _mtype = 11; MKL_INT _nrhs = 1; MKL_INT _iparm1 = 2; MKL_INT _iparm3 = 0; MKL_INT _iparm4 = 2; MKL_INT _iparm7 = -1; MKL_INT _iparm9 = 13; MKL_INT _iparm10 = 1; MKL_INT _iparm12 = 1; MKL_INT _iparm20 = 2; MKL_INT _iparm23 = 0; MKL_INT _iparm33 = 1; MKL_INT _iparm34 = 1; mkl_domain_set_num_threads(1, MKL_DOMAIN_PARDISO);
Call the mkl_cbwr_set(MKL_CBWR_SSE2) and next PARDISO.
The native library was build using Microsoft Visual Studio 2017 (v141)
The mkl_intel_lp64.lib, mkl_intel_thread.lib, mkl_core.lib used as additional dependencies.
The app runs on both PC.
The calculation result is different. See attached files.

Please suggest a solution to my problem.

Attachment	Size
Download SSE2_Results.zip	454.63 KB

↧

Pardiso Optimization

December 11, 2018, 12:11 am

Latest and popular articles on Intel Technologies

≫ Next: LAPACKE_dsyevr doesn't finish but LAPACKE_dsyev works well

≪ Previous: Different computation results on different processors using mkl_cbwr_set(int)

Hello,

i am currently solving a relatively small system of equations (3000-10000 equations), but since i have to solve it millions of times, each time with slightly different system matrices and right hand sides, it still takes a lot of time overall. So now i am trying to shave a few more ms off. My system matrix is real, not symmetric and contains about five nonzero entries per row. So now i am wondering how to optimize the problem further. I already did the following: Call phase=13 only for the first solve, then swith to phase=23 and reuse pt. This saved about 30% calculation time.

Iparm(24)=10 also saved some time, but unfortunately only in Debug mode.
Iparm(2)=0 shaved a few ms off as well.

I Also have a few questions: Is structurally symmetric == symmetric? I thought structurally symmetric meant that the pattern of nonzero entries is symmetric, which should be the case for my matrix, but using mtype=1 resulted in an access violation in pardiso.

My system matrix does not change very much from iteration to iteration. About half the equations stay the same, i can also predict quite well which equations stay the same. Is there any way to abuse that to speed up the solution?

Also, are there maybe other solvers that may be better suited for my problem?

The function is down below, an example matrix in csr format can be found here: http://s000.tinyupload.com/?file_id=00837828014833851584
Any help is very appreciated! Even the smallest gain in execution speed will help me a great deal.

SUBROUTINE QOED_SOLVE_PARDISO_S(A,ia,ja,pt,schleife, ZUST, GIT, SYS)
    USE MODULE_QOED_SUB

    IMPLICIT NONE
    TYPE(GITTER(GIT%NX,GIT%NY))                                     ,INTENT(IN)             :: GIT
    INTEGER             , DIMENSION(GIT%NX*GIT%NY+1)                ,INTENT(IN)             :: ia
    INTEGER             , DIMENSION(GIT%NX*(GIT%NY-2)*5+2*GIT%NX)   ,INTENT(IN)             :: ja
    REAL(KIND=8)        , DIMENSION(GIT%NX*(GIT%NY-2)*5+2*GIT%NX)   ,INTENT(IN)             :: A
    TYPE(ZUSTAND(GIT%NX,GIT%NY))                                    ,INTENT(INOUT)          :: ZUST
    TYPE(SYSTEMMATRIX(GIT%NX,GIT%NY))                               ,INTENT(IN)             :: SYS
    INTEGER                                                         ,INTENT(IN)             :: schleife
    INTEGER                                                                                 :: maxfct,mnum
    INTEGER                                                                                 :: mtype
    INTEGER                                                                                 :: phase
    INTEGER                                                                                 :: n
    INTEGER                                                                                 :: nrhs
    INTEGER                                                                                 :: error
    INTEGER(KIND=8) , DIMENSION(64)                            , INTENT(INOUT)             :: pt
    INTEGER(KIND=8) , DIMENSION(64)                                                        :: pt_temp!Internal pardiso data storage, initialized to zero, needs to be saved for following pardiso calls or memory leaks can occur
    INTEGER                                                                                 :: msglvl !PARDISO Output, 1-> generate statisitical output
    INTEGER          , DIMENSION(64)                                                        :: IPARM
    INTEGER          , DIMENSION(GIT%NX*GIT%NY)                                             :: PERM

    MSGLVL=2
    PERM=0
    phase=23
    mtype=11
    error=0
    nrhs=1
    mnum=1
    maxfct=1
    mnum=1

    n=GIT%NX*GIT%NY
    iparm=0_8

    CALL PARDISOINIT(pt_temp,mtype,iparm)
    if (schleife == 1) then
        pt=pt_temp
        phase=13
        endif

    CALL PARDISO(pt, maxfct, mnum, mtype, phase, n, a, ia, ja, perm, nrhs, iparm, msglvl, SYS%RHS, ZUST%P0, error)

END SUBROUTINE

The picture is for the first call, subsequent calls need significantly less time for malloc.

↧

LAPACKE_dsyevr doesn't finish but LAPACKE_dsyev works well

December 12, 2018, 10:34 pm

Latest and popular articles on Intel Technologies

≫ Next: Signature of v?Fdim, v?Fmax, ...

≪ Previous: Pardiso Optimization

I have a symmetrc matrix with some parameters to be changed. For some parameters, the LAPACKE_dsyevr does not return for a very long time, so I have to finish the program.

If I change to LAPACKE_dsyev, for most cases, the speed is slower than dsyevr, but the dsyev can return correctly.

What is the reason?

Thanks for your help.

↧

Signature of v?Fdim, v?Fmax, ...

December 14, 2018, 6:03 am

Latest and popular articles on Intel Technologies

≫ Next: Question about the allocate procedure

≪ Previous: LAPACKE_dsyevr doesn't finish but LAPACKE_dsyev works well

Hello,

why is the second array of the functions v?Fdim, v?Fmax, v?Fmin etc. not constant?

The signature in the file mkl_vml_functions.h is:

_Mkl_Api(void, vdFdim, (const MKL_INT n, const double a[], double r1[], double r2[]))
_Mkl_Api(void, vdFmax, (const MKL_INT n, const double a[], double r1[], double r2[]))
_Mkl_Api(void, vdFmin, (const MKL_INT n, const double a[], double r1[], double r2[]))

I think it would be correct as:

_Mkl_Api(void, vdFdim, (const MKL_INT n, const double a[], const double b[], double r[]))
_Mkl_Api(void, vdFmax, (const MKL_INT n, const double a[], const double b[], double r[]))
_Mkl_Api(void, vdFmin, (const MKL_INT n, const double a[], const double b[], double r[]))

Kind regards

↧

Question about the allocate procedure

December 17, 2018, 5:14 am

Latest and popular articles on Intel Technologies

≫ Next: CPardiso phase 33 scaling

≪ Previous: Signature of v?Fdim, v?Fmax, ...

Hi!

I`m wondering for what reason can it be that for an allocate statement like

if(.not. allocated(a)) then

allocate(a(size),stat=stat,errmsg=errmsg)

else if (size(a) /= size) then

deallocate(a,stat=stat,errmsg=errmsg)

allocate(a(size),stat=stat,errmsg=errmsg)

end if

gives a stat status of 0 when errmsg is 'Allocatable array or pointer is not allocated'?

↧

CPardiso phase 33 scaling

December 18, 2018, 9:38 am

Latest and popular articles on Intel Technologies

≫ Next: Block sparse matrix with sparse matrix

≪ Previous: Question about the allocate procedure

Hi,

We want to use Cluster Pardiso for our finite element application. To get an estimate of performances, we used a simple code (attached file) to read a matrix from the Sparse Suite Collection (Matrix Market format) and then measure execution time for each phase (11, 22 and 33).

Factorisation (22) phase shows good scale up with MPI and OpenMP parallelization but solving phase (33) performances are not nearly as good as factorisation.

For example, the table below shows running times (in seconds) for differents combination of MPI processes and OMP threads (by process).

Serena.mtx:

MPI / OMP Phase=22 Phase=33

2 / 2 408.70 2.4668

2 / 4 249.52 1.3382

2 / 8 234.87 3.7524

2 / 16 93.879 1.3181

4 / 2 327.69 1.8661

4 / 4 162.16 1.9664

4 / 8 96.526 4.4899

4 / 16 58.619 1.3763

8 / 2 175.61 1.1638

8 / 4 90.975 1.1006

8 / 8 67.704 2.4264

8 / 16 39.654 0.9049

16 / 2 127.61 1.4321

16 / 4 62.155 0.9136

16 / 8 53.761 2.0407

16 / 16 26.957 0.7122

32 / 8 36.447 2.1856

32 / 16 24.977 0.3729

We can observe that solving does not always decrease with more MPI process or OpenMP threads. We tested other matrices (RM07R) but the same behaviour was observed. Is this normal or is it an issue? Is there a way to get better scaling?

Thanks a lot for any advice

Guillaume

Attachment	Size
Download CPMM.tar.gz	7.73 KB

↧

Block sparse matrix with sparse matrix

December 19, 2018, 5:06 am

Latest and popular articles on Intel Technologies

≫ Next: Threading Problem in Pardiso

≪ Previous: CPardiso phase 33 scaling

I have a sparse matrix M, and a sparse matrix K, both of the sparse matrices obtain the same non-zeros location, how to form a sparse matrix with following format:

A= [ M, 0.2K; 0.1M, 0.1K];

This was used to solve linear equation with Ax = b.

Thanks a lot.

↧

Threading Problem in Pardiso

December 19, 2018, 1:03 pm

Latest and popular articles on Intel Technologies

≫ Next: undefined symbol: mkl_serv_check_ptr_and_warn when calling function from shared library using Python

≪ Previous: Block sparse matrix with sparse matrix

Hello,

We meet a problem in using Pardiso with multiple threads. Our purpose is solving: AX=B, X is a non-symmetry sparse matrix, A & B are vector. What we do is initialize the Pardiso, LU decomposition by phase 22, then call function "pardiso()" to solve the problem by phase 33.

The phenomenon is: if setting up the MKL by "mkl_set_dynamic( true )" and do nothing else, the solver always uses 1 thread only, not matter how big the matrix could be, from 1K unknown to 1M unknown. If setting up the MKL by "mkl_set_dynamic( true )" and "mkl_set_num_threads(4)", we can see 4 threads are used (CPU: i7-3770, i7-5820k, and dual xeon x5460), but the solving speed slow down by spend double time.

We also use MKL blas functions in other parts of our code, "mkl_set_dynamic( true )" works well, and we can observe multiple threads are used and can get a linear sppedup.

So, our question is how to enable Pardiso with multiple threads and get a reasonable speedup?

I'd appreciate any input and ideas,

↧

undefined symbol: mkl_serv_check_ptr_and_warn when calling function from shared library using Python

December 19, 2018, 3:41 pm

Latest and popular articles on Intel Technologies

≫ Next: Help on running GMRES with MKL in C++

≪ Previous: Threading Problem in Pardiso

I'm calling a C function from within my Python code using the ctypes Python library. The C function calls a MKL function. This however gives me an error:

symbol lookup error: /cm/shared/apps/intel/composer_xe/2015.5.223/mkl/lib/intel64/libmkl_avx2.so: undefined symbol: mkl_serv_check_ptr_and_warn

Below is my setup, the code and the compiler command and flags used.

My setup:
intel/compiler/64/15.0/2015.5.223
intel/mkl/64/11.2/2015.5.223

LD_PRELOAD=/cm/shared/apps/intel/composer_xe/2015.5.223/mkl/lib/intel64/libmkl_core.so

The LD_LIBRARY_PATH contains, among other things, the following: LD_LIBRARY_PATH=/cm/shared/apps/intel/composer_xe/2015.5.223/mkl/lib/intel64:/cm/shared/apps/intel/composer_xe/2015.5.223/mpirt/lib/intel64:/cm/shared/apps/intel/composer_xe/2015.5.223/compiler/lib/intel64:

MM.c

#include "mkl.h"

void matrix_mult(double *A, double *B, double *C, int N, int M, int P) {
   mkl_set_num_threads(4);
   cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
                   N, P, M, 1.0, A, M, B, P, 0.0, C, P);
}

My MM.c file:

#include "mkl.h"
void matrix_mult(double *A, double *B, double *C, int N, int M, int P) {
mkl_set_num_threads(64);
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
N, P, M, 1.0, A, M, B, P, 0.0, C, P);
}

Then I compile this file to a shared library using the following command:

icc -shared -fPIC -mkl MM.c -o MM.so

I had to change the LD_PRELOAD env variable so it contains the following: LD_PRELOAD=/cm/shared/apps/intel/composer_xe/2015.5.223/mkl/lib/intel64/libmkl_core.so

My Python file, test.py:
from ctypes import *
mkl = cdll.LoadLibrary("./MM.so")
dgemm = mkl.matrix_mult
N = 22

double_array = c_double*(N*N)
A = double_array(*[1]*N*N)
B = double_array(*[1]*N*N)
C = double_array(*[0]*N*N)
dgemm(byref(A), byref(B), byref(C), c_int(N), c_int(N), c_int(N))

Up and including N=21 this works fine. However, when I have a N of 22 or bigger I get the following error:

I run my code: python test.py

python: symbol lookup error: /cm/shared/apps/intel/composer_xe/2015.5.223/mkl/lib/intel64/libmkl_avx2.so: undefined symbol: mkl_serv_check_ptr_and_warn

The following thread [0] suggests that I might have multiple MKL versions on my machine. AFAIK this is not the case.

Does any of you had this problem too? Or does somebody knows how to fix it?

[0] https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/...

Thanks a lot!

↧

Help on running GMRES with MKL in C++

December 27, 2018, 9:07 am

Latest and popular articles on Intel Technologies

≫ Next: Batch normalization example

≪ Previous: undefined symbol: mkl_serv_check_ptr_and_warn when calling function from shared library using Python

Hi,

I am new to MKL library. I tried to use to use it to solve a sparse system matrix using “GMRES” functions in MKL library.

I tried the steps in the given in examples which come with the MKL library ( “fgmres_st_criterion_c.c”). I was able to compile and run the example with ICC v19.0 ( I am using Windows 10 64bit with Intel Parallel studio 2019. I have a student license). The problem is when I ported the code into my C++ program, nothing seems to be working.

My code is written in C++ and I am using template classes. I am not sure whether this is possible or not. I have included my code until gmres solver is initialized. The initialization fails and the error code is not listed in the documentation (“RCI_request =-3689348814741910324” ). “ipar” variable has wrong values.

Additionally, I notice something strange. Let's say if I am to set “RCI_request =0” (which is not needed) before calling “dfgmres_init”. Then “dfgmres_init” will simply return RCI_request =0. But still, the “ipar” variable values are wrong. I notice the same behavior with the Conjugate Gradient method.

I have 16k elements in my problem but it will be much larger when the mesh is refined. ( n = 16000).

Thank you all in advance

Dilshan

template<class T>

void SPMV<T>::call_gmres_solver_mkl(const vector<T>& coefNeigbor, const vector<T>& RHS,

vector<T>& phi, vector<T>& phiOld) {

if (sizeof(T) == sizeof(float)) {

cout << "currently this program is written for double floatting point numbers only.\nplease update the source code and recompiler"<< endl;

cout << "Program will terminate now"<< endl;

cin.get();

exit(0);

}

MKL_INT n = nFl;// number of unknows

MKL_INT nEle = (nFl + (nFl*nface) - (nTtl - nFl));

if (nEle <= 0) {

cout << "The number of interfaces must be greater than zero. current value ="<< n << endl;

cout << "Program will exit now"<< endl;

cin.get();

exit(0);

}

MKL_INT RCI_request, itercount, expected_itercount = 8;//

double *A = (double*)malloc(nEle * sizeof(double));

double *residual = (double*)malloc(n * sizeof(double));

int id = 0;

ia[0] = 0;

int n0 = n * (2 * n + 1) + (n * (n + 9)) / 2 + 1;

double *tmp = (double *)malloc(n0 * sizeof(double));

double *rhs = (double *)malloc(n * sizeof(double));

double *solution = (double *)malloc(n * sizeof(double));

MKL_INT ipar[128];

double dpar[128];

// At first run phiOld = 0

for (int i = 0; i < n; i++) {

solution[i] = phiOld[i];

rhs[i] = RHS[i];

}

dfgmres_init(&n, solution, rhs, &RCI_request, ipar, dpar, tmp);

if (RCI_request != 0) {

cout << "Some error occured during the gmres initialisation phase. Error code ="<< RCI_request << endl;

goto FAILED;

}

// rest of the code

}

↧

Batch normalization example

January 2, 2019, 12:00 pm

Latest and popular articles on Intel Technologies

≫ Next: Problem with random number generator ?

≪ Previous: Help on running GMRES with MKL in C++

Hello,

This is my first program using MKL library and i want to include a simple batch normalization call after convolution. This is my code for BN
Does anyone have an idea what i did wrong and why the execution of my code is failing ?

/*** BN1 section ***/

CHECK_ERR(dnnBatchNormalizationCreateForward_F64(&bn1, attributes, lt_conv1_output, 0), err);

resBn1[dnnResourceSrc] = resConv1[dnnResourceDst];

CHECK_ERR(dnnLayoutCreateFromPrimitive_F64(&lt_bn1_output, bn1, dnnResourceDst), err);

CHECK_ERR(dnnAllocateBuffer_F64((void **)&resBn1[dnnResourceDst], lt_bn1_output), err);

CHECK_ERR(init_conversion(&cv_relu1_to_user_output, &user_o, lt_user_output, lt_bn1_output, resBn1[dnnResourceDst]), err);

...............

CHECK_ERR(dnnExecute_F64(bn1, (void *)resBn1), err);

↧

Problem with random number generator ?

January 2, 2019, 1:47 pm

Latest and popular articles on Intel Technologies

≫ Next: MKL and CMAKE

≪ Previous: Batch normalization example

This isnt a BIG problem, but it might trip someone up if they are not aware of it.

When I call RANDOM_NUMBER(XX) where xx is real(8),

the first value is always a very small number, typically under 1.D-4.

So the first value obtained is not really a random number.

Of course can use RANDOM SEED to get around this, but I thought

it was supposed to use the time and date by default.

at any rate, the first number should be as random as all the following ones, right ?

Has this been addressed before ?

↧

MKL and CMAKE

January 3, 2019, 5:20 pm

Latest and popular articles on Intel Technologies

≫ Next: Why VM library is so slow in the new mode of CPUs

≪ Previous: Problem with random number generator ?

Is universal cmake script for finding and linking MKL exist?

For example, I want to create solution which uses cmake and MKL:

On the first machine I want to compile my application with Intel C++ Compiler 19.0 on Windows, on the second machine i want to compile it with gcc and link MKL manually.

I found some examples of FindMKL.cmake file in the internet, but they didn't work.

Please, help!

Thank you

↧

Why VM library is so slow in the new mode of CPUs

January 4, 2019, 12:55 am

Latest and popular articles on Intel Technologies

≫ Next: How to implement numpy broadcast mechanism with mkl?

≪ Previous: MKL and CMAKE

Hi, I had reached a special problem and found that the calculation which using for ... loop is more efficiency than use the VM library of MKL. I test for several examples. It shows the same kind of results.

For example. Work with CPU : Intel(R) Xeon(R) Gold 6148 CPU, Intel Parallel Studio 2018u5. After running the 'test', the result shows:

./test
Time for normal distribution
   serial:   5s
   vector (HA):   12s
   vector (LA):   11s
   vector (EP):   12s

The compile and link flags are:

-O3 -xHost -DMKL_ILP64 -I${MKLROOT}/include

And for link:

-Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_intel_ilp64.a ${MKLROOT}/lib/intel64/libmkl_sequential.a ${MKLROOT}/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread -lm -ldl

I also using Vtune to check the computing. It shows the time consuming of vdExp is almost equal the for...loop of serial part. The source code and Makefile are found in the tar package. The OS is Centos 7.3

I don't understand why the VM library runs so slowly. Is there anything wrong with the FLAGS or codes?

Attachment	Size
Download forintelforum.tar	7.5 KB

↧

How to implement numpy broadcast mechanism with mkl?

January 4, 2019, 6:19 am

Latest and popular articles on Intel Technologies

≫ Next: How to implement numpy broadcast mechanism with mkl?

≪ Previous: Why VM library is so slow in the new mode of CPUs

How to implement numpy broadcast mechanism with mkl?

I have been confused, how to use mkl to efficiently implement the broadcast mechanism in numpy ((Element wise operator "+","-","*")?
such as
2-D array sub 1-D array
[[1,2,3],
[4,5,6],
[7,8,9]]
-
[1, 2, 3]
=
[[0, 0, 0],
[3, 3, 3],
[6, 6, 6]]

And the second operation (can be understood as a matrix multiplied by a diagonal matrix)
2-D array multiply 1-D array(Element wise multiply )
[[1,2,3],
[4,5,6],
[7,8,9]]
*
[1, 2, 3]
=
[[1, 4, 9],
[4, 10, 18],
[7, 16, 27]]

↧

How to implement numpy broadcast mechanism with mkl?

January 4, 2019, 6:47 am

Latest and popular articles on Intel Technologies

≫ Next: No easy A to Z reference ?

≪ Previous: How to implement numpy broadcast mechanism with mkl?

How to implement numpy broadcast mechanism with mkl?

I tried to implement with the for loop +cblas_dscal/vdSub
But I think this is not efficient, I don't know if there is any better implementation.

↧