Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all 3005 articles
Browse latest View live

How to compile and run Quantum Espresso with intel MKL and Open MPI on Mac OS X

$
0
0

I want to run Quantum Espresso faster on Mac Pro (multi-core). I think using intel MKL(+FTW3) and Open MPI with intel compiler is the best solution. It was a process of trial and error, and I finally achieved to compile pw.x. But, when I run pw.x for example 

~/q-e/bin/pw.x < graphene.scf.in > graphene.scf.out

I got an eroor.

forrtl: severe (174): SIGSEGV, segmentation fault occurred 
Image PC Routine Line Source
pw.x 000000011021EB14 Unknown Unknown Unknown
libsystem_platfor 00007FFF560EFF5A Unknown Unknown Unknown
pw.x 000000010FE3D7CB Unknown Unknown Unknown
....

I assume that the compilation is wrong. I show machine environments and compile processes below.

  • Mac Pro : 2.7 GHz 12-Core Intel Xeon E5 : 64 GB 1866 MHz DDR3 : High Sierra 10.13.6
  • Xcode : 9.4.1
  • Intel compiler(fortran) : compilers_and_libraries_2019.1.144
  • Intel compiler(C++,MKL) : compilers_and_libraries_2019.2.184
  • Open MPI : 3.0.3

 And I did  "modify install" to install "Cluster support" and "Intel MKL core libraries for Fortran" at  compilers_and_libraries_2019.2.184

  1. path settings
    1. source /opt/intel/compilers_and_libraries_2019.1.144/mac/bin/compilervars.sh intel64
      source /opt/intel/compilers_and_libraries_2019.2.184/mac/bin/compilervars.sh intel64
    2. source /opt/intel/compilers_and_libraries_2019/mac/mkl/bin/mklvars.sh intel64
      
  2. compile FFTW3 with intel compiler

    1. cd ${MKLROOT}/interfaces/fftw3xf
      sudo make libintel64 compiler=intel
      
    2. cp libfftw3xf_intel.a ${MKLROOT}/lib/
      
  3. compile Open MPI with intel compiler
    1. ./configure --prefix=/opt/openmpi CC=icc CXX=icpc F77=ifort FC=ifort
      sudo make
      sudo make install
      
    2. echo 'export PATH=/opt/openmpi/bin:$PATH'>> .bashrc
      
  4. make settings for using mkl on openmpi
    1. cd ${MKLROOT}/interfaces/mklmpi
      sudo make libintel64 interface=ilp64 MPICC='mpicc' INSTALL_LIBNAME
        ='libmkl_blacs_openmpi_ilp64'
    2. cp obj_ilp64/libmkl_blacs_openmpi_ilp64.a ${MKLROOT}/lib/
      
  5. make QE (only PW)
    1. ./configure --enable-parallel --with-scalapack=openmpi --disable-openmp F90=ifort
        MPIF90=mpif90 CC=mpicc CXX=icc F77=ifort 
        LAPACK_LIBS=" ${MKLROOT}/lib/libmkl_intel_ilp64.a ${MKLROOT}/lib/libmkl
        _sequential.a ${MKLROOT}/lib/libmkl_core.a -lpthread -lm -ldl" 
        BLAS_LIBS="${MKLROOT}/lib/libmkl_intel_ilp64.a ${MKLROOT}/lib/libmkl
        _sequential.a ${MKLROOT}/lib/libmkl_core.a -lpthread -lm -ldl" 
        FFT_LIBS="${MKLROOT}/lib/libfftw3xf_intel.a -lmkl_intel_ilp64 -lmkl_sequential
         -lmkl_core -lpthread" MPI_LIBS="-L/opt/openmpi/lib" 
        DFLAGS="-D__INTEL -D__FFTW3 -D__MPI -D__SCALAPACK -D__PARA" 
        IFLAGS="-I/Users/username/q-e/include -I/Users/username/q-e/FoX/finclude 
        -I/Users/username/q-e/S3DE/iotk/include/ 
        -I${MKLROOT}/include -I${MKLROOT}/include/fftw" 
        SCALAPACK_LIBS="${MKLROOT}/lib/libmkl_scalapack_ilp64.a 
        ${MKLROOT}/lib/libmkl_cdft_core.a ${MKLROOT}/lib/libmkl_intel_ilp64.a 
        ${MKLROOT}/lib/libmkl_sequential.a ${MKLROOT}/lib/libmkl_core.a 
        ${MKLROOT}/lib/libmkl_blacs_openmpi_ilp64.a -lpthread -lm -ldl" 
        LDFLAGS="-static-intel"
    2. make -j8 pw

* At the step 4.1, warnings happened.

mpicc -c -Wall -fPIC    -DMKL_ILP64 -I../../include mklmpi-impl.c -o obj_ilp64/mklmpi-impl.o
mklmpi-impl.c(87): warning #1786: variable "ompi_mpi_ub" (declared at line 928 of "/opt/open
  mpi/include/mpi.h") was declared deprecated ("MPI_UB is deprecated in MPI-2.0")
      RETURN_IF(xdatatype,MPI_UB);
      ^

mklmpi-impl.c(338): warning #1786: function "MPI_Address" (declared at line 1201 of "/opt/open
  mpi/include/mpi.h") was declared deprecated ("MPI_Address is superseded by MPI_Get_address in MPI-2.0")
  	int err = MPI_Address(location, address);
  	          ^

mklmpi-impl.c(397): warning #1786: function "MPI_Attr_get" (declared at line 1241 of "/opt/open
  mpi/include/mpi.h") was declared deprecated ("MPI_Attr_get is superseded by MPI_Comm_get_attr in MPI-2.0")
      int res = MPI_Attr_get(X2COMM(comm),X2KEYVAL(keyval),attribute_val,flag);
                ^

Sorry for writing such a long message. How should I modify above steps to run Quantum Espresso?

FYI : I read the topic "How to correctly compile Quantum Espresso with intel MKL especially intel FFTW" , and I post my question in this forum.


MKL 5 times slower when called from a MEX function in Octave

$
0
0

Hi,

I'm trying to use Intel MKL library in an Octave MEX function, but the performance that I achieve using some MKL functions such as cblas_cgemm is 5 time slower when called from Octave rather than a compiled C executable. I'm using the same compilation flags for both C code and MEX functions in my testing, where I basically compare the speed of a very simple C matrix multiplication script and the same script wrapped in a MEX function (find this short example attached).

This is how I compile the C code:

gcc -I/opt/intel/compilers_and_libraries_2019.0.117/linux/mkl/include -Wall -L/opt/intel/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64 -o cgemm_test_c matmult_c.c -lmkl_gnu_thread -lmkl_rt -lmkl_core -lmkl_intel_ilp64 -lgomp -lpthread -lm -ldl

This is how I compile the MEX function:

mex -v -I/opt/intel/compilers_and_libraries_2019.0.117/linux/mkl/include -Wall -L/opt/intel/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64 -o cgemm_test_mex matmult_c.c matmult_mex.c -lmkl_gnu_thread -lmkl_rt -lmkl_core -lmkl_intel_ilp64 -lgomp -lpthread -lm -ldl

And this is what the mex command is really doing:

gcc -c -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/octave-4.2.2/octave/.. -I/usr/include/octave-4.2.2/octave -I/usr/include/hdf5/serial  -pthread -fopenmp -g -O2 -fdebug-prefix-map=/build/octave-DtqyIg/octave-4.2.2=. -fstack-protector-strong -Wformat -Werror=format-security  -Wall  -I. -I/opt/intel/compilers_and_libraries_2019.0.117/linux/mkl/include  -DMEX_DEBUG matmult_c.c -o matmult_c.o

gcc -c -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/octave-4.2.2/octave/.. -I/usr/include/octave-4.2.2/octave -I/usr/include/hdf5/serial  -pthread -fopenmp -g -O2 -fdebug-prefix-map=/build/octave-DtqyIg/octave-4.2.2=. -fstack-protector-strong -Wformat -Werror=format-security  -Wall  -I. -I/opt/intel/compilers_and_libraries_2019.0.117/linux/mkl/include  -DMEX_DEBUG matmult_mex.c -o matmult_mex.o

g++ -I/usr/include/octave-4.2.2/octave/.. -I/usr/include/octave-4.2.2/octave -I/usr/include/hdf5/serial -I/usr/include/mpi  -pthread -fopenmp -g -O2 -fdebug-prefix-map=/build/octave-DtqyIg/octave-4.2.2=. -fstack-protector-strong -Wformat -Werror=format-security -shared -Wl,-Bsymbolic  -Wall -o cgemm_test_mex.mex  matmult_c.o matmult_mex.o   -L/opt/intel/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64 -lmkl_gnu_thread -lmkl_rt -lmkl_core -lmkl_intel_ilp64 -lgomp -lpthread -lm -ldl -L/usr/lib/x86_64-linux-gnu/octave/4.2.2 -L/usr/lib/x86_64-linux-gnu -loctinterp -loctave -Wl,-Bsymbolic-functions -Wl,-z,relro

Test results:

C code: Elapsed time per multiplication: ~1.86 ms

MEX code: Elapsed time per multiplication: ~8.55ms

 

I have tested different optimisation flags but the results are virtually the same thing. This has been tested in 2 Intel machines with Ubuntu 18.04 and Ubuntu 14.04, yielding very similar results in all cases. MKL environment variables are set as per "source /opt/intel/mkl/bin/mklvars.sh intel64"

 

Many thanks in advance,

Juan.

AttachmentSize
Downloadtext/x-csrcmatmult_c.c1.45 KB
Downloadtext/x-csrcmatmult_mex.c136 bytes
Downloadtext/x-chdrmatmult.h134 bytes

Memory Leak in MKL

$
0
0

Our program analyzes several images every few seconds. Each image is processed in its own thread. The threads are ordinary Windows threads. MKL functions are used in the process. We used a memory leak detection tool to find leaks in the application (Memory Validator by Software Verification). The tool revealed a leak in calls to MKL vsMul function (32 bits for each image). The call stack is as follows:

vsMul->mkl_vml_kernel_GetTTableIndex->_vmlGetThreadLocalData

I found a note in Intel documentation (https://software.intel.com/en-us/mkl-windows-developer-guide-avoiding-me...) and accordingly I tried to use both the mkl_disable_fast_mm function and the MKL_DISABLE_FAST_MM environment variable. None of them helped to eliminate the memory leak.

How can I overcome this memory leak problem?

Typos in the mkl_spblas.h?

$
0
0

Some of the function declarations in mkl_spblas.h do not agree with the documentation. In particular, in the create and export routines, the parameter names for the CSC format coincide with those for CSR format, which I don't believe accurately describes the nature of the parameters (e.g. `rows_start` and `col_indx` should be `cols_start` and `row_indx`). 

Below are a few of the declarations I find in the mkl_spblas.h file:

sparse_status_t mkl_sparse_d_create_csc(
                                             sparse_matrix_t           *A,
                                             const sparse_index_base_t indexing, /* indexing: C-style or Fortran-style */
                                             const MKL_INT             rows,
                                             const MKL_INT             cols,
                                             MKL_INT             *rows_start,
                                             MKL_INT             *rows_end,
                                             MKL_INT             *col_indx,
                                             double              *values );
/* cf. https://software.intel.com/en-us/mkl-developer-reference-c-mkl-sparse-cr... */

sparse_status_t mkl_sparse_d_export_csc(
                                             const sparse_matrix_t  source,
                                             sparse_index_base_t    *indexing,      /* indexing: C-style or Fortran-style */
                                             MKL_INT                *rows,
                                             MKL_INT                *cols,
                                             MKL_INT                **rows_start,
                                             MKL_INT                **rows_end,
                                             MKL_INT                **col_indx,
                                             double                 **values );
/* cf. https://software.intel.com/en-us/mkl-developer-reference-c-mkl-sparse-ex... */

bad interaction betwenn mkl_dynamic and potri in mkl 19.02???

$
0
0

Hi

the program below implements the inversion of an autoregressive matrix.

Program Test
  use blas95
  use lapack95
  USE IFPORT
  use mkl_service
  implicit none
  integer(kind=8) :: istat, n, c1, c2, ise
  integer(kind=4) :: dy
  character(len=200) :: msg
  Real(kind=8), allocatable :: A(:,:)
  real(kind=8) :: r1=0.0D0, r2=0.0D0
  outer:block
    dy=1
    write(*,*) "dynamic: ", dy
    call mkl_set_dynamic(dy)
    call mkl_set_num_threads(mkl_get_max_threads())
    n=10000
    write(*,"(*(g0"",""))") n
    r1=dclock()
    !!start building the matrix
    allocate(&
      &A(n,n),&
      &stat=istat,errmsg=msg)
    if(istat/=0) Then
      write(*,*) msg;exit outer
    end if
    !$OMP PARALLEL DO PRIVATE(c1)
    Do c1=1,size(A,2)
      Do c2=c1,size(A,1)
        A(c2,c1)=0.5**(c2-c1)
      end Do
    end Do
    !$OMP END PARALLEL DO
    ise=size(A,1)
    !$OMP PARALLEL DO PRIVATE(c1) FIRSTPRIVATE(ise)
    Do c1=1,ise-1
      A(c1,(c1+1):ise)=A((c1+1):ise,c1)
    End Do
    !$OMP END PARALLEL DO
    r2=Dclock()
    write(*,*) "alloc: ", r2-r1
    !!end building matrix
    r2=Dclock()
    call potrf(A=A,UPLO="U",INFO=istat)
    r1=dclock()
    write(*,*) "potrf: ",r1-r2
    call POTRI(A=A,Info=istat)
    r2=dclock()
    write(*,*) "potri: ",r2-r1
  End block outer
End Program Test

 

For setting mkl_dynamic to 0 or 1, I noticed hardly any difference in processing time when using mkl 17.08.

mkl_dynamic=0:

potrf: 0.88 seconds, potri: 2.12 seconds

mkl_dynamic=1

potrf: 0.58 seconds, potri: 2.09 seconds

However, with mkl 19.02 the differences are such that mkl_dynamic=0 makes the program unusable.

mkl_dynamic=0:

potrf: 0.37 seconds, potri: 110.69 seconds

mkl_dynamic=1

potrf: 0.37 seconds, potri: 1.11 seconds

Times were obtained on Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz.

Environment variables were:

MKL_NUM_THREADS=36

KMP_AFFINITY=granularity=core,scatter

 

I noticed that potri in mkl 19.02 use a lot of time all 72 threads (including hyperthreading)

 

Is this a newly introduced bug or am I doing anything wrong.

 

Thanks

Keep some values of vector.

$
0
0

Hello, I want to keep some elements of a vector without using a for-loop. For example,

x = [ 10 20 30 40 .... 100] and index = [1 5 10];

In MATLAB you use x(1,index). 

Could you please tell me if there is a command with mkl?

Thank you very much.

F.

pardiso out of memory (-2) in phase 33

$
0
0

Hi,

pardiso stops with error message -2 in phase 33 when increasing the number of rhs from 600 to 700. 600 rhs runs fine with a memory use  of 146.2GB ("RES") reported by "top". The system has 256GB of RAM, so there is still +40% left. What is the issue when increasing #rhs by 100??

Thanks

some worry about FFT's half result

$
0
0

Hi,Intel masters,

  I am writing to ask for information about the FFT's half result . In matlab,i use fft to compute an real array ,and then use ifft ,i'll get the initial array.However , in MKL, i just coule get a N/2 + 1 result. Could I had some configs to got the full result? If not ,maybe i could to Conj the  first half of the result to fill the second result and reverse that ,because of the fft oufput in matlab , that is mirroring .

I would be grateful to receive a prompt reply and I am sorry for my broken English.

thanks,xD.


Question about matrix inverse

$
0
0

Recetnly, i was trying to replace the function for matrix inverse by that of MKL. However, after making it, i found that the time used for computing was not improved and be slower. I don't know the real reason.It could be appreciated that anyone can help me.

The original subroutine was syminvg.

SUBROUTINE syminvg(n,q,norm,nsing,parflg,sbrName,bigajj_limit,utest)

! -------------------------------------------------------------------------
! Purpose:    Inversion symm. matrix with pivot search on the diagonal.
!             Algorithm acm 150, H. Rutishauser 1963
!             Change: upper triangle of matrix in vektor q(n*(n+1)/2)
!
! Author:     H. Rutishauser, C. Urschl
!
! Created:    24-Jan-2003
!
! Changes:    09-Aug-2010 RD: Generic use, only one version for all purposes
!             27-Oct-2010 SL: use m_bern with ONLY
!             23-May-2011 RD: Special handling for diagonal elements with NaN
!             30-Jul-2012 RD: Evaluate the inversion (only for GRP_AIUB)
!             22-Sep-2012 RD: Change variable name "unit" to "utest"
!
! Copyright:  Astronomical Institute
!             University of Bern
!             Switzerland
! -------------------------------------------------------------------------

 

I used two subroutines of intel mkl: 

CALL DPPTRF('U',n,q0,INFO)   CALL DPPTRI('U',n,q0,INFO)

mkl_sparse_d_mv returns different values than mkl_dcsrsymv

$
0
0

Hello,

 

we are updating the code in my organisation to get rid of the deprecated SPBLAS functions. Our software makes some structure calculations with sparse matrices codified as CSR  (symmetric, upper, one-based indexes, non-unit diagonal) and force/displacement vectors. Until now we were using mkl_dscrsymv to multiply matrix by vector (we were obtaining exactly the same values codifying the vector as a one-row matrix and using mkl_dcsrmm because we experimented some segmentation faults with an older MKL version; btw, we are working with MS Visual Studio C++ on an Intel Core i7).

The fact is, all our operations with the Inspector-Executor functions (building the matrix from a COO, adding, even the results from PARDISO) are ok, but the multiplication with mkl_sparse_d_mv return a few significativelly different values than the deprecated functions, and that makes our structures fail. For instance, with a rank=1792 and a vector with 496 non-zero values, we find 144 different values in the multiplication, 36 of them being significally different (at least 1% in absolute value).

This is the code for the operation V_out = M x V_in :

    matrix_descr matrix_descr;
    matrix_descr.type = sparse_matrix_type_t::SPARSE_MATRIX_TYPE_SYMMETRIC;
    matrix_descr.mode = sparse_fill_mode_t::SPARSE_FILL_MODE_UPPER;
    matrix_descr.diag = sparse_diag_type_t::SPARSE_DIAG_NON_UNIT;

    sparse_status_t res = mkl_sparse_d_mv (sparse_operation_t::SPARSE_OPERATION_NON_TRANSPOSE, 1.0, M, matrix_descr, V_in, 0.0, V_out);

The return value is always sparse_status_t::SPARSE_STATUS_SUCCESS, but the values returned into V_out do not match the ones expected.

Is this a problem you were aware of, or have I found some bug?

 

Thanks,

David B.

Slight Discrepancies in AXPY results vs explicit loops

$
0
0

Hi,

In some of our codes, we have noticed some slight discrepancies in the results when using MKL AXPY calls and our own explicit loops to calculate Y=Y+ALPHA*X.  When we are performing iterative matrix solutions, the final solutions using AXPY can be significantly worse than the solutions using explicit loops instead of AXPY. 

I've attached a test case under 64-bit MS Windows using Intel MKL 2019 update 2.  For some vectors and scale factors, the results are identical down to machine precision. For other vector and scale factor combinations, you can barely maintain the specified floating point precision.  In the attached screen shot, the left column of numbers shows the RMS error between the AXPY result and the explicit loop result for two cases.  In the first case, the two results are different and in the second case no difference is discernible between the two results.  Since AXPY loops have no dependencies between vector entries, it does not seem like it could be a threading issue.  While we are aware that floating point operations are always tricky, we don't understand why independent operations of y(i) = y(i) + alpha*x(i) are returning such different results.

We've tried the various suggestions from the documentation about getting repeatable floating point solutions to no avail.  Any advice would be much appreciated.  Is this just to be expected or is there some other MKL issue going on?

Thanks,

John

AttachmentSize
Downloadapplication/zipxaxpy.zip828.48 KB
Downloadimage/pngscreen_output.png71.95 KB

Error building application with mkl

$
0
0

I am having issues with build an application. When running make, it runs successfully until linking the executable. There I encounter the following error: "ld: library not found for -liomp5". I tried building the examples in mkl library and there were no problems. The MKLROOT has been set to the intel's default install location. Any idea how could I solve this issue?

I am using MacOS 10.13.6.

Intel® Parallel Studio XE 2019 On linux

$
0
0

Hi,

I've been using parallel studio xe 2019 for evaluation on linux. I had a problem when compiling a very simple test fortran code using

"ifort source.f90 -mkl -qopenmp" it was working on the development machine but when used on a different machine where the parallel studio is not installed i was getting the error: "error while loading shared libraries". What i did is i copied the file libiomp5.so to usr/lib in the virgin machine so that the simple code with print number of threads works. However, when i just introduced to the code the pardiso solver from mkl, the code works on the development machine but on the virgin machine i'm getting: "kmp_aligned_malloc version VERSION not defined in file libiomp5.so with link time reference".

Any help is appreciated.

Thanks !

Trunst Region optimization dtrnlsp_solve aborts intermittently returning 1502

$
0
0

I'm calling dtrnlsp_solve from C++.   Depending on changes to the surrounding code, compile options (debug vs optimized) etc, Sometimes it works, sometimes it fails on the first call, returning code 1502.  What does the code mean?   Any suggestions on how to fix it? I can't find anything wrong in my code.

CENTRAL_MKL_LIB = /depot/intel/math_kernel_library_2018.1.163/mkl

MATH_LIBS_linux64      = -Wl,--start-group $(CENTRAL_MKL_LIB)/intel64/libmkl_intel_lp64.a $(CENTRAL_MKL_LIB)/intel64/libmkl_sequential.a $(CENTRAL_MKL_LIB)/intel64/libmkl_core.
a -Wl,--end-group -lpthread -lm

For a while it was working if compiled optimized but aborting if compiled debug.  Then I included the call to dtrnlsp_check (see below) and then it would work compiled debug but not optimized.  Then I change some code an another file and now it always fails??? 

double *xx = (double*) malloc (sizeof (double)*nn);
  double *fjac = (double*) malloc (sizeof (double)*nn*sz2);
  double *fvec= (double*) malloc (sizeof (double)*sz2);
  for(int i=0; i<nn; i++) xx[i]=0.0;
  for(int i=0; i<sz2*nn; i++) fjac[i]=0.0;
  for(int i=0; i<sz2; i++) fvec[i]=0.0;
  double eps[6]={.00001, 0.00001, 0.00001, 0.00001, 0.00001, 0.00001};
  MKL_INT iter1=100;
  MKL_INT iter2=10;
  double rs=0.0;
  int res = dtrnlsp_init(&handle, &nn, &sz2, xx, eps, &iter1, &iter2, &rs);
  if(res != TR_SUCCESS) {
    printf("ERROR SCE dtrnlsp_init failed\n");
    exit(1);
  }
#if 0 
  MKL_INT info[6];
  res = dtrnlsp_check(&handle, &nn, &sz2, fjac, fvec, eps, info);
  if(res != TR_SUCCESS) {
    printf("ERROR SCE dtrnlsp_check failed  code=%d\n", res);
    exit(1);
  }
#endif
  MKL_INT  RCI_Request;
  int success=0;
  while(success == 0){
    res = dtrnlsp_solve(&handle, fvec, fjac, &RCI_Request);
    if(res != TR_SUCCESS) {
      printf("ERROR SCE dtrnlsp_solve failed code=%d\n", res);
      exit(1);
    }

......

Thanks

Greg

Batch normalization implementation


mkl_sparse_d_create_csc different definitions

$
0
0

Hi,

I'm having a problem with CSC type sparse matrix generation. The function mkl_sparse_d_create_csc is defined in the intel MKL documentation in the following manner:

sparse_status_t mkl_sparse_d_create_csc (sparse_matrix_t       *A,
                                                                sparse_index_base_t indexing,
                                                                MKL_INT rows,
                                                                MKL_INT cols,
                                                                MKL_INT *cols_start,
                                                                MKL_INT *cols_end,
                                                                MKL_INT *row_indx,
                                                                double *values);

However in the header mkl_spblas.h it is defined differently:

sparse_status_t mkl_sparse_d_create_csc(sparse_matrix_t        *A,
                                                               sparse_index_base_t    indexing, /* indexing: C-style or Fortran-style */
                                                               MKL_INT    rows,
                                                               MKL_INT    cols,
                                                               MKL_INT    *rows_start,
                                                               MKL_INT    *rows_end,
                                                               MKL_INT    *col_indx,
                                                               double        *values );

The definition in the documentation seems to look like a true definition of a CSC type matrix however the definition in the header file looks like a CSR type sparse matrix definition. When I try generating a CSC matrix I get a memory access violation. However I manually checked the input vectors and they are all correctly specified as in the documentation. So I wonder, since the function looks to be defined in the header like a CSR format sparse matrix, should I also give it in this structure?

I'm using MKL version 2019.0.2 for windows.

Thanks for the help,

Adriaan

MKL Feast (inner memory problem: info=-2)

$
0
0

Hi,

i'm trying to solve (sparse) symmetric generalized eigenvalue problems using dfeast_scrgv().

It always worked fine with relatively small problems (up to 2000*2000 sparse matrix), but it turned out i can't solve the bigger ones (about 60000*60000 sparse matrix) as it always returns 'info = -2", that I know it refers to inner memory problems, but I don't know how to fix it.

The problem is solvable using other algorithms (not  included in MKL).

I'm using the latest Intel MKL 2019 library and Intel C++ Compiler 19.0 on Visual Studio 2013 Platform Toolset.

May I ask you for some help?

Thanks in advance,

Daniele

 

 

 

Linking MKL in the makefile

$
0
0

Hi:

     I went to the intel-mkl-link-line-advisor , and got the following:

Use this link line: 

  ${MKLROOT}/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_cdft_core.a ${MKLROOT}/lib/intel64/libmkl_intel_lp64.a ${MKLROOT}/lib/intel64/libmkl_intel_thread.a ${MKLROOT}/lib/intel64/libmkl_core.a ${MKLROOT}/lib/intel64/libmkl_blacs_openmpi_lp64.a -Wl,--end-group -liomp5 -lpthread -lm -ldl

Compiler options: 

 -I${MKLROOT}/include

  Please tell me how to include them in the makefile.

 

Thanks,

 

cblas_ddot access violation

$
0
0

Hi,

I was just trying to learn how to use the 64 bit platform on Parallel Studio XE, when running a very simple example (using cblas_ddot() - as shown below)  I've got the following exception:

Exception thrown at 0x00007FF7AA60EB36 in mkl64bit.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.

Then I tried the Intel Inspector tool and it says there was an invalid memory access: 0x1eb36    vmovsd xmm4, qword ptr [r9+rax*8]

The code is:

int main() {
    double v1[3], v2[3];
    for (size_t i = 0; i != 3; ++i)
        v1[i] = v2[i] = 1.5;

    double result;
    result = cblas_ddot(3, v1, 1, v2, 1);

    return 0;}

I'm using Intel Parallel Studio XE Cluster Edition 2019 for Windows on Visual Studio 2017.

May you please help me to understand what's wrong and how could I fix the problem?

Thanks in advance,

Daniele

MKL_SPARSE_D_TRSM ............ no result when called from fortran

$
0
0

Hi

the program below implements a "csr" class which holds a sparse csr matrix copied from the mkl manual. This matrix is used for Inspector-Executor routines MKL_SPARSE_D_CREATE_CSR, MKL_SPARSE_COPY and MKL_SPARSE_D_TRSM. All routines finish without error. However, MKL_SPARSE_D_TRSM does not seem to solve for anything. Am I doing something wrong here??

include "mkl_spblas.f90"
Module Data_Kind
  Implicit None
  Integer, Parameter :: Large=Selected_Int_Kind(12)
  Integer, Parameter :: Medium=Selected_Int_Kind(8)
  Integer(Medium), Parameter :: Double=Selected_Real_Kind(15,100)
End Module Data_Kind
Module Mod_csr
  use data_kind
  implicit None
  Type :: csr
    integer(large), allocatable :: nrows, ncols
    integer(large), allocatable :: rowpos(:), colpos(:),&
      & pointerB(:), pointerE(:)
    real(double), allocatable :: values(:)
  contains
    Procedure :: iii => Subiii
  end type csr
contains
  Subroutine Subiii(this)
    Implicit None
    Class(csr), intent(inout) :: this
    !!2017 mkl manual page 3233
    allocate(this%nrows,this%ncols,source=5)
    allocate(this%rowpos(6),source=(/1,4,6,9,12,14/))
    allocate(this%colpos(13),source=(/1,2,4,1,2,3,4,5,1,3,4,2,5/))
    allocate(this%pointerB(5),source=(/1,4,6,9,12/))
    allocate(this%pointerE(5),source=(/4,6,9,12,14/))
    allocate(this%values(13),&
      &source=(/1.0D0,-1.0D0,-3.0D0,-2.0D0,5.0D0,&
      &4.0D0,6.0D0,4.0D0,-4.0D0,2.0D0,7.0D0,8.0D0,-5.0D0/))
  end Subroutine Subiii
End Module Mod_Csr
Program Test
  use data_kind
  USE IFPORT
  use Mod_csr, only: csr
  use MKL_SPBLAS
  USE, INTRINSIC :: ISO_C_BINDING
  Implicit none
  Type(csr) :: tscsr
  integer(c_int) :: isstat=0
  Type(Sparse_Matrix_T) :: handle, handle1
  Type(MATRIX_DESCR) :: descr
  real(double), allocatable :: in(:,:), out(:,:)
  outer:block
    descr%TYPE=SPARSE_MATRIX_TYPE_GENERAL
    descr%DIAG=SPARSE_DIAG_NON_UNIT
    call tscsr%iii()
    isstat=MKL_SPARSE_D_CREATE_CSR(&
      &handle,&
      &SPARSE_INDEX_BASE_ONE,&
      &tscsr%nrows,&
      &tscsr%ncols,&
      &tscsr%pointerB,&
      &tscsr%pointerE,&
      &tscsr%colpos,&
      &tscsr%values&
      &)
    if(isstat/=0) Then
      write(*,*) "error 1 ",isstat;exit outer
    End if
    isstat=mkl_sparse_copy(handle,descr,handle1)
    if(isstat/=0) Then
      write(*,*) "error 2 ",isstat;exit outer
    End if
    allocate(in(tscsr%nrows,2),out(tscsr%nrows,2))
    in=1.0;out=0.0
    isstat=MKL_SPARSE_D_TRSM(&
      &SPARSE_OPERATION_NON_TRANSPOSE,&
      &1.0_double,&
      &handle1,&
      &descr,&
      &SPARSE_LAYOUT_COLUMN_MAJOR,&
      &in,&
      &2,&
      &size(in,1),&
      &out,&
      &size(out,1)&
      &)
    if(isstat/=0) Then
      write(*,*) "error 3 ",isstat;exit outer
    end if
    write(*,*) maxval(in), maxval(out), minval(out)
  end block outer
End Program Test

Matrix "out" is supposed to contain the sum over the rows of the inverse of the sparse matrix. But "out" contains only zeros. Also isstat from MKL_SPARSE_D_TRSM does not indicate any error.

The compiling and linking was:

ifort --version
ifort (IFORT) 19.0.2.187 20190117

ifort -i8 -warn nounused -warn declarations -O3 -static -align array64byte -mkl=parallel -qopenmp -parallel -c -o OMP_MKLPARA_ifort_4.20.12-arch1-1-ARCH/Test.o Test.f90 -I /opt/intel/compilers_and_libraries_2019.2.187/linux/mkl/include/

ifort -i8 -warn nounused -warn declarations -O3 -static -align array64byte -mkl=parallel -qopenmp -parallel -o Test_OMP_MKLPARA_4.20.12-arch1-1-ARCH OMP_MKLPARA_ifort_4.20.12-arch1-1-ARCH/Test.o /opt/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64/libmkl_blas95_ilp64.a /opt/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64/libmkl_lapack95_ilp64.a -Wl,--start-group /opt/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64/libmkl_intel_ilp64.a /opt/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64/libmkl_core.a /opt/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64/libmkl_intel_thread.a -Wl,--end-group -lpthread -lm -ldl

 

Any suggestion?

Viewing all 3005 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>