Signature of LU factorisation and Armadillo

May 3, 2020, 10:40 am

Latest and popular articles on Intel Technologies

≫ Next: Unhandled exception at 0x75724192 (KernelBase.dll). Module not found.

≪ Previous: Parallel Direct Sparse Solver for Clusters and iparm[30]

In our project we use Armadillo and IntelMKL extensively, and Armadillo expects the LAPACK signature to be conformant to the ones on netlib.

Similar to what happened with a previous release (https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/...), the latest version of intel MKL (2020.1.216) seems to introduce the LU factorisation ?getrf as NOTHROW methods. This breaks the compilation, which is fixed by declaring noexecpt the corresponding armadillo methods.

For those who might need it: https://gitlab.com/vincenzo.ferrazzano/armadillo-code/-/commits/fix/9.87...

Why are these methods, and it seems only these methods declared as NOTHROW?

Should we expect more of these signature changes in the future?

↧

Unhandled exception at 0x75724192 (KernelBase.dll). Module not found.

May 1, 2020, 1:10 pm

Latest and popular articles on Intel Technologies

≫ Next: About COO with duplicate entries, and MKL_sparse_export_csr

≪ Previous: Signature of LU factorisation and Armadillo

I'm using the using the latest Windows update, the latest update of the 2020 version of the mkl library and the latest update of the Microsoft Visual Studio 2019 Community edition. I've run the Windows system file checker to verify my operating system files. I'm using C++. My application is a simple console application.

The code below throws an exception which I am unable to catch. It seems to indicate that I am having a path problem. I've pursued this in the Microsoft forums. They referred me to you. Can you help.

try
{
info = LAPACKE_dgetrf(LAPACK_ROW_MAJOR, A.nRows, A.nColumns, aP, A.nColumns, ipiv);
}
catch (int eNumber )
{
printf("LAPACKE_dgetrf() threw exception number: %d\n", eNumber );
free(aP);
delete ipiv;
return Result;
}
if( info != 0 )
{
free(aP);
delete ipiv;
return Result;
}

Here is the debugger output I receive.

Unhandled exception at 0x75724192 (KernelBase.dll) in CraigsSystem.exe: 0xC06D007E: Module not found (parameters: 0x00F5F9A4). occurred

↧

About COO with duplicate entries, and MKL_sparse_export_csr

April 13, 2020, 1:42 am

Latest and popular articles on Intel Technologies

≫ Next: call to ZHEGV failed

≪ Previous: Unhandled exception at 0x75724192 (KernelBase.dll). Module not found.

Hello everyone,

Because I am working FEM, global stiffness matrix is sparse, which can be easily assembled in a COO format. But it's noted that this formed COO matrix have many unsorted duplicate entries (there are many values with same row and col value) which need to be sorted and consolidated or summed. In many other platforms, like matlab and python, sparse function can automatically sort and sum these duplicates, producing correct final COO format matrix.

But in MKL for Fortran users, I use mkl_?csrcoo, and it doens't consolidate these duplicates even though it produce sorted CSR matrix. Now I am using Matrix Manipulation Rountines in IE Sparse BLAS to do these things. I am not sure if these new rountines can operate COO with duplicate entries.

In addition, the routine mkl_spares_?_export_csr always give out wrong results. I also don't how to allocate array 'col_indx' and 'values', because I even don't know the length of these arrays if the duplicate entries have been consolidated.

My computer environment is VS 2015 community and Intel composer XE2018 update1.

Thanks in advance.

Eric

↧

call to ZHEGV failed

April 14, 2020, 8:29 am

Latest and popular articles on Intel Technologies

≫ Next: mkl_cluster_sparse_solver.f90 missing

≪ Previous: About COO with duplicate entries, and MKL_sparse_export_csr

Hello,

I am troubleshooting a warning from the VASP plane-wave electronic structure code. The application version is 5.4.4 built against Scalapack.

The warning is of the form

WARNING in EDDRMM: call to ZHEGV failed, returncode =   6  3     16

With the test case I have, the returncode can also be "8 4 16".

Operating system and version: CentOS 7.4
Library version: MKL 2019.5
Compiler version: Intel 2019.5 mpiifort
GNU Compiler Collection (GCC)* or Microsoft Visual Studio* version (if applicable): GCC 8.2.0 (underneath the Intel installation)
Steps to reproduce the error (include makefiles, command lines, small test cases, and build instructions): I can send our makefile.include and input deck under separate cover if required.
Working compiler, tool, or library version, and accelerator driver version (for regressions): The warning has been seen since Intel Parallel Studio 2018 and perhaps earlier.

MKL ldd links:

	libmkl_intel_lp64.so => /nopt/nrel/apps/compilers/intel/2019.5/mkl/lib/intel64/libmkl_intel_lp64.so (0x00007fd8ee5d6000)
	libmkl_cdft_core.so => /nopt/nrel/apps/compilers/intel/2019.5/mkl/lib/intel64/libmkl_cdft_core.so (0x00007fd8ee3ae000)
	libmkl_scalapack_lp64.so => /nopt/nrel/apps/compilers/intel/2019.5/mkl/lib/intel64/libmkl_scalapack_lp64.so (0x00007fd8edaa5000)
	libmkl_blacs_intelmpi_lp64.so => /nopt/nrel/apps/compilers/intel/2019.5/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so (0x00007fd8ed863000)
	libmkl_sequential.so => /nopt/nrel/apps/compilers/intel/2019.5/mkl/lib/intel64/libmkl_sequential.so (0x00007fd8ec24a000)
	libmkl_core.so => /nopt/nrel/apps/compilers/intel/2019.5/mkl/lib/intel64/libmkl_core.so (0x00007fd8e7f18000)
	libiomp5.so => /nopt/nrel/apps/compilers/intel/2019.5/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64/libiomp5.so (0x00007fd8e7b23000)

↧

mkl_cluster_sparse_solver.f90 missing

April 15, 2020, 5:42 am

Latest and popular articles on Intel Technologies

≫ Next: about non-zeros distribution used by the mkl_sparse_?_mv function.

≪ Previous: call to ZHEGV failed

I'm running parallel_studio_xe_2020.0.088 on CentOS 7. I'm trying to compile the PARDISO example for complex unsymmetric matrices using the following:

make libintel64 mpi=intelmpi compiler=gnu interface=lp64 ompthreads=8 mpidir=/opt/intel/impi/2019.6.166/intel64/bin examples=cl_solver_complex_unsym

I get the following output

make ext=a  run
make[1]: Entering directory `/opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/examples/cluster_sparse_solverf'
make[1]: *** No rule to make target `/opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/include/mkl_cluster_sparse_solver.f90', needed by `_results/gnu_intelmpi_lp64_intel64_a/mkl_cluster_sparse_solver.o'.  Stop.
make[1]: Leaving directory `/opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/examples/cluster_sparse_solverf'
make: *** [libintel64] Error 2

Indeed when I look in /opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/include/, the file mkl_cluster_sparse_solver.f90 is not there. I have an older 2018 version of the suite, and it's not there either.

What am I missing?

↧

about non-zeros distribution used by the mkl_sparse_?_mv function.

April 15, 2020, 6:31 am

Latest and popular articles on Intel Technologies

≫ Next: MKL - which dependencies to include in the distribution package?

≪ Previous: mkl_cluster_sparse_solver.f90 missing

Hi all,

I am using the sparse matrix-vector multiplication operation in the MKL library.

I started with a CSR representation (the classical three arrays of the CSR format) and use the mkl_sparse_d_create_csr() function to create a "sparse_matrix_t" handle. Then I ran the mkl_sparse_optimize () function using the handle, and finally the mkl_sparse_d_mv() function for the desired operation.

It works. So far so good. The answers I am getting are correct.

I am able to manipulate the number of threads used in the solution by setting the environmental variable "OMP_NUM_THREADS". This also work as expected.

My question is:

How the sparse matrix is distributed among the treads?

is the distribution based on a similar number of rows per thread?

is it based on a similar number of non-zeros per thread?

or something else?

One more question: Can the user manipulate the distribution?

Thanks

↧

MKL - which dependencies to include in the distribution package?

April 15, 2020, 10:35 pm

Latest and popular articles on Intel Technologies

≫ Next: MKL - why beta?

≪ Previous: about non-zeros distribution used by the mkl_sparse_?_mv function.

Hi everyone,

Is there a systematic way to determine which libraries to include in an application's distribution package?

Up to now, I have been proceeding by trial and error, adding the libs when a user complains that they are missing on his system. For instance for win64, the end result is a package which includes

libiomp5md.dll
mkl_avx2.dll
mkl_core.dll
mkl_def.dll
mkl_intel_thread.dll
mkl_mc3.dll
mkl_sequential.dll

The app itself uses such functions as LAPACKE_dgetrs, _sgetrs, _dgetrf, _sgetrf, _dgesv, _dgels etc.

Sometimes a library seems to be missing for one user but not for another. This happens for instance on macOS.

Couldn't find relevant documentation about this, so any help will be greatly appreciated.

Thanks.

↧

MKL - why beta?

April 15, 2020, 11:06 pm

Latest and popular articles on Intel Technologies

≫ Next: cblas_gemm_s8s8s32 not support？

≪ Previous: MKL - which dependencies to include in the distribution package?

Hi everyone,

Until recently, I had been packaging my app with libraries dating back to 2018. These were wonderfully stable.

Recently however, due to a change of hardware, I installed mkl 2021.1-beta03 and distributed the app with these updated libraries. As a result, some users are now complaining of program crashes in the calls to the libraries.

So I have upgraded to beta05, repackaged the app, and I'm crossing my fingers...

The question is: why does this latest distribution have the attribute "beta" and does an LTS, i.e., stable version, exist?

Thanks.

↧

cblas_gemm_s8s8s32 not support？

April 16, 2020, 1:12 am

Latest and popular articles on Intel Technologies

≫ Next: random malloc error in d_commit_trig_transform

≪ Previous: MKL - why beta?

Dear sir，

I hava a question：

I read the oneDNN code， that seams cblas_gemm_s8s8s32（） is not implemented，just cblas_gemm_s8u8s32，why？

Because ISA of intel （AVX2？）has not special instructions that can execute multiplying or adding operation when the two vectors have the same data type (either s8/s8 or u8/u8)？

Thank you！

↧

random malloc error in d_commit_trig_transform

April 17, 2020, 1:40 pm

Latest and popular articles on Intel Technologies

≫ Next: Calling 'pbtrf' and 'pbtrs' directly from a C# .Net Core library

≪ Previous: cblas_gemm_s8s8s32 not support？

Greetings,

I'm experiencing a random malloc error in d_commit_trig_transform. It's persistent, but happens at different times during the code execution as this routine is called repeatedly. I'm currently testing on a Mac using the latest available Intel C++ compiler and MKL versions. The routine that contains the d_commit_trig_transform call is given below. The error occurs regardless of the number of cores used, but usually after a few thousand calls. Does anything look suspicious in the code below? Any advice would be appreciated.

void TwoDCylRZPotSolver::RHSVectorDST()
{
   /*
   This method performs the discrete sine transform of the first kind (DST-I) on
   rhsvector in preparation to solve the linear tridiagonal system. The transform
   is performed in chunk sizes of nz. Due to the manner in which the DST is
   calculated, an input array (a) of size nz+2 must be used with a[0]=a[nz+1]=0,
   and a[1 to nz]=data. A normalization factor of sqrt(2/(nz+1)) must be applied
   when copying the transformed data back into rhsvector.
   */

    double normfac=sqrt(2/double(nz+1));

#pragma omp parallel for
    for(int i=0; i<nrad; i++)
   {
        int error, ipar[128],n=nz+1,tt_type=0;
       double dpar[5*(nz+2)/2+2];
        DFTI_DESCRIPTOR_HANDLE handle = 0; //data structures used in transform

        double datatemp[nz+2];
        datatemp[0]=0;
        datatemp[nz+1]=0;

       d_init_trig_transform(&n,&tt_type,ipar,dpar,&error);
        d_commit_trig_transform(datatemp,&handle,ipar,dpar,&error);

        //copy data from rhsvector
       for(int j=0; j<nz; j++)
           datatemp[j+1]=rhsvector[i*nz+j];


       //perform transformation
       d_backward_trig_transform(datatemp,&handle,ipar,dpar,&error);

       //copy transformed data back to rhsvector
       for(int j=0; j<nz; j++)
           rhsvector[i*nz+j]=normfac*datatemp[j+1];

       free_trig_transform(&handle,ipar,&error);

       if(error != 0)
           cout<<"Error = "<<error<<" in free_trig_transform in method RHSVectorDST."<<endl;
   }
}

↧

Calling 'pbtrf' and 'pbtrs' directly from a C# .Net Core library

April 20, 2020, 9:23 am

Latest and popular articles on Intel Technologies

≫ Next: Extracting U from getrf

≪ Previous: random malloc error in d_commit_trig_transform

I'm currently porting a small Fortran FE solver to C#. The solver uses MKL, and I'm trying to get the best of both worlds by calling the MKL functions in question directly from C#.
According to Intels documentation, this can be done by using the DllImport statement in C# and calling the relevant function in mkl_rt.dll directly. This Intel tutorial gives a short description of how this can be done, and even provides some C# code examples.

The examples provided compile on my computer, What I want to do is basically exactly the same, only targeting the functions 'pbtrf' and 'pbtrs'. But it seems these functions are not exposed from mkl_rt.dll. Using Dependency Walker, I looked into mkl_rt.dll and found that only the F77-versions are available. So I tried setting up a function call using 'dpbtrf' and 'dpbtrs' instead. These require several more arguments than the F95-versions.

About the case:

My setup is in the attached .cs file (also shown in below image). Some input is hardcoded for testing purposes. The actual case is not provided, but it is a simple static FE problem with a stiffness matrix and a load vector. Mathematically, the stiffness matrix is a band matrix of size n x n. In the code, it is written in compact form as a matrix of size (nSuperDiagonals + 1) x n, that is 4 x n. Only one right hand side is used, i.e. the load vector of length n.

The call does not throw any error messages; I get info=1 in return (not 0) and the stiffness matrix is never factorized.

I know that this specific case will run when the whole shebang is coded in Fortran on the same computer.

.Any thoughts on what the issue could be?

Attachment	Size
Download LaPackCallers.zip	607 bytes

↧

Extracting U from getrf

April 21, 2020, 1:17 pm

Latest and popular articles on Intel Technologies

≫ Next: Fat and Narrow matrix multiplication with "cblas_cgemm"

≪ Previous: Calling 'pbtrf' and 'pbtrs' directly from a C# .Net Core library

I'm trying to use getrf to put a MxN matrix in RREF (get U, divide by leading entry) but either I'm misunderstanding the function description or something strange is going on. Now my understanding is that on return A contains both U and L, and IPIV contains the pivot indices (i.e. the index of the first non-zero entry in U on each row) to enable you to determine where L stops and U begins within A. But the values being return in IPIV often don't make any sense by this interpretation, with the returned index often referring to a point before or after the "actual" start of U, and in some cases they aren't even non-decreasing (i.e. [1,3,2,4,4]). I've confirmed through matlab that the U and L being returned in A are correct, but I can't make any sense of IPIV and without it I'm not able to automatically extract those two matrices from the returned A.

Am I missing something in what IPIV is, or does anyone have some advice on what might be going wrong?

↧

Fat and Narrow matrix multiplication with "cblas_cgemm"

April 25, 2020, 11:20 pm

Latest and popular articles on Intel Technologies

≫ Next: Sparse-Dense Matrix Multiplication

≪ Previous: Extracting U from getrf

Hi,

I am working with a matrix multiplication of sizes A = 40 x 40 and B is 40 x 10k with MKL support functions "cblas_cgemm". It is taking a 30 milliseconds,

I have enabled mkl multithreading also, which I belive it is more.

I have read in internet that "MKL functions are optimized for generic matrix multiplications"..

Anybody agrees or disagrees with me.

Thanks in advance .

↧

Sparse-Dense Matrix Multiplication

April 28, 2020, 8:44 am

Latest and popular articles on Intel Technologies

≫ Next: Matrix Inversion and matrix-vector multiplication or solve linear equation for simulation

≪ Previous: Fat and Narrow matrix multiplication with "cblas_cgemm"

Hi,

I am working with the mkl_sparse_s_mm routine to perform:

C = A * B

where A is a sparse matrix and C,B are dense matrices. I would like to have some details about the algorithm for sparse-dense matrix multiplication implemented by mkl_sparse_s_mm, i.e., if it uses cache aware strategies, specific micro-kernel implementations to fully leverage CPU registers etc..

Just to provide an example, high performance dense matrix multiplication (cblas_gemm) is usually implemented following the block pack algorithm.

Thank you

Cosimo Rullli

↧

Matrix Inversion and matrix-vector multiplication or solve linear equation for simulation

April 29, 2020, 12:48 am

Latest and popular articles on Intel Technologies

≫ Next: Intel MKL Library for Linux (`libmkl_rt.so`) Missing `SONAME`

≪ Previous: Sparse-Dense Matrix Multiplication

Hi,

I have seen multiple times that matrix inversion is not recommended when solving linear equations and everyone says to just use a solver but I may reduce execution time significantly by inverting instead (or will I?).

I have a simulation where either I will calculate the inverse of a sparse symmetric matrix at the beginning and for each time-step calculate the matrix-(new vector) multiplication to solve the system,

I could just use the original sparse matrix and solve the linear system at each time step even though the matrix doesn't change.

My matrix-vector has: n ~= 20,000 and simulation has approx 10^7 time steps. So what is the optimum method?

I found pardiso to solve the system unless someone has a better recommendation.

Using MKL 2016.4 on a cluster so I could request more CPUs but my code isn't parallelized.

Thanks for your time

↧

Intel MKL Library for Linux (`libmkl_rt.so`) Missing `SONAME`

April 29, 2020, 3:14 am

Latest and popular articles on Intel Technologies

≫ Next: Which subroutine can achieve the same result as matlab's mldivide?

≪ Previous: Matrix Inversion and matrix-vector multiplication or solve linear equation for simulation

There is an issue with Intel MKL RT Library files at Linux (`libmkl_rt.so`).

Their `soname` isn't defined.
Please have a look at https://github.com/JuliaSparse/Pardiso.jl/issues/69#issuecomment-620898554.

Any chance to fix it in the next update?

↧

Which subroutine can achieve the same result as matlab's mldivide?

April 30, 2020, 2:30 am

Latest and popular articles on Intel Technologies

≫ Next: Cloudera CDH 5.16 MKL Parcel offline install

≪ Previous: Intel MKL Library for Linux (`libmkl_rt.so`) Missing `SONAME`

I know that DGETRF and DGETRI are for matrix inversion in large scale matrix.

However, it is not the same with Matlab's mldivide.

I want to know which subroutine can achieve the same result as matlab's mldivide.

Thanks for the help.

S. Kim

↧

Cloudera CDH 5.16 MKL Parcel offline install

April 30, 2020, 9:13 am

Latest and popular articles on Intel Technologies

≫ Next: Unhandled exception at 0x75724192 (KernelBase.dll). Module not found.

≪ Previous: Which subroutine can achieve the same result as matlab's mldivide?

Good afternoon, Folks

Can anyone tell me from where I can download the MKL parcel for Cloudera CDH 5.16 for an offline installation?

This link: http://parcels.repos.intel.com/mkl/latest and it's version-specific variants seem only to be available for access by a Cloudera Manager

Our Cloudera cluster is on-prem and without access to the internet so an offline install is necessary

Kind Regards

Mark

↧

Unhandled exception at 0x75724192 (KernelBase.dll). Module not found.

May 1, 2020, 1:10 pm

Latest and popular articles on Intel Technologies

≫ Next: Pardiso example tutorial.

≪ Previous: Cloudera CDH 5.16 MKL Parcel offline install

The code below throws an exception which I am unable to catch. It seems to indicate that I am having a path problem. I've pursued this in the Microsoft forums. They referred me to you. Can you help.

Here is the debugger output I receive.

Unhandled exception at 0x75724192 (KernelBase.dll) in CraigsSystem.exe: 0xC06D007E: Module not found (parameters: 0x00F5F9A4). occurred

↧

Pardiso example tutorial.

May 2, 2020, 2:56 am

Latest and popular articles on Intel Technologies

≫ Next: Parallel Direct Sparse Solver for Clusters and iparm[30]

≪ Previous: Unhandled exception at 0x75724192 (KernelBase.dll). Module not found.

Hello !!

Can I get an example tutorial file for the pardiso subroutine?

Ultimately, I want to use the pardiso subroutine as a solver for structural analysis.

So I just want to get any example file for understanding how to use pardiso. (symmetric case and unsymmetric case)

Thanks all,

S. Kim

↧