Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all 3005 articles
Browse latest View live

ATTENTION: Read if you cannot install from deb feed: The APT distribution key has expired today

$
0
0

The key listed on the instruction page

https://software.intel.com/en-us/articles/installing-intel-free-libs-and...

specifically, this file:

wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS...

is old, and has just expired. There will be a million failures; do not panic; I hope Intel will fix that very quickly.

# curl -Ss https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB | gpg -
gpg: WARNING: no command supplied.  Trying to guess what you mean ...
pub   rsa2048 2016-09-28 [SC] [expired: 2019-09-27]  <<<=== THIS
      BF4385F91CA5FC005AB39E1C1A8497B11911E097
uid           "CN = Intel(R) Software Development Products", O=Intel Corporation

unable to find @rpath/libiomp5.dylib

$
0
0

I am new to MKL on Mac. 

I am running code in Fortran as follows: 

ifort -o name.x -fast -mkl program.f90

and I get the following error 

ipo: warning #11012: unable to find @rpath/libiomp5.dylib

I tried looking online to find a solution but I could find anything straight forward. 

 

The closest I got to a solution is to do the following wich is without '-fast' option 

ifort -o name.x -mkl program.f90 -Wl,-rpath,${MKLROOT}/lib -Wl,-rpath,$MKLROOT/../compiler/lib/

This works but when I do the following I get the same error 

ifort -o name.x -fast -mkl program.f90 -Wl,-rpath,${MKLROOT}/lib -Wl,-rpath,$MKLROOT/../compiler/lib/

ipo: warning #11012: unable to find @rpath/libiomp5.dylib

How can I add the path?

Diagonalization of symmetric matrices - form of matrix?

$
0
0

I would like to compute the eigenvalues of a symmetric matrix and wanted to use the LAPACKE_dsyev function from the MKL Library in C++ for that.

From the documentation https://software.intel.com/en-us/mkl-developer-reference-c-syev, I concluded that I would have to pass only the upper/lower triangular part of the matrix. It says about the argument that it "is an array containing either upper or lower triangular part of the symmetric matrix A".

However, it seems that actually one needs to pass the pointer to the full matrix to the routine. Say I want to diagonalize the following matrix:

[[-2,   0,     0.5,  0],
[0,     0.5, -2,     0.5],
[0.5,  -2,    0.5,  0],
[0,     0.5,  0,    -1]],
which has eigenvalues [ 2.56,  -2.22, -1.53, -0.81]

Then in the following code, only the first option gives the correct values.

#include <iostream>
#include"mkl_lapacke.h"
using namespace std;
int main(){
	MKL_INT N = 4;
	double matrix_ex_full[16] = {-2,0,0.5,0,0,0.5,-2,0.5,0.5, -2, 0.5, 0, 0,0.5,0,-1};
	double evals_full[4];
// Pass over the full matrix
	MKL_INT test1 = LAPACKE_dsyev(LAPACK_ROW_MAJOR, 'N', 'U', N, matrix_ex_full,N, evals_full);
	cout << "success = "<<test1 << endl;
	for (MKL_INT i = 0;i<4;i++)
		cout << evals_full[i] << endl;
// Pass only the upper triagonal
	double matrix_ex_uppertri[10] = {-2, 0, 0.5, 0, 0.5, -2, 0.5, 0.5, 0, -1};
	double evals_uppertri[4];
	MKL_INT test2 = LAPACKE_dsyev(LAPACK_ROW_MAJOR, 'N', 'U', N, matrix_ex_uppertri,N, evals_uppertri);
	cout << "success = "<<test2 << endl;
	for (MKL_INT i = 0;i<4;i++)
		cout << evals_uppertri[i] << endl;

}
//Compiled with g++ test.cpp -o main -m64 -I/share/opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/include -L/share/opt/intel/compilers_and_libraries_2018.0.128/linux/mkl/lib/intel64 -Wl,--no-as-needed -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -lpthread -lm -ldl

I guess I am missing something obvious here, but why, if the full matrix has to be given anyway, is it necessary at all to specify 'U' or 'L'? Or am I doing something wrong elsewhere?

Thanks for any help, and apologies if this question is trivial (which I have the feeling it must be), I am rather new to using MKL.

Traceback MKL ERROR (Environmental variable?)

$
0
0

Is there a MKL environmental variable that will make all errors fatal? I would like to use this to find the location of an error in a large code. (Currently the code completes correctly, the error does not matter.)

cluster sparse solver returns error = -1

$
0
0

I can't understand why I get the "input inconsistent" error when I run the code that I am attaching. I am using a toy example to check on it and that works for P = 4, where P is the number of MPI processes. 
I set: 
iparm[34] = 1;// indexing from 0
iparm[36] = 0; //csr format 
iparm[39] =  2; //matrix distributed

in the csrA_rand.txt file, the first element is n, that is the number of rows and columns of the sparse matrix A. Process 0 reads n and broadcasts it to all other processes. The first n%P processes have (n / P) + 1 rows, whereas the remaining P - (n%P) ones have (n / P) rows each. In this case anyway, the matrix size n is 48, so all processes have the same number of rows (12). Since indexing starts from 0, all the ia's are such that ia[0] = 0 and ia[myn] = nA. All processes read their portion of a, ja and ia correctly, or at least this is what it seems to me.

*** Edit, found an error when generating ja

Incorrect band matrix storage description

$
0
0

In MKL C Manual on page 399 (PDF) row major layout formula looks like "row major layout: k(i, j) = (i - j)*ldab + kl + j - 1; 1 ≤i≤m, max(1, i - kl) ≤j≤ min(n, i + ku)". Let's take row i=1, j=i+ku, kl=ku=k (equal # of bands), ldab=kl + ku + 1 as suggested earlier and get k(i,j)=-k*(k+k+1)+k+1+k-1=k-2*k*k, which is negative for all k>0. Please adjust to make it correct. Thank you! 

mkl_sparse_sp2m: Conditional jump or move

$
0
0
struct matrix_descr descrA;
struct matrix_descr descrB;

descrA.type = SPARSE_MATRIX_TYPE_GENERAL;
descrB.type = SPARSE_MATRIX_TYPE_GENERAL;

std::cout << mkl_sparse_sp2m(SPARSE_OPERATION_TRANSPOSE, descrA, A, SPARSE_OPERATION_TRANSPOSE, descrB, B, SPARSE_STAGE_FULL_MULT, &C) << std::endl;

Assume there are two csr matrices A and B and one declared csr matrix C. Status 0 for the above code is returned (SPARSE_STATUS_SUCCESS). However, Valgrind indicates a "Conditional jump or move depends on uninizialised value" error. If I create the C matrix with dummy csr arrays I do not get the error, but then memory is directly lost. Is this a bug? I'm using mkl 2019 update 5.

error on description page

$
0
0

There is a line of code on two-stage-algorithm-for-inspector-executor-sparse-blas-routines that seems to be incorrect

 

status = mkl_sparse_x_export_csr ( csrC, &indexing, &rows, &cols, &rows_start, &rows_end, &col_indx, &values);

MKL_INT nnz = rows_end[rows] - rows_start[0];

"rows_end[rows]" accesses uninitialized space. "rows_end[rows - 1]" addresses the last element.


Where is the preconditioning of coefficient matrix -A in FGMRES

$
0
0

Hello, every one .

I want to implement the ILUT preconditioned FGMRES RCI  to solve a large Poisson equation in a lab CFD code which based on non-uniform cartesian grids and standard 7-point discretization scheme . I read the MKL Developer Reference and example : dcsrilut_exampl2.f90 and understand the RCI mechanism , GMRES method and ILU well, but I am still confused about the process of the preconditioned FGMRES method.

According to my understanding ,  to use ILUT+FGMRES, the user should first generate a CSR matrix for both A(csrA) and precondition matrix B(csrL) .Next,  MKL invokes the preconditioned version of FGMRES by setting ipar(11)=1 and then performs an additional matrix(csrL)-vector multiplication step (RCI_request =3) to precondition the rhs , which probably correspond : B.inv() * b in GMRES.

What confuses me is that where is the preconditioning step of coefficients A? Shouldn't there exists a  B.inv() * A  stepin the (RCI_request =3)  (Not clear whether right or left precondition is adopted in MKL) ? Actually , I am expecting some code like :  

                    call mkl_sparse_d_spmm(op, B.inv() , csrA , reconditioned_A_CSR)

following the preconditioning of vector TMP(ipar(22)) .

I also notice the RCI_request =1 performs a set of A*V_i operations with the input of csrA, not csrA and csrL. So, How the preconditioning of A is handled in FGMRES is not clear to me at all.

My guess is that MKL performs the precondition of A automatically after precondition of vector tmp(ipar(22)) ,and then overrides the original A matrix with the same variable name , so that explains why the RCI_request=1 part remain unchanged in precondition version.  Can anyone confirm this or explains it for me? Thanks a lot !

Lin  Yang

 

 

cannot find libimf.so

$
0
0

Situation:

After installing Intel Composer XE 19 and setting environment variables, I cannot use my internet in Ubuntu 18.04 and printing system in OpenSUSE Leap 15.1. The error message said that they(cups and networking service...) cannot find libimf.so and other library to intel composer. But if I install Intel Composer XE 19 without sudo and root, only within user environment, the system work smoothly. I asked the same question in OpenSUSE forum and someone suggested that I shouldn't use LD_LIBRARY_PATH. And some articles says LD_LIBRARY_PATH is evil. Now I just avoid using LD_LIBRARY_PATH.

Question:

If I install Intel Composer XE 19 as administrator, how should I set environment variables? The installation manual says setting environment variables with "source /opt/intel/.../compilervars.sh intel64". But it causes a lot of troubles for me.

MKL function cblas_sgemv gives different results each time

$
0
0

Hi, I have used cblas_sgemv but this function gives different results each time and I have checked the inputs which are always the same. Sometimes the result is correct (with 1e-6 L2 norm error compared to the correct result). Could someone tell me why I have this?

 

Thanks,

Cindy

MKL's cblas_saxpy outputs incorrect results

$
0
0

Hi,

I need to add two arrays in an efficient way, so I tried MKL's saxpy.

When I use the cblas_saxpy function on two dummy arrays with all values initialized to 1 and 2 respectively, I get totally wrong results. And I couldn't figure out what's wrong with my code.

(I omitted other includes)

#include "mkl.h"

#define SIZE 10000

int main()
{

        float* buf_x = (float*) malloc(SIZE * sizeof(float));
        float* buf_y = (float*) malloc(SIZE * sizeof(float));

        for (int i = 0; i<SIZE; i++){

            buf_x[i] = 1.f;

            buf_y[i] = 2.f;

        }

        cblas_saxpy ((MKL_INT)SIZE, 1, buf_x, (MKL_INT)1, buf_y, (MKL_INT)1);

        for (int i = 0; i<SIZE; i++)

            printf("%f, ", buf_y[i]);

        return 0;
}

I get as an output 204 instead of 3.

Can anybody shed a light on this ?

Thank you.

pardiso_handle_store Segmentation fault

$
0
0

Hi,

I have been using `pardiso`, and everything works fine. I can successfully factorize a large sparse matrix and later solve systems using the factorization. Now I wish to save the factorization to a file, so I don't need to do the factorization every time I run the application. I believe `pardiso_handle_store` is the correct function to use, but I keep getting the `Segmentation fault (core dumped)` error.

The relevant code and the console output can be found in this gist: https://gist.github.com/hkalexling/f8e0a22d1a29569a4012f717de8c7798.

Any help would be appreciated!

Alex

Permutation of a large sparse matrix

$
0
0

Hi,

What is the fastest way of permuting a large sparse_matrix_t in csr or csc format?

I could either do manual permutations on the csr arrays or I could create a sparse permutation matrix and use the mkl_sparse_spmm method.

Either method seems to be not optimal since I don't benefit from parallelism on the former method and I have to create additional arrays for the Permutation matrix and create a new copy of the matrix on the latter method.

Also, I notice that there might be performance differences between column and row permutations depending on whether the matrix is in csr or csc.

Is there a better way to do it?

use of MKL spline functions strange behavior at second run time

$
0
0

Hi,

I have wrapped the code required to do an Akima spline interpolation in the attached source code. When I link my main application statically with the MKL, it runs fine. However, with dynamic linking (i.e. with the need of MKL runtime libs) I experienced a strange behavior. First of all I did not know which runtime libs I needed so when trying to call the dfdNewTask1D  routine simply terminates my application; after several tries (running in console mode) I found that mkl_vml_avx2.dll and mkl_vml_p4.dll were required. I loaded manually these dlls and my code runs fine and I unload them after. However without exiting my app, if I run again the same calculation (after loading again the same set of dlls), the call to the dfdNewTask1D routine generates an exception an my application crashes (under debug, the call to dfdNewTask1D never returns because an exception is generated somewhere). The runtime dlls I load/unload at/after every run are:

  • libimalloc.dll
  • libmmd.dll
  • libifcoremd.dll
  • libifportmd.dll
  • libiomp5md.dll
  • msvcr100.dll
  • mkl_vml_avx2.dll
  • mkl_vml_p4.dll
  • mkl_core.dll
  • mkl_sequential.dll

I don't know if I need other dlls or what is going wrong.

Best regards,

Phil.

AttachmentSize
Downloadapplication/octet-streamAkimaSpline.f903.32 KB

Serious memory leak problem of mkl_sparse_d_add subroutine

$
0
0

Hi,

I'm currently programming with the new sparse interface, and experienced serious memory leak problem when this routine: mkl_sparse_d_add is called several thousand times, it takes up all my 64 GB memory and program cannot go on. I'm not sure whether other sparse routines has similar problems, but at least routines mkl_sparse_d_create_coo, mkl_sparse_convert_csr, mkl_sparse_d_mv do not have this problem.

Please take a check of it, thank you very much!

serious memory leak problem found within mkl_sparse_d_add

$
0
0

Hi,

I'm currently programming with the new sparse interface, and experienced serious memory leak problem when this routine: mkl_sparse_d_add is called several thousand times, it takes up all my 64 GB memory and program cannot go on. I'm not sure whether other sparse routines has similar problems, but at least routines mkl_sparse_d_create_coo, mkl_sparse_convert_csr, mkl_sparse_d_mv do not have this problem.

Please take a check of it, thank you very much!

dtrnlspbc_solve solution outside the allowed range

$
0
0

We are using a Trust Region MKL API: dtrnlspbc_solve and related functions.
We use the optimization with constraints.
The objective function written by us worked properly for years, in your algorithm.
Untill a customer of us signalated a strange behaviour, this is, a valid solution outside the allowed range!

We observed that, moving a little bit the initial conditions, if works! why this can happen?
Any suggestions?

 

Thank you

Gianluca

Using MKL Features: MKL Direct Call, MKL JIT, MKL Compact API, MKL Batch API, MKL Packed API on the Single Dynamic Library

Why different thread num makes no different in performance?

$
0
0

Hi everyone,

I'm testing MKL using VisualStudio 2019 and MKL v2019.5 on Intel i7-9750H CPU with 6 cores and 12 threads.I'm interested in the time consumed of vector mathematics and FFT functions in MKL.As I understand it, as to these two categories of functions,the time consumed should decrease when max theads num increases.But it did'nt happen to vector mathematics functions.I have tested vcMul and vcAdd function.The time consumed just makes no much different between thread num setting to 1 and 6.It's werid to me and I can't figure out a reason for it.Can anyone help me about it?The code is attached below,thanks very much!

 

////////////////////////////////

int N = 16384;
int M = 2000;

//#define FFTTEST 
#define CMULTEST 
int main(void)
{
 

    double clkfreq = mkl_get_clocks_frequency();

    unsigned MKL_INT64 startclk, endclk;
    double time;
    double time2[16384];
    int kk = 0;

    /* Execution status */
    MKL_LONG status = 0;

    DFTI_DESCRIPTOR_HANDLE hand = 0;

    //mkl_set_dynamic(0);

    //mkl_set_num_threads(1);
    int threadnum = mkl_get_max_threads();
    printf("设置线程数:%d\n", threadnum);
    printf("FFT点数:%d  FFT次数:%d\n", N,M);

    /* Pointer to input/output data */
    MKL_Complex8* x = 0;
    MKL_Complex8* y = 0;
    x = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);
    y = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);
    MKL_Complex8* x2 = 0;
    MKL_Complex8* y2 = 0;
    x2 = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);
    y2 = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);
    if (x == NULL) goto failed;

    init2(x, x2);
    vmlSetMode(VML_EP);
    mkl_get_cpu_clocks(&startclk);
    for (kk = 0; kk < M; kk++)
    {
        vcAdd(N, &x[N*kk], &x2[N * kk], &y[N * kk]);
    }
    mkl_get_cpu_clocks(&endclk);
    time = (double)(endclk - startclk) / (clkfreq * 1e9) * 1e6 / M;
    printf("复乘: %f us\n", time);
 

    mkl_free(x);
    mkl_free(y);
    mkl_free(x2);
    mkl_free(y2);

failed:
    return 0;
}

 

Viewing all 3005 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>