Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all 3005 articles
Browse latest View live

MKL 2019 installed but not showing Windows 10 Installed Programs

$
0
0

I installed MKL 2019 update 3 under Windows 10 64 bit, trying to use it with Visual Studio C++ 2017.

Initially the install went ok, and Intel Performace Libraries showed up under Configuration in VS2017.

But after uninstalling MKL 2019, and VS2017, and reinstalling, Performance Libraries is not showing up under Configuration, even though MKL install says it finds and integrates with VS2017.

There seems to be a problem with cleaning up the registry after uninstall of MKL 2019. Windows still thinks it is installed, and won't properly install again.

I install MKL 2018, and it seemed to integrate with VS2017, and show up under configuration. But there are strange issues that seem to result from Windows not uninstalling MKL 2019 properly.

How can I clean up the mess that MKL 2019 made in the registry, and reinstall it so that it works with VS2017 like it did originally?

This is extremely frustrating as I've spent 5 hours uninstalling and installing Visual Studio and different MKLs.

Please provide a working MKL 2019 uninstaller for Windows. I need to use MKL with Visual Studio, and the broken installer is making it impossible to proceed.

Thank you,
Jason


Pardiso with MPICH2 generates a program exception

$
0
0

Hello,

I'm using visual studio 2017 and parallel studio XE 2018 cluster edition.

I've tried to execute my code with PARDISO solver using MPICH2. However, it triggers an "Program exception - exception code = 0x7e (126)".

I wonder if I should use cluster version of PARSISO in case of using MPICH2. Otherwise, what should I do to treat this error?

Thanks.

 

 

 

LAPACK_zheev with matrix_layout=LAPACK_ROW_MAJOR

$
0
0

There seems to be a bug in the routine LAPACK_zheev in MKL 2019 update 3 version. This routine is used for finding eigenvectors and eigenvalues of a Hermitian matrix 'a'. Its syntax is:

lapack_int LAPACKE_zheev ( int matrix_layout , char jobz , char uplo , lapack_int n , lapack_complex_double* a , lapack_int lda , double* w );

The problem seems to occur for only "matrix_layout = LAPACK_ROW_MAJOR" argument with jobz = 'V'. The eigenvectors are outputted in 'a' by rewriting it. However, the routine rewrites only upper triangular portion of 'a' and the rest of the matrix is unchanged.

For example, the following code calculates the eigenvectors of the 2x2 Pauli matrix sigma_x.

filename: lapack_zheev

#include <iostream>
#include <complex>
#define MKL_Complex16 std::complex<double>
#include "mkl.h"
#include "mkl_types.h"

using namespace std;

int main() {
    complex<double> a[4];
    a[0] = 0;
    a[1] = 1;
    a[2] = 1;
    a[3] = 0;

    double eigs[2];

    int info = LAPACKE_zheev(LAPACK_ROW_MAJOR, 'V', 'U', 2, a, 2, eigs);

    cout << "Info = "<< info << endl;
    cout << a[0] << ""<< a[1] << endl;
    cout << a[2] << ""<< a[3] << endl;
}

I compile it with:

icpc -std=c++14 -qopenmp -DMKL_ILP64 -m64 -I${MKLROOT}/include -O3 -DNDEBUG -ansi-alias -o lapack_zheev.out lapack_zheev.cpp -Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_intel_ilp64.a ${MKLROOT}/lib/intel64/libmkl_intel_thread.a ${MKLROOT}/lib/intel64/libmkl_core.a -Wl,--end-group -liomp5 -lpthread -lm -ldl

./lapack_zheev.out

The output is:

Info = 0
(-0.707107,0) (0.707107,0)
(1,0) (0.707107,0)

while it is supposed to be:

Info = 0
(-0.707107,0)   (0.707107,0)
(0.707107,0)   (0.707107,0)

Replacing LAPACK_ROW_MAJOR by LAPACK_COL_MAJOR produces the correct output.

Question about matrix storage in Zheevd

$
0
0

There are two choices for the argument uplo.

uplo
Must be 'U' or 'L' .
If uplo = 'U' , a stores the upper triangular part of A.
If uplo = 'L' , a stores the lower triangular part of A.

Is it still rightful if the matrix holds all the elements (both upper and lower parts)?

The matrix is overwritten by the eigenvectors on exit. I am not sure if the upper or lower triangular part is not initialized to zeros, the output eigenvectors are still correct.

 

MKL Memory Management

$
0
0

I am trying to use a custom memory manager for my project. I looked at the example code given in "i_malloc.h" but this seems to not working for me. 

Here is a snippet of what I have. I am only replacing malloc here as example, but I am replacing all four in my actual code. 

   #include <i_malloc.h>

   void * my_malloc(size_t size){
     printf("Testing my_malloc\n");
     return malloc(size);
   }

   int main( int argc, int argv )
   {
       // override normal pointers
       i_malloc = my_malloc;

       // Some MKL calls that involves malloc inside the library calls
   }

 

It looks like the code is not using my_malloc for allocation. Any idea what I am missing?

 

Thanks,

Samee

Linker Problem with MKL Fast Poisson solver

$
0
0

Hello all,

I have written a code in C++ for fluid dynamics simulations. I develop it in Qt Creator (4.2.1, based on Qt 5.8.0 (MSVC 2015, 32 bit)) on Windows10 and mingw32.

Part of the code requires the fast poisson solver (#include "mkl_poisson.h") and the corresponding cartesian routines of mkl.

The code compiles however whenever I try to call a function from the library I get the dreaded

---Intel MKL FATAL ERROR: Cannot load mkl_intel_thread.dll.----

error message. This appears to be a common issue, especially for some Python libraries.

I have a brand new copy of the MKL libraries (2019.3.203) and have tried:

1) Including libraries (IntelSWTools\compilers_and_libraries_2019.3.203\windows\redist\ia32_win\mkl) and compiler libs (IntelSWTools\compilers_and_libraries_2019.3.203\windows\redist\ia32_win\compiler) into my path, and separately (after removing paths)

2) Linking directly to the necessary libraries in the build tree

LIBS += -lmkl_rt, LIBS += -lmkl_intel_thread etc.

But I still get the problem. The issue does not seem to happen on my work PC, so I figured it must be some dependency issue, which is why I downloaded the new MKL libraries to trial method 1) above. Unfortunately the same error occurs.

 

Any help would be appreciated.

Linker Problem with MKL Fast Poisson solver

$
0
0

Hi all,

I am writing a small library for fluid dynamic simulations. I use Qt Creator ( 4.2.1, Based on Qt 5.8.0 (MSVC 2015, 32 bit)) and am compiling with mingw32 on Windows 10. I require the mkl fast poisson library for my solver. I dynamically link the library, and the source compiles, however as soon as I try to call the library I receive the dreaded:

"Intel MKL FATAL ERROR: Cannot load mkl_intel_thread.dll."

error. Apparently this happens a lot with python libraries and appears to generally be an issue with the libraries not being on the PATH. After downloading the newest IntelMKL release (219.3.203) I have already tried the following:

1) Path Variables: ...IntelSWTools\compilers_and_libraries_2019.3.203\windows\redist\ia32_win\mkl and

...IntelSWTools\compilers_and_libraries_2019.3.203\windows\redist\ia32_win\compiler in PATH. No dice, also:

2) Not in Path: I dynamically link and include the following libraries (some perhaps unnecessary) (mkl_rt, mkl_intel_thread, mkl_core)

Regardless of these two approaches, I always get the error. None of the topics I have thus far found appear to help my case.

Help!

Cannot link to MKL from a Fortran program

$
0
0

I just installed your latest version of MKL, 2019 version 3, and I get this error massege:

      fatal error LNK1104: cannot open file 'mkl_cdft_core_dll.lib'        

I am using VISUAL STUDIO 2015, and it did say it integrated it to VS2015.

also, I did say "use MKL" in the Fortran part of the BUILD options,

so I was wondering if if have to say that somewhere else ?

Under the LINKER part of the build menu, it does not mention the MKL anywhere.     

Is there a way to check if it was properly installed ?

did anyone TEST this for Fortran users ?  


illegal value in DESCINIT and PSGESV

$
0
0

I am new to scaLapack and I am trying to execute the program ex1.c in this link:

http://geco.mines.edu/software/mkl/index.shtml

when I run it, these two messages are printed: "DESCINIT parameter number 9 had an illegal value", "and PSGESV parameter number 1 had an illegal value". 
I don't see why those values are illegal, can someone help? Thanks.

Computing the Schur-complement with MKL_PARDISO

$
0
0

Hello,

we are currently trying to integrate mkl_pardiso into our software and are facing some questions regarding mkl_pardiso and the computation of the Schur-complement.
Given a real symmetric matrix we wanted to compute the Schur-complement of a certain size and afterwards use the partial factorization from that computation to solve a part of the original linear equation system. We are using sparse matrices and thus also iparm[35] = -2/-1.

First of all we started out by modifying the supplied pardiso_schur_c.c example and used the pseudocode found here as an orientation. You can find the used code as an attachment. The compile command can be found as a comment in the upper part of the c-File.

First thing was, that when setting

    int iparm35 = -1;
    
    iparm[1-1] = 1;         /* No solver default */
    iparm[2-1] = 2;         /* Fill-in reordering from METIS */
    iparm[5-1] = 0;
    iparm[10-1] = 8;        /* Perturb the pivot elements with 1E-13 */
    iparm[11-1] = 0;        /* Use nonsymmetric permutation and scaling MPS */
    iparm[13-1] = 0;        /* Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy */
    iparm[14-1] = 0;        /* Output: Number of perturbed pivots */
    iparm[18-1] = -1;       /* Output: Number of nonzeros in the factor LU */
    iparm[19-1] = -1;       /* Output: Mflops for LU factorization */
    iparm[24 - 1] = 10;
    iparm[31 - 1] = 0;
    iparm[36 - 1] = iparm35;        /* Use Schur complement */

we ran into a segmentation fault thrown by the pardiso_export function. We do not know why this happens, but setting iparm[23] = 1 instead of 10 resolved the issue. The documentation on iparm[23] simply says that it cannot be 0 when setting iparm[35] to either -1 or -2. Did we do something wrong here?
Another thing we found is that when setting iparm[35] = -2 in the above settings (so with iparm[23] = 10) we don't get the number of non-zeros as output in iparm[35] but instead -2 again (even though error == 0).

After 'resolving' the above issue by setting iparm[23] = 1 we were able to compute the Schur-complement and get it returned in sparse format (that was all correct, even though it was quite surprising to us, that pardiso, despite getting all matrices supplied in 1-based indexing, returned the Schur-complement with zero based indexing - is there  a parameter to control this ?).
Still we were not quite sure on how to use the parameter perm. If we set perm to something like

perm = {1, 0, 0, 0, 1} 

what exactly does "perm specifies elements for a Schur complement" in the documentation mean? Will pardiso perform a Schur-complement computation equivalent to one where we specify perm as

perm = {0, 0, 0, 1, 1} 

but swap row 1 and 4? If yes, will pardiso always "stable" sort the rows?

A last question we have that is somewhat more general:

Given a linear system

[A11 A12] [x1]     [b1]
[A21 A22] [x2] = [b2]

we want to do two things:

a) compute the Schur-complement S = A22 - A21 A11^-1 A12

b) solve the linear system A11 x1 = b1

We assumed that, similar to pardiso from the pardiso-project, we could do that by computing the Schur-complement with iparm[35] = -2 (so that the factorization is kept for solving phase) and afterwards use that partial factorization in pardiso and phase=33 to compute x1.
From our tests we found that pardiso instead solves the complete system for x1 and x2. Is that the expected behaviour?
If so, is there a way to efficiently only solve A11 x1 = b1?

 

With best regards,

Nils

AttachmentSize
Downloadtext/x-csrcpardiso_schur_c.c9.88 KB

Floating Point Exception in MKL FFT from 18.0.4 onwards.

$
0
0

A floating point overflow is raised in the code below giving the following backtrace.

Program received signal SIGFPE, Arithmetic exception.
0x0000000000d4a1e6 in mkl_dft_avx2_coDFTColTwid_Compact_Fwd_v_10_s ()
(gdb) backtrace
#0  0x0000000000d4a1e6 in mkl_dft_avx2_coDFTColTwid_Compact_Fwd_v_10_s ()
#1  0x00000000005e6e0d in compute_colbatch_fwd ()
#2  0x00000000004057dc in MAIN__ ()

 

The same code runs fine with a previous version of mkl (11.1.1) or if the CNR mode is set to SSE4_2.

Seems something specific to the avx2 code path.

 

program mkl_test

   USE MKL_DFTI

  include  'mkl.fi'

  integer, parameter :: len_i = 1025
  integer, parameter :: len_j = 1920
  complex :: values_in(len_i * len_j)
  complex :: values_out(len_i * len_j)
  real :: temp_r, temp_i
  integer :: ieee_flags
  character*16 :: out
  integer :: i, j, unit,  status
  integer stride_in(2)
  integer stride_out(2)

  type(dfti_descriptor), pointer :: My_Desc1_Handle
!---------------------------------------------------------------------------------------------------  
  values_out(:) = cmplx(0,0)

  print*, "Started and reading in data..."
  open(unit, file='data2_CFFT.txt')
  do j=1, len_j
    do i=1, len_i
      read(unit, '(2f15.8)') temp_r, temp_i
      values_in((j-1) * len_i + i) = cmplx(temp_r,temp_i)
    enddo
  enddo
  close(unit)
  print*, "Done reading data"

!  status = mkl_cbwr_set(MKL_CBWR_SSE4_2)
!  if(status .ne. MKL_CBWR_SUCCESS ) then
!     print *, 'unable to set the mkl environment'

!  endif

  i = ieee_flags('set', 'exception', 'overflow', out)

  stride_in(0)=0;
  stride_in(1)=1025;
  stride_out(0)=0;
  stride_out(1)=1025;

  status = DftiCreateDescriptor(My_Desc1_Handle,DFTI_SINGLE,DFTI_COMPLEX,1,1920)
  status = DftiSetValue(My_Desc1_Handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE);
  status = DftiSetValue(My_Desc1_Handle, DFTI_NUMBER_OF_TRANSFORMS, 1025);
  status = DftiSetValue(My_Desc1_Handle, DFTI_INPUT_DISTANCE, 1);
  status = DftiSetValue(My_Desc1_Handle, DFTI_OUTPUT_DISTANCE, 1);
  status = DftiSetValue(My_Desc1_Handle, DFTI_INPUT_STRIDES, stride_in);
  status = DftiSetValue(My_Desc1_Handle, DFTI_OUTPUT_STRIDES, stride_out);
  status = DftiCommitDescriptor(My_Desc1_Handle);

  status = DftiComputeForward( My_Desc1_Handle, values_in, values_out )

  print*, "Finished successfully."

end program mkl_test

Compile as follows:

$INTEL_HOME/ifort -I$MKL_HOME/include/ cpbtrs.f90 -Wl,--start-group -Wl,-Bstatic -L$MKL_HOME_LIB/lib -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack95_lp64 -liomp5 -Wl,--end-group

The input file data2_CFFT.txt that is read and passed to the FFT funtion is attached. Input all looks normal.

MKL 2019 u2 seems to have the same issue.

I am using linux debian 9 on a Intel(R) Xeon(R) CPU E3-1240 v3. Can someone please have a look.

AttachmentSize
Downloadapplication/zipdata2_CFFT.zip5.24 MB

Installation fails

$
0
0

I'm on a GCP Ubuntu 14.04 VM and I can't get apt-get to build the mkl. The steps I took, following the instructions here:

> sudo sh -c 'echo deb https://apt.repos.intel.com/mkl all main > /etc/apt/sources.list.d/intel-mkl.list'

> sudo sh -c 'echo deb https://apt.repos.intel.com/mpi all main > /etc/apt/sources.list.d/intel-mpi.list'

> sudo apt-get update

Error message:

W: Conflicting distribution: https://apt.repos.intel.com binary/ InRelease (expected binary but got )W: Duplicate sources.list entry https://apt.repos.intel.com/mkl/ all/main amd64 Packages (/var/lib/apt/lists/apt.repos.intel.com_mkl_dists_all_main_binary-amd64_Packages)W: Duplicate sources.list entry https://apt.repos.intel.com/mpi/ all/main amd64 Packages (/var/lib/apt/lists/apt.repos.intel.com_mpi_dists_all_main_binary-amd64_Packages)

Does anyone have instructions for how to build properly?

Segmentation fault in PARDISO phase 33 when changing parameters in iparm

$
0
0

Hello,

we are currently integrating PARDISO into our software and stumbled upon some (from our point of view) weird behaviour.

First the setting:

Given a sparse matrix of type -2 we start with phase 12, so reordering and symbolic factorization. After that we wanted to use phase 33 with a sparse right hand side and thus set iparm[30] = 1.

In the documentation it says that for using iparm[23] = 1 (what we wanted to do), we should

Disable iparm[10] (scaling) and iparm[12] = 1 (matching) when using the two-level factorization algorithm. Otherwise Intel MKL PARDISO uses the classic factorization algorithm.

First of all, we are not quite sure what that is supposed to mean: Does it say set iparm[10] and iparm[12] both to zero? Or does it say set iparm[10] = 0 and iparm[12] = 1?
Ok, so, nevertheless, we set both to one and in addition set iparm[23] = 1, because we didn't quite read the hint at that time and anyway, we expected that this would lead to a behaviour similar to iparm[23] = 0.
The reason for setting iparm[10] and iparm[12] in the first place was that, indeed, our matrices come from a interior point method.

Now the weird part.
During phase 12 we did not set iparm[30] (since pardisoinit did not and we did not hand over a right hand side at that point - also, the documentation says iparm[30]

controls the solve step of Intel MKL PARDISO.

For phase 33 we then set iparm[30] = 1 and got a segfault back from pardiso.
Now, we did find two ways to get rid of the segfault neither of which seem to make any sense:
a) when setting iparm[30] = 1 for phase 12 and 33 the segfault disappears
b) setting iparm[23] = 0 or iparm[23] = 10 during solution phase lets the segfault disappear too

Frankly we are not quite sure what is going on here, I assume we misunderstood something.

Another thing we saw happening: during phase 12 PARDISO sets paramter iparm[33] = -1. Iparm[33] is described as an input parameter and thus should not be set by PARDISO? Or does this indicate some kind of error?

 

We hope you can help us.

With best regards,

Nils

 

PS:
I attached the example we were using, it is a modified version of the pardiso_sym_c.c example. Compiler command is on the top of the file in some comment. You can also find the complete setting of iparm we were using during phase 12 and the on we were using during phase 33 (iparm2).

AttachmentSize
Downloadtext/x-csrcpardiso_sym_c.c79.6 KB

Bug in dsyev row-major

$
0
0

Hello, 

When I compile and run the MKL examples `lapacke_dsyev_row.c` I get the following output: 

LAPACKE_dsyev (row-major, high-level) Example Program Results

 Eigenvalues
 -11.07  -6.23   0.86   8.87  16.09

 Eigenvectors (stored columnwise)
  -0.30  -0.61   0.40  -0.37   0.49
   0.00  -0.29  -0.41  -0.36  -0.61
   0.00   0.00  -0.66   0.50   0.40
   0.00   0.00   0.00   0.62  -0.46
   0.00   0.00   0.00   0.00   0.16

The lower part of the matrix is missing. 

This looks like a bug, is it the right place to report it ? 

It seems that the row  col major version is working. 

This happens with the version "2019.3.199" on both linux and windows.

Sincerely, 

Marc Lasson. 

 

Pardiso low-rank update question

$
0
0

Hi,

I recently realized this functionality is available with pardiso - great!

I am testing it now and I have some questions. I have read the instructions in https://software.intel.com/en-us/mkl-developer-reference-c-intel-mkl-par....

In particular, I am using matching as I am solving a highly indefinite symmetric system: iparm[12]=1. With iparm[12]=1 the instructions for iparm say A must be filled with relevant values during phase 11. This makes me wonder whether iparm[12]=1 is compatible with the low-rank update functionality (i.e. it suggests phase 11 must be run whenever A is updated as matching is enabled). 

I also note that the improved two-level factorization algorithm must be used with the low rank update (iparm[23]=10). But it is unclear to me whether matching works together with (improved) two-level factorization.

Could you please clarify?

Best,

Jens


Parallel pardiso solve step

running "cl_solver_unsym_c.c"

$
0
0

Hi. 

I need to run the code in examples/cluster_sparse_solverc/source of the Intel installation directory with different input data. 
Just as in the example, I am using CSR sparse format. Arrays a, ja and ia are read from files that I generated, and they are as follows:

a : double array of 52 nonzero elements. 

ja : 1     2     3    12     1     2     3     9     1     2     3     4     5     4     5  6    12     4     5     6     8     2     4     5     6    11     7     8     9     1     7     8     9     2     7     8     9    11    12     5     6    10    11    12     1   10    11    12     8    10    11    12

ja[i] = column index of a[i], counting from 1. 

ia :  1     5     9    14    18    22    27    30    34    40    45    49    53

assuming that the full matrix is 12x12, ia[i] = pointer to the first element of the i-th row. ia[12] = 53, that is the number of elements of a and ja + 1. 

when I execute it, I get the following message that appear in phase 22:

 

*** Error in PARDISO  (incorrect input matrix  ) error_num= 21

*** Input check: i=12, ia(i)=49, ia(i+1)=53 are incompatible

ERROR during symbolic factorization: -1

 

I don't see where is the mistake here. Why are those value of ia incompatible? Apart from reading these files, the only thing I changed was n, that is now 12.
Thank you in advance.

 

eigen value and vector for N=200x200

$
0
0

How do I find the eigenvalues ​​and eigenvectors of a matrix in a fortranda 200 * 200

[HPCG] "QuickPath" option always selected

$
0
0

Hello everyone,

 

A brief summary of the issue:

  • The Intel MKL and Intel MPI libraries are installed on the cluster I am using (its compute nodes embed 2 sockets (E5-2620 v4));
  • The AVX2 pre-built binary of HPCG delivered with the Intel MKL (xhpcg_avx2) executes successfully, both on a single node and on multiple nodes;
  • I tried to set a target execution time by using either the --rt=180 command line option, or the hpcg.dat configuration file (which is in the same directory as the binary), and by using both at the same time. However, the setting of the target execution time simply seems to be ignored, since the QuickPath option is always used (confirmed by the YAML output file);
  • However, setting the size of the local 3D compute grid works perfectly, thanks to both the command-line options and the configuration file.

 

I could not find any information related to such an issue (however, there is a closed issue in the GitHub repository of HPCG concerning the fact that if the --rt command-line option was not specified, the QuickPath option was enforced).

 

Any idea/workaround? If you need some more information, or if you want me to test something, just ask.

 

Thank you in advance for your help,

--Mathieu.

Problems with mkl_sparse_convert_csr

$
0
0

I'm trying to use the Intel MKL Inspector/Executor Sparse BLAS library and I've been struggling with faulty memory use in the `mkl_sparse_convert_csr` subroutine. The simple program below can reproduce my problem:

program debug
use mkl_spblas
use omp_lib
use, intrinsic :: iso_c_binding, only: c_int, c_double
implicit none
integer, parameter :: DIM = 10000
integer :: stat, i
integer(kind = c_int), dimension(DIM) :: irn, jcn
real(kind = c_double), dimension(DIM) :: val
type(sparse_matrix_t) :: mat1, mat2

do i = 1, DIM
  irn(i) = i
  jcn(i) = i
  val(i) = 1.0d0
end do

call omp_set_num_threads(1)
stat = mkl_sparse_d_create_coo (A = mat1, indexing = SPARSE_INDEX_BASE_ONE, &
  rows = DIM, cols = DIM, nnz = DIM, row_indx = irn, col_indx = jcn, values = val)
if (stat /= 0) stop 'Error in mkl_sparse_d_create_coo'

stat = mkl_sparse_convert_csr (source = mat1, &
  operation = SPARSE_OPERATION_NON_TRANSPOSE, dest = mat2)
if (stat /= 0) stop 'Error in mkl_sparse_convert_csr'

stat = mkl_sparse_destroy (A = mat1)
if (stat /= 0) stop 'Error in mkl_sparse_destroy (mat1)'

stat = mkl_sparse_destroy (A = mat2)
if (stat /= 0) stop 'Error in mkl_sparse_destroy (mat2)'

call mkl_free_buffers
end program debug

When I check with Valgrind I get the following report of memory leaks:

==27267== Memcheck, a memory error detector
==27267== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==27267== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==27267== Command: ../bin/LINKS_debug
==27267== 
==27267== 
==27267== HEAP SUMMARY:
==27267==     in use at exit: 495 bytes in 6 blocks
==27267==   total heap usage: 47 allocs, 41 frees, 463,031 bytes allocated
==27267== 
==27267== 8 bytes in 1 blocks are still reachable in loss record 1 of 6
==27267==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27267==    by 0x504CA98: gomp_malloc (alloc.c:37)
==27267==    by 0x505BA56: gomp_init_num_threads (proc.c:91)
==27267==    by 0x504B06A: initialize_env (env.c:1244)
==27267==    by 0x4010732: call_init (dl-init.c:72)
==27267==    by 0x4010732: _dl_init (dl-init.c:119)
==27267==    by 0x40010C9: ??? (in /lib/x86_64-linux-gnu/ld-2.27.so)
==27267== 
==27267== 8 bytes in 1 blocks are still reachable in loss record 2 of 6
==27267==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27267==    by 0x152F22: mkl_serv_malloc (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x1261B4: mkl_sparse_d_create_coo_i4_avx2 (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x112AF8: MAIN__ (main.f90:49)
==27267==    by 0x112C07: main (main.f90:31)
==27267== 
==27267== 32 bytes in 1 blocks are still reachable in loss record 3 of 6
==27267==    at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27267==    by 0x590C7E4: _dlerror_run (dlerror.c:140)
==27267==    by 0x590C050: dlopen@@GLIBC_2.2.5 (dlopen.c:87)
==27267==    by 0x150F32: mkl_serv_inspector_suppress (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x150E8C: mkl_serv_lock (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x14EFA1: mkl_serv_cpu_detect (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x112EC4: mkl_sparse_d_create_coo_i4 (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x112AF8: MAIN__ (main.f90:49)
==27267==    by 0x112C07: main (main.f90:31)
==27267== 
==27267== 47 bytes in 1 blocks are still reachable in loss record 4 of 6
==27267==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27267==    by 0x4017880: _dl_exception_create (dl-exception.c:77)
==27267==    by 0x6996250: _dl_signal_error (dl-error-skeleton.c:117)
==27267==    by 0x4009812: _dl_map_object (dl-load.c:2384)
==27267==    by 0x4014EE3: dl_open_worker (dl-open.c:235)
==27267==    by 0x69962DE: _dl_catch_exception (dl-error-skeleton.c:196)
==27267==    by 0x40147C9: _dl_open (dl-open.c:605)
==27267==    by 0x590BF95: dlopen_doit (dlopen.c:66)
==27267==    by 0x69962DE: _dl_catch_exception (dl-error-skeleton.c:196)
==27267==    by 0x699636E: _dl_catch_error (dl-error-skeleton.c:215)
==27267==    by 0x590C734: _dlerror_run (dlerror.c:162)
==27267==    by 0x590C050: dlopen@@GLIBC_2.2.5 (dlopen.c:87)
==27267== 
==27267== 192 bytes in 1 blocks are still reachable in loss record 5 of 6
==27267==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27267==    by 0x504CA98: gomp_malloc (alloc.c:37)
==27267==    by 0x5059B65: gomp_get_thread_pool (pool.h:42)
==27267==    by 0x5059B65: get_last_team (team.c:146)
==27267==    by 0x5059B65: gomp_new_team (team.c:165)
==27267==    by 0x5050DDB: GOMP_parallel_start (parallel.c:126)
==27267==    by 0x17D0A4: mkl_sparse_d_coo_csr_new_omp_i4 (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x17D4A7: mkl_sparse_d_convert_coo_to_csr_i4 (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x17D554: mkl_sparse_d_export_csr_data_i4 (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x126E68: mkl_sparse_d_convert_csr_i4_avx2 (in /home/rcarvalho/repos/debug/bin/LINKS_debug)
==27267==    by 0x112B38: MAIN__ (main.f90:52)
==27267==    by 0x112C07: main (main.f90:31)
==27267== 
==27267== 208 bytes in 1 blocks are still reachable in loss record 6 of 6
==27267==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27267==    by 0x504CA98: gomp_malloc (alloc.c:37)
==27267==    by 0x505AFFA: gomp_new_icv (team.c:968)
==27267==    by 0x504CF24: omp_set_num_threads (libgomp.h:681)
==27267==    by 0x112AB3: MAIN__ (main.f90:47)
==27267==    by 0x112C07: main (main.f90:31)
==27267== 
==27267== LEAK SUMMARY:
==27267==    definitely lost: 0 bytes in 0 blocks
==27267==    indirectly lost: 0 bytes in 0 blocks
==27267==      possibly lost: 0 bytes in 0 blocks
==27267==    still reachable: 495 bytes in 6 blocks
==27267==         suppressed: 0 bytes in 0 blocks
==27267== 
==27267== For counts of detected and suppressed errors, rerun with: -v
==27267== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

It seems that this kind of problem has been also reported before and, as suggested in https://stackoverflow.com/questions/37395541/mkl-sparse-blas-segfault-wh..., I'm already setting the number of threads to 1 and also using the `call mkl_free_buffers` subroutine. However, the problem is still there and, in a bigger project I have, this memory leak leads leads to a program crash due to invalid writes. Any idea on how to solve this?

Viewing all 3005 articles
Browse latest View live