Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all 3005 articles
Browse latest View live

PARDISO - Phase 33

$
0
0

Hello

I am solving (relatively) a big system with PARDISO (6810921 nonzero terms, 64080 equations, and 508 columns in load vector). I am calling Phase 11, 22 and 33.

I just noticed that PARDISO tries to allocate significant amount of memory when "Phase 33" is called, and then returns me memory allocation.

I understand that Phase 11 and 22 needs to allocate memory for factorization, but I was not expecting high memory demand for Phase 33 as well.

Is this expected? Is there a way to know how much memory needed for Phase 33? Anyway to reduce this memory demand (out-of-core option is not helping me for this).

Regards

Bulent

 

 

 

 


LU factorization of band matrix

$
0
0

I try to use https://software.intel.com/sites/products/documentation/doclib/iss/2013/mkl/mklman/GUID-03E2AA41-0886-485D-B0D7-1C2186119220.htm routine for band matrix factorization. I've wrote simple application that factorizes 4*4 matrix. However, I can't undertand relation between elements of matrix L and multipliers used during factorization mi,j. Those mutipliers represent correct columns of matrix L, but order of elements mi,j in the column is not correct. I assume, some permutation of rows should be performed based on ipiv vector. 
Initial matrix:

-0.230 -6.980  0.000  0.000
 2.540  2.460  2.560  0.000
-3.660 -2.730  2.460 -4.780
 0.000 -2.130  4.070 -3.820

Result of dgbtrf:
 0.000  0.000  0.000 -4.780
 0.000  0.000  2.460  0.300
 0.000 -2.730 -0.155 -3.292
-3.660 -6.808  4.254 -0.727
-0.694 -0.083  0.968  0.000
 0.063  0.313  0.000  0.000

Permutation vector: 3 3 3 4

In this case 
m21 = -0.694 m32 = -0.083 m43= 0.968
m31 =  0.063 m42 = 0.313
Those values form matrix L_result:
 1.000  0.000  0.000  0.000
-0.694 1.000  0.000  0.000
 0.063 -0.083 1.000  0.000
 0.000  0.313  0.968  1.000

However, the correct answer is L_correct:
 1.000  0.000  0.000  0.000
 0.063 1.000  0.000  0.000
-0.694 -0.083 1.000  0.000
 0.000  0.313  0.968  1.000

Could you explain how L_result should be transformed into L_correct in general case?

 

A bug in dcg_get?

$
0
0

I wrote a small code calling the RCI CG routine with the ILU0 preconditioner. The code is attached. The code returns an incorrect vector of all zeros as a solution. I have noticed that the vector tmp contains the correct solution but it is not transfered to the vector x (= solution).

Zbigniew

AttachmentSize
Downloadtest-mkl-cg-rci-ilu_1.c5.64 KB

Parallelization of dpotri and dpotrf

$
0
0

I measured the time needed to invert a symmetric positive definite matrix with dpotrf and dpotri in parallel on a 32-core Sandy Bridge machine and got quite surprising results:

Although the MKL-documentation says the number of flops for dpotrf is 1/3 n^3 and 2/3 n^3 for dpotrf the runtime results were quite different:

On a 8192x8192 matrix dpotrf took 1.1 sec. and dpotri took 20 sec. On other sizes dpotri always takes more than 10 times the time of dpotrf. For me this is quite surprising as it only has to do twice the flops and the parallelization of dpotri should be easier (I don't know the code of MKL but I know the algebraic operation that dpotri does and it doesn't look that difficult to parallelize it, especially for the Intel experts).

I also tested on some other machines and always got similar results, the parallelization of potrf is very good but the parallelization of potri looks quite slow. Did anyone else get similar results or does anyone know why I get these results?

Thanks

  Jochen

Steps to link LAPACK in Fortran 90 using Visual Studio 2008

$
0
0

Good afternoon,

I would deeply appreciate if anybody can help me with the clear steps to link and be able to call a subroutine from the LAPACK library.

I am working with Fortran 90 via Microsoft Visual Studio 2008. IA32 architecture. I already checked that I do have the MKL folder (with files, and some .f90 lapack files, etc). However I have found a lot of information in the internet, but none of them has been successful for me. I guess there are just pieces of information, but not all the steps: from adding sources codes to including USE statements in my source code. Also I am very confused on whether do I have to use lapack only or lapack95 or mkl_lapack95, etc.. Can anybody please clarify this to me?

Thank you very much.

Regards,

Juan Diego

passing additional parameters to RHS function of ODE solver

$
0
0

Hi,

I am trying to use the Intel's ODE solver to solver a system of equations. Below is the how the RHS function is supposed to be defined (from manual):

subroutine <name>(n, t, y, f)
integer n
double precision t, y(n), f(n)
..................
f(i) = .....
..................
return
end

The problem is I need to pass additional parameters to the RHS function. Is there a way to do that?

Thanks!

Bo

 

log sum and under/overflow

$
0
0

I have converted some neural net code from matlab which consists of adding/subtracting very small probabilities and is of the form log( sum( Array) ). This may be affected by underflow. There is a common workaround on the internet called the log sum exp trick which involves shifting back and forward by a value equal to maxval(Array)  see http://machineintelligence.tumblr.com/post/4998477107/the-log-sum-exp-trick for example. I could replicate this is fortran but before I do I though I would ask. Is there a MKL function that computes log( sum (Array) )) with minimal underflow/overflow before I reinvent the wheel - Here is the matlab code - repmat is similar to fortran spread(), ones creates a matrix of 1's and 

Alternately are there any fortran specific tricks for handling very small numbers accurately ?

if(length(xx(:))==1) ls=xx; return; end

xdims=size(xx);
if(nargin<2)
  dim=find(xdims>1);
end

alpha = max(xx,[],dim)-log(realmax)/2;
repdims=ones(size(xdims)); repdims(dim)=xdims(dim);
ls = alpha+log(sum(exp(xx-repmat(alpha,repdims)),dim));

 

cluster_sparse_solver discrepancy

$
0
0

Hello,

I'm trying to solve a general system with CPARDISO. When using two processes, there is no issue if I don't use the coefficient array during the solution phase. When using only one process, then I get a segmentation fault. Could you give me some insight into this issue, please ? Thank you in advance.

$ mpicxx -cxx=icpc cl_solver_unsym_complex_c.cpp  -lmkl_intel_thread -lmkl_core -lmkl_intel_lp64 -liomp5 -std=c++11
$ mpirun -np 1 ./a.out
$ echo $?
11
$ mpirun -np 2 ./a.out
$ echo $?
0
$ mpicxx -cxx=icpc cl_solver_unsym_complex_c.cpp  -lmkl_intel_thread -lmkl_core -lmkl_intel_lp64 -liomp5 -std=c++11 -DNSEGFAULT
$ mpirun -np 1 ./a.out
$ echo $?
0
$ mpirun -np 2 ./a.out
$ echo $?
0

 

AttachmentSize
Downloadsegfault.tar.gz92 KB

Installing MKL Blas95 and Lapack95 for Intel Fortran

$
0
0

Hi,

I am using Intel Parallel Studio XE Composer 2013 (formerly Intel Fortran Composer XE 2013) together with Microsoft Visual Studio 2012 Professional to write Fortran programs.  I would like to install the Intel MKL Blas95 and Lapack95 so that I can call routines of Lapack95 from within a Fortran program.  I would like to know the steps that are required for such installation.  I have downloaded the Intel mkl library package.

With Thanks

I-Lok Chang 

 

ILAENV for ?getrf

$
0
0

Hi,

When I call (mkl from fortran)

NB = ILAENV( 1, 'DGETRF', '', 4000, 4000, -1, -1 )

it always gives NB=1. For other methods it seems to work ok (DGEQRF gives NB=128).

I don't want to use mkl's getrf. I'm running Intel(R) Math Kernel Library 11.0 Update 5 for Linux* OS.

Pieter

 

Dynamic Linking fails due to undefined reference to __isoc99_sscanf

$
0
0

I've recently upgraded from Intel Fortran 2013 to 2015 and now I'm having trouble linking with MKL on Linux. Now, any kind of linking (dynamic, static, or with just -static-intel) fails with errors due to undefined references to __isoc99_sscanf. (Dynamic linking also has trouble with __isoc99_fscanf.) I've seen a few posts that at first glance appear similar:

https://software.intel.com/en-us/articles/link-error-when-static-linking-to-intel-mkl-on-linux-6
https://software.intel.com/en-us/forums/topic/393920
https://software.intel.com/en-us/forums/topic/393889

However, each of those posts refer to Intel Fortran 2013, and *none* of them cover problems with dynamic linking. It seems that MKL now depends on some static component that I am apparently missing. I can't figure out how to link successfully. Is this a bug, or is there a viable workaround?

Below I have provided a simplified test case (mkl_test.f90), my Makefile, and the output with the three variations of linking.
 

program mkl_test

use LAPACK95

implicit none
integer :: stat
integer, parameter :: length = 5
real(kind = 8) :: x(length), y(length), A(length, 3), b(length)

print *, 'Start.'

call gels(A, b, info = stat)
if (stat .ne. 0) then
    print *, 'Bad status from gels.'
else
    print *, 'Success'
end if

end program mkl_test

 

FC := ifort

all: mkl_test

clean:
    rm -f mkl_test mkl_test.o

mkl_test: mkl_test.o
    $(FC) -o $@ $^ $(FCFLAGS) $(FLFLAGS) -lmkl_lapack95_lp64 -mkl -static-intel

mkl_test.o: mkl_test.f90
    $(FC) -c -o $@ $< $(FCFLAGS) -mkl

 

$ make clean all
rm -f mkl_test mkl_test.o
ifort -c -o mkl_test.o mkl_test.f90  -mkl
ifort -o mkl_test mkl_test.o   -lmkl_lapack95_lp64 -mkl
/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to `__isoc99_sscanf'
/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_core.so: undefined reference to `__isoc99_fscanf'
make: *** [mkl_test] Error 1


$ make clean all
rm -f mkl_test mkl_test.o
ifort -c -o mkl_test.o mkl_test.f90  -mkl
ifort -o mkl_test mkl_test.o   -lmkl_lapack95_lp64 -mkl -static
/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_core.a(load_library.o): In function `mkl_ueaa_prv_load_backend_lib':
loadl_library.c:(.text+0x1d1): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_core.a(load_dll_static_patched.o): In function `mkl_serv_cpu_detect':
../../../../serv/kernel/load_dll.c(.text+0x86): undefined reference to `__isoc99_sscanf'
/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_thread.a(d__gemm_drv.o): In function `mkl_blas_dgemm':
../../../../blas/thread/level3/common/_gemm.c:(text+0xe29): undefined reference to `__isoc99_sscanf'
make: *** [mkl_test] Error 1


$ make clean all
rm -f mkl_test mkl_test.o
ifort -c -o mkl_test.o mkl_test.f90  -mkl
ifort -o mkl_test mkl_test.o   -lmkl_lapack95_lp64 -mkl -static-intel
/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_core.a(load_dll_static_patched.o): In function `mkl_serv_cpu_detect':
../../../../serv/kernel/load_dll.c(.text+0x86): undefined reference to `__isoc99_sscanf'
/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_thread.a(d__gemm_drv.o): In function `mkl_blas_dgemm':
../../../../blas/thread/level3/common/_gemm.c:(text+0xe29): undefined reference to `__isoc99_sscanf'
make: *** [mkl_test] Error 1

 

use lapack

$
0
0

 

The documentation refers to  use lapack and use lapack95.   Do I have to add a "use lapack" to access getrf & getrs?   It seems if I put in "use lapack" I get:  This name has already been used as an external module name. [LAPACK]

 

If I add "use lapack95", then I get:

Error in opening the compiled module file. Check INCLUDE paths. [LAPACK95]

 

But if I have neither, I don't get the calls resolved:

Error 2  error LNK2019: unresolved external symbol GETRF referenced in function SMIKE smike.obj 

same for getrs.

 

I am compiling in x64, release, with default real set to kind=8 in compiler options.

! use lapack from intel library PURE SUBROUTINE DGETRF_F95(A,

call getrf(anewt, ipvt, info)

if ( info.ne.0 ) asing = .true.

job = 0

! PURE SUBROUTINE DGETRS1_F95(A,IPIV,B,TRANS,INFO)

! call getrs('N',nnewt,1,anewt,bnewt,mxq,mxq,ipvt,solutn,job)

call getrs(anewt,ipvt,bnewt,'N',job)

 

 

Thanks...

 

 

 

Dealing with MKL DLLs on an in-house server farm

$
0
0

This is an MKL part of my general question, that was helpfully answered in the compiler forum but only regarding the runtime DLLs (https://software.intel.com/en-us/forums/topic/535178). How to solve the MKL part the best way I do not really know.

We are a SaaS, and I will be using a DLL build against the MKL in our server farm of Windows servers. In the farm, an application is first deployed, such that all binaries are on the network location accessible to all servers. Then a scheduler may run instances of the application on any server that currently has capacity. A mishmash of servers ranges from Gainestowns to Haswells. At any time, there may be 500 application installed in the farm. The applications are similar, but may be build by different versions of tool. Some of them are new, other have not been rebuilt for a year.

A DLL an application depends on may be deployed with the application, or may be installed as an .msi system-wide package. The latter is the case with Microsoft runtimes, required by MSVC-compiled binaries. Now I see a few options of deploying the MKL:

1. Deploy the DLLs with the application package, This is not an ideal option to me, as a typical application is less than 100MB in size, and the sum of MKL DLLs is about 150MB for ia32'. So increasing the deployment size by the factor of  2.5 increase storage, network traffic and scheduler load, and opens a whole new can of fresh worms.

2. Compile MKL statically. Well, my DLL bloats up to 10MB in size, which is pretty affordable. Still, not the best solution because the copy of read-only MKL code sections will be loaded for every instance, and not shared by the system. This is mostly the case in the previous item as well.

3. Deploy MKL as we deploy e. g. Visual C runtimes. Run an msi, and there it is for everyone to use! This would be almost ideal. What I am worrying about is what happens if there is a new code built against a newer MKL. If we, for example, upgrade this common, per-server MKL install on a server from e. g. version 10 to 11, will it kill the apps built against version 10? In other words, are MKL releases backward compatible in the DLL form?

4, Super duper ideal solution: deploy MKL into the system side-by-side facility (SxS), so each application can request its own version of DLLs.  That would be ideal. Now tell me it is possible please! : )

Any thoughts and ideas are gladly accepted.

problems of using LAPACKE_dgesv in c++

$
0
0

I am using the LAPACKE_dgesv in visual studio Ultimate 2012 Version 11.0.61030.00 Update 4 with  Intel C++ 14.0 as

info = LAPACKE_dgesv(LAPACK_ROW_MAJOR, n, nrhs, *a, lda, ipiv, *b, ldb);

LAPACKE_degsv works in the debug mode, but got crashed in the release mode with the following errors: 

"Unhandled exception at 0x00007ff714da0888 in MyProg.exe: 0xC0000005: Access violation at location 0x00007ff714da0888."

It is hard to post a sample codes for my program. Command lines are supplied below. Are there any problems for invoking the MKL library? I appreciate any suggestions for this type of problem. Thanks!   

Compile Command line: 

/Yu"stdafx.h" /GS /W3 /Gy /Zc:wchar_t /I"C:\Program Files (x86)\Intel\Composer XE 2013 SP1\mkl\include" /Zi /O2 /Fd"Release\vc110.pdb" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /Qipo /Zc:forScope /Gd /Oi /MD /Fa"Release\" /EHsc /nologo /Fo"Release\" /Fp"Release\MyProg.pch" 

Liner Command line: 

/OUT:"D:\Release\MyProg.exe" /MANIFEST /NXCOMPAT /PDB:"D:\Release\MyProg.pdb" /DYNAMICBASE "mkl_intel_c.lib""mkl_core.lib""mkl_sequential.lib""mkl_lapack95.lib""kernel32.lib""user32.lib""gdi32.lib""winspool.lib""comdlg32.lib""advapi32.lib""shell32.lib""ole32.lib""oleaut32.lib""uuid.lib""odbc32.lib""odbccp32.lib" /DEBUG /MACHINE:X86 /OPT:REF /SAFESEH /INCREMENTAL:NO /SUBSYSTEM:CONSOLE /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /ManifestFile:"Release\MyProg.exe.intermediate.manifest" /OPT:ICF /NOLOGO /LIBPATH:"C:\Program Files (x86)\Intel\Composer XE 2013 SP1\mkl\lib\ia32" /TLBID:1 

Do the MKL FFT Handler creation uses malloc??

$
0
0

Hi,

When we want to use MKL FFT we create a handler for that. So do this handler creation involves malloc system call? If not which method is used to allocate memory.

 

Thanks

sivaramakrishna


Is there a max num of threads for mkl dss

$
0
0

Hello, everyone.

I used MKL DSS to solve linear systems with openMP. When calling dss with 32 threads, i came up with a run time error. Debugging messages are as below:

......

[New Thread 0x7fff86b5e700 (LWP 10279)]
[New Thread 0x7fff8675d700 (LWP 10280)]
[New Thread 0x7fff8635c700 (LWP 10281)]
[New Thread 0x7fff85f5b700 (LWP 10282)]
[New Thread 0x7fff85b5a700 (LWP 10283)]
[New Thread 0x7fff85759700 (LWP 10284)]
[New Thread 0x7fff85358700 (LWP 10285)]
[New Thread 0x7fff84f57700 (LWP 10286)]
[New Thread 0x7fff84b56700 (LWP 10287)]
[New Thread 0x7fff84755700 (LWP 10288)]
[New Thread 0x7fff7fafe700 (LWP 10289)]
[New Thread 0x7fff7f6fd700 (LWP 10290)]
[New Thread 0x7fff7f2fc700 (LWP 10291)]
[New Thread 0x7fff7eefb700 (LWP 10292)]
[New Thread 0x7fff7eafa700 (LWP 10293)]
[New Thread 0x7fff7e6f9700 (LWP 10294)]
[New Thread 0x7fff7e2f8700 (LWP 10295)]
[New Thread 0x7fff7def7700 (LWP 10296)]
[New Thread 0x7fff7daf6700 (LWP 10297)]
[New Thread 0x7fff7d6f5700 (LWP 10298)]
[New Thread 0x7fff7d2f4700 (LWP 10299)]
[New Thread 0x7fff7cef3700 (LWP 10300)]
[New Thread 0x7fff7caf2700 (LWP 10301)]
[New Thread 0x7fff7c6f1700 (LWP 10302)]
TAG:3

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff7cef3700 (LWP 10300)]
0x00007ffff6da6ca2 in mkl_pds_lp64_c_blkl_omp_pardiso ()
   from /home/***/intel/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_thread.so
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.149.el6.x86_64 libgcc-4.4.7-11.el6.x86_64 libgomp-4.4.7-11.el6.x86_64

"Tag:3" is a label in my program after dss_reorder, thus it may be something wrong with call of dss_factor_*. However, when i run the same program with only the difference of less threads, say 16, everyting seems OK. No error emerges when calling pardiso with 32 threads also. What happened after all?

Thanks,

wzmumu

Access violation with MKL_DSS

$
0
0

Hi there,

I found a strange problem when using MKL DSS, something related with "Access violation writing location", either with DSS_FACTOR_REAL or DSS_SOLVE_REAL. Problem seems to be computer dependent.

I prepared a small program with which the problem can be found.

 

In brief, we have a system of equations with neq=8, which I am trying to solve.

In some cases, like neq=8, when I try to solve the same system of equations twice, the problem appears.

 

Running:

Intel Visual Fortran Composer XE 2013 14.0.1.139 build 20131008

Windows 7/ Visual studio 2010

Already tried with Windows 8 / VS 2013 / Fortran 15.0.0.108 build 20147026 and the problem is the same.

 

Best regards,

Luis Alves

Can't get Linux evaulation copy

$
0
0

Putative on-line resources lead me in circles.  

Rep at reseller who gave me a quote is clueless about product.

Help

Ronald

 

 

Tough and evasive bug in MKL DSS solver

$
0
0

Here is a very small program that solves two linear equations using the MKL DSS interface to Pardiso. First, the test program:

program ptmkl
use mkl_dss
implicit none
TYPE (MKL_DSS_HANDLE) :: handle
INTEGER opt,dss_err
INTEGER, PARAMETER :: NEQ=2, NNZM=NEQ*NEQ
INTEGER :: rowIDX(NEQ+1) = [1,3,5]
INTEGER :: COL(NNZM) = [1,2, 1,2]
INTEGER :: i,j,k, n = NEQ, nnz = NNZM, perm(NEQ)
DOUBLE PRECISION :: A(NNZM) = [1d0, -1d-2,   -1d-2, 1d0]
DOUBLE PRECISION :: B(NEQ) = [1d0, 2d0], X(NEQ)

opt=MKL_DSS_DEFAULTS
dss_err = dss_create(handle, opt)
write(*,10)'Create ',dss_err
dss_err = dss_define_structure(handle,opt,rowIDX,n,n,COL,nnz)
write(*,10)'Define ',dss_err
dss_err = dss_reorder(handle,opt,perm)
write(*,10)'ReOrder',dss_err

dss_err = dss_factor_real(handle,opt,A)
write(*,10)'Factor ',dss_err
dss_err = dss_solve_real(handle,opt,B,1,X)
write(*,10)'Solve  ',dss_err

10 format(A7,2x,I4)
end program ptmkl

I compile this program with IFort 15.0 IA-32 using the command

ifort /Qmkl /traceback /MD dssbug.f90

When I then run the program repeatedly, it works correctly very often but, once in a while, aborts with a C0000005 or C0000374 error. To track the problem down, I ran the program inside Inspector XE 2015, and the screenshot is attached.

This is a shorter reproducer for the problems reported by another user, see https://software.intel.com/en-us/forums/topic/535430 .

AttachmentSize
Downloaddssbug.jpg150.74 KB

Intel® Math Kernel Library 11.2 Update 1 is now available

$
0
0

Intel® Math Kernel Library (Intel® MKL) is a highly optimized, extensively threaded, and thread-safe library of mathematical functions for engineering, scientific, and financial applications that require maximum performance. Intel MKL 11.2 Update 1 packages are now ready for download. Intel MKL is available as part of the Intel® Parallel Studio XE 2015 . Please visit the Intel® Software Evaluation Center to evaluate this product.

Intel® MKL 11.2 Update 1 Bug fixes

What's New in Intel® MKL 11.2 Update 1 :

  • Introduced support for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) on Intel® Xeon® processors for Windows* and Linux* versions of Intel MKL. This is in addition to the current support for Intel® AVX-512 instructions for Intel® Many Integrated Core Architecture (Intel® MIC Architecture)
  • Introduced support for LAPACK version 3.5
  • Added support for Schur complement including getting explicit Schur complement matrix and solving the system through Schur complement
  • Deprecations: Intel® MKL Cluster Support for IA 32 is now Deprecated and support will be removed in the next major release of Intel MKL

Check Out the Release Notes

Viewing all 3005 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>