Using multiple DFTI DESCRIPTOR (FFT in MKL)

August 11, 2017, 12:14 pm

Latest and popular articles on Intel Technologies

≫ Next: Scalapack raise error under certain circumstance

Is it possible to create and commit several different DFTI descriptor and re-use them later (the FFT of different sizes will be called many times, and creating the descriptor and free it for each call seems not efficient). In other words, can the descriptor be created/committed and then saved in some arrays?

↧

Scalapack raise error under certain circumstance

August 14, 2017, 11:43 am

Latest and popular articles on Intel Technologies

≫ Next: SVD hangs periodically

≪ Previous: Using multiple DFTI DESCRIPTOR (FFT in MKL)

Dear All,

I am using IntelMPI + ifort + MKL to compile Quantum-Espresso 6.1. Everthing works fine except invoking scalapack routines. Calls to PDPOTRF may exit with non-zero error code under certain circumstance. In an example, with 2 nodes * 8 processors per node the program works but with 4 nodes * 4 processors per node the program fails. If I_MPI_DEBUG is used, for the failed case there are following messages just before the call exit with code 970, while for the working case there is no such messages:

[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2676900, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2675640, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x26742b8, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2676b58, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x26769c8, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2676c20, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2675fa0, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2676068, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2676a90, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2676e78, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2678778, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2675898, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2675a28, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2675bb8, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2674f38, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2676ce8, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2676130, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2674768, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2674448, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2674b50, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2675e10, operation 2, size 12272, lkey 1879682311
[10#18754:18754@node09] MRAILI_Ext_sendq_send(): rail 0,vbuf 0x2675708, operation 2, size 2300, lkey 1879682311

Could you provide any suggestion about what is the possible cause here? Thanks very much.

Feng

↧

SVD hangs periodically

July 28, 2017, 7:10 am

Latest and popular articles on Intel Technologies

≫ Next: How to add MKL with TBB threading to application that uses TBB under Visual Studio?

≪ Previous: Scalapack raise error under certain circumstance

We are noticing that SVD, both dgesvd and dgesdd, will hang periodically. The call stack terminates with a call to either one of those functions. Killing the process and rerunning alleviates the problem.

We suspect there's some form of locking going on in SVD somewhere. Something we've tried is the following:

start a new thread to compute SVD
wait until finish or timeout
if timeout, try again
if timeout, throw

What was interesting about this is that the first thread will block (just like before we put the threading in, but the second thread wouldn't even start. Now, there are a limited number of resources and actions that could stop a new thread. One of them is that every thread must call into DllMain of every assembly in the process. If one of the assemblies is doing something non-standard (holding a lock), then you are dead.

We suspect there's a problem somewhere with Intel, and possibly, another library in our stack. We generally do not mess with DllMain and thread registration.

↧

How to add MKL with TBB threading to application that uses TBB under Visual Studio?

August 17, 2017, 7:49 am

Latest and popular articles on Intel Technologies

≫ Next: Are DNN functions thread safe?

≪ Previous: SVD hangs periodically

I have installed IntelSWTools 2017.4:
c:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows

My application uses TBB:
Release(/MD): tbb\lib\intel64\vc12\tbb.lib -> redist\intel64\tbb\vc12\tbb.dll
Debug(/MDd): tbb\lib\intel64\vc12\tbb_debug.lib -> redist\intel64\tbb\vc12\tbb_debug.dll

How to add MKL with TBB threading (so that there would be only one TBB instance) to application that uses TBB?

The problem is that mkl_tbb_thread_dll.dll (as I understand) linked with:
tbb\lib\intel64\vc_mt\tbb.lib -> redist\intel64\tbb\vc_mt\tbb.dll
which is different from what my application uses.

Is redist\intel64\tbb\vc_mt\tbb.dll for mkl_tbb_thread_dll.dll can be changed to redist\intel64\tbb\vc12\tbb.dll ?
If yes how to be with Debug build (somehow rename redist\intel64\tbb\vc12\tbb_debug.dll to tbb.dll)?

↧

Are DNN functions thread safe?

August 18, 2017, 4:38 am

Latest and popular articles on Intel Technologies

≫ Next: How to store A to get fastest performance of AT*x using cblas_dgemv?

≪ Previous: How to add MKL with TBB threading to application that uses TBB under Visual Studio?

It seems that dnnExecute_F32 can not be called in multi-thread, does it?

↧

How to store A to get fastest performance of AT*x using cblas_dgemv?

August 18, 2017, 9:20 am

Latest and popular articles on Intel Technologies

≫ Next: Warnings from libiomp5.a when linking on Macintosh with Xcode 8

≪ Previous: Are DNN functions thread safe?

Hello,

I am using cblas_dgemv to obtain A^T*x. The size of the matrix A is about 10000 Rows x 20000 columns. I am storing A in row major format. A_i,j+1 is stored next to A_ij

My questions are as follows (in order to get fastest execution time):

What is better way to store A -- row major format or column major format? does it matter?
Is it better to store A and set TransA=CblasTrans or store A^T directly and use it with TransA=CblasNoTrans.
If answer to #2 is to use A^T directly, is it better to store A^T in rowmajor format or column major format?

Another related question I have has to do with byte alignment. Let us say we are storing in A in row major format. A has m rows and n columns. I have read that, when doing multithreading using openmp, to avoid false sharing it is better if each row of A starts at a byte aligned boundary. A common way of doing that is by padding the number of columns such that it is divisible by 8 (64 bytes for 8 doubles). So LDA = n + (8 - n%8). Does doing this help dgemv run faster?

Finally, For my calculation I need alpha=1 and beta=0. Does cblas_dgemv optimize for this trivial case or does it do the extra and in this unneccessary calculations?

Thanks in advance for any help.

↧

Warnings from libiomp5.a when linking on Macintosh with Xcode 8

August 18, 2017, 1:48 pm

Latest and popular articles on Intel Technologies

≫ Next: Regarding cluster_sparse_solver

≪ Previous: How to store A to get fastest performance of AT*x using cblas_dgemv?

We are using MKL and linking with libiomp5.a. Since starting to use Xcode 8.3.3 to build, we have been getting a large number of warnings like this:

:-1: warning: pointer not aligned at address 0x11132A942 (anon + 112 from ...libs/libiomp5.a(iomp.o))

Does anyone know how to silence this?

Thanks, John Weeks

↧

Regarding cluster_sparse_solver

August 19, 2017, 6:01 pm

Latest and popular articles on Intel Technologies

≫ Next: Pardiso and pardiso_64

≪ Previous: Warnings from libiomp5.a when linking on Macintosh with Xcode 8

I am Mehdi and this is my first time using this forum.

I need to used cluster_sparse_solver in my FORTRAN Finite Element program. Because the degree of freedom of my system is very high (1^6), the number of nonzero members in the stiffness matrix (A in Ax=B) will be also very high in a way that I can not store the number of non-zero in an integer number with type 4 and I must use integer(8). Therefore, the parameter ia (row indexing of sparse matrix) must be integer(8).

In this situation, how I should compile my program. I have tried to use 4 bit and 8 bit libraries, during compiling of my program and none of them are working. Shall I use all of the integers in my program with type integer(8)? When ia in integer(8) and ja is integer(4), is it possible to compile the program?

Please help me. I can provide any more information you may need.

Bests

Mehdi

↧

Pardiso and pardiso_64

August 21, 2017, 10:09 am

Latest and popular articles on Intel Technologies

≫ Next: cluster_sparse_solve library and path setting in UBUNTU

≪ Previous: Regarding cluster_sparse_solver

Hello,

I have a FORTRAN program for solving flow field (Stokes flow) problem using FEM. Firstly, I used PARDISO solver to solve the coupled problem. I was totally happy with PARDISO, because it is faster than MATLAB (I used to work with MATLAB and I am new in ifort). Bellow, I have summarized the command related to PARDISO:

!----------Matrix solution PARDISO--------------------------------------------

! [M ]*{X} = {RHS}

INTEGER :: PT(64), MTYPE, IPARM(64)

INTEGER :: MAXFCT,MNUM,PHASE,N , NRHS , MSGLVL , ERROR

INTEGER,ALLOCATABLE :: PERM(:)

REAL(8),ALLOCATABLE :: X(:)

INTEGER,ALLOCATABLE :: ja(:) , ia(:)

REAL(8),ALLOCATABLE,DIMENSION(:) :: M

! ABOVE FOR INTRODUCING THE PARAMETERS

MTYPE = 11

CALL PARDISOINIT (PT, MTYPE, IPARM)

ALLOCATE( PERM(dof) , STAT = ISTAT ) ! dof : degree of freedom

ALLOCATE( X(dof) , STAT = ISTAT )

MAXFCT = 1

MNUM = 1

PHASE = 13

N = dof

PERM = 1

NRHS = 1

MSGLVL = 0

CALL CPU_TIME(start)

CALL PARDISO (PT, MAXFCT, MNUM, MTYPE, PHASE, N, M_SPARSE, ia,ja, PERM, NRHS, IPARM, MSGLVL, RHS, X, ERROR)

CALL CPU_TIME(finish)

write(*,*) (finish-start)*1000 , 'msec' , ERROR

This code works without any problem and perfectly.

My problem is for the situation that I want to use PARDISO_64. According to the document, all of the input and output integers, should be INTEGER(8), therefore:

!----------Matrix solution PARDISO--------------------------------------------

! MX = RHS

INTEGER(8) :: PT(64), MTYPE, IPARM(64)

INTEGER(8) :: MAXFCT,MNUM,PHASE,N , NRHS , MSGLVL , ERROR

INTEGER(8),ALLOCATABLE :: PERM(:)

REAL(8),ALLOCATABLE :: X(:)

INTEGER(8),ALLOCATABLE :: ja(:) , ia(:)

REAL(8),ALLOCATABLE,DIMENSION(:) :: M

! ABOVE FOR INTRODUCING THE PARAMETERS

MTYPE = 11

CALL PARDISOINIT (PT, MTYPE, IPARM)

ALLOCATE( PERM(dof) , STAT = ISTAT ) ! dof : degree of freedom

ALLOCATE( X(dof) , STAT = ISTAT )

MAXFCT = 1

MNUM = 1

PHASE = 13

N = dof

PERM = 1

NRHS = 1

MSGLVL = 0

CALL CPU_TIME(start)

CALL PARDISO_64 (PT, MAXFCT, MNUM, MTYPE, PHASE, N, M_SPARSE, ia,ja, PERM, NRHS, IPARM, MSGLVL, RHS, X, ERROR)

CALL CPU_TIME(finish)

write(*,*) (finish-start)*1000 , 'msec' , ERROR

but, this code does not working and it gives the following error:

forrtl: server (157): program Exception – access violation

I use the following to compile my code:

ifort USEFULLS.f90 CONSTANTS.f90 PRE_PROCESSOR_3D.f90 DATATYPES.f90 VEL_SUBS.f90 SPARSE_SUB.f90 main00.f90 -o t1 -Qmkl -heap-arrays

USEFULLS.f90, CONSTANTS.f90, PRE_PROCESSOR_3D.f90, DATATYPES.f90, VEL_SUBS.f90, and SPARSE_SUB.f90 are modules that are developed by me and main00.f90 is the main program. I think (I am not sure) that I must use some other keys in my compile command line.

I have similar problem for cluster_sparse_solver, which works well, but cluster_sparse_solver_64 does not work!!!!!!

Best regards

Mehdi

↧

cluster_sparse_solve library and path setting in UBUNTU

August 22, 2017, 7:25 am

Latest and popular articles on Intel Technologies

≫ Next: Pardiso iparm(30) not returning equation number correctly

≪ Previous: Pardiso and pardiso_64

p { margin-bottom: 0.1in; line-height: 120%; }

Hello,

I have a question regarding compiling a program containing cluster_sparse_solve.

I have developed a program for 3D flow field calculation using finite element method, using Intel parallel studio in my laptop, which is working with WINDOWS OS. I have compiled the program using the following lines:

step1 :

set path=C:\MyIntel\IntelSWTools\compilers_and_libraries_2017.4.210\windows\redist\intel64\mkl; C:\MyIntel\IntelSWTools\compilers_and_libraries_2017.4.210\windows\redist\intel64\compiler;C:\MyIntel\IntelSWTools\compilers_and_libraries_2017.4.210\windows\redist\intel64\tbb\vc_mt;%path%

step 2 :

set lib=C:\MyIntel\IntelSWTools\compilers_and_libraries_2017.4.210\windows\mkl\lib\intel64;%lib%

for the mentioned lined, I have used mkl_link_tool mpiifort C:\FORTRAN\Programmes\MPI\main01.f90

Then, after setting the path and libraries, I have used the followings:

step 3:

mpiifort USEFULLS.f90 CONSTANTS.f90 PRE_PROCESSOR_3D.f90 DATATYPES.f90 VEL_SUBS.f90 SPARSE_SUB.f90 -I"C:\MyIntel\IntelSWTools\compilers_and_libraries_2017.4.210\windows\mkl\include""parallel01.f90" mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib mkl_blacs_intelmpi_lp64.lib impi.lib libiomp5md.lib -o Pstatic -heap-arrays

It works perfectly, without any problem. Also, I shall add that I have also compiled the program using dynamic libraries, as well. (I have used the the online link advisor for the recently mentioned line (step 3)- https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/)

But, I want to use the in a better computer which has Intel parallel Studio just under a LINUX OS. I have used again the online link advisor and then I have used the following to compile my code:

mpiifort USEFULLS.f90 CONSTANTS.f90 PRE_PROCESSOR_3D.f90 DATATYPES.f90 VEL_SUBS.f90 SPARSE_SUB.f90 -I. /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/include "parallel01.f90" -Wl,--start-group . /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64/libmkl_intel_lp64.a . /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64/libmkl_intel_thread.a . /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64/libmkl_core.a . /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -liomp5 -lpthread -lm -ldl -o Pstatic -heap-arrays

But I do not know:

a – How to determine the path and libraries in linux system (I mean what is mentioned in step 1 and step 2, but for linux)

b – How to find the suitable lines and libraries in linux system

c- Is there anything like mkl_link_tool for linux, and if yes where is it?

At the moment, it gives the following errors:

parallel01.f90(17): error #7002: Error in opening the compiled module file. Check INCLUDE paths. [MKL_CLUSTER_SPARSE_SOLVER]

USE MKL_CLUSTER_SPARSE_SOLVER

------------^

parallel01.f90(61): error #6457: This derived type name has not been declared. [MKL_CLUSTER_SPARSE_SOLVER_HANDLE]

TYPE(MKL_CLUSTER_SPARSE_SOLVER_HANDLE) :: CPT(64)

-------------^

parallel01.f90(268): error #6404: This name does not have a type, and must have an explicit type. [CPT]

CPT(:)%dummy = 0

--------^

parallel01.f90(268): error #6514: Substring or array slice notation requires CHARACTER type or array. [CPT]

CPT(:)%dummy = 0

--------^

parallel01.f90(268): error #6460: This is not a field name that is defined in the encompassing structure. [DUMMY]

CPT(:)%dummy = 0

---------------^

parallel01.f90(268): error #6158: The structure-name is invalid or is missing. [CPT]

CPT(:)%dummy = 0

--------^

compilation aborted for parallel01.f90 (code 1)

Best regards

Mehdi

↧

Pardiso iparm(30) not returning equation number correctly

August 22, 2017, 7:22 pm

Latest and popular articles on Intel Technologies

≫ Next: MKL Memory Allocator

≪ Previous: cluster_sparse_solve library and path setting in UBUNTU

I am using Pardiso with the 2017 Update 2 Intel Fortran compiler in VS 2015 and I'm finding that when using mtype=2 (real and symmetric positive definite matrix) if my matrix has a singularity, iparm(30) always returns 1 rather than the location of the equation where the singularity occurs. In the 2016 version of the compiler this worked correctly. Has something changed or is this a bug?

Refer to the Pardiso documentation as follows:
If Intel MKL PARDISO detects zero or negative pivot for mtype=2 or mtype=4 matrix types, the factorization is stopped, Intel MKL PARDISO returns immediately with an error = -4, and iparm(30) reports the number of the equation where the first zero or negative pivot is detected.

↧

MKL Memory Allocator

August 9, 2017, 6:52 am

Latest and popular articles on Intel Technologies

≫ Next: cluster_sparse_solver library and path setting in LINUX

≪ Previous: Pardiso iparm(30) not returning equation number correctly

Hello,

I am looking for information regarding the Memory Allocator embedded in MKL.

We are using intensively MKL_malloc/MKL_free in a project and are planning to add a memory manager on top of it. Our goal is to reuse aligned memory without freeing it and to have per thread memory pools and to have the ability to do fine tuning on those memory pools. (We are indeed challenged by the memory consumption)

The mkl_disable_fast_mm page refers to a per thread memory pool but with no more details. Does anyone have more information? (lock-free malloc with per thread heap? Monitoring the available memory in the memory pool, etc.)

We are also considering a deactivation of mkl memory manager and rely on intel TBB malloc implementation (either by redefining memory function with i_malloc, or with std::vector> style implementation). Does anyone have feedback of such implementation?

Thank you
Arnaud

↧

cluster_sparse_solver library and path setting in LINUX

August 22, 2017, 7:25 am

Latest and popular articles on Intel Technologies

≫ Next: MKL_SINGLE_PATH_ENABLE

≪ Previous: MKL Memory Allocator

p { margin-bottom: 0.1in; line-height: 120%; }

Hello,

I have a question regarding compiling a program containing cluster_sparse_solve.

step1 :

step 2 :

set lib=C:\MyIntel\IntelSWTools\compilers_and_libraries_2017.4.210\windows\mkl\lib\intel64;%lib%

for the mentioned lined, I have used mkl_link_tool mpiifort C:\FORTRAN\Programmes\MPI\main01.f90

Then, after setting the path and libraries, I have used the followings:

step 3:

But, I want to use the in a better computer which has Intel parallel Studio just under a LINUX OS. I have used again the online link advisor and then I have used the following to compile my code:

But I do not know:

a – How to determine the path and libraries in linux system (I mean what is mentioned in step 1 and step 2, but for linux)

b – How to find the suitable lines and libraries in linux system

c - I have found mkl_link_tool in linux instalation directory, but it does not work! Why?

At the moment, it gives the following errors:

parallel01.f90(17): error #7002: Error in opening the compiled module file. Check INCLUDE paths. [MKL_CLUSTER_SPARSE_SOLVER]

USE MKL_CLUSTER_SPARSE_SOLVER

------------^

parallel01.f90(61): error #6457: This derived type name has not been declared. [MKL_CLUSTER_SPARSE_SOLVER_HANDLE]

TYPE(MKL_CLUSTER_SPARSE_SOLVER_HANDLE) :: CPT(64)

-------------^

parallel01.f90(268): error #6404: This name does not have a type, and must have an explicit type. [CPT]

CPT(:)%dummy = 0

--------^

parallel01.f90(268): error #6514: Substring or array slice notation requires CHARACTER type or array. [CPT]

CPT(:)%dummy = 0

--------^

parallel01.f90(268): error #6460: This is not a field name that is defined in the encompassing structure. [DUMMY]

CPT(:)%dummy = 0

---------------^

parallel01.f90(268): error #6158: The structure-name is invalid or is missing. [CPT]

CPT(:)%dummy = 0

--------^

compilation aborted for parallel01.f90 (code 1)

Best regards

Mehdi

↧

MKL_SINGLE_PATH_ENABLE

August 23, 2017, 8:04 am

Latest and popular articles on Intel Technologies

≫ Next: CANNOT use cluster_sparse_solver in linux!

≪ Previous: cluster_sparse_solver library and path setting in LINUX

Hello,

We have legacy code that call :

mkl_enable_instructions(MKL_SINGLE_PATH_ENABLE);

MKL_SINGLE_PATH_ENABLE is not in the documentation (anymore ?).

In mkl_service.h, defines are :

#define  MKL_ENABLE_SSE4_2          0
#define  MKL_ENABLE_AVX             1
#define  MKL_ENABLE_AVX2            2
#define  MKL_ENABLE_AVX512_MIC      3
#define  MKL_ENABLE_AVX512          4
#define  MKL_ENABLE_AVX512_MIC_E1   5
#define  MKL_SINGLE_PATH_ENABLE     0x0600

Does anyone know if MKL_SINGLE_PATH_ENABLE is still used in MKL 2017 (still present in mkl_service.h) ? Can we consider that it is the most restrictive mode ?

Thank you.

Arnaud

↧

CANNOT use cluster_sparse_solver in linux!

August 25, 2017, 7:18 am

Latest and popular articles on Intel Technologies

≫ Next: Access Violation Error while using dgesvd for c on Visual Studio

≪ Previous: MKL_SINGLE_PATH_ENABLE

p { margin-bottom: 0.1in; line-height: 120%; }

Hi everyone,

I have developed a code which uses cluster_sparse_solver. I can compile it (static) in my PC, which has WINDOWS10 OS.

But, I can not compile it in linux!!

I use the following commands for compilation:

$MKLPATH=$MKLROOT/lib/intel64

$MKLINCLUDE=$MKLROOT/include

and then

$mpiifort USEFULLS.f90 CONSTANTS.f90 PRE_PROCESSOR_3D.f90 DATATYPES.f90 VEL_SUBS.f90 SPARSE_SUB.f90 -L$MKLPATH -I$MKLINCLUDE/ -I$MKLINCLUDE/intel64/lp64 parallel00.f90 -Wl,--start-group $MKLROOT/lib/intel64/libmkl_intel_lp64.a $MKLROOT/lib/intel64/libmkl_intel_thread.a $MKLROOT/lib/intel64/libmkl_core.a $MKLROOT/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -liomp5 -lpthread -lm -ldl -o t1

But, I see the following error message:

parallel00.f90(17): error #7002: Error in opening the compiled module file. Check INCLUDE paths. [MKL_CLUSTER_SPARSE_SOLVER]

USE MKL_CLUSTER_SPARSE_SOLVER

------------^

parallel00.f90(61): error #6457: This derived type name has not been declared. [MKL_CLUSTER_SPARSE_SOLVER_HANDLE]

TYPE(MKL_CLUSTER_SPARSE_SOLVER_HANDLE) :: PT(64)

-------------^

parallel00.f90(268): error #6404: This name does not have a type, and must have an explicit type. [PT]

PT(:)%dummy = 0

--------^

parallel00.f90(268): error #6514: Substring or array slice notation requires CHARACTER type or array. [PT]

PT(:)%dummy = 0

--------^

parallel00.f90(268): error #6460: This is not a field name that is defined in the encompassing structure. [DUMMY]

PT(:)%dummy = 0

---------------^

parallel00.f90(268): error #6158: The structure-name is invalid or is missing. [PT]

PT(:)%dummy = 0

--------^

compilation aborted for parallel00.f90 (code 1)

Does anyone have any suggestion to solve this problem? Shall I compile MKL_CLUSTER_SPARSE_SOLVER.f90 in the include directory or it is not necessary? (I should say that even this compilation is not possible in my linux system, but I can do it in WINDOWS OS)

Best regards

Mehdi

↧

Access Violation Error while using dgesvd for c on Visual Studio

August 25, 2017, 12:26 pm

Latest and popular articles on Intel Technologies

≫ Next: Pardiso generates wrong result for a small test matrix

≪ Previous: CANNOT use cluster_sparse_solver in linux!

Hello, while running the following code on Visual Studio 2015:

#include <thread>
#include <mkl.h>
#include <random>
#include <ctime>

const int MatrixLayout = LAPACK_COL_MAJOR;

int GetIndex(int i, int j, int m, int n);
void Initialize2DArray( double** matrix, int rows, int columns );
void PopulateRandMatrix( double **& matrix, int m, int n );
/*Wraps inputs intended for svdcmp to those accepted by the MKL library.*/
int wrapperForSVD( double **u, int m, int n, double *w, double **v );
#define M 500
#define N 400

int main() {
   int m = M;
   int n = N;
   double** A = new double*[m];
   double* w = new double[(m>n)?n:m];
   double** V = new double*[n];

   Initialize2DArray( A, m, n );
   Initialize2DArray( V, n, n );

   PopulateRandMatrix( A, m, n );

   int getReturn = wrapperForSVD( A, m, n, w, V );

}

/*Wraps inputs intended for svdcmp to those accepted by the MKL library.*/
int wrapperForSVD( double **u, int m, int n, double *w, double **v )
{
   mkl_verbose( 1 );
   const unsigned int NumberOfThreads = std::thread::hardware_concurrency() > 0 ? (int)std::thread::hardware_concurrency() : 4;
   //Return only variable
   const int ReturnError = 1;
   const char JobUVt = 'A';
   lapack_int lda = (MatrixLayout == LAPACK_COL_MAJOR) ? m : n;
   lapack_int ldu = m;
   lapack_int ldvt = n;

   //save old thread value and set threads locally incase of future omp threadding of application
   int oldThreadNumber = mkl_set_num_threads_local( NumberOfThreads );


   double* aOneDArray = (double*)malloc( m*n * sizeof( double ) );//new double[m*n];
   if (!aOneDArray)
      return ReturnError;

   //convert 2d matrix to 1d array
   for (int i = 0; i < m; i++)
      for (int j = 0; j < n; j++)
         aOneDArray[GetIndex( i, j, m, n )] = u[i][j];

   double * uOneDArray = (double*)malloc( ldu*m * sizeof( double ) );//new double[ldu*m];
   double * vOneDArray = (double*)malloc( ldvt*n * sizeof( double ) );//new double[ldvt*n];
   double * superb = (double*)malloc( sizeof( double )*(m > n) ? n-2 : m-2 );//new double[(m>n)?n:m];
   if (!uOneDArray || !vOneDArray || !superb)
      return ReturnError;

   int testFailConvergence = LAPACKE_dgesvd( MatrixLayout, JobUVt, JobUVt, m, n, aOneDArray, lda, w, uOneDArray, ldu, vOneDArray, ldvt, superb );


   //if matrix converged
   if (testFailConvergence == 0) {
      //convert 1d arrays to 2d arrays
      for (int i = 0; i < m; i++)
         for (int j = 0; j < n; j++)
            u[i][j] = uOneDArray[GetIndex( i, j, m, n )];
      int smallerOfMN = (m < n) ? m : n;
      for (int i = 0; i < smallerOfMN; i++)
         for (int j = 0; j < smallerOfMN; j++)
            v[j][i] = vOneDArray[GetIndex( i, j, smallerOfMN, smallerOfMN )];
   }
   else
      testFailConvergence = ReturnError;

   free( aOneDArray ); //delete[] oneDArray;
   free( uOneDArray ); //delete[] uOneDArray;
   free( vOneDArray ); //delete[] vOneDArray;
   free( superb ); //delete[] superb;

   //reset thread count
   mkl_set_num_threads_local( oldThreadNumber );

   return testFailConvergence;

}


//Maps the correct index from a 2d array to a 1d array
//NOTE: Fucntion is dependent upon MatrixLayout and will return the correct
//layout for both row major and column major re-mapping
int GetIndex(int i, int j, int m, int n)
{
   if (MatrixLayout == LAPACK_ROW_MAJOR)
      return (i*n) + j;
   else
      return (j*m) + i;
}

void Initialize2DArray(double ** matrix, int rows, int columns)
{
   for (int i = 0; i < rows; i++)
      matrix[i] = new double[columns];
   return;
}

void PopulateRandMatrix(double **& matrix, int m, int n)
{

   double** old = nullptr;
   std::srand(std::time(NULL));
   if (matrix != nullptr)
      old = matrix;
   matrix = new double*[m];

   for (int i = 0; i < m; i++) {
      if (old != nullptr)
         delete[] old[i];
      matrix[i] = new double[n];
      for (int j = 0; j < n; j++) {
         double randOut = std::rand();
         matrix[i][j] = randOut;
      }
   }
}

I get the following error:

Exception thrown at 0x013A1C0B in DebugSVD.exe: 0xC0000005: Access violation writing location 0x00AC1000.

While the error message informs us that this is an Access violation, it is virtually useless in helping determine where the access violation is stemming from. Visual Studio is only able to show me disassembly, so debugging in that way is impossible as well.

The program outputs the following (because mkl_verbose is set to one) before throwing the error:

MKL_VERBOSE Intel(R) MKL 2017.0 Update 3 Product build 20170413 for 32-bit Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors, Win 2.30GHz intel_thread
MKL_VERBOSE DGESVD(A,A,500,400,00B60040,500,0055E008,00CF0040,500,00EE0040,400,003FF524,-1,0) 32.05ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:8
MKL_VERBOSE DGESVD(A,A,500,400,00B60040,500,0055E008,00CF0040,500,00EE0040,400,01231080,46000,0) 3.98s CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:8

My settings are as follows: MKL is enabled in the Visual Studio project and set to Parallel. The only other changes are that I copied libiomp5md.dll into the project folder.

I appreciate any help debugging my problem! Thank you so much.

↧

Pardiso generates wrong result for a small test matrix

August 27, 2017, 9:10 pm

Latest and popular articles on Intel Technologies

≫ Next: Is DFTI_NUMBER_OF_TRANSFORMS data-parallel?

≪ Previous: Access Violation Error while using dgesvd for c on Visual Studio

Hi,

I've written a test program to get familiar with the pardiso. However I find that comparing with the result from Eigen the pardiso spits out totally incorrect answer. I have totally no idea what went wrong and please help. Code is attached. The compile command I used is also listed as:

icpc -I ~/Lib/Eigen -std=c++11 -mkl=parallel -qopenmp -O3 -xCORE-AVX2 PTEST.cpp -o PardisoTest

Best,

Izzy

Attachment	Size
Download PTEST.cpp	2.8 KB

↧

Is DFTI_NUMBER_OF_TRANSFORMS data-parallel?

August 28, 2017, 11:44 am

Latest and popular articles on Intel Technologies

≫ Next: cluster_sparse_solver and cluster_sparse_solver_64

≪ Previous: Pardiso generates wrong result for a small test matrix

If I set DFTI_NUMBER_OF_TRANSFORMS to 4 on a AVX computer, or 8 on a AVX-512 KNL, will MKL's DftiComputeForward/Backward compute the FFT's of similar but independant, non-overlapping arrays simultaneously in SIMD or sequentially one after the other?

Thanks

↧

cluster_sparse_solver and cluster_sparse_solver_64

August 30, 2017, 10:40 am

Latest and popular articles on Intel Technologies

≫ Next: softmax

≪ Previous: Is DFTI_NUMBER_OF_TRANSFORMS data-parallel?

Hello all,

I have developed a code for 3D fluid flow using FEM coupeled method. Once I have used cluster_sparse_solver. For compiling this file I have used the following terms, using intel link advisor:

mpiifort USEFULLS.f90 CONSTANTS.f90 PRE_PROCESSOR_3D.f90 DATATYPES.f90 VEL_SUBS.f90 SPARSE_SUB.f90 parallel00.f90 -I"%MKLROOT%"\include -heap-arrays mkl_intel_lp64_dll.lib mkl_intel_thread_dll.lib mkl_core_dll.lib mkl_blacs_lp64_dll.lib impi.lib libiomp5md.lib -o t1

When I increase the mesh number, the nonzero components of the A matrix (Ax = RHS) will exceed 500000000. So, based on the advide of the online document of the intel, I use cluster_sparse_solver_64. This time all of the integer input parameters are integer(8). in order to compile I do the following (again based on the link advisor):

mpiifort USEFULLS.f90 CONSTANTS.f90 PRE_PROCESSOR_3D.f90 DATATYPES.f90 VEL_SUBS.f90 SPARSE_SUB.f90 parallel01.f90 /4I8 -I"%MKLROOT%"\include mkl_intel_ilp64.lib mkl_intel_thread.lib mkl_core.lib mkl_blacs_intelmpi_ilp64.lib impi.lib libiomp5md.lib -o t1

But, I see the following error:

parallel01.f90(281): error #6285: There is no matching specific subroutine for this generic subroutine call. [CLUSTER_SPARSE_SOLVER_64]

Can someone please help me. I had similar problem with pardiso and pardiso_64. But, in that case I add -i8 to the compiling term and the problem was solved. But, for cluster_sparse_solver, it seems to be more complicated.

Best regards

Mehdi

↧

softmax

September 5, 2017, 12:48 am

Latest and popular articles on Intel Technologies

≫ Next: Announcing new tool -- Intel® Math Kernel Library LAPACK Function Finding Advisor

≪ Previous: cluster_sparse_solver and cluster_sparse_solver_64

Does MKL support the popular DNN operation called softmax?

I cannot find any suitable function.

↧