MKL Data fitting, log-linear interpolation

July 24, 2013, 1:14 am

Latest and popular articles on Intel Technologies

≪ Previous: Core Access Limit for PARDISO Solver?

Hi,

a log-linear Interpolation can be calculated by the MKL data fitting if one applies the toolbox to the log-scaled values and applied the exponential function to the result of dfdInterpolate1D.

What about the Integration? One has to apply the exponential function on each integration segment. My first idea was to use the call back mechanism of dfdIntegrateEx1D.

If I understand correctly I has to implement the integration by my-self. It is not a big deal, but at least performance improvements for the integration parts seems to reduce to managed code performance (I apply a .net wrapper for the MKL functionality and the call back method is managed code as well).

Kind regards

Markus Wendt

↧

bug in MKL 11.0update5 DGESVG

July 24, 2013, 10:14 am

Latest and popular articles on Intel Technologies

≫ Next: weired results from MKL FFT

≪ Previous: MKL Data fitting, log-linear interpolation

Same code works fine in 11.0update1 but return error (result=-13) in update5

Linux version CentOS 6.4
$ uname -a
Linux 2.6.32-358.6.2.el6.x86_64 #1 SMP Thu May 16 20:59:36 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

code:

dgesvd(v_job, u_job, &M, &N, data_, &LDA, s->Data(), V->Data(), &V_stride, U->Data(), &U_stride, p_work, &l_work, &result);

data:

u_job char * 0x4d4fea "N"
v_job char * 0x4d4fea "N"
M KaldiBlasInt 1
N KaldiBlasInt 10
LDA KaldiBlasInt 2
V_stride KaldiBlasInt 2
U_stride KaldiBlasInt 2

data_ double [20] 0x741300
data_[0] double 0.78239572048187256
data_[1] double 1.0829823019173015e-312
data_[2] double -0.50321561098098755
data_[3] double 1.5021374629402668
data_[4] double 2.4055604934692383
data_[5] double 1.0829822489929895e-312
data_[6] double -0.79770100116729736
data_[7] double 0
data_[8] double 0.25807684659957886
data_[9] double 0
data_[10] double 1.0628244876861572
data_[11] double 0
data_[12] double 0.30530369281768799
data_[13] double 0
data_[14] double 0.82724034786224365
data_[15] double 0
data_[16] double -0.49196150898933411
data_[17] double 0
data_[18] double -0.21408705413341522
data_[19] double -0.10297946631908417

l_work KaldiBlasInt 5

p_work double * 0x741460

result KaldiBlasInt -13

↧

weired results from MKL FFT

July 24, 2013, 10:58 am

Latest and popular articles on Intel Technologies

≫ Next: crashes of gemm on new Windows machine

≪ Previous: bug in MKL 11.0update5 DGESVG

I am trying to perfrom 1D (in place) a complex to a complex fourier transform (forward and backward). In my code I added

Status = DftiCreateDescriptor(desc1,DFTI_DOUBLE,DFTI_COMPLEX,1, N)

Status = DftiCommitDescriptor(desc1)
Status = DftiComputeForward(desc1,coef)

               do j=1,n
                coefout(j)=coef(j)*(2.d0*pi*cmplx(0.d0,1.d0)*dble(j)/tlength)**2
               enddo

Status = DftiComputeBackward(desc1,coefout)
Status = DftiFreeDescriptor(desc1)

The program finish without any error but the results looks very weired.

The input data looks like

       0.145213596D-05   0.000000000D+00
      0.235852281D-05   0.000000000D+00
      0.379253834D-05   0.000000000D+00
       0.603777471D-05   0.000000000D+00
       0.951657967D-05   0.000000000D+00
      0.148505296D-04   0.000000000D+00
       0.229435209D-04   0.000000000D+00
       0.350941882D-04   0.000000000D+00
       0.531456135D-04   0.000000000D+00
      0.796813547D-04   0.000000000D+00

However after performing forward FFT I obtained

      -0.280573615D-21   0.154142831D-43
      -0.153507094D-21   0.154142831D-43
      -0.342774500D+06   0.980908925D-44
       0.226293969D+06 -0.980908925D-44
       0.466064128D+10   0.000000000D+00
       0.830659660D+20   0.000000000D+00
       0.279607085D+32   0.280259693D-44
      0.803789090D-34 -0.280259693D-44
      0.933711541D-28   0.000000000D+00
      0.741872405D-23   0.000000000D+00

While after backward FFT I got

       0.176396978D-37 -0.267003409D-40
      -0.786617726D-27 -0.303521247D-41
      -0.209912859D-17 -0.277344992D-40
      -0.898662859D+12 -0.461027195D-41
       0.113104136D+37 -0.283342549D-40
      -0.465754617D+22 -0.627921842D-41
      -0.118460876D+32 -0.286397380D-40
       0.209648241D+35 -0.788790906D-41
       0.274214524D-33 -0.286523497D-40
      0.218714569D+03 -0.941112049D-41

The results look very weired to me. do you have any idea how can I check if the fft code works probably?

↧

crashes of gemm on new Windows machine

July 24, 2013, 5:56 pm

Latest and popular articles on Intel Technologies

≫ Next: Incorrect workspace size returned by query to DGESVD in MKL 11.0.5

≪ Previous: weired results from MKL FFT

I have just set up a new Dell XPS 8700 with an Intel(R) Core(TM) i7-4770 running Windows 7 Pro. I also have a Dell XPS 8100 with Intel's i7-860 running Windows 7 Ultimate, and a Dell Inspiron Notebook with Intel's i3-2350M running Windows 7 Home Premium.

All three computers have MS Visual Studio 2010 and 2012, and Intel® Visual Fortran Composer XE for Windows* 2013 w/MKL 11 (installed on top of the 2011 version with MKL 10). I've been running these on linear algebra applications. All of my apps compile and execute fine on both of the older machines, but I'm having problems on the new (XPS 8700, i-4770) one. These problems occur only after installation of the XE 2013/MKL 11, and not with XE 2011/MKL 10 using VS 2010.

Most simply, just doing a simple matrix multiplication using gemm in debug gives access violation code -1073741819 (0xc0000005). I see the same behavior on both VS 2010 and 2012.

Exactly the same program, with exactly the same Project Properties, runs without a problem on both older machines.

I've checked the Program Files (86)>Intel>Composer XE 2013>mkl>lib>intel64 contents of the new XPS 8700 with those of the XPS 8100 and see no differences.

Any suggestions will be appreciated.

↧

Incorrect workspace size returned by query to DGESVD in MKL 11.0.5

July 25, 2013, 4:17 am

Latest and popular articles on Intel Technologies

≫ Next: error using mkl as default blas implementation on linux

≪ Previous: crashes of gemm on new Windows machine

There are discrepancies between the required minimum value of the argument lwork (i) as stated in the documentation, (ii) as returned by a workspace query with lwork=-1 and (iii) the value required to pass the input argument check in routine DGESVD, for the case JOBU='N', JOBVT='N' -- the case where only the singular values are desired. Here is a reproducer, based on the details provided in another thread ("bug in MKL 11.0update5 DGESVG", http://software.intel.com/en-us/forums/topic/402436). The outputs were obtained using IFort 13.1.3.198/32-bit on Win-8-Pro-64.

program dgesvdx
c Program to demonstrate incorrect estimate of lwork in call to ?gesvd
c in MKL 11.0.5
c
c use mkl_lapack
 implicit none
 character*1 :: jobu, jobvt
 integer :: m,n,lda,ldu,ldvt,lwork,info
 double precision :: A(1,10)
 double precision :: work(13),S(1),U(1),VT(1)
 character*200 buf
 data A/2d0, 7d0, 5d0, 9d0, 3d0, 6d0, 2d0, 5d0, 4d0, 8d0/
c
 call mkl_get_version_string( buf )
 write(*,'(A)')buf
 m=1
 n=10
 jobu='N'
 jobvt='N'
 lda=1
 ldu=1
 ldvt=1
c
c First call to find lwork needed
c
 lwork=-1
 call dgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt,
 + work, lwork, info)
 write(*,*)'info = ',info,' lwork asked for by MKL =',work(1)
 lwork=nint(work(1))
c 
c starting with size returned, try increasing values until DGESVD
c accepts the value as sufficient
c 
 info=-1
 do while(info .lt. 0)
 call dgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt,
 + work, lwork, info)
 write(*,*)'lwork = ',lwork,' info = ',info
 lwork=lwork+1
 end do
 write(*,*)'s = ',s(1)
 end program dgesvdx

Output with version 11.0.1 of MKL

Intel(R) Math Kernel Library Version 11.0.1 Product Build 20121016 for 32-bit applications
 info = 0 lwork asked for by MKL = 5.00000000000000
 lwork = 5 info = 0
 s = 17.6918060129541

Output with version 11.0.5 of MKL

Intel(R) Math Kernel Library Version 11.0.5 Product Build 20130612 for 32-bit applications
 info = 0 lwork asked for by MKL = 5.00000000000000
MKL ERROR: Parameter 13 was incorrect on entry to DGESVD.
 lwork = 5 info = -13
MKL ERROR: Parameter 13 was incorrect on entry to DGESVD.
 lwork = 6 info = -13
MKL ERROR: Parameter 13 was incorrect on entry to DGESVD.
 lwork = 7 info = -13
MKL ERROR: Parameter 13 was incorrect on entry to DGESVD.
 lwork = 8 info = -13
MKL ERROR: Parameter 13 was incorrect on entry to DGESVD.
 lwork = 9 info = -13
MKL ERROR: Parameter 13 was incorrect on entry to DGESVD.
 lwork = 10 info = -13
MKL ERROR: Parameter 13 was incorrect on entry to DGESVD.
 lwork = 11 info = -13
MKL ERROR: Parameter 13 was incorrect on entry to DGESVD.
 lwork = 12 info = -13
 lwork = 13 info = 0
 s = 17.6918060129541

↧

error using mkl as default blas implementation on linux

July 25, 2013, 2:58 pm

Latest and popular articles on Intel Technologies

≫ Next: Stability of trusted region algorithm in MKL 10.3

≪ Previous: Incorrect workspace size returned by query to DGESVD in MKL 11.0.5

Hi,

I don't know if this makes much sense, but I tried to use update-alternatives on linux to make all the symlinks for libblas.so, libblas.so.3gf, liblapack.so and liblapack.so.3gf point to libmkl_rt.so. I was hoping that this way numpy will use the mkl implementation of blas operations without needing to recompile it. But, as it turns out, the loader is having some trouble with the multi-threading library, which I expected to be libmkl_gnu_thread.so instead of libmkl_intel_thread.so.

python -c "from numpy import zeros; a = zeros((17,17)); b = a.dot(a)"

python: symbol lookup error: /opt/intel/mkl/lib/intel64/libmkl_intel_thread.so: undefined symbol: omp_get_num_procs

I am using mkl 11.0 update 5, which I upgraded from mkl 11.0 update 2, which didn't work either. Everything is on 64bits. I also noticed that for matrices smaller than (17, 17) ther error doesn't occur.

Thanks,

Bogdan

↧

Stability of trusted region algorithm in MKL 10.3

July 26, 2013, 2:13 am

Latest and popular articles on Intel Technologies

≫ Next: Use one LU factorization in several instances of mkl_dss_solve

≪ Previous: error using mkl as default blas implementation on linux

Hi all,

I am using the trusted region algorithm (from MKL 10.3) to solve a non linear least square problem with boundary constraints. I followed the documentation and the examples that are provided to implement it in my code, and I find some stability issues with this method. Every time I launch my optimization, I get a (slightly) different results, with different termination criteria, and different number of iteration. It seems to me that the algorithm takes a different path for every trial that I make. As far as I know, the TR algorithm is determinist, and when starting from the same initial guess, it should always converge to the same solution. This is at least what I observe in the matlab implementation of this algorithm.

I precise that I am quite confident in my implementation because the residual that I obtain with MKL TR algorithm are consistant with those of matlab.

Does any one of you has an explanation for this puzzling behavior?

Best regards,

T.Boutelier

↧

Use one LU factorization in several instances of mkl_dss_solve

July 29, 2013, 2:55 pm

Latest and popular articles on Intel Technologies

≫ Next: Some documentation problems

≪ Previous: Stability of trusted region algorithm in MKL 10.3

I am using Intel MKL library to solve a system of linear equations (A*x = b) with multiple right-hand side (rhs) vectors. The rhs vectors are generated asynchronously and through a separate routine and therefore, it is not possible to solve them all at once.

In order to expedite the program, a multi-threaded program is used where each thread is responsible for solving a single rhs vectors. Since the matrix A is always constant, LU factorization should be performed once and the factors are used subsequently in all threads. So, I factor A using following command

dss_factor_real(handle, opt, data);

and pass the handle to the threads to solve the problems using following command:

dss_solve_real(handle, opt, rhs, nRhs, sol);

However, I found out that it is not thread-safe to use the same handle in several instances ofdss_solve_real. Apparently, for some reason, MKL library changes handle in each instance which creates race condition. I read the MKL manual but could not find anything relevant. Since it is not logical to factorize A for each thread, I am wondering if there is any way to overcome this problem and use the same handle everywhere.

Thanks in advance for your help

↧

Some documentation problems

July 31, 2013, 10:15 am

Latest and popular articles on Intel Technologies

≫ Next: combining direct pardiso and Fgmres?

≪ Previous: Use one LU factorization in several instances of mkl_dss_solve

Hi!

As of MKL Version 11 Update 3 and 5 for Linux, User's Guide in Appendix C: Directory Structure In Detail, Dynamic Libraries section miss description for libmkl_avx2.so, libmkl_vml_avx2.so for both 64 a32 bit versions, and libmkl_avx.so for 32 bit version.

Sure, libraries purpose could be easily deduced, but will be good idea to keep documentation up-to-date.

Will be good idea to check sections about static libraries too, as well as documentation for other platforms.

Eugene.

↧

combining direct pardiso and Fgmres?

July 31, 2013, 11:38 am

Latest and popular articles on Intel Technologies

≫ Next: How to set affinity while using MKl in sequential mode

≪ Previous: Some documentation problems

i am currently writting a nonlinear finite element code. so i have to solve a nonlinear system.
for each iteration my matrixes changes, i use direct solution for each iteration.my matrix sparsity maybe change in each iteration so i cant use pardiso iterative solver at phase 13 and phase 23 as suggested here:

http://software.intel.com/en-us/forums/topic/326721

in this thread:

http://software.intel.com/en-us/forums/topic/389822

Alexander Kalinkin sugested we can use pardiso LU decomposition of first matrix as preconditioner for Fgemres iterative solver.

i want to know how i can do that?

↧

How to set affinity while using MKl in sequential mode

July 31, 2013, 3:33 pm

Latest and popular articles on Intel Technologies

≫ Next: Threaded iterative sparse solver

≪ Previous: combining direct pardiso and Fgmres?

I have written a multi-threaded code using pthread. Each thread calls an instance of dss_solve_real separately. I compile the code using following libraries to make sure that MKL works in sequential mode:

$(MKLROOT)/lib/intel64/libmkl_intel_ilp64.a $(MKLROOT)/lib/intel64/libmkl_sequential.a $(MKLROOT)/lib/intel64/libmkl_core.a -lm -lpthread

Also, I have disabled KMP_AFFINITY using:

env KMP_AFFINITY=disabled

The number of threads for MKL is also manually determined in the code using:

mkl_set_num_threads(1);

I use the following code to set affinity for each thread. This piece code is executed at the beginning of each thread's function:

pthread_t curThread = pthread_self();
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(threadCPUNum[threadData->numOfCurThread], &cpuset);
sched_setaffinity(curThread, sizeof(cpuset), &cpuset);

In this code, threadCPUNum[threadData->numOfCurThread] represents number of the CPU to which current thread will be binded to.

In order to make sure that MKL respects my CPU affinity settings, I initially bind all the threads to CPU0 by setting all elements of threadCPUNum array to zero. However, monitoring CPU utilization reveals that MKL does not pay attention to sched_setaffinity and uses different processors.

I would like to know what I am missing here and how I can force MKL function (dss_solve_real) to bind to a specific CPU.

Thanks in advance for your help.

↧

Threaded iterative sparse solver

August 1, 2013, 8:19 am

Latest and popular articles on Intel Technologies

≫ Next: Grouped Means only (no covariance matrix) in SS in Intel MKL?

≪ Previous: How to set affinity while using MKl in sequential mode

Intel MKL 11 library offers optimized set of threaded functions, but for case of iterative sparse solver (ISS), the preconditioned conjugate gradient method does not seem to be straightforward to be threaded.

To be more precise, using preconditioning techniques such as incomplete Cholesky factorization or ILU, at some point sparse triangular solvers are required, but corresponding MKL function to perform triangular solving mkl_cspblas_?csrtrsv is not threaded.

I'd like to know if dcg is threaded, and if there is any workaround to achieve better performance in iterative solvers on multi-processors? Should I expect threaded ISS in a near future?

↧

Grouped Means only (no covariance matrix) in SS in Intel MKL?

August 1, 2013, 6:01 pm

Latest and popular articles on Intel Technologies

≫ Next: Where are all the .dll files located?

≪ Previous: Threaded iterative sparse solver

Hello,

I've successfully used SS capabilities of Intel MKL to compute the grouped/pooled covariance/correlation matrices. Now I face the the situation when only grouped means are required. There is an option for estimates "VSL_SS_GROUP_COV", which would do the job of computing both Grouped Means and Grouped Cov. Matrices. I couldn't find a way for skipping the computation of Cov. matrix (i.e. there is no option "VSL_SS_GROUP_MEAN"). Is there a solution to this problem?

Thank you very much.

↧

Where are all the .dll files located?

August 5, 2013, 7:39 am

Latest and popular articles on Intel Technologies

≫ Next: MKL with legacy app (Visual Studio 6)

≪ Previous: Grouped Means only (no covariance matrix) in SS in Intel MKL?

I just bought the Intel Composer XE 2013 which contains MKL libraries, I just installed it into a Windows server 2008 machine.

I´m writting a program that loads dynamically(LoadLibrary()) MKL libraries, but in order to do so I need to use mkl_rt.dll file, I'm able to locate the mkl_rt.lib file, but no .dll, actually I just noticed that inside MKL_HOME/lib/intel64 directory, there is not even one .dll file, I've also searched for any .dll file inside MKL_HOME directory without success.

Are the .dll files being installed in some other directory?

Here is the list of the files inside MKL_HOME/lib/intel64

[C:\Program Files (x86)\Intel\Composer XE 2013\mkl\lib\intel64]ls
mkl_blacs_ilp64_dll.lib mkl_cdft_core.lib mkl_intel_thread_dll.lib mkl_scalapack_lp64_dll.lib
mkl_blacs_intelmpi_ilp64.lib mkl_cdft_core_dll.lib mkl_lapack95_ilp64.lib mkl_sequential.lib
mkl_blacs_intelmpi_lp64.lib mkl_core.lib mkl_lapack95_lp64.lib mkl_sequential_dll.lib
mkl_blacs_lp64_dll.lib mkl_core_dll.lib mkl_pgi_thread.lib mydll.c
mkl_blacs_mpich2_ilp64.lib mkl_intel_ilp64.lib mkl_pgi_thread_dll.lib
mkl_blacs_mpich2_lp64.lib mkl_intel_ilp64_dll.lib
mkl_blacs_msmpi_ilp64.lib mkl_intel_lp64.lib mkl_rt.lib
mkl_blacs_msmpi_lp64.lib mkl_intel_lp64_dll.lib mkl_scalapack_ilp64.lib
mkl_blas95_ilp64.lib mkl_intel_sp2dp.lib mkl_scalapack_ilp64_dll.lib
mkl_blas95_lp64.lib mkl_intel_thread.lib mkl_scalapack_lp64.lib

↧

MKL with legacy app (Visual Studio 6)

August 5, 2013, 9:10 am

Latest and popular articles on Intel Technologies

≫ Next: pardiso

≪ Previous: Where are all the .dll files located?

I have a legacy app written in C++ with Visual Studio 6. It uses an older version of the MKL library.

The app is currently getting memory allocation errors from the MKL code when using FFT and IFFT (error code 1) (The input and output buffers are set to be the same to reduce the memory footprint). The data is around 344 MB. The app has other demands on memory.

I was hoping to test a newer version of the MKL to see whether it would work with this dataset.

Unfortunately, the new MKL library (the Release 11 Update 5), produces a load of linker errors (I'm linking against mkl_intel_c, mkl_sequential and mkl_core.

mkl_core.lib(psdftsfactca_w7---ownscDftOutOrdInv_Prime13_32fc_20121126.obj) : error LNK2001: unresolved external symbol ___security_cookie

This is with Visual studio 6.0 so there's no /GS switch or BufferOverflowU.lib

↧

pardiso

August 5, 2013, 10:43 am

Latest and popular articles on Intel Technologies

≫ Next: MKL DFT descriptor generation question

≪ Previous: MKL with legacy app (Visual Studio 6)

hi,all I want to use the pardiso to solve a complex and symmetrical matrix .But it display error:The type of the actual argument differs from the type of the dummy argument.But when the matrix is real ,it work. I can't understand why it happen.Then I check the 'mkl_pardiso.f90' file,and not find the complex statement.only real .So when the matrix is real,it work well. How can I fix it? Do I need to rewrite the mkl_pardiso.f90??? so i hope someone can point me to a good place or help me out! Thanks!

↧

MKL DFT descriptor generation question

August 6, 2013, 10:05 am

Latest and popular articles on Intel Technologies

≫ Next: Difference between libmkl_mc.so and libmkl_mc3.so

≪ Previous: pardiso

Hi there,

I have a question about the DFTI descriptor.

So the problem is 1Kx1K complex number, row major. for each row of 1K element, I would like to compute size-16 FFT with stride 64. That is - I do not want to compute size -1024 FFT but only size-16 FFT.

For example: these 16- elements are element 0, 64, 128, 192, ... 1008. and another size-16 FFT elements are element 1, 65, 129, ... 1009, etc.

And the same computation is applied on all the 1K rows.

I had a look at the reference manual but am not sure if the descriptor could generate that.

specifically, I don't know arguments like:

1) num_of_transforms 2) stride, 3) dist.

Thanks!

Jing

↧

Difference between libmkl_mc.so and libmkl_mc3.so

August 6, 2013, 3:26 pm

Latest and popular articles on Intel Technologies

≫ Next: DGEMM with pgithread is giving segmentation fault

≪ Previous: MKL DFT descriptor generation question

What is the actual difference between libmkl_mc.so and libmkl_mc3.so. Moreover, How to do you which SSE version is linked to libmkl_mc3.so

When I force the code to use SSE4.1, by this command. mkl_cbwr_set(MKL_CBWR_SSE4_1). Then libmkl_mc.so is being used and the performance is better. But if I don't force it, the code simply uses libmkl_mc3.so and the performance is bad.

Can anyone explain me what actually the difference is?

↧

DGEMM with pgithread is giving segmentation fault

August 6, 2013, 4:44 pm

Latest and popular articles on Intel Technologies

≫ Next: DGEMM with pgithread is giving segmentation fault

≪ Previous: Difference between libmkl_mc.so and libmkl_mc3.so

Hello,

My code uses multithreaded MKL dgemm. I use following to link the code

-L$(MKL_LIBDIR) -lmkl_intel_lp64 -lmkl_pgi_thread -lmkl_core -L/usr/common/usg/intel/lib/intel64 -lpthread -lm -pgf90libs

The code for some reasons give segmentation at some calls to DGEMM. I ran gdb this is output of backtrace

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x43c08940 (LWP 30503)]
0x00002aaaf7ee7df5 in mkl_blas_mc3_dgemm_copyan ()
from /usr/common/usg/intel/13.0.028/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_mc3.so
(gdb) bt
#0 0x00002aaaf7ee7df5 in mkl_blas_mc3_dgemm_copyan ()
from /usr/common/usg/intel/13.0.028/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_mc3.so
#1 0x00002aaaab4ccef2 in mkl_blas_dgemm_2d_acopy_n ()
from /usr/common/usg/intel/13.0.028/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_pgi_thread.so
#2 0x00002aaaab4c9a75 in gemm_host ()
from /usr/common/usg/intel/13.0.028/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_pgi_thread.so
#3 0x00002aaaab4c8951 in mkl_blas_dgemm ()
from /usr/common/usg/intel/13.0.028/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_pgi_thread.so
#4 0x00002aaaaada7799 in dgemm_ ()
from /usr/common/usg/intel/13.0.028/composer_xe_2013.1.117/mkl/lib/intel64/libmkl_intel_lp64.so
#5 0x000000000045c6ef in my_dgemm_ (a=0x519863 "N", b=0x519863 "N", c=0x7fffffffad64,
d=0x7fffffffad48, e=0x7fffffffad98, f=0x7fffffffaeb0, g=0x59b5fd0, h=0x7fffffffad40,
i=0x2aaaf6db5010, j=0x7fffffffad98, k=0x7fffffffaea8, l=0x2aaac72c4010, m=0x7fffffffad64, n=1,
o=1) at ./pdgstrf.c:165

I ran this code on dual socket xeon 5550 2.67GHz system. The code doesn't give segfault until MKL_NUM_THREADS is set to 1, 2, 3, ..7, but for 8 it does.

↧

DGEMM with pgithread is giving segmentation fault

August 6, 2013, 4:44 pm

Latest and popular articles on Intel Technologies

≫ Next: Large overhead and spin time reported in MKL functions

≪ Previous: DGEMM with pgithread is giving segmentation fault

Hello,

My code uses multithreaded MKL dgemm. I use following to link the code

-L$(MKL_LIBDIR) -lmkl_intel_lp64 -lmkl_pgi_thread -lmkl_core -L/usr/common/usg/intel/lib/intel64 -lpthread -lm -pgf90libs

The code for some reasons give segmentation at some calls to DGEMM. I ran gdb this is output of backtrace

I ran this code on dual socket xeon 5550 2.67GHz system. The code doesn't give segfault until MKL_NUM_THREADS is set to 1, 2, 3, ..7, but for 8 it does.

↧