Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all 3005 articles
Browse latest View live

BACON outlier detection

$
0
0

I'm trying to run the following code:

#include <iostream>

#include "mkl.h"

int main () {

    /* Define vector of obseravations (5 2D observation points) */

    float pObservations [] = {1., 2., 3., 4.2, 5, 9., 10., 8., 7., 6., 5., 9.};

    /* Creates and initializes a new summary statistics task descriptor */

    VSLSSTaskPtr task;
    const int p = 2;
    const int n = 5;
    const int xstorage = VSL_SS_MATRIX_STORAGE_ROWS;
    int status = 0;
    status = vslsSSNewTask (&task, &p, &n, &xstorage, pObservations, NULL, NULL);
    if (status != VSL_STATUS_OK) {
        std::cout << "Failed to create a new summary statistics task descriptor"<< std::endl;
        throw false;
    }

    /* Modifies array pointers related to multivariate mean calculation */

    float* pMean = new float [p];
    status = vslsSSEditTask (task, VSL_SS_ED_MEAN, pMean);
    if (status != VSL_STATUS_OK) {
        std::cout << "Failed to modifies array pointers related to multivariate mean calculation"<< std::endl;
        throw false;
    }

    /* Computes Summary Statistics estimates - mean calculation */

    status = vslsSSCompute(task, VSL_SS_MEAN, VSL_SS_METHOD_FAST);
    if (status != VSL_STATUS_OK) {
        std::cout << "Failed to compute summary statistics estimates with error code "<< status << std::endl;
        throw false;
    }

    // Print mean values
    for (int ip = 0; ip < p; ip++)
        std::cout << pMean [ip] << std::endl;

    /* Modifies array pointers related to multivariate outliers detection */

    const int nParams = 0;
    float* pWeights = new float [n];
    status = vslsSSEditOutliersDetection (task, &nParams, NULL, pWeights);
    if (status != VSL_STATUS_OK) {
        std::cout << "Failed to modifies array pointers related to multivariate outliers detection"<< std::endl;
        throw false;
    }

    /* Computes Summary Statistics estimates - outlier detection */

    status = vslsSSCompute(task, VSL_SS_OUTLIERS, VSL_SS_METHOD_BACON);
    if (status != VSL_STATUS_OK) {
        std::cout << "Failed to compute summary statistics estimates with error code "<< status << std::endl;
        throw false;
    }

    return (0);
}

for a 2D data set consisting of 5 pairs of observation. The output of the program reads:

3.04
8
Failed to compute summary statistics estimates with error code -4002
terminate called after throwing an instance of 'bool'
Abort (core dumped)

The first two numbers are the mean values of the observations in each dimension (2), and the result is accurate. 

I'm using the same dataset for the outlier detection BACON algorithm, but can an error -4002, which means that the number of input observation (5 in my case) is either 0 or negative. 

 

Is this a bug in MKL, or something wrong on my side.

 

Thanks,

Yaniv

 

 

 

 


pardiso using a lot of memory

$
0
0

I'm using Pardiso to solve a large number of linear systems, and its using quite a lot of memory. My program takes the following steps

  1. Make a large, sparse matrix
  2. Call Pardiso with phase=11 on that matrix
  3. Call pardiso with phase=22 on that matrix
  4. Call pardiso many times with phase=33, with various right hand sides
  5. Make a new matrix (with the same sparse structure). and go to step 2.

The amount of memory used by this function steadily increases. By the end of the program, the program is using a lot of memory, almost all of which came from calls to pardiso with phase=33. I don't see why pardiso needs to use so much memory, and I assume I am failing to deallocate something. I don't call pardiso with phase=0 until the end of the program, which I think is because pardiso needs the memory where the LU decomposition is stored all the way to the end. (I tried calling pardiso with phase=0 at the start of step 5, this segfaults). I call mkl_free_buffers_() at the start of step 5, this does not solve the problem.

Any help would be greatly appreciated. I'm using the version of MKL which ships with intel composer 2013 sp1 

 

pardiso and pardiso_64 internal addressing

$
0
0

Hi,

 

Using the LP64 interface, will the internal addressing of pardiso be 64-bit, or do I have to use the pardiso_64 version for that? I am using Intel Parallel studio 2016 with mkl 11.3 update 3.

My problem now is that I use pardiso_64, but then when I use calls like mkl_dcsrsymv I have to change type from int64_t to int32_t on all indices.

 

Jens

 

extended eigensolver routines

$
0
0

Hello,

I am trying to diagonalize a matrix using the feast algorithm routines. I have written this small program to test

zfeast_heev. The following code tries to diagonalize a 4x4 complex Hermitian matrix:  

 

program test



    implicit none
    integer, parameter :: n=4      ! h0 nxn matrix

    complex(16) :: h0(n,n), zero

    character(len=1), parameter :: UPLO='F'
    integer :: fpm(128), loop, M0,M,info
    real(8) ::  epsout, E(n), res(n), Pi, alfa, Mk

    integer, parameter :: L=4
    complex(16) :: X(n,n)
    integer :: i,j,k,i1,j1

    real(8), parameter :: Emin=-5.0d0, Emax=5.0d0
   complex(16), parameter :: ii=(0.0d0,1.0d0)

    open (1, file='eigenvalues.dat')
    open (2, file='check.dat')

     zero=dcmplx(0.d0, 0.d0)

     h0=zero

     h0(1,2)= 2.0d0 + 2.0d0*ii
     h0(1,3)= 3.0d0 - 2.0d0*ii
     h0(2,1)= 2.0d0 - 2.0d0*ii
     h0(2,4)= 3.0d0 - 2.0d0*ii
     h0(3,1)= 3.0d0 + 2.0d0*ii
     h0(3,4)= -2.0d0 - 2.0d0*ii
     h0(4,2)= 3.0d0 + 2.0d0*ii
     h0(4,3)= -2.0d0 - 2.0d0*ii


    M0=L
    M=M0
      print *,'Search interval ', Emin,'', Emax

      call feastinit(fpm)
      fpm(1)=1
      print *, ' Testing zfeast_hcsrev '


     call zfeast_heev(UPLO,n,h0,n, fpm, epsout, loop,Emin,Emax,M0,E,X,M,res,info)
    print  *,' FEAST OUTPUT INFO ',info
    if(info/=0) stop 1

    print *, 'Number of eigenvalues found ', M


    do i1=1,M
    write(1,*) E(i1)
    end do



    end program test

The eigenvalues of the matrix h0 are 4.58258, 4.58258, -4.58258, -4.58258 but this program gives me different results.

I hope someone can explain me what I am missing in order to make this code work properly.

Routines mkl_?tppack and mkl_?tpunpack

$
0
0

The documentation reads the following names for these routines (C interface): LAPACKE_mkl_?tppack and LAPACKE_mkl_?tpunpack. However, these names are not defined in the mkl_lapacke.h header file. Instead, LAPACKE_?tppack and LAPACKE_?tpunpack (with no mkl) are defined. If I use these names I get an "unresolved external" error on linking. Manual modification of the aforementioned header file solves the problem. I suggest this header file be fixed accordingly.

LAPACKE_dsyev on Xeon Phi - eigenvalues on xeon phi

$
0
0

Dear MKL Forum,

       I am testing a Xeon Phi x100 family with 5GB using automatic offload, MKL composer_xe_2013.1.117 and icc 13.0.1, a function as cblas_dgemm is working well with xeon phi. However the function LAPACKE_dsyev is not using xeon Phi. The documentation says the ?syev should work with a N bigger then 8000, my matrix has 10000. There is any different configuration to calculate eigenvalues in xeon phi ?

 Best regards

Complexity of functions ?potrs, ?potrf and cblas_dgemm

$
0
0

Dear MKL forum,

I'm using the functions "?potrf" for Cholesky factorization of a matrix and "?potrs" for solving a linear equation system. Additionally I need the function "cblas_dgemm" (matrix multiplication) for further calculations. These functions are used in a distributed system with multiple servers, but I need the exact complexity for each of these algorithms for optimal load balancing (see: big O notation). I don't prefer to use the complexities given in common literature because the MKL functions are optimized and don't work with the common complexities.

Can you help me out?

Best regards

cgelss: MKL 11.2.2 -> 11.3

$
0
0

Hi,

Recently, the project I am working on switched from MKL version 11.2.2 to 11.3. However, after we switched to MKL 11.3, cgelss() started throwing an error: "Intel MKL ERROR: Parameter 4 was incorrect on entry to CGELSS." When I look at the documentation, it looks like the cgelss() call has not changed, and I am passing the same data into the solver. So I am confused to why this problem has appeared. The project also updated the Intel C++ Compiler from XE 15.0.2 to 16.0.

Does anyone have any ideas?

Thanks,

Treph

Thread Topic: 

Help Me

Compiling R with MKL

$
0
0

Hi,

I'm trying to compile R 3.3.1 on Ubuntu Server 16.04 with Intel MKL using the GNU Fortran and C compilers. Unfortunately, I'm unable to get the gcc  to locate BLAS. I've tried several combinations of suggestions between those on the Intel website, including Build R-3.0.1 with Intel MKLUsing Intel MKL with R, and the Intel Link Advisor. I've also found some other bloggers who have successfully compiled R with MKL. Unfortunately, no one else's solutions have worked for me thus far.

I first ran the provided shell script to set the environment variables. I've noticed it's the INCLUDE variable is empty. Is this a problem, or did it happen by design? All others are appropriately created.

I've also been specifying the following LDFLAGS="-L$MKLROOT/lib/intel64" or LDFLAGS = "-L$MKLROOT/include" in order to point gcc to the directories. When I check config.log after the configuration, however, I see:

  • configure: 31659: checking for dgemm_ in -lblas
  • configure: 31692: gcc -o conftest -march=native -O3 -I/usr/local/include -L/opt/intel/compilers_and_libraries_2016.3.210/linux/mkl/include conftest.c -lblas -lgfortran -lm -lquadmath -lrt -ldl -l > &5
  • /usr/bin/ld: cannot find -lblas
  • collect2: error: ld returned 1 exit status

for both settings of the LDFLAGS. 

Any suggestions would be tremendously appreciated! Thank you for your time!

vector array in mkl

$
0
0

 Hi does anyone know what is the best mapping in MKL for  ippmMul_vaca_64f.

Thanks, 

Saar.

Zone: 

Thread Topic: 

Question

Multi-thread MKL cblas_sgemm with g++ problem

$
0
0

Here's an example of sgemm program.

#include <mkl.h>
#include <iostream>
#include <cstdlib>
#define ITERATION 1

int main()
{
  int ra = 128;
  int lda = 75;
  int ldb = 55;
  float* left = (float*)calloc(ra * lda, sizeof(float));
  float* right = (float*)calloc(ldb * lda, sizeof(float));
  float* ans = (float*)calloc(ra * ldb, sizeof(float));
  std::cout << "left "<< std::endl;
  for (int i = 0; i < ra; ++i) {
    for (int j = 0; j < lda; ++j) {
      left[i * lda + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
      std::cout << left[i * lda + j] << "";
    }
    std::cout << std::endl;
  }

  std::cout << "right "<< std::endl;
  for (int i = 0; i < lda; ++i) {
    for (int j = 0; j < ldb; ++j) {
      right[i * ldb + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
      std::cout << right[i * ldb + j] << "";
    }
    std::cout << std::endl;
  }

  for (int i = 0; i < ITERATION; ++i) {
    cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, ra, ldb, lda, 1.0f, left, lda,
      right, ldb, 0.0f, ans, ldb);
  }

  std::cout << "ans "<< std::endl;
  for (int i = 0; i < ra; ++i) {
    for (int j = 0; j < ldb; ++j) {
      std::cout << ans[i * ldb + j] << "";
    }
    std::cout << std::endl;
  }

  return 0;
}

I compile this program with g++ by options `-fopenmp -lmkl_rt`, where `OMP_NUM_THREADS` has been set to 16. 

After running the program, I figure out that the answer is exactly wrong comparing to the matlab result. I wouldn't say wrong if there's only few accuracy errors. Further, I observe that the program performs well under these conditions:

  1. Use icc instead of g++,
  2. Remove -fopenmp flag,
  3. Use g++&atlas instead of icc&mkl
  4. Set OMP_NUM_THREADS=1

Therefore, I guess the problem may lay on the `-fopenmp` flag. Can you help me figure out the problem? Thank you!

g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16)

icc (ICC) 16.0.3 20160415

Linux core 2.6.32-279.el6.x86_64

Zone: 

Best way to use MKL in Python

$
0
0

Hi,

I want to install the Math Kernel Library for its use on Python, could you please advise me which one to install, Intel® System Studio 2016 or Intel® Parallel Studio XE?

Thank you in advance and I look forward to hearing from you.

Sincerely,

Carlos Torres

Zone: 

Thread Topic: 

Question

ask questions about mkl_csrmm function

$
0
0

Hi everyone,

I used mkl_<>csrmm function in deep learning, but I met with a really strange problem. One parameter in mkl_<>csrmm is called pntrb (row pointer in compressed sparse row format) and its definition is:

pntrb

INTEGER. Array of length m.

For one-based indexing this array contains row indices, such that pntrb(I) - pntrb(1) + 1 is the first index of row I in the arrays val and indx.

For zero-based indexing this array contains row indices, such that pntrb(I) - pntrb(0) is the first index of row I in the arrays val and indx.

Refer to pointerb array description in CSR Format for more details.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

As you can see, pntrb actually calculates the difference rather than absolute value. In my program, if pntrb(0) is small, it works well. But if pntrb(0) is large, the program is just hang there and no error information. I am sure the value of pntrb(0) does not exceed the maximum value (2^32-1). Could someone tell me what's wrong?Thanks

Zone: 

Thread Topic: 

Question

Compiling Applications with MKL

$
0
0

Hi all,

I have MKL installed on my Ubuntu Server 16.04. I've used the Intel-provided mklvars.sh script to provide the directories with the MKL libraries. Every time I try to use these directories with a compiler (such as gcc or gfortran), I'm never able to reference the libraries without providing the -L option and the directory manually. I thought the script would set the environment variables so I wouldn't have to provide these entries in the dynamic links. The script is sets MKLROOT, LD_LIBRARY_PATH, and LIBRARY_PATH successfully, but when I attempt to run compilers without the directory, I'm told the files cannot be found. 

Any suggestions are tremendously appreciated!

Thread Topic: 

Help Me

Efficient storage of 3D field data

$
0
0

Hi,

in working on a numerical code, which solves a discretised equation on a three dimensional grid. I have multiple fields (around 80) I need to save on this grid and which are needed to compute my results. I want to perform my computations (which consist of rather simple operations (as product / dot product) to set up a sparse matrix and solve this matrix using MKL. My questions are:

1) What ist the most efficient way to store the data? Using a 1D array or a multidimensional array? At the moment I'm using a 1D array and accessing it in my innermost loop using 

for kk...
for jj...
for ii...
for (int k = 0 ; k < 8 l k++){
for (int j = 0 ; j < 8 ; j++){
for (int i = 0 ; i < 8 ; i++){
   field1[offset + i] += field2[offset + i]
}}}
}}}

 

in order to get good spacial and temporal data locality. The data is aligned in such a way, that offset is always a multiple of 8. To do so - as far as I understood - my data should be aligned to a 64 byte boundary. Since I have many blocks consisting of ni x nj x nk cells, this alignment leads to a quite high overhead (the factor between aligned and unaligned data is between 1.5 and 2). Is there a more efficient way to store my data?

Thanks,

Sebastian

Thread Topic: 

How-To

Better way to compute phi0 + sigma*vector?

$
0
0

Hello!

I want to compute this quantity prob = phi0 + sigma*atilde, where phi0 and sigma are scalars and atilde a vector 1xind. I have computed it like this:

for(i=0;i<ind;i++){ones[i] = 1.0;}

 cblas_dcopy(ind, ones, 1, B, 1);
 cblas_dscal(ind, phi0, B, 1);
 cblas_dcopy(ind, atilde, 1, Bcan, 1);
 cblas_dscal(ind, sqrt(sigma2), Bcan, 1);
 vdAdd(ind, B, Bcan, prob);
     

I would like to ask if there is a better way to do it.

Thank you very much.

Troubles with undefined _MKLMPI_Get_wrappers

$
0
0

Hello all,

 

recently, I successfully compiled and linked the MKL Pardiso solver within a Fortran 2008 code by including mkl_pardiso.f90 into the code and using the following compiler and linker flags:

 

-m64 -O2 -I$(MKLROOT)/include -static-intel -L$(MKLROOT)/lib -mkl -qopenmp -qopenmp-link static

 

MKLROOT is set to /opt/intel//compilers_and_libraries_2016.1.111/mac/mkl

 

Now, I tried to compile with mkl_cluster_sparse_solver.f90 instead with the same settings for the makefile and run into the following compiling error:

 

Undefined symbols for architecture x86_64:

  "_MKLMPI_Get_wrappers", referenced from:

      _mkl_serv_get_mpi_wrappers in libmkl_core.a(mkl_get_mpi_wrappers_static.o)

ld: symbol(s) not found for architecture x86_64

make: *** [mpi_mkl_db] Error 1

 

 

I searched in the internet but couldn't find anything helpful for me. I also played around with some different/additional linker options (e.g. linking against different mkl links explicitly) but without success.

 

 

When using the following command:

 

nm /opt/intel/compilers_and_libraries_2016.1.111/mac/mkl/lib/libmkl_core.a | grep wrappers

 

I get this:

 

/opt/intel/compilers_and_libraries_2016.1.111/mac/mkl/lib/libmkl_core.a(mkl_get_mpi_wrappers_static

.o):

                 U _MKLMPI_Get_wrappers

 

 

So it seems to me that this routine is available, but cannot be found anyway for some reason. Probably, I am wrong here.

 

I also tried to link dynamically (omitting the 'static' options above). In this case the compilation was done without error. But when running the job, I get the following error:

 

Intel MKL FATAL ERROR: Cannot load symbol MKLMPI_Get_wrappers.

 

I would appreciate any help to resolve the problem.

 

Many thanks in advance,

 

Susan

 

Thread Topic: 

Help Me

mkl_scscmm performance problem

$
0
0

Hello,

I am building the attached program in OLCF's RHEA supercomputer (https://www.olcf.ornl.gov/computing-resources/rhea/)  with Intel compiler icc (ICC) 14.0.4 20140805. Armadillo has a naive implementation of cscmm. I run the program with MKL_NUM_THREADS=1 in rhea to multiply the sparse matrix of size 83328x124992 with a dense matrix of size 124992x50. The following is the output.

 ./a.out 83328 124992 50 0.00001

The output of the test code

::A::83328x124992
nnz::104153
::B::124992x50
mkl cscmm::162.13
::C::50x83328
arma ::0.06

I am seeing MKL_CSCMM to be really slow over armadillo naive implementation. Kindly let me know what am I doing wrong here. 

Ramki

AttachmentSize
Downloadtext/x-c++srctestcscmm.cpp2.17 KB

Zone: 

Thread Topic: 

Question

Inspector-Executor API for Sparse BLAS

$
0
0

Hi, 

I am trying to use the Inspector-Executor API for SpMV, but it seems like the mkl_sparse_optimize() routine is not performing any optimizations, as I am not seeing any performance difference for matrices shown here http://www.inteldevconference.com/wp-content/uploads/2015/12/Intel-DevCon-London-2015-Fedorov-MKL.pdf. I am using the following code:

 

sparse_status_t err;
sparse_matrix_t A;
struct matrix_descr matdescr;
matdescr.type = SPARSE_MATRIX_TYPE_GENERAL;
err = mkl_sparse_d_create_csr(&A, SPARSE_INDEX_BASE_ZERO, nrows, ncols, pointerB, pointerE, colind, values);
err = mkl_sparse_set_mv_hint(A, SPARSE_OPERATION_NON_TRANSPOSE, SPARSE_FULL, 50000000);
err = mkl_sparse_set_memory_hint(A, SPARSE_MEMORY_AGGRESSIVE);
err = mkl_sparse_optimize(A);

for (int l = 0; l < LOOPS; l++)
    mkl_sparse_d_mv(SPARSE_OPERATION_NON_TRANSPOSE, ALPHA, A, matdescr, x, BETA, y);

 

I am using the API incorrectly?

Thanks

 

 

Thread Topic: 

Help Me

Should VSL Leap Frog work with Wichmann Hill?

$
0
0

I tried vslLeapfrogStream and compared the numbers produced with a simple block of randoms.

This worked for the congruential generators MCG31 and 59, but it did not put the numbers in the right places with WH.

If it is not supported, I expected a non-zero status from calling the vslLeapfrogStream method, but it was zero.

My guess is that WH shouldn't be supported and the status is wrong.

This is what I did...

the assert(fabs(a - b) < 1e-12); fails, suggesting inconsistent values.

It works if I change the gen to an MCG near the top.

This is with compiler version 16.0 x64 in VS2015 / Windows 10.

int main()
{
	constexpr int nFrogs = 7;
	constexpr int nSims = 101;
	VSLStreamStatePtr streams[nFrogs];

	int gen = VSL_BRNG_WH;
	int seed = 1234567890;

	// Creating first stream
	int status = vslNewStream(&streams[0], gen, seed);
	assert(status == 0);

	// Copy first stream to others
	for (int i = 1; i < nFrogs; ++i)
	{
		status = vslCopyStream(&streams[i], streams[0]);
		assert(status == 0);
	}

	// Leapfrogging the streams
	for (int i = 0; i < nFrogs; ++i)
	{
		status = vslLeapfrogStream(streams[i], i, nFrogs);
		assert(status == 0); // Unacceptable generator gives status -1002
	}

	// Generating base case random numbers without leap frog for comparison
	// Same generator and seed
	VSLStreamStatePtr baseStream;
	status = vslNewStream(&baseStream, gen, seed);
	assert(status == 0);
	double y[nSims*nFrogs];
	status = vdRngUniform(VSL_RNG_METHOD_UNIFORM_STD, baseStream, nSims*nFrogs, y, 0.0, 1.0);
	assert(status == 0);

	// Generate randoms for each of the leapfrog streams and compare output
	double x[nSims];
	for (int i = 1; i < nFrogs; ++i)
	{
		status = vdRngUniform(VSL_RNG_METHOD_UNIFORM_STD, streams[i], nSims, x, 0.0, 1.0);
		assert(status == 0);
		for (int j = 0; j < nSims; ++j)
		{
			double a = x[j];
			double b = y[j*nFrogs + i];
			assert(fabs(a - b) < 1e-12);
		}

	}

	// Deleting the streams
	for (int i = 1; i < nFrogs; ++i)
	{
		status = vslDeleteStream(&streams[i]);
		assert(status == 0);
	}

	vslDeleteStream(&baseStream);
}

 

 

Thread Topic: 

Bug Report
Viewing all 3005 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>