Hi,
Once again having some trouble with the Direct Sparse Solver for clusters. I am getting the following error when running on a single process
entering matrix solver
*** Error in PARDISO ( insufficient_memory) error_num= 1
*** Error in PARDISO memory allocation: MATCHING_REORDERING_DATA, allocation of 1 bytes failed
total memory wanted here: 142 kbyte
=== PARDISO: solving a real structurally symmetric system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON
Summary: ( reordering phase )
================
Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.000005 s
Time spent in reordering of the initial matrix (reorder) : 0.000000 s
Time spent in symbolic factorization (symbfct) : 0.000000 s
Time spent in allocation of internal data structures (malloc) : 0.000465 s
Time spent in additional calculations : 0.000080 s
Total time spent : 0.000550 s
Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP
< Linear system Ax = b >
number of equations: 6
number of non-zeros in A: 8
number of non-zeros in A (%): 22.222222
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0< Preprocessing with state of the art partitioning metis>
number of supernodes: 0
size of largest supernode: 0
number of non-zeros in L: 0
number of non-zeros in U: 0
number of non-zeros in L+U: 0
ERROR during solution: 4294967294
I just hangs when running on a single process. Below is the CSR format of my matrix and the provided RHS to solve for
CSR row values
0
2
6
9
12
16
18
CSR col values
0
1
0
1
2
3
1
2
4
1
3
4
2
3
4
5
4
5
Rank 0 rhs vector :
1
0
0
0
0
1
Now my calling file looks like:
void SolveMatrixEquations(MKL_INT numRows, MatrixPointerStruct &cArrayStruct, const std::pair<MKL_INT,MKL_INT>& rowExtents)
{
double pressureSolveTime = -omp_get_wtime();
MKL_INT mtype = 1; /* set matrix type to "real structurally symmetric" */
MKL_INT nrhs = 1; /* number of right hand sides. */
void *pt[64] = { 0 }; //internal memory Pointer
/* Cluster Sparse Solver control parameters. */
MKL_INT iparm[64] = { 0 };
MKL_INT maxfct, mnum, phase=13, msglvl, error;
/* Auxiliary variables. */
float ddum; /* float dummy */
MKL_INT idum; /* Integer dummy. */
MKL_INT i, j;
/* -------------------------------------------------------------------- */
/* .. Init MPI. */
/* -------------------------------------------------------------------- */
int mpi_stat = 0;
int comm, rank;
mpi_stat = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
comm = MPI_Comm_c2f(MPI_COMM_WORLD);
/* -------------------------------------------------------------------- */
/* .. Setup Cluster Sparse Solver control parameters. */
/* -------------------------------------------------------------------- */
iparm[0] = 0; /* Solver default parameters overridden with provided by iparm */
iparm[1] =3; /* Use METIS for fill-in reordering */
//iparm[1] = 10; /* Use parMETIS for fill-in reordering */
iparm[5] = 0; /* Write solution into x */
iparm[7] = 2; /* Max number of iterative refinement steps */
iparm[9] = 8; /* Perturb the pivot elements with 1E-13 */
iparm[10] = 0; /* Don't use non-symmetric permutation and scaling MPS */
iparm[12] = 0; /* Switch on Maximum Weighted Matching algorithm (default for non-symmetric) */
iparm[17] = 0; /* Output: Number of non-zeros in the factor LU */
iparm[18] = 0; /* Output: Mflops for LU factorization */
iparm[20] = 0; /*change pivoting for use in symmetric indefinite matrices*/
iparm[26] = 1;
iparm[27] = 0; /* Single precision mode of Cluster Sparse Solver */
iparm[34] = 1; /* Cluster Sparse Solver use C-style indexing for ia and ja arrays */
iparm[39] = 2; /* Input: matrix/rhs/solution stored on master */
iparm[40] = rowExtents.first+1;
iparm[41] = rowExtents.second+1;
maxfct = 3; /* Maximum number of numerical factorizations. */
mnum = 1; /* Which factorization to use. */
msglvl = 1; /* Print statistical information in file */
error = 0; /* Initialize error flag */
//cout << "Rank "<< rank << ": "<< iparm[40] << ""<< iparm[41] << endl;
#ifdef UNIT_TESTS
//msglvl = 0;
#endif
phase = 11;
#ifndef UNIT_TESTS
if (rank == 0)printf("Restructuring system...\n");
cout << "Restructuring system...\n"<<endl;;
#endif
cluster_sparse_solver(pt, &maxfct, &mnum, &mtype, &phase,&numRows, &ddum, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl,&ddum, &ddum, &comm, &error);
if (error != 0)
{
cout << "\nERROR during solution: "<< error << endl;
exit(error);
}
phase = 23;
#ifndef UNIT_TESTS
// if (rank == 0) printf("\nSolving system...\n");
printf("\nSolving system...\n");
#endif
cluster_sparse_solver_64(pt, &maxfct, &mnum, &mtype, &phase,&numRows, cArrayStruct.valArray, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl,
cArrayStruct.rhsVector, cArrayStruct.pressureSolutionVector, &comm, &error);
if (error != 0)
{
cout << "\nERROR during solution: "<< error << endl;
exit(error);
}
phase = -1; /* Release internal memory. */
cluster_sparse_solver_64(pt, &maxfct, &mnum, &mtype, &phase,&numRows, &ddum, cArrayStruct.rowIndexArray, cArrayStruct.colIndexArray, &idum, &nrhs, iparm, &msglvl, &ddum, &ddum, &comm, &error);
if (error != 0)
{
cout << "\nERROR during release memory: "<< error << endl;
exit(error);
}
/* Check residual */
pressureSolveTime += omp_get_wtime();
#ifndef UNIT_TESTS
//cout << "Pressure Solve Time: "<< pressureSolveTime << endl;
#endif
//TestPrintCsrMatrix(cArrayStruct,rowExtents.second-rowExtents.first +1);
}
This is based on the format of one of the examples. Now i am trying to use the ILP64 interface becasue my example system is very large. (16 billion non-zeros). I am using the Intel C++ compiler 2017 as part of the Intel Composer XE Cluster Edition Update 1. I using the following link lines in my Cmake files:
TARGET_COMPILE_OPTIONS(${MY_TARGET_NAME} PUBLIC "-mkl:cluster""-DMKL_ILP64""-I$ENV{MKLROOT}/include")
TARGET_LINK_LIBRARIES(${MY_TARGET_NAME} "-Wl,--start-group $ENV{MKLROOT}/lib/intel64/libmkl_intel_ilp64.a $ENV{MKLROOT}/lib/intel64/libmkl_intel_thread.a $ENV{MKLROOT}/lib/intel64/libmkl_core.a $ENV{MKLROOT}/lib/intel64/libmkl_blacs_intelmpi_ilp64.a -Wl,--end-group -liomp5 -lpthread -lm -ldl")
What is interesting is that this same code runs perfectly fine on my windows development machine. Porting it to my linux cluster is causing issues. Any Ideas?
I am currently awaiting the terribly long download for the update 4 Composer XE package. But I don't have much hope of that fixing it because this code used to run fine on this system.