I tested PARDISO and found that IPAM[3] is limited in 8. I tried to use 16 threads, but summary shows that it uses 8.
What should I fix ?
iparm[FORT(1)] = 1; // 0 - all default, 1 - MUST supply all
iparm[FORT(2)] = 2; // 0 - MMD, 2 - ND METIS
iparm[FORT(3)] = get_num_threads(); // num of threads
iparm[FORT(4)] = 0; // No iterative-direct algorithm
iparm[FORT(5)] = 0; // No user fill-in reducing permutation
iparm[FORT(6)] = 0; // Write solution into x
iparm[FORT(7)] = 0; // Not in use
iparm[FORT(8)] = 0; // num of iterative refinement
iparm[FORT(9)] = 0; // Not in use
iparm[FORT(10)] = 13; // Pivot 13 - non sym 8 sym indefinite
iparm[FORT(11)] = 1; // Scaling 1 - non sym, 0 - sym
iparm[FORT(12)] = 0; // Conjugate transposed/transpose solve
iparm[FORT(13)] = 1; // 1 - normal matching, 2 - advanced
iparm[FORT(14)] = 0; // Output: Number of perturbed pivots
iparm[FORT(15)] = 0; // Not in use
iparm[FORT(16)] = 0; // Not in use
iparm[FORT(17)] = 0; // Not in use
iparm[FORT(18)] = -1; // -1 - report nz in factors
iparm[FORT(19)] = -1; // -1 - report MFlops
iparm[FORT(20)] = 0; // Output: Numbers of CG Iterations
iparm[FORT(21)] = 1; // 0 - 1x1 pivot] 1 - 2x2 bunch-kaufman pivot (default)
iparm[FORT(24)] = 1; // 0 - one level parallel, 1 - two level (default)
iparm[FORT(25)] = 1; // 0 - seq solve] 1 - par solve (default)
iparm[FORT(27)] = 1; // 0) no check, 1) matrix check
iparm[FORT(52)] = 1; // num of dist solver : multi thread only should be 1
cout << "IPARM[FORT(3)] = "<< iparm[FORT(3)] << endl;
PARDISO returns like this:
Statistics:
===========
< Parallel Direct Factorization with number of processors: > 8
< Numerical Factorization with BLAS3 and O(n) synchronization >
< Linear system Ax = b >
number of equations: 357889
number of non-zeros in A: 26539009
number of non-zeros in A (%): 0.020720