Hello,
I am working with a sparse matrix A and trying to solve <Ax =b>. b is an array of 1 million and it is in double precision. The matrix is sufficiently sparse (number of non-zeros in A (%): 0.000694). I am using mkl_pardiso.f90 in 4 nodes. The factorization is taking 10 min to complete but I was expecting that the solution phase shall not take longer but it is more than one hour and it is still in that phase. Is this normal ? I provide the output till now (before solution phase). Can anybody please share any ideas in order to improve this situation ? Any help will be much appreciated.
=== PARDISO: solving a real nonsymmetric system === 1-based array indexing is turned ON PARDISO double precision computation is turned ON Parallel METIS algorithm at reorder step is turned ON Scaling is turned ON Matching is turned ON Summary: ( reordering phase ) ================ Times: ====== Time spent in calculations of symmetric matrix portrait (fulladj): 0.081618 s Time spent in reordering of the initial matrix (reorder) : 1.912543 s Time spent in symbolic factorization (symbfct) : 1.559847 s Time spent in data preparations for factorization (parlist) : 0.075198 s Time spent in allocation of internal data structures (malloc) : 0.340503 s Time spent in additional calculations : 0.220093 s Total time spent : 4.189802 s Statistics: =========== Parallel Direct Factorization is running on 4 OpenMP < Linear system Ax = b > number of equations: 1000000 number of non-zeros in A: 6940000 number of non-zeros in A (%): 0.000694 number of right-hand sides: 1 < Factors L and U > number of columns for each panel: 96 number of independent subgraphs: 0 number of supernodes: 654600 size of largest supernode: 11181 number of non-zeros in L: 782333465 number of non-zeros in U: 766664027 number of non-zeros in L+U: 1548997492 Reordering completed ... Number of nonzeros in factors = 1548997492 Number of factorization MFLOPS = 10596105 === PARDISO is running in In-Core mode, because iparam(60)=0 === Percentage of computed non-zeros for LL^T factorization 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 === PARDISO: solving a real nonsymmetric system === Single-level factorization algorithm is turned ON Summary: ( factorization phase ) ================ Times: ====== Time spent in copying matrix to internal data structure (A to LU): 0.000000 s Time spent in factorization step (numfct) : 953.925513 s Time spent in allocation of internal data structures (malloc) : 0.170726 s Time spent in additional calculations : 0.089973 s Total time spent : 954.186212 s Statistics: =========== Parallel Direct Factorization is running on 4 OpenMP < Linear system Ax = b > number of equations: 1000000 number of non-zeros in A: 6940000 number of non-zeros in A (%): 0.000694 number of right-hand sides: 1 < Factors L and U > number of columns for each panel: 96 number of independent subgraphs: 0 number of supernodes: 654600 size of largest supernode: 11181 number of non-zeros in L: 782333465 number of non-zeros in U: 766664027 number of non-zeros in L+U: 1548997492 gflop for the numerical factorization: 10596.105218 gflop/s for the numerical factorization: 11.107896 Factorization completed ...
Thanks,
Dhiraj