Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 3005

mkl_pardiso time consumption during solution phase

$
0
0

Hello,

 I am working with a sparse matrix A and trying to solve <Ax =b>. b is an array of 1 million and it is in double precision. The matrix is sufficiently sparse (number of non-zeros in A (%): 0.000694). I am using mkl_pardiso.f90 in 4 nodes. The factorization is taking 10 min to complete but I was expecting that the solution phase shall not take longer but it is more than one hour and it is still in that phase. Is this normal ? I provide the output till now (before solution phase). Can anybody please share any ideas in order to improve this situation ? Any help will be much appreciated.

=== PARDISO: solving a real nonsymmetric system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
Parallel METIS algorithm at reorder step is turned ON
Scaling is turned ON
Matching is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.081618 s
Time spent in reordering of the initial matrix (reorder)         : 1.912543 s
Time spent in symbolic factorization (symbfct)                   : 1.559847 s
Time spent in data preparations for factorization (parlist)      : 0.075198 s
Time spent in allocation of internal data structures (malloc)    : 0.340503 s
Time spent in additional calculations                            : 0.220093 s
Total time spent                                                 : 4.189802 s

Statistics:
===========
Parallel Direct Factorization is running on 4 OpenMP

< Linear system Ax = b >
             number of equations:           1000000
             number of non-zeros in A:      6940000
             number of non-zeros in A (%): 0.000694

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
             number of supernodes:                    654600
             size of largest supernode:               11181
             number of non-zeros in L:                782333465
             number of non-zeros in U:                766664027
             number of non-zeros in L+U:              1548997492
 Reordering completed ...
 Number of nonzeros in factors =  1548997492
 Number of factorization MFLOPS =  10596105
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
Percentage of computed non-zeros for LL^T factorization
 0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99  100

=== PARDISO: solving a real nonsymmetric system ===
Single-level factorization algorithm is turned ON


Summary: ( factorization phase )
================

Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct)                        : 953.925513 s
Time spent in allocation of internal data structures (malloc)    : 0.170726 s
Time spent in additional calculations                            : 0.089973 s
Total time spent                                                 : 954.186212 s

Statistics:
===========
Parallel Direct Factorization is running on 4 OpenMP

< Linear system Ax = b >
             number of equations:           1000000
             number of non-zeros in A:      6940000
             number of non-zeros in A (%): 0.000694

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
             number of supernodes:                    654600
             size of largest supernode:               11181
             number of non-zeros in L:                782333465
             number of non-zeros in U:                766664027
             number of non-zeros in L+U:              1548997492
             gflop   for the numerical factorization: 10596.105218

             gflop/s for the numerical factorization: 11.107896

 Factorization completed ...

 

 

 

 

Thanks,

Dhiraj

 

 

     


Viewing all articles
Browse latest Browse all 3005

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>