Hi there,
We are encountering inconsistent behaviour with the sparse PARDISO_64 solver.
We are trying to solve a linear system, which corresponds to the stationary distribution of a discrete-time Markov chain. The transition matrix can be generated for small or large state spaces. The application requires a solution for a large state space (very sparse transition matrix); I include code for a small state space (not very sparse) that reproduces the problem we're encountering. Oddly, when the transition matrix is < 101 in size (ie., 101 x 101), this code produces correct results. However, with matrix size >=101, it gives non-sense results including negative values.
I've run this same matrix in R, using its default dense linear solver, and it solves the system just fine and returns a vector that sums to one with all entries positive.
Can someone let us know if this is a bug? Any suggestions?
Thanks very much,
- Jason
To compile the example:
icc -o test ./tester.c -DMKL_ILP64 -std=c99 -g -m64 -fopenmp -lmkl_intel_ilp64 -lmkl_core -liomp5 -lmkl_intel_thread -lpthread -lmkl_avx -lmkl_vml_avx -lm -liomp5 -ldl -lrt
Output:
$ ./test
=== PARDISO: solving a real nonsymmetric system ===
0-based array is turned ON
PARDISO double precision computation is turned ON
Parallel METIS algorithm at reorder step is turned ON
Scaling is turned ON
Matching is turned ON
Summary: ( reordering phase )
================
Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.000282 s
Time spent in reordering of the initial matrix (reorder) : 0.002224 s
Time spent in symbolic factorization (symbfct) : 0.000373 s
Time spent in data preparations for factorization (parlist) : 0.000009 s
Time spent in allocation of internal data structures (malloc) : 0.001029 s
Time spent in additional calculations : 0.000441 s
Total time spent : 0.004357 s
Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP
< Linear system Ax = b >
number of equations: 101
number of non-zeros in A: 7649
number of non-zeros in A (%): 74.982845
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 9
size of largest supernode: 73
number of non-zeros in L: 7376
number of non-zeros in U: 1599
number of non-zeros in L+U: 8975
ooc_max_core_size got from config file=100000
ooc_max_swap_size got from config file=10000
ooc_path got from config file=/tmp
ooc_keep_file got from config file=1
=== PARDISO is running in In-Core mode, because iparam(60)=1 and there is enough RAM for In-Core ===
Percentage of computed non-zeros for LL^T factorization
1 % 2 % 3 % 4 % 5 % 27 % 100 %
=== PARDISO: solving a real nonsymmetric system ===
Single-level factorization algorithm is turned ON
Summary: ( factorization phase )
================
Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct) : 0.002051 s
Time spent in allocation of internal data structures (malloc) : 0.000871 s
Time spent in additional calculations : 0.000008 s
Total time spent : 0.002930 s
Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP
< Linear system Ax = b >
number of equations: 101
number of non-zeros in A: 7649
number of non-zeros in A (%): 74.982845
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 9
size of largest supernode: 73
number of non-zeros in L: 7376
number of non-zeros in U: 1599
number of non-zeros in L+U: 8975
gflop for the numerical factorization: 0.000497
gflop/s for the numerical factorization: 0.242160
Percentage of computed non-zeros for LL^T factorization
1 % 2 % 3 % 4 % 5 % 27 % 100 %
=== PARDISO: solving a real nonsymmetric system ===
Single-level factorization algorithm is turned ON
Summary: ( starting phase is factorization, ending phase is solution )
================
Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct) : 0.000978 s
Time spent in direct solver at solve step (solve) : 0.000152 s
Time spent in allocation of internal data structures (malloc) : 0.000225 s
Time spent in additional calculations : 0.000010 s
Total time spent : 0.001365 s
Statistics:
===========
Parallel Direct Factorization is running on 1 OpenMP
< Linear system Ax = b > < transpose >
number of equations: 101
number of non-zeros in A: 7649
number of non-zeros in A (%): 74.982845
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
number of supernodes: 9
size of largest supernode: 73
number of non-zeros in L: 7376
number of non-zeros in U: 1599
number of non-zeros in L+U: 8975
gflop for the numerical factorization: 0.000497
gflop/s for the numerical factorization: 0.507876
Solution complete
0 0.999995193211361765861511230469
1 0.000032578641480540682096034288
2 -0.001657514998954390250673895935
3 0.033771150451229914324358105659
4 -0.461805824206749093718826770782
5 4.749899498253853380447253584862
6 -38.771420601724457810632884502411
7 259.248515126662823604419827461243
8 -1449.667206926258586463518440723419
9 6874.129777303653099806979298591614
10 -27898.278384158944390946999192237854
11 97440.861928413447458297014236450195
12 -293504.794066837057471275329589843750
13 761189.228222305420786142349243164062
14 -1688760.591960122343152761459350585938
15 3160374.134989297483116388320922851562
16 -4847723.865611193701624870300292968750
17 5708551.324229651130735874176025390625
18 -4176587.558768271468579769134521484375
19 -647335.411046688444912433624267578125
20 7123983.262770898640155792236328125000
21 -10372849.399064248427748680114746093750
22 5097756.375352034345269203186035156250
23 8368298.486144787631928920745849609375
24 -20094661.334219779819250106811523437500
25 15530267.332568019628524780273437500000
26 10869635.995811181142926216125488281250
27 -45071205.791867606341838836669921875000
28 59302210.260278038680553436279296875000
29 -32906353.391984485089778900146484375000
30 -29776674.515638992190361022949218750000
31 100146018.075395077466964721679687500000
32 -142138732.192305266857147216796875000000
33 132670794.247007787227630615234375000000
34 -77887861.591384351253509521484375000000
35 16047679.964387292042374610900878906250
36 8486771.549898184835910797119140625000
37 8432116.450060281902551651000976562500
38 -19044070.376644946634769439697265625000
39 -14876318.802595624700188636779785156250
40 53651210.152546934783458709716796875000
41 -18599756.443840526044368743896484375000
42 -83578205.019382655620574951171875000000
43 132441139.780909150838851928710937500000
44 -44723190.570743650197982788085937500000
45 -90119098.758960306644439697265625000000
46 112308026.259973496198654174804687500000
47 -15883799.202348483726382255554199218750
48 -37476735.814083844423294067382812500000
49 -42720512.310896500945091247558593750000
50 136331725.799720644950866699218750000000
51 -94694026.197748795151710510253906250000
52 -24969756.509624961763620376586914062500
53 36123495.129054218530654907226562500000
54 95371312.336257725954055786132812500000
55 -187649724.744519829750061035156250000000
56 99580546.284361138939857482910156250000
57 61281447.953638210892677307128906250000
58 -75682784.307738944888114929199218750000
59 -94990590.347380638122558593750000000000
60 249469815.636476099491119384765625000000
61 -185170645.040623068809509277343750000000
62 -69561311.618086069822311401367187500000
63 291469651.987545907497406005859375000000
64 -296767916.465866982936859130859375000000
65 107424872.212943851947784423828125000000
66 100278309.586056604981422424316406250000
67 -174662996.371612936258316040039062500000
68 105230287.342239096760749816894531250000
69 10815686.119901048019528388977050781250
70 -81002433.999335929751396179199218750000
71 86416125.554535329341888427734375000000
72 -62056087.853065535426139831542968750000
73 40542349.147022008895874023437500000000
74 -26163544.408309187740087509155273437500
75 10472964.159998646005988121032714843750
76 5665027.077539607882499694824218750000
77 -11247268.616148518398404121398925781250
78 -535759.321254136040806770324707031250
79 23669191.133846260607242584228515625000
80 -44661930.400523439049720764160156250000
81 53449912.937561854720115661621093750000
82 -48916840.807791873812675476074218750000
83 36568077.467748269438743591308593750000
84 -23051005.968265801668167114257812500000
85 12468528.466729646548628807067871093750
86 -5845339.524531649425625801086425781250
87 2387800.565653024241328239440917968750
88 -851709.236374233034439384937286376953
89 265137.756044438341632485389709472656
90 -71829.217305896498146466910839080811
91 16846.539484900742536410689353942871
92 -3393.713058615743648260831832885742
93 580.705784613051491760415956377983
94 -83.109681426136376103386282920837
95 9.736892611661156493596536165569
96 -0.905525373004574585245052276150
97 0.063834158911049598827958106995
98 -0.003165205991651388536811673191
99 0.000096134640968314257489582553
100 -0.000008968170732259750366210938