Quantcast
Channel: Intel® Software - Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
Viewing all articles
Browse latest Browse all 3005

Pardiso example terminates when using 72 or more cpus

$
0
0

I studied the example cl_solver_sym_sp_0_based_c.c in cluster_sparse_solverc/source . I compiled it using:

make libintel64 example=cl_solver_sym_sp_0_based_c

It runs fine . However the matrix is too small to look at performance. So I modified the example to read in a 3million^2 matrix from a text file. When I run it with 24 cpus ( 1 host ), it factors the matrix in 30 second. When I run it with 48 cpus ( 2 hosts ) it factors it in 20 seconds. This is great! But when I run it with 72 or more cpus, I keep getting this after the reordering stage:

Reordering completed ...
===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 106418 RUNNING AT cforge201
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
==================================================================================

The command I am using is:

mpirun -np 3 -machinefile ./hostfile ./cl_solver_sym_sp_0_based_c.exe

Where hostfile contains:

cforge200:1
cforge201:1
cforge202:1

Here are my example files to see if issue is reproduceable:

cl_solver_sym_sp_0_based_c.c - Edit all the occurences of *.txt to the path where the files are on your system

https://www.dropbox.com/s/ndkzi9zojxuh1xo/cl_solver_sym_sp_0_based_c.c?dl=0

ia, ja, a, and b data in text files:

https://www.dropbox.com/s/3dkhbillyso03kc/ia_ja_a_b_data.tar.gz?dl=0

Curious what kind of performance improvement you get when running with MPI on 12, 24, 48, and 72 cpus!

 

 

 

 


Viewing all articles
Browse latest Browse all 3005

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>