Hello,
I coded the Conjugate Gradient algorithm using the MKL library functions on an the Intel Xeon familiy product.
The code's version of the CG runs fine on the Intel Xeon processor (without offloadin); the problem surges when I
try to run the code by offloading some operations (the sparse matrix vector products) to the Intel Xeon Phi 7120P
coprocessor.
In line 209 of the cg_mkl_csr_intel.c (that I am attaching) I initiate an asynchronus transfer of the matrix's
arrays while performing some operation until line 237 (of the same file) where the execution waits to receive the
data in order to perform the A * x product. From the cg_execution.txt file that contains the execution of the
cg_mkl_csr_intel.c executable (also attached to this post) I observe that the starting asynchronus data transfer
has no problem, but when the data is needed to perform the product of line 238 (of the cg_mkl_csr_intel.c file)
the following error is generated: "offload error: process on the device 0 was terminated by signal 11 (SIGSEGV)".
I had been unable to identify the cause for this error, hence this post.
I compile the cg_mkl_csr_intel.c file with the following command line:
icc -O3 -qopenmp cg_mkl_csr_intel.c -lm -mkl -o cg_mkl_csr_intel
I run the executable with:
./cg_mkl_csr_intel msym8.txt 8 1e-12
where the msym8.csr is a text file containing a sparse symmetric matrix in CSR format (which I am also attaching
to this post).
I appreciate any help you can provide to solve this issue.
Kindly regards.
Edoardo