Hello,
I have some questions on cblas_gemm_s8u8s32.
1. What is the reasoning behind requiring one side to be signed and the other unsigned?
2. When I do matrix multiplication with cblas_gemm_s8u8s32 function, I find that when the column major and the second operator( the unsigned int8 integer value) exceeds 128, the calculation result is wrong. What is the reason? How do I calculate the multiplication of two signed int8 matrices.
3. I tried to use MKLDNN DNNL dnnl_gemm_s8s8s32, but unfortunately, but unfortunately, it was much slower than MKL's cblas_sgemm function on some scales.
4. I tested the efficiency of int8 GEMM (Use cblas_gemm_s8u8s32) and float GEMM on my machine and found that the speed of int8 GEMM is close to float. Why? Do you have the efficiency test results of two interfaces?
Thanks,
Jingjing Wang