In this paper,we propose and implement a mixed-precision Block-ISAI preconditioner for solving linear systems from mul-tiphysics areas.By leveraging FP32 computing,our approach accelerates the sparse matrix-vector pro...In this paper,we propose and implement a mixed-precision Block-ISAI preconditioner for solving linear systems from mul-tiphysics areas.By leveraging FP32 computing,our approach accelerates the sparse matrix-vector product kernel while main-taining satisfactory accuracy.Meanwhile,an efficient,warp-based GPU implementation for Block-ISAI preconditioner with Tensor core acceleration is proposed.For the matrix-multiplication portion of it,we use the double-precision Tensor core on the NVIDIA GPUs A100 to accelerate it.To showcase the effectiveness of our method,detailed comparisons are made which shows noteworthy speedup:precisely,it is 6x faster than cuSPARSE and 11.2x faster than PETSc’s built-in preconditioner.展开更多
基金funded by Key Technologies Research and Development Program(No.2020YFB1709500).
文摘In this paper,we propose and implement a mixed-precision Block-ISAI preconditioner for solving linear systems from mul-tiphysics areas.By leveraging FP32 computing,our approach accelerates the sparse matrix-vector product kernel while main-taining satisfactory accuracy.Meanwhile,an efficient,warp-based GPU implementation for Block-ISAI preconditioner with Tensor core acceleration is proposed.For the matrix-multiplication portion of it,we use the double-precision Tensor core on the NVIDIA GPUs A100 to accelerate it.To showcase the effectiveness of our method,detailed comparisons are made which shows noteworthy speedup:precisely,it is 6x faster than cuSPARSE and 11.2x faster than PETSc’s built-in preconditioner.