期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Mixed-precision block incomplete sparse approximate preconditioner on Tensor core
1
作者 Haoyuan Zhang Wenpeng Ma +2 位作者 Wu Yuan Jian Zhang Zhonghua Lu 《CCF Transactions on High Performance Computing》 2024年第1期54-67,共14页
In this paper,we propose and implement a mixed-precision Block-ISAI preconditioner for solving linear systems from mul-tiphysics areas.By leveraging FP32 computing,our approach accelerates the sparse matrix-vector pro... In this paper,we propose and implement a mixed-precision Block-ISAI preconditioner for solving linear systems from mul-tiphysics areas.By leveraging FP32 computing,our approach accelerates the sparse matrix-vector product kernel while main-taining satisfactory accuracy.Meanwhile,an efficient,warp-based GPU implementation for Block-ISAI preconditioner with Tensor core acceleration is proposed.For the matrix-multiplication portion of it,we use the double-precision Tensor core on the NVIDIA GPUs A100 to accelerate it.To showcase the effectiveness of our method,detailed comparisons are made which shows noteworthy speedup:precisely,it is 6x faster than cuSPARSE and 11.2x faster than PETSc’s built-in preconditioner. 展开更多
关键词 block-isai GPU Mixed-precision Tensor core PRECONDITIONER
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部