检索结果-维普期刊中文期刊服务平台

期刊文献⁺

任意字段

题名或关键词

题名

关键词

文摘

作者

第一作者

机构

刊名

分类号

参考文献

作者简介

基金资助

栏目信息

共找到6篇文章

< 1 >

每页显示 20 50 100

已选择0条

导出题录引用分析

统计分析

显示方式：

文摘详细列表

相关度排序被引量排序时效性排序

Convergence-aware operator-wise mixed-precision training: 1; 作者 Wenhao Dai Ziyi Jia +1 位作者 Yuesi Bai Qingxiao Sun 《CCF Transactions on High Performance Computing》 2025年第1期43-57,共15页; With the support of more precision formats in emerging hardware architectures,mixed-precision has become a popular approach to accelerate deep learning(DL)training.Applying low-precision formats such as FP16 and BF16 ... 展开更多; 关键词 GPU mixed-precision Neural network training AUTO-TUNING Performance optimization; 在线阅读下载PDF 职称材料

Mixed-precision block incomplete sparse approximate preconditioner on Tensor core: 2; 作者 Haoyuan Zhang Wenpeng Ma +2 位作者 Wu Yuan Jian Zhang Zhonghua Lu 《CCF Transactions on High Performance Computing》 2024年第1期54-67,共14页; In this paper,we propose and implement a mixed-precision Block-ISAI preconditioner for solving linear systems from mul-tiphysics areas.By leveraging FP32 computing,our approach accelerates the sparse matrix-vector pro... 展开更多; 关键词 Block-ISAI GPU mixed-precision Tensor core PRECONDITIONER; 在线阅读下载PDF 职称材料

Enhancing LLM Inference Performance on ARM CPUs Through Software and Hardware Co-Optimization Strategies: 3; 作者 CHENG ZHANG XINGYU ZHU +8 位作者 LONGHAO CHEN TINGJIE YANG EVENS PAN GUOSHENG YU YANG ZHAO XIGUANG WU BO LI WEI MAO GENQUAN HAN 《Integrated Circuits and Systems》 2025年第2期49-57,共9页; Large language models(LLMs)have exhibited remarkable performance across a broad spectrum of tasks,yet their extensive computational and memory requirements present substantial challenges for deployment in resource-con... 展开更多; 关键词 Model compression mixed-precision quantization ARM CPUs SIMD optimization LLM inference performance.; 在线阅读下载PDF 职称材料

Establishing high performance AI ecosystem on Sunway platform: 4; 作者 Sha Liu Jie Gao +2 位作者 Xin Liu Zeqiang Huang Tianyu Zheng 《CCF Transactions on High Performance Computing》 2021年第3期224-241,共18页; To meet the demand of large computing power for training complex deep neural networks(DNN),we establish an AI ecosystem on Sunway platform to utilize the Sunway series of high performance computers(HPC).We provide a s... 展开更多; 关键词 AI ecosystem Sunway DNN mixed-precision Hybrid parallelism; 在线阅读下载PDF 职称材料

XHYPRE:a reliable parallel numerical algorithm library for solving large‑scale sparse linear equations: 5; 作者 Chuanying Li Stef Graillat +3 位作者 Zhe Quan Tong‑Xiang Gu Hao Jiang Kenli Li 《CCF Transactions on High Performance Computing》 2023年第2期191-209,共19页; With the rapid development of supercomputers,large-scale computing has become increasingly widespread in various scientific research and engineering fields.Meanwhile,the precision and efficiency of large-scale floatin... 展开更多; 关键词 High-performance computing Rounding errors Error-free transformation technology mixed-precision; 在线阅读下载PDF 职称材料

Interplay Bitwise Operation in Emerging MRAM for Efficient In‑memory Computing: 6; 作者 Hao Cai Honglan Jiang +2 位作者 Yongliang Zhou Menglin Han Bo Liu 《CCF Transactions on High Performance Computing》 2020年第3期282-296,共15页; In order to realize high efficient magnetization switching in magnetic tunnel junction(MTJ),several potential mechanisms have been realized as the interplay effect to MTJ device,such as the interaction between spin or... 展开更多; 关键词 MTJ interplay writing mixed-precision memory In-memory computing Image processing; 在线阅读下载PDF 职称材料

	题名	作者	出处	发文年	操作
1	Convergence-aware operator-wise mixed-precision training	Wenhao Dai Ziyi Jia Yuesi Bai Qingxiao Sun	《CCF Transactions on High Performance Computing》	2025	在线阅读下载PDF 职称材料
2	Mixed-precision block incomplete sparse approximate preconditioner on Tensor core	Haoyuan Zhang Wenpeng Ma Wu Yuan Jian Zhang Zhonghua Lu	《CCF Transactions on High Performance Computing》	2024	在线阅读下载PDF 职称材料
3	Enhancing LLM Inference Performance on ARM CPUs Through Software and Hardware Co-Optimization Strategies	CHENG ZHANG XINGYU ZHU LONGHAO CHEN TINGJIE YANG EVENS PAN GUOSHENG YU YANG ZHAO XIGUANG WU BO LI WEI MAO GENQUAN HAN	《Integrated Circuits and Systems》	2025	在线阅读下载PDF 职称材料
4	Establishing high performance AI ecosystem on Sunway platform	Sha Liu Jie Gao Xin Liu Zeqiang Huang Tianyu Zheng	《CCF Transactions on High Performance Computing》	2021	在线阅读下载PDF 职称材料
5	XHYPRE:a reliable parallel numerical algorithm library for solving large‑scale sparse linear equations	Chuanying Li Stef Graillat Zhe Quan Tong‑Xiang Gu Hao Jiang Kenli Li	《CCF Transactions on High Performance Computing》	2023	在线阅读下载PDF 职称材料
6	Interplay Bitwise Operation in Emerging MRAM for Efficient In‑memory Computing	Hao Cai Honglan Jiang Yongliang Zhou Menglin Han Bo Liu	《CCF Transactions on High Performance Computing》	2020	在线阅读下载PDF 职称材料

已选择0条

导出题录引用分析

统计分析

使用帮助返回顶部