期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
A case study of 3D RTM-TTI algorithm on multicore and many-core platforms
1
作者 张秀霞 Tan Guangming +1 位作者 Chen Mingyu Yao Erlin 《High Technology Letters》 EI CAS 2017年第2期185-190,共6页
3D reverse time migration in tiled transversly isotropic(3D RTM-TTI) is the most precise model for complex seismic imaging.However,vast computing time of 3D RTM-TTI prevents it from being widely used,which is addresse... 3D reverse time migration in tiled transversly isotropic(3D RTM-TTI) is the most precise model for complex seismic imaging.However,vast computing time of 3D RTM-TTI prevents it from being widely used,which is addressed by providing parallel solutions for 3D RTM-TTI on multicores and many-cores.After data parallelism and memory optimization,the hot spot function of 3D RTMTTI gains 35.99 X speedup on two Intel Xeon CPUs,89.75 X speedup on one Intel Xeon Phi,89.92 X speedup on one NVIDIA K20 GPU compared with serial CPU baseline.This study makes RTM-TTI practical in industry.Since the computation pattern in RTM is stencil,the approaches also benefit a wide range of stencil-based applications. 展开更多
关键词 3D RTM-TTI Intel Xeon Phi NVIDIA K20 GPU stencil computing manycore MULTICORE seismic imaging
在线阅读 下载PDF
PsmArena:Partitioned Shared Memory for NUMA-Awareness in Multithreaded Scientific Applications
2
作者 Zhang Yang Aiqing Zhang Zeyao Mo 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2021年第3期287-295,共9页
The Distributed Shared Memory(DSM)architecture is widely used in today’s computer design to mitigate the ever-widening processing-memory gap,and it inevitably exhibits Non-Uniform Memory Access(NUMA)to shared-memory ... The Distributed Shared Memory(DSM)architecture is widely used in today’s computer design to mitigate the ever-widening processing-memory gap,and it inevitably exhibits Non-Uniform Memory Access(NUMA)to shared-memory parallel applications.Failure to adapt to the NUMA effect can significantly downgrade application performance,especially on today’s manycore platforms with tens to hundreds of cores.However,traditional approaches such as first-touch and memory policy fall short in false page-sharing,fragmentation,or ease of use.In this paper,we propose a partitioned shared-memory approach that allows multithreaded applications to achieve full NUMA-awareness with only minor code changes and develop an accompanying NUMA-aware heap manager which eliminates false page-sharing and minimizes fragmentation.Experiments on a 256-core cc-NUMA computing node show that the proposed approach helps applications to adapt to NUMA with only minor code changes and improves the performance of typical multithreaded scientific applications by up to 4.3 folds with the increased use of cores. 展开更多
关键词 partitioned shared memory Non-Uniform Memory Access(NUMA) heap manager multithread manycore
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部