期刊文献+

使用GPU加速计算矩阵的Cholesky分解 被引量:3

ACCELERATING CALCULATION OF CHOLESKY FACTORISATION OF MATRIX WITH GPU
在线阅读 下载PDF
导出
摘要 针对大型实对称正定矩阵的Cholesky分解问题,给出其在图形处理器(GPU)上的具体实现。详细分析了Volkov计算Cholesky分解的混合并行算法,并在此基础上依据自身计算机的CPU以及GPU的计算性能,给出一种更为合理的三阶段混合调度方案,进一步减少CPU的空闲时间以及避免GPU空闲情况的出现。数值实验表明,当矩阵阶数超过7000时,新的混合调度算法相比标准的MKL算法获得了超过5倍的加速比,同时对比原Volkov混合算法获得了显著的性能提升。 A concrete implementation of Cholesky factorisation on graphic processing unit (GPU) for large real symmetric positive definite matrix is described in this article. We analyse the hybrid parallel algorithm presented by Volkov for computing the Cholesky factorisation in detail. On that basis, and according to the computational performances of CPU and GPU on our own computers, we present a more reasonable hy- brid three-phase scheduling strategy,which further reduces the idle time of CPU and avoids the occurrence of GPU in idle status. Numerical experiment shows that the new hybrid scheduling algorithm achieves a speedup of more than 5 times compared with the standard MKL algorithm when the order of a matrix is larger than 7000,and it also observably outperforms the performance of original Volkov's hybrid algorithm.
作者 沈聪 高火涛
出处 《计算机应用与软件》 CSCD 2016年第9期284-287,305,共5页 Computer Applications and Software
基金 湖北省自然科学基金重点项目(ZRZ2014000286)
关键词 图形处理器 乔里斯基分解 加速比 混合算法 GPU Cholesky factorisation Speedup Hybrid algorithm
  • 相关文献

参考文献10

  • 1Chandrasekar J, Kim I S, Bernstein D S, et al. Reduced-Rank Unscent- ed Kalman filtering using Cholesky-based decomposition [ C ]//Ameri- can Control Conference, June,2008 : 1274 - 1279.
  • 2Yu H, Chung C Y, Wong K P, et al. Probabilistic Load Flow Evaluation With Hybrid Latin Hypereube Sampling and Cholesky Decomposition [ J]. IEEE Transactions on Power Systems,2009,24 (2) :661 - 667.
  • 3David S. Watkins. Fundamentals of Matrix Computations [ M ]. New York : John Wiley and Sons,2013.
  • 4Gene H Golub,Charles F,Van Loan. Matrix Computations [ M]. Balti- more: Johns Hopkins University Press,2013.
  • 5Volkov V, Demmel J W. Benchmarking gpus to tune dense linear alge- bra[C]//Proeeedings of the 2008 ACM/IEEE Conference on Super- computing, Nov, 2008 : 1 - 11.
  • 6胡鹏飞,袁志勇,廖祥云,郑奇,陈二虎.基于CPU-GPU混合加速的SPH流体仿真方法[J].计算机工程与科学,2014,36(7):1231-1237. 被引量:3
  • 7张健,焦良葆,陈瑞.CPU-GPU混合平台上动态场景光线跟踪的研究[J].计算机工程与应用,2012,48(21):151-154. 被引量:5
  • 8Yaohung M Tsai, Weichurtg Wang, RayBing Chert. Tunning Block Size for QR Factorization on CPU-GPU Hybrid Systems[ C~//Proceedings of the IEEE 6th International Symposium on Embedded Multicore Socs,Sept,2012:205 -211.
  • 9John Cheng, Max Grossman, Ty McKercher. CUDA C Programming [ M]. Indianalpois:John Wiley & Sons,2014.
  • 10刘金硕,邓娟,周峥,等.基于CUDA的并行程序设计[M].北京:科学出版社,2014.

二级参考文献23

  • 1Glassner A S.An introduction to ray tracing[M].London, UK:Academic Press Ltd, 1989.
  • 2Wald I,Mark W R,Gunther J, et al.State of the art in ray tracing animated scenes[J].Computers & Graphics, 2008,32(1):3-13.
  • 3Choi B,Komuravelli R, Lu V, et al.Parallel SAH k-D Tree construction for fast dynamic scene ray tracing[R]. 2009.
  • 4Shevtsov M, Soupikov A, Kapustin A.Highly parallel fastkd-tree construction for interactive ray tracing of dynamic scenes[J].Computer Graphics Forum, 2007,26 ( 3 ) : 395-404.
  • 5Budge B,Anderson J,Garth C,et al.A straightforward CUDA implementation for interactive ray-tracing[C]//IEEE Symposium on Interactive Ray Tracing, Los Angeles, CA, 2008: 178-185.
  • 6Wald I, Benthin C.Interactive rendering with coherent ray tracing[J].Computer Graphics Forum, 2001,20 ( 3 ) : 153 - 164.
  • 7Thrane N, Simonsen L O.A comparison of acceleration structures for GPU assisted ray tracing[D].University of Aarhus. 2005.
  • 8Hunt W,Mark W R, Stoll G.Fast kd-tree construction with an adaptive error-bounded heuristic[C]//IEEE Sympo- sium on Interactive Ray Tracing, Salt Lake City,UT, 2006 : 81-88.
  • 9BART: a Benchmark for Animated Ray Tracing[EB/OL] http://www.ce.chalmers.se/research/group/graphics/BART/.
  • 10Liu M B,Liu G R. Smoothed particle hydrodynamics (SPH) : An overview and recent developments[J]. Archives of Com- putational Methods in Engineering, 2010, 17(1):25 -76.

共引文献5

同被引文献16

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部