期刊文献+

稀疏矩阵LU分解的FPGA实现

Implementation of sparse LU decomposition using FPGAs
在线阅读 下载PDF
导出
摘要 研究了直接法求解稀疏线性方程组过程中最耗时的稀疏矩阵LU分解的数值计算,提出了一种稀疏矩阵LU分解并行算法,该算法可通过动态的相关性检测来开发更多的并行性。同时提出了基于现场可编程门阵列(FPGA)实现该并行算法的硬件结构,该结构不依赖于分解因子的稀疏结构信息,分解因子的数据结构可动态生成。与相关工作比较,这种新的硬件结构具有更好的通用性。实验结果表明,这种新的结构的性能优于通用处理器的软件实现。 The most time-consuming numerical computation in sparse LU decomposition with the direct method was studied,and a parallel sparse LU decomposition algorithm was presented,with which more parallelisms can be developed by dynamic dependence analysis.And a hardwate structure implemented using the parallel sparse LU decomposition algorithm based on field programmable gate orrays (FPGAs)was proposed.The design of the hardware structure does not need the sparsity structural informafion of the decomposition factors,and the data structures of decomposition factors are generated dynamically.The proposed design is more general than that proposed in related work.The experimental results show that this new LU decomposition design outperforms the software implementation on the general-purpose processors.
出处 《高技术通讯》 CAS CSCD 北大核心 2013年第8期789-796,共8页 Chinese High Technology Letters
基金 国家自然科学基金(61125201)资助项目
关键词 稀疏矩阵 LU分解 并行算法 现场可编程门阵列(FPGA) 任务并行 sparse matrix LU decomposition parallel algorithm FPGA task parallelism
  • 相关文献

参考文献13

  • 1Wang X,Ziavras S. Parallel LU factorization of sparse ma- trices on FPGA-based configurable computing engines. Concurrency and Computation:Practice and Experience, 2004,16(4) :319-343.
  • 2Chagnon T, Johnson J, Vachranukunkiet P, et al. Sparse LU decomposition using FPGA. In:Proceedings of the 9th Imemational Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondeim, Norway,2008.
  • 3Vachranukunkiet P. Power-flow computation using Field Programmable Gate Arrays: [ Ph. D dissertation ]. Drexel University, 2007.
  • 4Chagnon T. Architectural support for direct sparse LU al- gorithms:[ Master dissertation]. Drexel University,2010.
  • 5Nagel L. SPICE2:a computer program to simulate semi- conductor circuits: [ Ph. D dissertation]. University of Cal- ifornia, Berkeley, 1975.
  • 6Kapre N, DeHon A. Parallelizing sparse matrix solve for SPICE circuit simulation using FPGAs. In:Proceedings of the 2009 IEEE International Conference on Field-Pro- grammable Technology, Sydney, Australia,2009. 190-198.
  • 7Demmel J. Applied Numerical Linear Algebra. The Society of Industrial and Applied Mathematics. 1997.
  • 8Wu G, Dou Y, Lei Y, et al. A fine-grained pipeline imple- mentation of the LINPACK benchmark on FPGAs. In: Pro- ceedings of the 2009 IEEE Symposium on Field-Program- mable Custom Computing Machines, Napa, California, USA,2009. 183-190.
  • 9Kurzak J, Dongarra J. Fully dynamic scheduler for numeri-cal computing on muhicore processors. University of Ten- nessee LAPACK Working Note #220,2010.
  • 10Fu C, Jiao X, Yang T. Efficient sparse LU factorization with partial pivoting on distributed memory architectures. IEEE Transactions on Parallel and Distributed Systems, 1998,9(2) :109-125.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部