摘要
研究了直接法求解稀疏线性方程组过程中最耗时的稀疏矩阵LU分解的数值计算,提出了一种稀疏矩阵LU分解并行算法,该算法可通过动态的相关性检测来开发更多的并行性。同时提出了基于现场可编程门阵列(FPGA)实现该并行算法的硬件结构,该结构不依赖于分解因子的稀疏结构信息,分解因子的数据结构可动态生成。与相关工作比较,这种新的硬件结构具有更好的通用性。实验结果表明,这种新的结构的性能优于通用处理器的软件实现。
The most time-consuming numerical computation in sparse LU decomposition with the direct method was studied,and a parallel sparse LU decomposition algorithm was presented,with which more parallelisms can be developed by dynamic dependence analysis.And a hardwate structure implemented using the parallel sparse LU decomposition algorithm based on field programmable gate orrays (FPGAs)was proposed.The design of the hardware structure does not need the sparsity structural informafion of the decomposition factors,and the data structures of decomposition factors are generated dynamically.The proposed design is more general than that proposed in related work.The experimental results show that this new LU decomposition design outperforms the software implementation on the general-purpose processors.
出处
《高技术通讯》
CAS
CSCD
北大核心
2013年第8期789-796,共8页
Chinese High Technology Letters
基金
国家自然科学基金(61125201)资助项目