In this paper,a novel parallel structured divide-and-conquer(DC)algorithm is proposed for symmetric banded eigenvalue problems,denoted by PBSDC,which modifies the classical parallel banded DC(PBDC)algorithm by reducin...In this paper,a novel parallel structured divide-and-conquer(DC)algorithm is proposed for symmetric banded eigenvalue problems,denoted by PBSDC,which modifies the classical parallel banded DC(PBDC)algorithm by reducing its computational cost.The main tool that PBSDC uses is a parallel structured matrix multiplication algorithm(PSMMA),which can be much faster than the general dense matrix multiplication ScaLAPACK routine PDGEMM.Numerous experiments have been performed on Tianhe-2 supercomputer to compare PBSDC with PBDC and ELPA.For matrices with few deflations,PBSDC can be much faster than PBDC since computations are saved.For matrices with many deflations and/or small bandwidths,PBSDC can be faster than the tridiagonalization-based DC implemented in LAPACK and ELPA.However,PBSDC would become slower than ELPA for matrices with relatively large bandwidths.展开更多
基金supported in part by NSFC(No.2021YFB0300101,62073333,61902411,62032023,12002382,11275269,42104078)173 Program of China(2020-JCJQ-ZD-029)+3 种基金Open Research Fund from State Key Laboratory of High Performance Computing of China(HPCL)(No.202101-01)Guangdong Natural Science Foundation(2018B030312002)the Program for Guangdong Introducing Innovative and Entrepreneurial Teams under Grant(No.2016ZT06D211)supported by the Spanish Agencia Estatal de Investigación(AEI)under project SLEPc-DA(PID2019-107379RB-I00).
文摘In this paper,a novel parallel structured divide-and-conquer(DC)algorithm is proposed for symmetric banded eigenvalue problems,denoted by PBSDC,which modifies the classical parallel banded DC(PBDC)algorithm by reducing its computational cost.The main tool that PBSDC uses is a parallel structured matrix multiplication algorithm(PSMMA),which can be much faster than the general dense matrix multiplication ScaLAPACK routine PDGEMM.Numerous experiments have been performed on Tianhe-2 supercomputer to compare PBSDC with PBDC and ELPA.For matrices with few deflations,PBSDC can be much faster than PBDC since computations are saved.For matrices with many deflations and/or small bandwidths,PBSDC can be faster than the tridiagonalization-based DC implemented in LAPACK and ELPA.However,PBSDC would become slower than ELPA for matrices with relatively large bandwidths.