In this paper,we consider using Schur complements to design preconditioners for twofold and block tridiagonal saddle point problems.One type of the preconditioners are based on the nested(or recursive)Schur complement...In this paper,we consider using Schur complements to design preconditioners for twofold and block tridiagonal saddle point problems.One type of the preconditioners are based on the nested(or recursive)Schur complement,the other is based on an additive type Schur complement after permuting the original saddle point systems.We analyze different preconditioners incorporating the exact Schur complements.We show that some of them will lead to positively stable preconditioned systems if proper signs are selected in front of the Schur complements.These positive-stable preconditioners outperform other preconditioners if the Schur complements are further approximated inexactly.Numerical experiments for a 3-field formulation of the Biot model are provided to verify our predictions.展开更多
This paper proposes a parallel algorithm, called KDOP (K-DimensionalOptimal Parallel algorithm), to solve a general class of recurrence equations efficiently. The KDOP algorithm partitions the computation into a serie...This paper proposes a parallel algorithm, called KDOP (K-DimensionalOptimal Parallel algorithm), to solve a general class of recurrence equations efficiently. The KDOP algorithm partitions the computation into a series of sub-computations, each of which is executed in the fashion that all the processors work simultaneously with each one executing an optimal sequential algorithm to solve a subcomputation task. The algorithm solves the equations in O(N/p)steps in EREW PRAM model (Exclusive Read Exclusive Write Parallel Ran-dom Access Machine model) using p<N1-e processors, where N is the size of the problem, and e is a given constant. This is an optimal algorithm (itsspeedup is O(p)) in the case of p<N1-e. Such an optimal speedup for this problem was previously achieved only in the case of p<N0.5. The algorithm can be implemented on machines with multiple processing elements or pipelined vector machines with parallel memory systems.展开更多
基金the NIH-RCMI(Grant No.347U54MD013376)the affliated project award from the Center for Equitable Artificial Intelligence and Machine Learning Systems at Morgan State University(Project ID 02232301)+3 种基金the National Science Foundation awards(Grant No.1831950).The work of G.Ju is supported in part by the National Key R&D Program of China(Grant No.2017YFB1001604)the National Natural Science Foundation of China(Grant No.11971221)the Shenzhen Sci-Tech Fund(Grant Nos.RCJC20200714114556020,JCYJ20170818153840322,JCYJ20190809150413261)the Guangdong Provincial Key Laboratory of Computational Science and Material Design(Grant No.2019B030301001).
文摘In this paper,we consider using Schur complements to design preconditioners for twofold and block tridiagonal saddle point problems.One type of the preconditioners are based on the nested(or recursive)Schur complement,the other is based on an additive type Schur complement after permuting the original saddle point systems.We analyze different preconditioners incorporating the exact Schur complements.We show that some of them will lead to positively stable preconditioned systems if proper signs are selected in front of the Schur complements.These positive-stable preconditioners outperform other preconditioners if the Schur complements are further approximated inexactly.Numerical experiments for a 3-field formulation of the Biot model are provided to verify our predictions.
文摘This paper proposes a parallel algorithm, called KDOP (K-DimensionalOptimal Parallel algorithm), to solve a general class of recurrence equations efficiently. The KDOP algorithm partitions the computation into a series of sub-computations, each of which is executed in the fashion that all the processors work simultaneously with each one executing an optimal sequential algorithm to solve a subcomputation task. The algorithm solves the equations in O(N/p)steps in EREW PRAM model (Exclusive Read Exclusive Write Parallel Ran-dom Access Machine model) using p<N1-e processors, where N is the size of the problem, and e is a given constant. This is an optimal algorithm (itsspeedup is O(p)) in the case of p<N1-e. Such an optimal speedup for this problem was previously achieved only in the case of p<N0.5. The algorithm can be implemented on machines with multiple processing elements or pipelined vector machines with parallel memory systems.