Distributed learning is a well-established method for estimation tasks over extensively distributed datasets.However,non-randomly stored data can introduce bias into local parameter estimates,leading to significant pe...Distributed learning is a well-established method for estimation tasks over extensively distributed datasets.However,non-randomly stored data can introduce bias into local parameter estimates,leading to significant performance degradation in classical distributed algorithms.In this paper,the authors propose a novel Distributed Quasi-Newton Pilot(DQNP)method for distributed learning with non-randomly distributed data.The proposed approach accommodates both randomly and non-randomly distributed data settings and imposes no constraints on the uniformity of local sample sizes.Additionally,it avoids the need to transfer the Hessian matrix or compute its inversion,thereby greatly reducing computational and communication complexity.The authors theoretically demonstrate that the resulting estimator achieves statistical efficiency under mild conditions.Extensive numerical experiments on synthetic and real-world data validate the theoretical findings and illustrate the effectiveness of the proposed method.展开更多
基金supported by the National Natural Science Foundation of China under Grant No.12271034the Open Fund Project of Key Laboratory of Market Regulation under Grant No.2023SYSKF02003。
文摘Distributed learning is a well-established method for estimation tasks over extensively distributed datasets.However,non-randomly stored data can introduce bias into local parameter estimates,leading to significant performance degradation in classical distributed algorithms.In this paper,the authors propose a novel Distributed Quasi-Newton Pilot(DQNP)method for distributed learning with non-randomly distributed data.The proposed approach accommodates both randomly and non-randomly distributed data settings and imposes no constraints on the uniformity of local sample sizes.Additionally,it avoids the need to transfer the Hessian matrix or compute its inversion,thereby greatly reducing computational and communication complexity.The authors theoretically demonstrate that the resulting estimator achieves statistical efficiency under mild conditions.Extensive numerical experiments on synthetic and real-world data validate the theoretical findings and illustrate the effectiveness of the proposed method.