摘要
深度神经网络在具备强大的表达能力的同时展现出优异的泛化性能,这与统计学习理论中“模型复杂度损害泛化”的经典论断存在本质冲突,导致传统框架下的深度泛化机制分析陷入困境。经典一致收敛界理论具有依赖参数空间维度、忽略算法隐式偏置等局限,难以直接适配深度网络核心特性。针对这一理论裂隙,构建了融合深度模型关键特征的新型统计学习理论框架,重构了一致收敛理论对深度模型泛化机制的解释范式。通过构建保留深度网络过参数化结构与高维噪声扰动特征的代理线性模型,首次推导出有效的一致收敛界,揭示了高维特征空间中噪声扰动对泛化性能的良性作用机制,突破了传统低维学习理论框架的局限性;基于深度泛化机制构造了数据规模敏感的规范化训练过程,揭示一致收敛界与泛化误差随样本复杂度增长呈现同步衰减的规律,证实了一致收敛理论对深度模型泛化机制的解释能力。基于理论与实验双重证据,突破了一致收敛泛化界的适配瓶颈,重新打开了一致收敛理论分析深度模型泛化性这扇即将被关闭的大门。
Deep neural networks demonstrate both powerful expressive capabilities and exceptional generalization performance,which fundamentally conflicts with the classical statistical learning tenet that“model complexity harms generalization”,rendering the analysis of deep generalization mechanisms under traditional frameworks intractable.Classic uniform convergence theory,constrained by its reliance on parameter space dimensionality and neglect of algorithmic implicit bias,fails to directly align with the core characteristics of deep networks.To address this theoretical gap,this paper constructs a novel statistical learning framework that integrates key features of deep models,thereby redefining the explanatory paradigm of uniform convergence theory for deep generalization mechanisms.It derives the first effective uniform convergence bound for deep networks by introducing a surrogate linear model that preserves overparameterization and high-dimensional noise-perturbation features,which reveals a benign role of high-dimensional noise in improving generalization beyond classical low-dimensional theory.Building on this deep generalization mechanism,it further proposes a scale-sensitive regularized training scheme and shows that the bound and the generalization error decay with increasing sample complexity.Supported by both theoretical and empirical evidence,this work breaks through the adaptability bottleneck of uniform convergence bounds and reopens the door for uniform convergence theory to analyze the generalization of deep models.
作者
李鹏奇
丁立中
张春晖
傅稼润
LI Pengqi;DING Lizhong;ZHANG Chunhui;FU Jiarun(School of Computer Science,Beijing Institute of Technology,Beijing 100081,China)
出处
《计算机科学》
北大核心
2026年第4期33-39,共7页
Computer Science
基金
国家重点研发计划(2022YFB2703100)
国家自然科学基金(62376028,U22A2099)
国家自然科学基金优秀青年科学基金(海外)。
关键词
泛化误差
一致收敛界
修剪假设空间
高维概率
泛化机制
Generalization error
Uniform convergence bound
Pruned hypothesis space
High-dimensional probability
Generalization mechanism