Quantized training has been proven to be a prominent method to achieve deep neural network training under limited computational resources.It uses low bit-width arithmetics with a proper scaling factor to achieve negli...Quantized training has been proven to be a prominent method to achieve deep neural network training under limited computational resources.It uses low bit-width arithmetics with a proper scaling factor to achieve negligible accuracy loss.Cambricon-Q is the ASIC design proposed to efficiently support quantized training,and achieves significant performance improvement.However,there are still two caveats in the design.First,Cambricon-Q with different hardware specifications may lead to different numerical errors,resulting in non-reproducible behaviors which may become a major concern in critical applications.Second,Cambricon-Q cannot leverage data sparsity,where considerable cycles could still be squeezed out.To address the caveats,the acceleration core of Cambricon-Q is redesigned to support fine-grained irregular data processing.The new design not only enables acceleration on sparse data,but also enables performing local dynamic quantization by contiguous value ranges(which is hardware independent),instead of contiguous addresses(which is dependent on hardware factors).Experimental results show that the accuracy loss of the method still keeps negligible,and the accelerator achieves 1.61×performance improvement over Cambricon-Q,with about 10%energy increase.展开更多
In this paper, we propose and analyze an accelerated augmented Lagrangian method(denoted by AALM) for solving the linearly constrained convex programming. We show that the convergence rate of AALM is O(1/k^2) whil...In this paper, we propose and analyze an accelerated augmented Lagrangian method(denoted by AALM) for solving the linearly constrained convex programming. We show that the convergence rate of AALM is O(1/k^2) while the convergence rate of the classical augmented Lagrangian method(ALM) is O1 k. Numerical experiments on the linearly constrained 1-2minimization problem are presented to demonstrate the effectiveness of AALM.展开更多
基金the National Key Research and Devecopment Program of China(No.2022YFB4501601)the National Natural Science Foundation of China(No.62102398,U20A20227,62222214,62002338,U22A2028,U19B2019)+1 种基金the Chinese Academy of Sciences Project for Young Scientists in Basic Research(YSBR-029)Youth Innovation Promotion Association Chinese Academy of Sciences。
文摘Quantized training has been proven to be a prominent method to achieve deep neural network training under limited computational resources.It uses low bit-width arithmetics with a proper scaling factor to achieve negligible accuracy loss.Cambricon-Q is the ASIC design proposed to efficiently support quantized training,and achieves significant performance improvement.However,there are still two caveats in the design.First,Cambricon-Q with different hardware specifications may lead to different numerical errors,resulting in non-reproducible behaviors which may become a major concern in critical applications.Second,Cambricon-Q cannot leverage data sparsity,where considerable cycles could still be squeezed out.To address the caveats,the acceleration core of Cambricon-Q is redesigned to support fine-grained irregular data processing.The new design not only enables acceleration on sparse data,but also enables performing local dynamic quantization by contiguous value ranges(which is hardware independent),instead of contiguous addresses(which is dependent on hardware factors).Experimental results show that the accuracy loss of the method still keeps negligible,and the accelerator achieves 1.61×performance improvement over Cambricon-Q,with about 10%energy increase.
基金Supported by Fujian Natural Science Foundation(2016J01005)Strategic Priority Research Program of the Chinese Academy of Sciences(XDB18010202)
文摘In this paper, we propose and analyze an accelerated augmented Lagrangian method(denoted by AALM) for solving the linearly constrained convex programming. We show that the convergence rate of AALM is O(1/k^2) while the convergence rate of the classical augmented Lagrangian method(ALM) is O1 k. Numerical experiments on the linearly constrained 1-2minimization problem are presented to demonstrate the effectiveness of AALM.