In this work,we investigate the mechanism underlying loss spikes observed during neural network training.When the training enters a region with a lower-loss-as-sharper structure,the training becomes unstable,and the l...In this work,we investigate the mechanism underlying loss spikes observed during neural network training.When the training enters a region with a lower-loss-as-sharper structure,the training becomes unstable,and the loss exponentially increases once the loss landscape is too sharp,resulting in the rapid ascent of the loss spike.The training stabilizes when it finds a flat region.From a frequency perspective,we explain the rapid descent in loss as being primarily influenced by low-frequency components.We observe a deviation in the first eigendirection,which can be reasonably explained by the frequency principle,as low-frequency information is captured rapidly,leading to the rapid descent.Inspired by our analysis of loss spikes,we revisit the link between the maximum eigenvalue of the loss Hessian(λ_(max)),flatness and generalization.We suggest that λ_(max)is a good measure of sharpness but not a good measure for generalization.Furthermore,we experimentally observe that loss spikes can facilitate condensation,causing input weights to evolve towards the same direction.And our experiments show that there is a correlation(similar trend)between λ_(max)and condensation.This observation may provide valuable insights for further theoretical research on the relationship between loss spikes,λ_(max),and generalization.展开更多
Fault detection and location are critically significant applications of a supervisory control system in a smart grid.The methods,based on random matrix theory(RMT),have been practiced using measurements to detect shor...Fault detection and location are critically significant applications of a supervisory control system in a smart grid.The methods,based on random matrix theory(RMT),have been practiced using measurements to detect short circuit faults occurring on transmission lines.However,the diagnostic accuracy is infuenced by the noise signal in the measurements.The relationship between mean eigenvalue of a random matrix and noise is detected in this paper,and the defects of the Mean Spectral Radius(MSR),as an indicator to detect faults,are theoretically determined,along with a novel indicator of the shifting degree of maximum eigenvalue and its threshold.By comparing the indicator and the threshold,the occurrence of a fault can be assessed.Finally,an augmented matrix is constructed to locate the fault area.The proposed method can effectively achieve fault detection via the RMT without any influence of noise,and also does not depend on system models.The experiment results are based on the IEEE 39-bus system.Also,actual provincial grid data is applied to validate the effectiveness of the proposed method.展开更多
基金sponsored by the National Key R&D Program of China(Grant No.2022YFA1008200)the National Natural Science Foundation of China(Grant Nos.92270001,12371511,12422119)+3 种基金the Shanghai Municipal of Science and Technology Major Project(Grant No.2021SHZDZX0102)the HPC of School of Mathematical Sciences and the Student Innovation Centerthe Siyuan-1 cluster supported by the Center for High Performance Computing at Shanghai Jiao Tong University,Key Laboratory of Marine Intelligent Equipment and System,Ministry of Education,P.R.Chinasupported by the SJTU Kunpeng&Ascend Center of Excellence.
文摘In this work,we investigate the mechanism underlying loss spikes observed during neural network training.When the training enters a region with a lower-loss-as-sharper structure,the training becomes unstable,and the loss exponentially increases once the loss landscape is too sharp,resulting in the rapid ascent of the loss spike.The training stabilizes when it finds a flat region.From a frequency perspective,we explain the rapid descent in loss as being primarily influenced by low-frequency components.We observe a deviation in the first eigendirection,which can be reasonably explained by the frequency principle,as low-frequency information is captured rapidly,leading to the rapid descent.Inspired by our analysis of loss spikes,we revisit the link between the maximum eigenvalue of the loss Hessian(λ_(max)),flatness and generalization.We suggest that λ_(max)is a good measure of sharpness but not a good measure for generalization.Furthermore,we experimentally observe that loss spikes can facilitate condensation,causing input weights to evolve towards the same direction.And our experiments show that there is a correlation(similar trend)between λ_(max)and condensation.This observation may provide valuable insights for further theoretical research on the relationship between loss spikes,λ_(max),and generalization.
基金This work was supported in part by the National Natural Science Foundation of China(Key Project Number:51437003)。
文摘Fault detection and location are critically significant applications of a supervisory control system in a smart grid.The methods,based on random matrix theory(RMT),have been practiced using measurements to detect short circuit faults occurring on transmission lines.However,the diagnostic accuracy is infuenced by the noise signal in the measurements.The relationship between mean eigenvalue of a random matrix and noise is detected in this paper,and the defects of the Mean Spectral Radius(MSR),as an indicator to detect faults,are theoretically determined,along with a novel indicator of the shifting degree of maximum eigenvalue and its threshold.By comparing the indicator and the threshold,the occurrence of a fault can be assessed.Finally,an augmented matrix is constructed to locate the fault area.The proposed method can effectively achieve fault detection via the RMT without any influence of noise,and also does not depend on system models.The experiment results are based on the IEEE 39-bus system.Also,actual provincial grid data is applied to validate the effectiveness of the proposed method.