The training efficiency and test accuracy are important factors in judging the scalability of distributed deep learning.In this dissertation,the impact of noise introduced in the mixed national institute of standards ...The training efficiency and test accuracy are important factors in judging the scalability of distributed deep learning.In this dissertation,the impact of noise introduced in the mixed national institute of standards and technology database(MNIST)and CIFAR-10 datasets is explored,which are selected as benchmark in distributed deep learning.The noise in the training set is manually divided into cross-noise and random noise,and each type of noise has a different ratio in the dataset.Under the premise of minimizing the influence of parameter interactions in distributed deep learning,we choose a compressed model(SqueezeNet)based on the proposed flexible communication method.It is used to reduce the communication frequency and we evaluate the influence of noise on distributed deep training in the synchronous and asynchronous stochastic gradient descent algorithms.Focusing on the experimental platform TensorFlowOnSpark,we obtain the training accuracy rate at different noise ratios and the training time for different numbers of nodes.The existence of cross-noise in the training set not only decreases the test accuracy and increases the time for distributed training.The noise has positive effect on destroying the scalability of distributed deep learning.展开更多
To achieve better performance,researchers have recently focused on building larger deep learning models,substantially increasing the training costs and prompting the development of distributed training within GPU clus...To achieve better performance,researchers have recently focused on building larger deep learning models,substantially increasing the training costs and prompting the development of distributed training within GPU clusters.However,conventional distributed training approaches suffer from limitations:data parallelism is hindered by excessive memory demands and communication overhead during gradient synchronization,while model parallelism fails to achieve optimal device utilization due to strict computational dependencies.To overcome these challenges,researchers have proposed the concept of hybrid parallelism.By segmenting the model into multiple stages that may internally utilize data parallelism and sequentially processing split training data in a pipeline-like manner across different stages,hybrid parallelism enhances model training speed.However,widely used freezing mechanisms in model fine-tuning,namely canceling gradient computation and weight updates for converged parameters to reduce computational overhead,are yet to be efficiently integrated within hybrid parallel training,failing to strike a balance between speeding up training and guaranteeing accuracy and further reducing the time required for the model to reach a converged state.In this paper,we propose Reinforcement Learning Freeze(RLFreeze),a freezing strategy for distributed DNN training in heterogeneous GPU clusters,especially in hybrid parallelism.We first introduce a mixed freezing criterion based on gradients and gradient variation to accurately freeze converged parameters while minimizing the freezing of unconverged ones.Then,RLFreeze selects the parameters to be frozen according to this criterion and dynamically adjusts the required thresholds for freezing decisions during training using reinforcement learning,achieving a balance between accuracy and accelerated model training.Experimental results demonstrate that RLFreeze can further improve training efficiency in both data parallelism and hybrid parallelism while maintaining model accuracy.展开更多
This article is the second part of Active Power Correction Strategies Based on Deep Reinforcement Learning.In Part II,we consider the renewable energy scenarios plugged into the large-scale power grid and provide an a...This article is the second part of Active Power Correction Strategies Based on Deep Reinforcement Learning.In Part II,we consider the renewable energy scenarios plugged into the large-scale power grid and provide an adaptive algorithmic implementation to maintain power grid stability.Based on the robustness method in Part I,a distributed deep reinforcement learning method is proposed to overcome the infuence of the increasing renewable energy penetration.A multi-agent system is implemented in multiple control areas of the power system,which conducts a fully cooperative stochastic game.Based on the Monte Carlo tree search mentioned in Part I,we select practical actions in each sub-control area to search the Nash equilibrium of the game.Based on the QMIX method,a structure of offine centralized training and online distributed execution is proposed to employ better practical actions in the active power correction control.Our proposed method is evaluated in the modified global competition scenario cases of“2020 Learning to Run a Power Network.Neurips Track 2”.展开更多
Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Com...Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches.展开更多
文摘The training efficiency and test accuracy are important factors in judging the scalability of distributed deep learning.In this dissertation,the impact of noise introduced in the mixed national institute of standards and technology database(MNIST)and CIFAR-10 datasets is explored,which are selected as benchmark in distributed deep learning.The noise in the training set is manually divided into cross-noise and random noise,and each type of noise has a different ratio in the dataset.Under the premise of minimizing the influence of parameter interactions in distributed deep learning,we choose a compressed model(SqueezeNet)based on the proposed flexible communication method.It is used to reduce the communication frequency and we evaluate the influence of noise on distributed deep training in the synchronous and asynchronous stochastic gradient descent algorithms.Focusing on the experimental platform TensorFlowOnSpark,we obtain the training accuracy rate at different noise ratios and the training time for different numbers of nodes.The existence of cross-noise in the training set not only decreases the test accuracy and increases the time for distributed training.The noise has positive effect on destroying the scalability of distributed deep learning.
基金supported by the Science and Technology project of State Grid Jiangsu Electric Power CO.LTD.under Grants No.J2023153.
文摘To achieve better performance,researchers have recently focused on building larger deep learning models,substantially increasing the training costs and prompting the development of distributed training within GPU clusters.However,conventional distributed training approaches suffer from limitations:data parallelism is hindered by excessive memory demands and communication overhead during gradient synchronization,while model parallelism fails to achieve optimal device utilization due to strict computational dependencies.To overcome these challenges,researchers have proposed the concept of hybrid parallelism.By segmenting the model into multiple stages that may internally utilize data parallelism and sequentially processing split training data in a pipeline-like manner across different stages,hybrid parallelism enhances model training speed.However,widely used freezing mechanisms in model fine-tuning,namely canceling gradient computation and weight updates for converged parameters to reduce computational overhead,are yet to be efficiently integrated within hybrid parallel training,failing to strike a balance between speeding up training and guaranteeing accuracy and further reducing the time required for the model to reach a converged state.In this paper,we propose Reinforcement Learning Freeze(RLFreeze),a freezing strategy for distributed DNN training in heterogeneous GPU clusters,especially in hybrid parallelism.We first introduce a mixed freezing criterion based on gradients and gradient variation to accurately freeze converged parameters while minimizing the freezing of unconverged ones.Then,RLFreeze selects the parameters to be frozen according to this criterion and dynamically adjusts the required thresholds for freezing decisions during training using reinforcement learning,achieving a balance between accuracy and accelerated model training.Experimental results demonstrate that RLFreeze can further improve training efficiency in both data parallelism and hybrid parallelism while maintaining model accuracy.
基金supported by the National Key R&D Program of China under Grant 2018AAA0101502.
文摘This article is the second part of Active Power Correction Strategies Based on Deep Reinforcement Learning.In Part II,we consider the renewable energy scenarios plugged into the large-scale power grid and provide an adaptive algorithmic implementation to maintain power grid stability.Based on the robustness method in Part I,a distributed deep reinforcement learning method is proposed to overcome the infuence of the increasing renewable energy penetration.A multi-agent system is implemented in multiple control areas of the power system,which conducts a fully cooperative stochastic game.Based on the Monte Carlo tree search mentioned in Part I,we select practical actions in each sub-control area to search the Nash equilibrium of the game.Based on the QMIX method,a structure of offine centralized training and online distributed execution is proposed to employ better practical actions in the active power correction control.Our proposed method is evaluated in the modified global competition scenario cases of“2020 Learning to Run a Power Network.Neurips Track 2”.
文摘Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches.