Federated learning is a machine learning framework designed to protect privacy by keeping training data on clients’devices without sharing private data.It trains a global model through collaboration between clients a...Federated learning is a machine learning framework designed to protect privacy by keeping training data on clients’devices without sharing private data.It trains a global model through collaboration between clients and the server.However,the presence of data heterogeneity can lead to inefficient model training and even reduce the final model’s accuracy and generalization capability.Meanwhile,data scarcity can result in suboptimal cluster distributions for few-shot clients in centralized clustering tasks,and standalone personalization tasks may cause severe overfitting issues.To address these limitations,we introduce a federated learning dual optimization model based on clustering and personalization strategy(FedCPS).FedCPS adopts a decentralized approach,where clients identify their cluster membership locally without relying on a centralized clustering algorithm.Building on this,FedCPS introduces personalized training tasks locally,adding a regularization term to control deviations between local and cluster models.This improves the generalization ability of the final model while mitigating overfitting.The use of weight-sharing techniques also reduces the computational cost of central machines.Experimental results on MNIST,FMNIST,CIFAR10,and CIFAR100 datasets demonstrate that our method achieves better personalization effects compared to other personalized federated learning methods,with an average test accuracy improvement of 0.81%–2.96%.Meanwhile,we adjusted the proportion of few-shot clients to evaluate the impact on accuracy across different methods.The experiments show that FedCPS reduces accuracy by only 0.2%–3.7%,compared to 2.1%–10%for existing methods.Our method demonstrates its advantages across diverse data environments.展开更多
An intelligent endo-atmospheric penetration strategy based on generative adversarialreinforcement learning is proposed in this manuscript.Firstly,attack and defense adversarial mod-els are established,and missile mane...An intelligent endo-atmospheric penetration strategy based on generative adversarialreinforcement learning is proposed in this manuscript.Firstly,attack and defense adversarial mod-els are established,and missile maneuver penetration problem is transformed into an optimal con-trol problem,considering penetration,handover position and mid-terminal guidance velocityconstraints.Then,Radau Pseudospectral method is adopted to generate data samples consideringrandom perturbations.Furthermore,Generative Adversarial Imitation Learning Combined withDeep Deterministic Policy Gradient method(GAIL-DDPG)is designed,with internal processreward signals constructed to tackle long-term sparse reward in missile manuver penetration prob-lem.Finally,penetration strategy is trained and verified.Simulation shows that using generativeadversarial reinforcement learning,with sample library to learn expert experience in training earlystage,the proposed method can quickly converge.Also,performance is further optimized with rein-forcement learning exploration strategy in the later stage of training.Simulation shows that the pro-posed method has better engineering application ability compared with traditional reinforcementlearning method.展开更多
In Internet of Vehicles,VehicleInfrastructure-Cloud cooperation supports diverse intelligent driving and intelligent transportation applications.Federated Learning(FL)is the emerging computation paradigm to provide ef...In Internet of Vehicles,VehicleInfrastructure-Cloud cooperation supports diverse intelligent driving and intelligent transportation applications.Federated Learning(FL)is the emerging computation paradigm to provide efficient and privacypreserving collaborative learning.However,in Io V environment,federated learning faces the challenges introduced by high mobility of vehicles and nonIndependently Identically Distribution(non-IID)of data.High mobility causes FL clients quit and the communication offline.The non-IID data leads to slow and unstable convergence of global model and single global model's weak adaptability to clients with different localization characteristics.Accordingly,this paper proposes a personalized aggregation strategy for hierarchical Federated Learning in Io V environment,including Fed SA(Special Asynchronous Federated Learning with Self-adaptive Aggregation)for low-level FL between a Road Side Unit(RSU)and the vehicles within its coverage,and Fed Att(Federated Learning with Attention Mechanism)for high-level FL between a cloud server and multiple RSUs.Agents self-adaptively obtain model aggregation weight based on Advantage Actor-Critic(A2C)algorithm.Experiments show the proposed strategy encourages vehicles to participate in global aggregation,and outperforms existing methods in training performance.展开更多
In practical combat scenarios,Hypersonic Glide Vehicles(HGV)face the challenge of evading Successive Pursuers from the Same Direction while satisfying the Homing Constraint(SPSDHC).To address this problem,this paper p...In practical combat scenarios,Hypersonic Glide Vehicles(HGV)face the challenge of evading Successive Pursuers from the Same Direction while satisfying the Homing Constraint(SPSDHC).To address this problem,this paper proposes a parameterized evasion guidance algorithm based on reinforcement learning.The three-player optimal evasion strategy is firstly analyzed and approximated by parametrization.The switching acceleration command of HGV optimal evasion strategy considering the upper limit of missile acceleration command is analyzed based on the optimal control theory.The terminal miss of HGV in the case of evading two missiles is analyzed,which means that the three-player optimal evasion strategy is a linear combination of two one-toone strategies.Then,a velocity control algorithm is proposed to increase the terminal miss by actively controlling the flight speed of the HGV based on the parametrized evasion strategy.The reinforcement learning method is used to implement the strategy in real time and a reward function is designed by deducing homing strategy for the HGV to approach the target,which ensures that the HGV satisfies the homing constraint.Experimental results demonstrate the feasibility and robustness of the proposed parameterized evasion strategy,which enables the HGV to generate maximum terminal miss and satisfy homing constraint when facing single or double missiles.展开更多
In recent years,robotic arm grasping has become a pivotal task in the field of robotics,with applications spanning from industrial automation to healthcare.The optimization of grasping strategies plays a crucial role ...In recent years,robotic arm grasping has become a pivotal task in the field of robotics,with applications spanning from industrial automation to healthcare.The optimization of grasping strategies plays a crucial role in enhancing the effectiveness,efficiency,and reliability of robotic systems.This paper presents a novel approach to optimizing robotic arm grasping strategies based on deep reinforcement learning(DRL).Through the utilization of advanced DRL algorithms,such as Q-Learning,Deep Q-Networks(DQN),Policy Gradient Methods,and Proximal Policy Optimization(PPO),the study aims to improve the performance of robotic arms in grasping objects with varying shapes,sizes,and environmental conditions.The paper provides a detailed analysis of the various deep reinforcement learning methods used for grasping strategy optimization,emphasizing the strengths and weaknesses of each algorithm.It also presents a comprehensive framework for training the DRL models,including simulation environment setup,the optimization process,and the evaluation metrics for grasping success.The results demonstrate that the proposed approach significantly enhances the accuracy and stability of the robotic arm in performing grasping tasks.The study further explores the challenges in training deep reinforcement learning models for real-time robotic applications and offers solutions for improving the efficiency and reliability of grasping strategies.展开更多
The purpose of this research is to analyze the causal mechanisms of learning difficulties of middle school students and use them to propose strategies to help them.This research is particularly valuable for its focus ...The purpose of this research is to analyze the causal mechanisms of learning difficulties of middle school students and use them to propose strategies to help them.This research is particularly valuable for its focus on middle school students.Research on this critical transition period is often lacking compared to primary and high school.Therefore,this research establishes a structured equation model and analyzes the data from the survey using the partial least squares method.The data were obtained from a 13,900 Wenzhou City,China students’questionnaire.The research found that learning strategies were the most significant influences on learning effectiveness,followed by learning motivation and learning relationships.Meanwhile,learning relationships had a significant impact on learning pressure.Therefore,this research proposes targeted support strategies.It aims to enhance learning motivation(Set achievable learning goals for each student with learning difficulties based on their actual situation),optimize learning strategies(Encourage students with learning difficulties to learn self-regulatory strategies such as goal setting,time management,and self-reflection),and improve learning relationships(Establish a good social network to promote positive interaction between students with learning difficulties and their peers).At the same time,it reduces students’learning pressure.Ultimately,the learning effectiveness of students with learning difficulties is improved.展开更多
Multi-constrained pipes conveying fluid,such as aircraft hydraulic control pipes,are susceptible to resonance fatigue in harsh vibration environments,which may lead to system failure and even catastrophic accidents.In...Multi-constrained pipes conveying fluid,such as aircraft hydraulic control pipes,are susceptible to resonance fatigue in harsh vibration environments,which may lead to system failure and even catastrophic accidents.In this study,a machine learning(ML)-assisted weak vibration design method under harsh environmental excitations is proposed.The dynamic model of a typical pipe is developed using the absolute nodal coordinate formulation(ANCF)to determine its vibrational characteristics.With the harsh vibration environments as the preserved frequency band(PFB),the safety design is defined by comparing the natural frequency with the PFB.By analyzing the safety design of pipes with different constraint parameters,the dataset of the absolute safety length and the absolute resonance length of the pipe is obtained.This dataset is then utilized to develop genetic programming(GP)algorithm-based ML models capable of producing explicit mathematical expressions of the pipe's absolute safety length and absolute resonance length with the location,stiffness,and total number of retaining clips as design variables.The proposed ML models effectively bridge the dataset with the prediction results.Thus,the ML model is utilized to stagger the natural frequency,and the PFB is utilized to achieve the weak vibration design.The findings of the present study provide valuable insights into the practical application of weak vibration design.展开更多
Anxiety,motivation,and strategy have long been seen as critical in second language acquisition.This study presents a systematic review of the literature on these variables in terms of their relationship with one anoth...Anxiety,motivation,and strategy have long been seen as critical in second language acquisition.This study presents a systematic review of the literature on these variables in terms of their relationship with one another,their effects on learning outcomes,and how they are affected by technology-assisted tools in the teaching of Chinese as a second language.This study includes 24 articles for the review study based on the criteria and process of the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocol(PRISMA-P)and the clustering techniques of VOSviewer.It is found that 1)anxiety,motivation,and strategy were interrelated,that is,motivation was negatively associated with anxiety but positively related to strategy,while strategy could positively predict anxiety;2)anxiety could both positively and negatively affect learning outcomes,while motivation and strategy could both positively and insignificantly influence learning outcomes;3)the technology-assisted tools used in the classroom could both positively and negatively affect the levels of these variables and learning outcomes in the L2 Chinese context.The need to explore more complicated relationships between language-specific individual variables themselves and other possible factors that affect these variables,such as cultural ones,are also discussed for future research.展开更多
Autonomous driving technology is constantly developing to a higher level of complex scenes,and there is a growing demand for the utilization of end-to-end data-driven control.However,the end-to-end path tracking proce...Autonomous driving technology is constantly developing to a higher level of complex scenes,and there is a growing demand for the utilization of end-to-end data-driven control.However,the end-to-end path tracking process often encounters challenges in learning efficiency and generalization.To address this issue,this paper designs a deep deterministic policy gradient(DDPG)-based reinforcement learning strategy that integrates imitation learning and feedforward exploration in the path following process.In imitation learning,the path tracking control data generated by the model predictive control(MPC)method is used to train an end-to-end steering control model of a deep neural network.Another feedforward exploration behavior is predicted by road curvature and vehicle speed,and adds it and imitation learning to the DDPG reinforcement learning to obtain decision-making experience and action prediction behavior of the path tracking process.In the reinforcement learning process,imitation learning is used to update the pre-training parameters of the actor network,and a feedforward steering technique with random noise is adopted for strategy exploration.In the reward function,a hierarchical progressive reward form and a constrained objective reward function referring to MPC are designed,and the actor-critic network architecture is determined.Finally,the path tracking performance of the designed method is verified by comparing various training results,simulations,and HIL tests.The results show that the designed method can effectively utilize pre-training and feedforward prior experience to obtain optimal path tracking performance of an autonomous vehicle,and has better generalization ability than other methods.This study provides an efficient control scheme for improving the end-to-end control performance of autonomous vehicles.展开更多
The development of fungicides is time-consuming and costly.Introducing a fungicide-likeness assessment strategy at the early screening stage can help reduce development risks and improve the success rate.However,exist...The development of fungicides is time-consuming and costly.Introducing a fungicide-likeness assessment strategy at the early screening stage can help reduce development risks and improve the success rate.However,existing assessment methods are often plagued by low accuracy and poor generalization,while fragment-based design strategies commonly fail to account for synergistic effects between structural units.Therefore,based on a small-scale sample set,this study developed a more efficient global predictive model for fungicidal activity—-named APPf—by integrating multi-scale feature screening methods and machine learning algorithms,which also accounts for synergistic effects among different structural fragments.We utilized three independent external test sets for model validation:External Test Set 1 for general validation,External Test Set 2 for comparison with existing models,and External Test Set 3 for disease-specific fungicide evaluation.On External Test Set 1,the APPf model achieved a precision of 0.6454,a recall of 0.8535,and an F1 score of 0.7350,demonstrating its robust predictive performance.It also exhibited strong enrichment capability for positive samples in External Test Set 2.For External Test Set 3,APPf achieved a prediction accuracy exceeding 80%for each disease,suggesting its promising potential in practical fungicide development.Furthermore,we quantified the contribution of molecular descriptors to the model predictions using SHAP value analysis and identified nHdNH and NssssNp as strong indicative features for predicting fungicidal activity,thereby enhancing the interpretability of the model.APPf has been deployed on a public web server(http://pesticides.cau.edu.cn/APPf),providing a user-friendly online prediction service to support the discovery of novel fungicides.Meanwhile,we employed a molecular fragmentation strategy to analyze the co-occurrence relationships between fragments in fungicides and constructed a network map of fragment co-occurrence associated with fungicidal activity.This study provides both an active fragment library and a global fungicide-likeness assessment tool for AI-based de novo molecular generation aimed at discovering novel fungicidal leads,which is expected to enhance the efficiency of developing new fungicides.展开更多
Background Breed identification plays an important role in conserving indigenous breeds,managing genetic resources,and developing effective breeding strategies.However,researches on breed identification in livestock m...Background Breed identification plays an important role in conserving indigenous breeds,managing genetic resources,and developing effective breeding strategies.However,researches on breed identification in livestock mainly focused on purebreds,and they yielded lower predict accuracy in hybrid.In this study,we presented a Multi-Layer Perceptron(MLP)model with multi-output regression framework specifically designed for genomic breed composition prediction of purebred and hybrid in pigs.Results We utilized a total of 8,199 pigs from breeding farms in eight provinces in China,comprising Yorkshire,Landrace,Duroc and hybrids of Yorkshire×Landrace.All the animals were genotyped with 1K,50K and 100K SNP chips.Comparing with random forest(RF),support vector regression(SVR)and Admixture,our results from five replicates of fivefold cross validation demonstrated that MLP achieved a breed identification accuracy of 100%for both hybrid and purebreds in 50K and 100K SNP chips,SVR performed comparable with MLP,they both outperformed RF and Admixture.In the independent testing,MLP yielded accuracy of 100%for all three pure breeds and hybrid across all SNP chips and panel,while SVR yielded 0.026%–0.121%lower accuracy than MLP.Compared with classification-based framework,the new strategy of multi-output regression framework in this study was helpful to improve the predict accuracy.MLP,RF and SVR,achieved consistent improvements across all six SNP chips/panel,especially in hybrid identification.Our results showed the determination threshold for purebred had different effects,SVR,RF and Admixture were very sensitive to threshold values,their optimal threshold fluctuated in different scenarios,while MLP kept optimal threshold 0.75 in all cases.The threshold of 0.65–0.75 is ideal for accurate breed identification.Among different density of SNP chips,the 1K SNP chip was most cost-effective as yielding 100%accuracy with enlarging training set.Hybrid individuals in the training set were useful for both purebred and hybrid identification.Conclusions Our new MLP strategy demonstrated its high accuracy and robust applicability across low-,medium-,and high-density SNP chips.Multi-output regression framework could universally enhance prediction accuracy for ML methods.Our new strategy is also helpful for breed identification in other livestock.展开更多
The complex compositions of high-entropy alloys(HEAs)enable a variety of phase structures like FCC single phase,BCC single phase,or duplex FCC+BCC phase.Accurate and efficient prediction of phase structure is crucial ...The complex compositions of high-entropy alloys(HEAs)enable a variety of phase structures like FCC single phase,BCC single phase,or duplex FCC+BCC phase.Accurate and efficient prediction of phase structure is crucial for accelerating the discovery of new components and designing HEAs with desired phase structure.In this work,five machine learning strategies were utilized to predict the phase structures of HEAs with a dataset of 296.Specifically,a two-step feature selection strategy was proposed,enabling pronounced improvement in the computational efficiency from 2047 to 12 iterations for each model while ensuring fewer input features and higher prediction accuracy.Compared with traditional valence electron concentration criterion,the prediction accuracy of collected dataset was highly improved from 0.79 to 0.98 for random forest.Furthermore,HEAs with compositions of Al_(x)CoCu_(6)Ni_(6)Fe_(6)(x=1,3,6)were developed to validate the prediction results of machine learning models,and the mechanical properties as well as corrosion resistance were investigated.It is found that the higher Al content enhances the yield strength but deteriorates corrosion resistance.The present two-step feature selection strategy provides an alternative method that is feasible for predicting the phase structure of HEAs with high efficiency and accuracy.展开更多
Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies dri...Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies driven by Cybertwins have been proposed for adaptive task offloading strategies.However,the existence of random transmission delay between Cybertwin-driven agents and underlying networks is not considered in related works,which destroys the standard Markov property and increases the decision reaction time to reduce the task offloading strategy performance.In order to address this problem,we propose a pipelining task offloading method to lower the decision reaction time and model it as a delay-aware Markov Decision Process(MDP).Then,we design a delay-aware MADRL algorithm to minimize the weighted sum of task execution latency and energy consumption.Firstly,the state space is augmented using the lastly-received state and historical actions to rebuild the Markov property.Secondly,Gate Transformer-XL is introduced to capture historical actions'importance and maintain the consistent input dimension dynamically changed due to random transmission delays.Thirdly,a sampling method and a new loss function with the difference between the current and target state value and the difference between real state-action value and augmented state-action value are designed to obtain state transition trajectories close to the real ones.Numerical results demonstrate that the proposed methods are effective in reducing reaction time and improving the task offloading performance in the random-delay Cybertwin-enabled 6G networks.展开更多
In the parallel steering coordination control strategy for path tracking,it is difficult to match the current driver steering model using the fixed parameters with the actual driver,and the designed steering coordinat...In the parallel steering coordination control strategy for path tracking,it is difficult to match the current driver steering model using the fixed parameters with the actual driver,and the designed steering coordination control strategy under a single objective and simple conditions is difficult to adapt to the multi-dimensional state variables’input.In this paper,we propose a deep reinforcement learning algorithm-based multi-objective parallel human-machine steering coordination strategy for path tracking considering driver misoperation and external disturbance.Firstly,the driver steering mathematical model is constructed based on the driver preview characteristics and steering delay response,and the driver characteristic parameters are fitted after collecting the actual driver driving data.Secondly,considering that the vehicle is susceptible to the influence of external disturbances during the driving process,the Tube MPC(Tube Model Predictive Control)based path tracking steering controller is designed based on the vehicle system dynamics error model.After verifying that the driver steering model meets the driver steering operation characteristics,DQN(Deep Q-network),DDPG(Deep Deterministic Policy Gradient)and TD3(Twin Delayed Deep Deterministic Policy Gradient)deep reinforcement learning algorithms are utilized to design a multi-objective parallel steering coordination strategy which satisfies the multi-dimensional state variables’input of the vehicle.Finally,the tracking accuracy,lateral safety,human-machine conflict and driver steering load evaluation index are designed in different driver operation states and different road environments,and the performance of the parallel steering coordination control strategies with different deep reinforcement learning algorithms and fuzzy algorithms are compared by simulations and hardware in the loop experiments.The results show that the parallel steering collaborative strategy based on a deep reinforcement learning algorithm can more effectively assist the driver in tracking the target path under lateral wind interference and driver misoperation,and the TD3-based coordination control strategy has better overall performance.展开更多
This study demonstrates the complexity and importance of water quality as a measure of the health and sustainability of ecosystems that directly influence biodiversity,human health,and the world economy.The predictabi...This study demonstrates the complexity and importance of water quality as a measure of the health and sustainability of ecosystems that directly influence biodiversity,human health,and the world economy.The predictability of water quality thus plays a crucial role in managing our ecosystems to make informed decisions and,hence,proper environmental management.This study addresses these challenges by proposing an effective machine learning methodology applied to the“Water Quality”public dataset.The methodology has modeled the dataset suitable for providing prediction classification analysis with high values of the evaluating parameters such as accuracy,sensitivity,and specificity.The proposed methodology is based on two novel approaches:(a)the SMOTE method to deal with unbalanced data and(b)the skillfully involved classical machine learning models.This paper uses Random Forests,Decision Trees,XGBoost,and Support Vector Machines because they can handle large datasets,train models for handling skewed datasets,and provide high accuracy in water quality classification.A key contribution of this work is the use of custom sampling strategies within the SMOTE approach,which significantly enhanced performance metrics and improved class imbalance handling.The results demonstrate significant improvements in predictive performance,achieving the highest reported metrics:accuracy(98.92%vs.96.06%),sensitivity(98.3%vs.71.26%),and F1 score(98.37%vs.79.74%)using the XGBoost model.These improvements underscore the effectiveness of our custom SMOTE sampling strategies in addressing class imbalance.The findings contribute to environmental management by enabling ecology specialists to develop more accurate strategies for monitoring,assessing,and managing drinking water quality,ensuring better ecosystem and public health outcomes.展开更多
Aiming at the problem of mobile data traffic surge in 5G networks,this paper proposes an effective solution combining massive multiple-input multiple-output techniques with Ultra-Dense Network(UDN)and focuses on solvi...Aiming at the problem of mobile data traffic surge in 5G networks,this paper proposes an effective solution combining massive multiple-input multiple-output techniques with Ultra-Dense Network(UDN)and focuses on solving the resulting challenge of increased energy consumption.A base station control algorithm based on Multi-Agent Proximity Policy Optimization(MAPPO)is designed.In the constructed 5G UDN model,each base station is considered as an agent,and the MAPPO algorithm enables inter-base station collaboration and interference management to optimize the network performance.To reduce the extra power consumption due to frequent sleep mode switching of base stations,a sleep mode switching decision algorithm is proposed.The algorithm reduces unnecessary power consumption by evaluating the network state similarity and intelligently adjusting the agent’s action strategy.Simulation results show that the proposed algorithm reduces the power consumption by 24.61% compared to the no-sleep strategy and further reduces the power consumption by 5.36% compared to the traditional MAPPO algorithm under the premise of guaranteeing the quality of service of users.展开更多
The application of multiple unmanned aerial vehicles(UAVs)for the pursuit and capture of unauthorized UAVs has emerged as a novel approach to ensuring the safety of urban airspace.However,pursuit UAVs necessitate the ...The application of multiple unmanned aerial vehicles(UAVs)for the pursuit and capture of unauthorized UAVs has emerged as a novel approach to ensuring the safety of urban airspace.However,pursuit UAVs necessitate the utilization of their own sensors to proactively gather information from the unauthorized UAV.Considering the restricted sensing range of sensors,this paper proposes a multi-UAV with limited visual field pursuit-evasion(MUV-PE)problem.Each pursuer has a visual field characterized by limited perception distance and viewing angle,potentially obstructed by buildings.Only when the unauthorized UAV,i.e.,the evader,enters the visual field of any pursuer can its position be acquired.The objective of the pursuers is to capture the evader as soon as possible without collision.To address this problem,we propose the normalizing flow actor with graph attention critic(NAGC)algorithm,a multi-agent reinforcement learning(MARL)approach.NAGC executes normalizing flows to augment the flexibility of policy network,enabling the agent to sample actions from more intricate distributions rather than common distributions.To enhance the capability of simultaneously comprehending spatial relationships among multiple UAVs and environmental obstacles,NAGC integrates the“obstacle-target”graph attention networks,significantly aiding pursuers in supporting search or pursuit activities.Extensive experiments conducted in a high-precision simulator validate the promising performance of the NAGC algorithm.展开更多
The integration of large-scale-distributed new energy resources has led to heightened source‒load uncertainty.As energy prosumers,microgrids urgently require enhanced real-time regulation capabilities over controllabl...The integration of large-scale-distributed new energy resources has led to heightened source‒load uncertainty.As energy prosumers,microgrids urgently require enhanced real-time regulation capabilities over controllable resources amid uncertain environments,rendering real-time and rapid decision-making a critical issue.This paper proposes a tailored twin delayed deep deterministic policy gradient(TD3)reinforcement learning algorithm that explicitly accounts for source‒load uncertainty.First,following an expert experience-based methodology,Gaussian process regression was implemented using the radial basis function covariance with historical source and load data.The parameters were adaptively adjusted by maximum likelihood estimation to generate the expected curves of demand and wind‒solar power generation,along with their 95%confidence regions,which were treated as representative uncertainty scenarios.Second,the traditional scheduling model was transformed into a deep reinforcement learning(DRL)environment through a Markov process.To minimize the total operational cost of the microgrid,the tailored TD3 algorithm was applied to formulate rapid intraday scheduling decisions.Finally,simulations were conducted using real historical data from an actual region in Zhejiang province,China,to verify the efficacy of the proposed method.The results demonstrate the potential of the algorithm for achieving economic scheduling for microgrids.展开更多
In the wake of major natural disasters or human-made disasters,the communication infrastruc-ture within disaster-stricken areas is frequently dam-aged.Unmanned aerial vehicles(UAVs),thanks to their merits such as rapi...In the wake of major natural disasters or human-made disasters,the communication infrastruc-ture within disaster-stricken areas is frequently dam-aged.Unmanned aerial vehicles(UAVs),thanks to their merits such as rapid deployment and high mobil-ity,are commonly regarded as an ideal option for con-structing temporary communication networks.Con-sidering the limited computing capability and battery power of UAVs,this paper proposes a two-layer UAV cooperative computing offloading strategy for emer-gency disaster relief scenarios.The multi-agent twin delayed deep deterministic policy gradient(MATD3)algorithm integrated with prioritized experience replay(PER)is utilized to jointly optimize the scheduling strategies of UAVs,task offloading ratios,and their mobility,aiming to diminish the energy consumption and delay of the system to the minimum.In order to address the aforementioned non-convex optimiza-tion issue,a Markov decision process(MDP)has been established.The results of simulation experiments demonstrate that,compared with the other four base-line algorithms,the algorithm introduced in this paper exhibits better convergence performance,verifying its feasibility and efficacy.展开更多
BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning ofte...BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning often fail to capture the sparse and diagnostically critical features of metastatic potential.AIM To develop and validate a case-level multiple-instance learning(MIL)framework mimicking a pathologist's comprehensive review and improve T3/T4 CRC LNM prediction.METHODS The whole-slide images of 130 patients with T3/T4 CRC were retrospectively collected.A case-level MIL framework utilising the CONCH v1.5 and UNI2-h deep learning models was trained on features from all haematoxylin and eosinstained primary tumour slides for each patient.These pathological features were subsequently integrated with clinical data,and model performance was evaluated using the area under the curve(AUC).RESULTS The case-level framework demonstrated superior LNM prediction over slide-level training,with the CONCH v1.5 model achieving a mean AUC(±SD)of 0.899±0.033 vs 0.814±0.083,respectively.Integrating pathology features with clinical data further enhanced performance,yielding a top model with a mean AUC of 0.904±0.047,in sharp contrast to a clinical-only model(mean AUC 0.584±0.084).Crucially,a pathologist’s review confirmed that the model-identified high-attention regions correspond to known high-risk histopathological features.CONCLUSION A case-level MIL framework provides a superior approach for predicting LNM in advanced CRC.This method shows promise for risk stratification and therapy decisions,requiring further validation.展开更多
基金supported by the Foundation of President of Hebei University(XZJJ202303).
文摘Federated learning is a machine learning framework designed to protect privacy by keeping training data on clients’devices without sharing private data.It trains a global model through collaboration between clients and the server.However,the presence of data heterogeneity can lead to inefficient model training and even reduce the final model’s accuracy and generalization capability.Meanwhile,data scarcity can result in suboptimal cluster distributions for few-shot clients in centralized clustering tasks,and standalone personalization tasks may cause severe overfitting issues.To address these limitations,we introduce a federated learning dual optimization model based on clustering and personalization strategy(FedCPS).FedCPS adopts a decentralized approach,where clients identify their cluster membership locally without relying on a centralized clustering algorithm.Building on this,FedCPS introduces personalized training tasks locally,adding a regularization term to control deviations between local and cluster models.This improves the generalization ability of the final model while mitigating overfitting.The use of weight-sharing techniques also reduces the computational cost of central machines.Experimental results on MNIST,FMNIST,CIFAR10,and CIFAR100 datasets demonstrate that our method achieves better personalization effects compared to other personalized federated learning methods,with an average test accuracy improvement of 0.81%–2.96%.Meanwhile,we adjusted the proportion of few-shot clients to evaluate the impact on accuracy across different methods.The experiments show that FedCPS reduces accuracy by only 0.2%–3.7%,compared to 2.1%–10%for existing methods.Our method demonstrates its advantages across diverse data environments.
文摘An intelligent endo-atmospheric penetration strategy based on generative adversarialreinforcement learning is proposed in this manuscript.Firstly,attack and defense adversarial mod-els are established,and missile maneuver penetration problem is transformed into an optimal con-trol problem,considering penetration,handover position and mid-terminal guidance velocityconstraints.Then,Radau Pseudospectral method is adopted to generate data samples consideringrandom perturbations.Furthermore,Generative Adversarial Imitation Learning Combined withDeep Deterministic Policy Gradient method(GAIL-DDPG)is designed,with internal processreward signals constructed to tackle long-term sparse reward in missile manuver penetration prob-lem.Finally,penetration strategy is trained and verified.Simulation shows that using generativeadversarial reinforcement learning,with sample library to learn expert experience in training earlystage,the proposed method can quickly converge.Also,performance is further optimized with rein-forcement learning exploration strategy in the later stage of training.Simulation shows that the pro-posed method has better engineering application ability compared with traditional reinforcementlearning method.
基金supported by the National Natural Science Foundation of China under Grant 61931005Beijing Natural Science Foundation under Grant L202018the Key Laboratory of Internet of Vehicle Technical Innovation and Testing(CAICT),Ministry of Industry and Information Technology under Grant No.KL-2023-001。
文摘In Internet of Vehicles,VehicleInfrastructure-Cloud cooperation supports diverse intelligent driving and intelligent transportation applications.Federated Learning(FL)is the emerging computation paradigm to provide efficient and privacypreserving collaborative learning.However,in Io V environment,federated learning faces the challenges introduced by high mobility of vehicles and nonIndependently Identically Distribution(non-IID)of data.High mobility causes FL clients quit and the communication offline.The non-IID data leads to slow and unstable convergence of global model and single global model's weak adaptability to clients with different localization characteristics.Accordingly,this paper proposes a personalized aggregation strategy for hierarchical Federated Learning in Io V environment,including Fed SA(Special Asynchronous Federated Learning with Self-adaptive Aggregation)for low-level FL between a Road Side Unit(RSU)and the vehicles within its coverage,and Fed Att(Federated Learning with Attention Mechanism)for high-level FL between a cloud server and multiple RSUs.Agents self-adaptively obtain model aggregation weight based on Advantage Actor-Critic(A2C)algorithm.Experiments show the proposed strategy encourages vehicles to participate in global aggregation,and outperforms existing methods in training performance.
基金supported by the National Natural Science Foundation of China(No.62103014)。
文摘In practical combat scenarios,Hypersonic Glide Vehicles(HGV)face the challenge of evading Successive Pursuers from the Same Direction while satisfying the Homing Constraint(SPSDHC).To address this problem,this paper proposes a parameterized evasion guidance algorithm based on reinforcement learning.The three-player optimal evasion strategy is firstly analyzed and approximated by parametrization.The switching acceleration command of HGV optimal evasion strategy considering the upper limit of missile acceleration command is analyzed based on the optimal control theory.The terminal miss of HGV in the case of evading two missiles is analyzed,which means that the three-player optimal evasion strategy is a linear combination of two one-toone strategies.Then,a velocity control algorithm is proposed to increase the terminal miss by actively controlling the flight speed of the HGV based on the parametrized evasion strategy.The reinforcement learning method is used to implement the strategy in real time and a reward function is designed by deducing homing strategy for the HGV to approach the target,which ensures that the HGV satisfies the homing constraint.Experimental results demonstrate the feasibility and robustness of the proposed parameterized evasion strategy,which enables the HGV to generate maximum terminal miss and satisfy homing constraint when facing single or double missiles.
文摘In recent years,robotic arm grasping has become a pivotal task in the field of robotics,with applications spanning from industrial automation to healthcare.The optimization of grasping strategies plays a crucial role in enhancing the effectiveness,efficiency,and reliability of robotic systems.This paper presents a novel approach to optimizing robotic arm grasping strategies based on deep reinforcement learning(DRL).Through the utilization of advanced DRL algorithms,such as Q-Learning,Deep Q-Networks(DQN),Policy Gradient Methods,and Proximal Policy Optimization(PPO),the study aims to improve the performance of robotic arms in grasping objects with varying shapes,sizes,and environmental conditions.The paper provides a detailed analysis of the various deep reinforcement learning methods used for grasping strategy optimization,emphasizing the strengths and weaknesses of each algorithm.It also presents a comprehensive framework for training the DRL models,including simulation environment setup,the optimization process,and the evaluation metrics for grasping success.The results demonstrate that the proposed approach significantly enhances the accuracy and stability of the robotic arm in performing grasping tasks.The study further explores the challenges in training deep reinforcement learning models for real-time robotic applications and offers solutions for improving the efficiency and reliability of grasping strategies.
基金2025 Wenzhou Key Research Base of Philosophy and Social Science(Wenzhou University Learning Science and Technology Research Centre)Research Project:Investigation and Strategy Research on the Causes of Middle School Students’Learning Difficulties in the Context of the Leading Country in Education.
文摘The purpose of this research is to analyze the causal mechanisms of learning difficulties of middle school students and use them to propose strategies to help them.This research is particularly valuable for its focus on middle school students.Research on this critical transition period is often lacking compared to primary and high school.Therefore,this research establishes a structured equation model and analyzes the data from the survey using the partial least squares method.The data were obtained from a 13,900 Wenzhou City,China students’questionnaire.The research found that learning strategies were the most significant influences on learning effectiveness,followed by learning motivation and learning relationships.Meanwhile,learning relationships had a significant impact on learning pressure.Therefore,this research proposes targeted support strategies.It aims to enhance learning motivation(Set achievable learning goals for each student with learning difficulties based on their actual situation),optimize learning strategies(Encourage students with learning difficulties to learn self-regulatory strategies such as goal setting,time management,and self-reflection),and improve learning relationships(Establish a good social network to promote positive interaction between students with learning difficulties and their peers).At the same time,it reduces students’learning pressure.Ultimately,the learning effectiveness of students with learning difficulties is improved.
基金Project supported by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China(No.12421002)the National Science Funds for Distinguished Young Scholars of China(No.12025204)+1 种基金the National Natural Science Foundation of China(No.12372015)China Scholarship Council(No.202206890065)。
文摘Multi-constrained pipes conveying fluid,such as aircraft hydraulic control pipes,are susceptible to resonance fatigue in harsh vibration environments,which may lead to system failure and even catastrophic accidents.In this study,a machine learning(ML)-assisted weak vibration design method under harsh environmental excitations is proposed.The dynamic model of a typical pipe is developed using the absolute nodal coordinate formulation(ANCF)to determine its vibrational characteristics.With the harsh vibration environments as the preserved frequency band(PFB),the safety design is defined by comparing the natural frequency with the PFB.By analyzing the safety design of pipes with different constraint parameters,the dataset of the absolute safety length and the absolute resonance length of the pipe is obtained.This dataset is then utilized to develop genetic programming(GP)algorithm-based ML models capable of producing explicit mathematical expressions of the pipe's absolute safety length and absolute resonance length with the location,stiffness,and total number of retaining clips as design variables.The proposed ML models effectively bridge the dataset with the prediction results.Thus,the ML model is utilized to stagger the natural frequency,and the PFB is utilized to achieve the weak vibration design.The findings of the present study provide valuable insights into the practical application of weak vibration design.
文摘Anxiety,motivation,and strategy have long been seen as critical in second language acquisition.This study presents a systematic review of the literature on these variables in terms of their relationship with one another,their effects on learning outcomes,and how they are affected by technology-assisted tools in the teaching of Chinese as a second language.This study includes 24 articles for the review study based on the criteria and process of the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocol(PRISMA-P)and the clustering techniques of VOSviewer.It is found that 1)anxiety,motivation,and strategy were interrelated,that is,motivation was negatively associated with anxiety but positively related to strategy,while strategy could positively predict anxiety;2)anxiety could both positively and negatively affect learning outcomes,while motivation and strategy could both positively and insignificantly influence learning outcomes;3)the technology-assisted tools used in the classroom could both positively and negatively affect the levels of these variables and learning outcomes in the L2 Chinese context.The need to explore more complicated relationships between language-specific individual variables themselves and other possible factors that affect these variables,such as cultural ones,are also discussed for future research.
基金Supported by National Natural Science Foundation of China(Grant No.52405104)Jiangxi Provincial Natural Science Foundation(Grant Nos.20242BAB20249 and 20232BAB204041)Science and Technology Project of Department of Transportation of Jiangxi Province(Grant No.2025QN003).
文摘Autonomous driving technology is constantly developing to a higher level of complex scenes,and there is a growing demand for the utilization of end-to-end data-driven control.However,the end-to-end path tracking process often encounters challenges in learning efficiency and generalization.To address this issue,this paper designs a deep deterministic policy gradient(DDPG)-based reinforcement learning strategy that integrates imitation learning and feedforward exploration in the path following process.In imitation learning,the path tracking control data generated by the model predictive control(MPC)method is used to train an end-to-end steering control model of a deep neural network.Another feedforward exploration behavior is predicted by road curvature and vehicle speed,and adds it and imitation learning to the DDPG reinforcement learning to obtain decision-making experience and action prediction behavior of the path tracking process.In the reinforcement learning process,imitation learning is used to update the pre-training parameters of the actor network,and a feedforward steering technique with random noise is adopted for strategy exploration.In the reward function,a hierarchical progressive reward form and a constrained objective reward function referring to MPC are designed,and the actor-critic network architecture is determined.Finally,the path tracking performance of the designed method is verified by comparing various training results,simulations,and HIL tests.The results show that the designed method can effectively utilize pre-training and feedforward prior experience to obtain optimal path tracking performance of an autonomous vehicle,and has better generalization ability than other methods.This study provides an efficient control scheme for improving the end-to-end control performance of autonomous vehicles.
基金the National Key R&D Program of China(No.2022YFD1700200).
文摘The development of fungicides is time-consuming and costly.Introducing a fungicide-likeness assessment strategy at the early screening stage can help reduce development risks and improve the success rate.However,existing assessment methods are often plagued by low accuracy and poor generalization,while fragment-based design strategies commonly fail to account for synergistic effects between structural units.Therefore,based on a small-scale sample set,this study developed a more efficient global predictive model for fungicidal activity—-named APPf—by integrating multi-scale feature screening methods and machine learning algorithms,which also accounts for synergistic effects among different structural fragments.We utilized three independent external test sets for model validation:External Test Set 1 for general validation,External Test Set 2 for comparison with existing models,and External Test Set 3 for disease-specific fungicide evaluation.On External Test Set 1,the APPf model achieved a precision of 0.6454,a recall of 0.8535,and an F1 score of 0.7350,demonstrating its robust predictive performance.It also exhibited strong enrichment capability for positive samples in External Test Set 2.For External Test Set 3,APPf achieved a prediction accuracy exceeding 80%for each disease,suggesting its promising potential in practical fungicide development.Furthermore,we quantified the contribution of molecular descriptors to the model predictions using SHAP value analysis and identified nHdNH and NssssNp as strong indicative features for predicting fungicidal activity,thereby enhancing the interpretability of the model.APPf has been deployed on a public web server(http://pesticides.cau.edu.cn/APPf),providing a user-friendly online prediction service to support the discovery of novel fungicides.Meanwhile,we employed a molecular fragmentation strategy to analyze the co-occurrence relationships between fragments in fungicides and constructed a network map of fragment co-occurrence associated with fungicidal activity.This study provides both an active fragment library and a global fungicide-likeness assessment tool for AI-based de novo molecular generation aimed at discovering novel fungicidal leads,which is expected to enhance the efficiency of developing new fungicides.
基金supported by grants from Key R&D Program of Shandong Province(2022LZGC003)China Agriculture Research System of MOF and MARA,the National Key Research and Development Project(2023YFD1300200 and 2023YFF1001104)+1 种基金the Science and Technology Program of Sichuan Province(2024ZHCG0109)the 2115 Talent Development Program of China Agricultural University.
文摘Background Breed identification plays an important role in conserving indigenous breeds,managing genetic resources,and developing effective breeding strategies.However,researches on breed identification in livestock mainly focused on purebreds,and they yielded lower predict accuracy in hybrid.In this study,we presented a Multi-Layer Perceptron(MLP)model with multi-output regression framework specifically designed for genomic breed composition prediction of purebred and hybrid in pigs.Results We utilized a total of 8,199 pigs from breeding farms in eight provinces in China,comprising Yorkshire,Landrace,Duroc and hybrids of Yorkshire×Landrace.All the animals were genotyped with 1K,50K and 100K SNP chips.Comparing with random forest(RF),support vector regression(SVR)and Admixture,our results from five replicates of fivefold cross validation demonstrated that MLP achieved a breed identification accuracy of 100%for both hybrid and purebreds in 50K and 100K SNP chips,SVR performed comparable with MLP,they both outperformed RF and Admixture.In the independent testing,MLP yielded accuracy of 100%for all three pure breeds and hybrid across all SNP chips and panel,while SVR yielded 0.026%–0.121%lower accuracy than MLP.Compared with classification-based framework,the new strategy of multi-output regression framework in this study was helpful to improve the predict accuracy.MLP,RF and SVR,achieved consistent improvements across all six SNP chips/panel,especially in hybrid identification.Our results showed the determination threshold for purebred had different effects,SVR,RF and Admixture were very sensitive to threshold values,their optimal threshold fluctuated in different scenarios,while MLP kept optimal threshold 0.75 in all cases.The threshold of 0.65–0.75 is ideal for accurate breed identification.Among different density of SNP chips,the 1K SNP chip was most cost-effective as yielding 100%accuracy with enlarging training set.Hybrid individuals in the training set were useful for both purebred and hybrid identification.Conclusions Our new MLP strategy demonstrated its high accuracy and robust applicability across low-,medium-,and high-density SNP chips.Multi-output regression framework could universally enhance prediction accuracy for ML methods.Our new strategy is also helpful for breed identification in other livestock.
基金the Shenzhen Fundamental Research Fund(No.JCYJ20210324122801005)the Fundamental Research Funds for the Central Universities(No.HIT.OCEF.2023022).
文摘The complex compositions of high-entropy alloys(HEAs)enable a variety of phase structures like FCC single phase,BCC single phase,or duplex FCC+BCC phase.Accurate and efficient prediction of phase structure is crucial for accelerating the discovery of new components and designing HEAs with desired phase structure.In this work,five machine learning strategies were utilized to predict the phase structures of HEAs with a dataset of 296.Specifically,a two-step feature selection strategy was proposed,enabling pronounced improvement in the computational efficiency from 2047 to 12 iterations for each model while ensuring fewer input features and higher prediction accuracy.Compared with traditional valence electron concentration criterion,the prediction accuracy of collected dataset was highly improved from 0.79 to 0.98 for random forest.Furthermore,HEAs with compositions of Al_(x)CoCu_(6)Ni_(6)Fe_(6)(x=1,3,6)were developed to validate the prediction results of machine learning models,and the mechanical properties as well as corrosion resistance were investigated.It is found that the higher Al content enhances the yield strength but deteriorates corrosion resistance.The present two-step feature selection strategy provides an alternative method that is feasible for predicting the phase structure of HEAs with high efficiency and accuracy.
基金funded by the National Key Research and Development Program of China under Grant 2019YFB1803301Beijing Natural Science Foundation (L202002)。
文摘Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies driven by Cybertwins have been proposed for adaptive task offloading strategies.However,the existence of random transmission delay between Cybertwin-driven agents and underlying networks is not considered in related works,which destroys the standard Markov property and increases the decision reaction time to reduce the task offloading strategy performance.In order to address this problem,we propose a pipelining task offloading method to lower the decision reaction time and model it as a delay-aware Markov Decision Process(MDP).Then,we design a delay-aware MADRL algorithm to minimize the weighted sum of task execution latency and energy consumption.Firstly,the state space is augmented using the lastly-received state and historical actions to rebuild the Markov property.Secondly,Gate Transformer-XL is introduced to capture historical actions'importance and maintain the consistent input dimension dynamically changed due to random transmission delays.Thirdly,a sampling method and a new loss function with the difference between the current and target state value and the difference between real state-action value and augmented state-action value are designed to obtain state transition trajectories close to the real ones.Numerical results demonstrate that the proposed methods are effective in reducing reaction time and improving the task offloading performance in the random-delay Cybertwin-enabled 6G networks.
基金Supported by National Natural Science Foundation of China(Grant Nos.U22A20246,52372382)Hefei Municipal Natural Science Foundation(Grant No.2022008)+1 种基金the Open Fund of State Key Laboratory of Mechanical Behavior and System Safety of Traffic Engineering Structures(Grant No.KF2023-06)S&T Program of Hebei(Grant No.225676162GH).
文摘In the parallel steering coordination control strategy for path tracking,it is difficult to match the current driver steering model using the fixed parameters with the actual driver,and the designed steering coordination control strategy under a single objective and simple conditions is difficult to adapt to the multi-dimensional state variables’input.In this paper,we propose a deep reinforcement learning algorithm-based multi-objective parallel human-machine steering coordination strategy for path tracking considering driver misoperation and external disturbance.Firstly,the driver steering mathematical model is constructed based on the driver preview characteristics and steering delay response,and the driver characteristic parameters are fitted after collecting the actual driver driving data.Secondly,considering that the vehicle is susceptible to the influence of external disturbances during the driving process,the Tube MPC(Tube Model Predictive Control)based path tracking steering controller is designed based on the vehicle system dynamics error model.After verifying that the driver steering model meets the driver steering operation characteristics,DQN(Deep Q-network),DDPG(Deep Deterministic Policy Gradient)and TD3(Twin Delayed Deep Deterministic Policy Gradient)deep reinforcement learning algorithms are utilized to design a multi-objective parallel steering coordination strategy which satisfies the multi-dimensional state variables’input of the vehicle.Finally,the tracking accuracy,lateral safety,human-machine conflict and driver steering load evaluation index are designed in different driver operation states and different road environments,and the performance of the parallel steering coordination control strategies with different deep reinforcement learning algorithms and fuzzy algorithms are compared by simulations and hardware in the loop experiments.The results show that the parallel steering collaborative strategy based on a deep reinforcement learning algorithm can more effectively assist the driver in tracking the target path under lateral wind interference and driver misoperation,and the TD3-based coordination control strategy has better overall performance.
文摘This study demonstrates the complexity and importance of water quality as a measure of the health and sustainability of ecosystems that directly influence biodiversity,human health,and the world economy.The predictability of water quality thus plays a crucial role in managing our ecosystems to make informed decisions and,hence,proper environmental management.This study addresses these challenges by proposing an effective machine learning methodology applied to the“Water Quality”public dataset.The methodology has modeled the dataset suitable for providing prediction classification analysis with high values of the evaluating parameters such as accuracy,sensitivity,and specificity.The proposed methodology is based on two novel approaches:(a)the SMOTE method to deal with unbalanced data and(b)the skillfully involved classical machine learning models.This paper uses Random Forests,Decision Trees,XGBoost,and Support Vector Machines because they can handle large datasets,train models for handling skewed datasets,and provide high accuracy in water quality classification.A key contribution of this work is the use of custom sampling strategies within the SMOTE approach,which significantly enhanced performance metrics and improved class imbalance handling.The results demonstrate significant improvements in predictive performance,achieving the highest reported metrics:accuracy(98.92%vs.96.06%),sensitivity(98.3%vs.71.26%),and F1 score(98.37%vs.79.74%)using the XGBoost model.These improvements underscore the effectiveness of our custom SMOTE sampling strategies in addressing class imbalance.The findings contribute to environmental management by enabling ecology specialists to develop more accurate strategies for monitoring,assessing,and managing drinking water quality,ensuring better ecosystem and public health outcomes.
基金supported by National Natural Science Foundation of China(62271096,U20A20157)Natural Science Foundation of Chongqing,China(CSTB2023NSCQ-LZX0134)+3 种基金University Innovation Research Group of Chongqing(CXQT20017)Youth Innovation Group Support Program of ICE Discipline of CQUPT(SCIE-QN-2022-04)the Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN202300632)the Chongqing Postdoctoral Special Funding Project(2022CQBSHTB2057).
文摘Aiming at the problem of mobile data traffic surge in 5G networks,this paper proposes an effective solution combining massive multiple-input multiple-output techniques with Ultra-Dense Network(UDN)and focuses on solving the resulting challenge of increased energy consumption.A base station control algorithm based on Multi-Agent Proximity Policy Optimization(MAPPO)is designed.In the constructed 5G UDN model,each base station is considered as an agent,and the MAPPO algorithm enables inter-base station collaboration and interference management to optimize the network performance.To reduce the extra power consumption due to frequent sleep mode switching of base stations,a sleep mode switching decision algorithm is proposed.The algorithm reduces unnecessary power consumption by evaluating the network state similarity and intelligently adjusting the agent’s action strategy.Simulation results show that the proposed algorithm reduces the power consumption by 24.61% compared to the no-sleep strategy and further reduces the power consumption by 5.36% compared to the traditional MAPPO algorithm under the premise of guaranteeing the quality of service of users.
基金supported in part by the National Natural Science Foundation of China(62373380)。
文摘The application of multiple unmanned aerial vehicles(UAVs)for the pursuit and capture of unauthorized UAVs has emerged as a novel approach to ensuring the safety of urban airspace.However,pursuit UAVs necessitate the utilization of their own sensors to proactively gather information from the unauthorized UAV.Considering the restricted sensing range of sensors,this paper proposes a multi-UAV with limited visual field pursuit-evasion(MUV-PE)problem.Each pursuer has a visual field characterized by limited perception distance and viewing angle,potentially obstructed by buildings.Only when the unauthorized UAV,i.e.,the evader,enters the visual field of any pursuer can its position be acquired.The objective of the pursuers is to capture the evader as soon as possible without collision.To address this problem,we propose the normalizing flow actor with graph attention critic(NAGC)algorithm,a multi-agent reinforcement learning(MARL)approach.NAGC executes normalizing flows to augment the flexibility of policy network,enabling the agent to sample actions from more intricate distributions rather than common distributions.To enhance the capability of simultaneously comprehending spatial relationships among multiple UAVs and environmental obstacles,NAGC integrates the“obstacle-target”graph attention networks,significantly aiding pursuers in supporting search or pursuit activities.Extensive experiments conducted in a high-precision simulator validate the promising performance of the NAGC algorithm.
基金supported in part by Science and Technology Project of State Grid Corporation of China(No.5400-202319829A-4-1-KJ).
文摘The integration of large-scale-distributed new energy resources has led to heightened source‒load uncertainty.As energy prosumers,microgrids urgently require enhanced real-time regulation capabilities over controllable resources amid uncertain environments,rendering real-time and rapid decision-making a critical issue.This paper proposes a tailored twin delayed deep deterministic policy gradient(TD3)reinforcement learning algorithm that explicitly accounts for source‒load uncertainty.First,following an expert experience-based methodology,Gaussian process regression was implemented using the radial basis function covariance with historical source and load data.The parameters were adaptively adjusted by maximum likelihood estimation to generate the expected curves of demand and wind‒solar power generation,along with their 95%confidence regions,which were treated as representative uncertainty scenarios.Second,the traditional scheduling model was transformed into a deep reinforcement learning(DRL)environment through a Markov process.To minimize the total operational cost of the microgrid,the tailored TD3 algorithm was applied to formulate rapid intraday scheduling decisions.Finally,simulations were conducted using real historical data from an actual region in Zhejiang province,China,to verify the efficacy of the proposed method.The results demonstrate the potential of the algorithm for achieving economic scheduling for microgrids.
基金supported by the Basic Scientific Research Business Fund Project of Higher Education Institutions in Heilongjiang Province(145409601)the First Batch of Experimental Teaching and Teaching Laboratory Construction Research Projects in Heilongjiang Province(SJGZ20240038).
文摘In the wake of major natural disasters or human-made disasters,the communication infrastruc-ture within disaster-stricken areas is frequently dam-aged.Unmanned aerial vehicles(UAVs),thanks to their merits such as rapid deployment and high mobil-ity,are commonly regarded as an ideal option for con-structing temporary communication networks.Con-sidering the limited computing capability and battery power of UAVs,this paper proposes a two-layer UAV cooperative computing offloading strategy for emer-gency disaster relief scenarios.The multi-agent twin delayed deep deterministic policy gradient(MATD3)algorithm integrated with prioritized experience replay(PER)is utilized to jointly optimize the scheduling strategies of UAVs,task offloading ratios,and their mobility,aiming to diminish the energy consumption and delay of the system to the minimum.In order to address the aforementioned non-convex optimiza-tion issue,a Markov decision process(MDP)has been established.The results of simulation experiments demonstrate that,compared with the other four base-line algorithms,the algorithm introduced in this paper exhibits better convergence performance,verifying its feasibility and efficacy.
基金Supported by Chongqing Medical Scientific Research Project(Joint Project of Chongqing Health Commission and Science and Technology Bureau),No.2023MSXM060.
文摘BACKGROUND The accurate prediction of lymph node metastasis(LNM)is crucial for managing locally advanced(T3/T4)colorectal cancer(CRC).However,both traditional histopathology and standard slide-level deep learning often fail to capture the sparse and diagnostically critical features of metastatic potential.AIM To develop and validate a case-level multiple-instance learning(MIL)framework mimicking a pathologist's comprehensive review and improve T3/T4 CRC LNM prediction.METHODS The whole-slide images of 130 patients with T3/T4 CRC were retrospectively collected.A case-level MIL framework utilising the CONCH v1.5 and UNI2-h deep learning models was trained on features from all haematoxylin and eosinstained primary tumour slides for each patient.These pathological features were subsequently integrated with clinical data,and model performance was evaluated using the area under the curve(AUC).RESULTS The case-level framework demonstrated superior LNM prediction over slide-level training,with the CONCH v1.5 model achieving a mean AUC(±SD)of 0.899±0.033 vs 0.814±0.083,respectively.Integrating pathology features with clinical data further enhanced performance,yielding a top model with a mean AUC of 0.904±0.047,in sharp contrast to a clinical-only model(mean AUC 0.584±0.084).Crucially,a pathologist’s review confirmed that the model-identified high-attention regions correspond to known high-risk histopathological features.CONCLUSION A case-level MIL framework provides a superior approach for predicting LNM in advanced CRC.This method shows promise for risk stratification and therapy decisions,requiring further validation.