In federated learning,backdoor attacks have become an important research topic with their wide application in processing sensitive datasets.Since federated learning detects or modifies local models through defense mec...In federated learning,backdoor attacks have become an important research topic with their wide application in processing sensitive datasets.Since federated learning detects or modifies local models through defense mechanisms during aggregation,it is difficult to conduct effective backdoor attacks.In addition,existing backdoor attack methods are faced with challenges,such as low backdoor accuracy,poor ability to evade anomaly detection,and unstable model training.To address these challenges,a method called adaptive simulation backdoor attack(ASBA)is proposed.Specifically,ASBA improves the stability of model training by manipulating the local training process and using an adaptive mechanism,the ability of the malicious model to evade anomaly detection by combing large simulation training and clipping,and the backdoor accuracy by introducing a stimulus model to amplify the impact of the backdoor in the global model.Extensive comparative experiments under five advanced defense scenarios show that ASBA can effectively evade anomaly detection and achieve high backdoor accuracy in the global model.Furthermore,it exhibits excellent stability and effectiveness after multiple rounds of attacks,outperforming state-of-the-art backdoor attack methods.展开更多
Federated Learning(FL)protects data privacy through a distributed training mechanism,yet its decentralized nature also introduces new security vulnerabilities.Backdoor attacks inject malicious triggers into the global...Federated Learning(FL)protects data privacy through a distributed training mechanism,yet its decentralized nature also introduces new security vulnerabilities.Backdoor attacks inject malicious triggers into the global model through compromised updates,posing significant threats to model integrity and becoming a key focus in FL security.Existing backdoor attack methods typically embed triggers directly into original images and consider only data heterogeneity,resulting in limited stealth and adaptability.To address the heterogeneity of malicious client devices,this paper proposes a novel backdoor attack method named Capability-Adaptive Shadow Backdoor Attack(CASBA).By incorporating measurements of clients’computational and communication capabilities,CASBA employs a dynamic hierarchical attack strategy that adaptively aligns attack intensity with available resources.Furthermore,an improved deep convolutional generative adversarial network(DCGAN)is integrated into the attack pipeline to embed triggers without modifying original data,significantly enhancing stealthiness.Comparative experiments with Shadow Backdoor Attack(SBA)across multiple scenarios demonstrate that CASBA dynamically adjusts resource consumption based on device capabilities,reducing average memory usage per iteration by 5.8%.CASBA improves resource efficiency while keeping the drop in attack success rate within 3%.Additionally,the effectiveness of CASBA against three robust FL algorithms is also validated.展开更多
While reinforcement learning-based underwater acoustic adaptive modulation shows promise for enabling environment-adaptive communication as supported by extensive simulation-based research,its practical performance re...While reinforcement learning-based underwater acoustic adaptive modulation shows promise for enabling environment-adaptive communication as supported by extensive simulation-based research,its practical performance remains underexplored in field investigations.To evaluate the practical applicability of this emerging technique in adverse shallow sea channels,a field experiment was conducted using three communication modes:orthogonal frequency division multiplexing(OFDM),M-ary frequency-shift keying(MFSK),and direct sequence spread spectrum(DSSS)for reinforcement learning-driven adaptive modulation.Specifically,a Q-learning method is used to select the optimal modulation mode according to the channel quality quantified by signal-to-noise ratio,multipath spread length,and Doppler frequency offset.Experimental results demonstrate that the reinforcement learning-based adaptive modulation scheme outperformed fixed threshold detection in terms of total throughput and average bit error rate,surpassing conventional adaptive modulation strategies.展开更多
Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Re...Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning.展开更多
Underground engineering projects such as deep tunnel excavation often encounter rockburst disasters accompanied by numerous microseismic events.Rapid interpretation of microseismic signals is crucial for the timely id...Underground engineering projects such as deep tunnel excavation often encounter rockburst disasters accompanied by numerous microseismic events.Rapid interpretation of microseismic signals is crucial for the timely identification of rockbursts.However,conventional processing encompasses multi-step workflows,including classification,denoising,picking,locating,and computational analysis,coupled with manual intervention,which collectively compromise the reliability of early warnings.To address these challenges,this study innovatively proposes the“microseismic stethoscope"-a multi-task machine learning and deep learning model designed for the automated processing of massive microseismic signals.This model efficiently extracts three key parameters that are necessary for recognizing rockburst disasters:rupture location,microseismic energy,and moment magnitude.Specifically,the model extracts raw waveform features from three dedicated sub-networks:a classifier for source zone classification,and two regressors for microseismic energy and moment magnitude estimation.This model demonstrates superior efficiency compared to traditional processing and semi-automated processing,reducing per-event processing time from 0.71 s to 0.49 s to merely 0.036 s.It concurrently achieves 98%accuracy in source zone classification,with microseismic energy and moment magnitude estimation errors of 0.13 and 0.05,respectively.This model has been well applied and validated in the Daxiagu Tunnel case in Sichuan,China.The application results indicate that the model is as accurate as traditional methods in determining source parameters,and thus can be used to identify potential geomechanical processes of rockburst disasters.By enhancing the signal processing reliability of microseismic events,the proposed model in this study presents a significant advancement in the identification of rockburst disasters.展开更多
Surface properties of crystals are critical in many fields,including electrochemistry and photoelectronics,the efficient prediction of which can expedite the design and optimization of catalysts,batteries,alloys etc.H...Surface properties of crystals are critical in many fields,including electrochemistry and photoelectronics,the efficient prediction of which can expedite the design and optimization of catalysts,batteries,alloys etc.However,we are still far from realizing this vision due to the rarity of surface property-related databases,especially for multicomponent compounds,due to the large sample spaces and limited computing resources.In this work,we present a surface emphasized multi-task crystal graph convolutional neural network(SEM-CGCNN)to predict multiple surface properties simultaneously from crystal structures.The model is evaluated on a dataset of 3526 surface energies and work functions of binary magnesium intermetallics obtained through first-principles calculations,and obvious improvements are observed both in efficiency and accuracy over the original CGCNN model.By transferring the pre-trained model to the datasets of pure metals and other intermetallics,the fine-tuned SEM-CGCNN outperforms learning from scratch and can be further applied to other surface properties and materials systems.This study could be a paradigm for the end-to-end mapping of atomic structures to anisotropic surface properties of crystals,which provides an efficient framework to understand and screen materials with desired surface characteristics.展开更多
Reconfigurable intelligent surface(RIS)have been cast as a promising alternative to alleviate blockage vulnerability and enhance coverage capability for terahertz(THz)communications.Owing to large-scale array elements...Reconfigurable intelligent surface(RIS)have been cast as a promising alternative to alleviate blockage vulnerability and enhance coverage capability for terahertz(THz)communications.Owing to large-scale array elements at transceivers and RIS,the codebook based beamforming can be utilized in a computationally efficient manner.However,the codeword selection for analog beamforming is an intractable combinatorial optimization(CO)problem.To this end,by taking the CO problem as a classification problem,a multi-task learning based analog beam selection(MTL-ABS)framework is developed to implement cooperative beam selection concurrently at transceivers and RIS.In addition,residual network and self-attention mechanism are used to combat the network degradation and mine intrinsic THz channel features.Finally,the network convergence is analyzed from a blockwise perspective,and numerical results demonstrate that the MTL-ABS framework greatly decreases the beam selection overhead and achieves near optimal sum-rate compared with heuristic search based counterparts.展开更多
To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervis...To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervised masked contrastive learning and domain adaptation(SSMCL-DA)method for gearbox fault diagnosis under variable conditions.Initially,during the unsupervised pre-training phase,a dual signal augmentation strategy is devised,which simultaneously applies random masking in the time domain and random scaling in the frequency domain to unlabeled samples,thereby constructing more challenging positive sample pairs to guide the encoder in learning intrinsic features robust to condition variations.Subsequently,a ConvNeXt-Transformer hybrid architecture is employed,integrating the superior local detail modeling capacity of ConvNeXt with the robust global perception capability of Transformer to enhance feature extraction in complex scenarios.Thereafter,a contrastive learning model is constructed with the optimization objective of maximizing feature similarity across different masked instances of the same sample,enabling the extraction of consistent features from multiple masked perspectives and reducing reliance on labeled data.In the final supervised fine-tuning phase,a multi-scale attention mechanism is incorporated for feature rectification,and a domain adaptation module combining Local Maximum Mean Discrepancy(LMMD)with adversarial learning is proposed.This module embodies a dual mechanism:LMMD facilitates fine-grained class-conditional alignment,compelling features of identical fault classes to converge across varying conditions,while the domain discriminator utilizes adversarial training to guide the feature extractor toward learning domain-invariant features.Working in concert,they markedly diminish feature distribution discrepancies induced by changes in load,rotational speed,and other factors,thereby boosting the model’s adaptability to cross-condition scenarios.Experimental evaluations on the WT planetary gearbox dataset and the Case Western Reserve University(CWRU)bearing dataset demonstrate that the SSMCL-DA model effectively identifies multiple fault classes in gearboxes,with diagnostic performance substantially surpassing that of conventional methods.Under cross-condition scenarios,the model attains fault diagnosis accuracies of 99.21%for the WT planetary gearbox and 99.86%for the bearings,respectively.Furthermore,the model exhibits stable generalization capability in cross-device settings.展开更多
Existing traditional ocean vertical-mixing schemes are empirically developed without a thorough understanding of the physical processes involved,resulting in a discrepancy between the parameterization and forecast res...Existing traditional ocean vertical-mixing schemes are empirically developed without a thorough understanding of the physical processes involved,resulting in a discrepancy between the parameterization and forecast results.The uncertainty in ocean-mixing parameterization is primarily responsible for the bias in ocean models.Benefiting from deep-learning technology,we design the Adaptive Fully Connected Module with an Inception module as the baseline to minimize bias.It adaptively extracts the best features through fully connected layers with different widths,and better learns the nonlinear relationship between input variables and parameterization fields.Moreover,to obtain more accurate results,we impose KPP(K-Profile Parameterization)and PP(Pacanowski–Philander)schemes as physical constraints to make the network parameterization process follow the basic physical laws more closely.Since model data are calculated with human experience,lacking some unknown physical processes,which may differ from the actual data,we use a decade-long time record of hydrological and turbulence observations in the tropical Pacific Ocean as training data.Combining physical constraints and a nonlinear activation function,our method catches its nonlinear change and better adapts to the oceanmixing parameterization process.The use of physical constraints can improve the final results.展开更多
Federated learning combined with edge computing has greatly facilitated transportation in real-time applications such as intelligent traffic sys-tems.However,synchronous federated learning is in-efficient in terms of ...Federated learning combined with edge computing has greatly facilitated transportation in real-time applications such as intelligent traffic sys-tems.However,synchronous federated learning is in-efficient in terms of time and convergence speed,mak-ing it unsuitable for high real-time requirements.To address these issues,this paper proposes an Adap-tive Waiting time Asynchronous Federated Learn-ing(AWTAFL)based on Dueling Double Deep Q-Network(D3QN).The server dynamically adjusts the waiting time using the D3QN algorithm based on the current task progress and energy consumption,aim-ing to accelerate convergence and save energy.Addi-tionally,this paper presents a new federated learning global aggregation scheme,where the central server performs weighted aggregation based on the freshness and contribution of client parameters.Experimen-tal simulations demonstrate that the proposed algo-rithm significantly reduces the convergence time while ensuring model quality and effectively reducing en-ergy consumption in asynchronous federated learning.Furthermore,the improved global aggregation update method enhances training stability and reduces oscil-lations in the global model convergence.展开更多
Accurate and reliable photovoltaic(PV)modeling is crucial for the performance evaluation,control,and optimization of PV systems.However,existing methods for PV parameter identification often suffer from limitations in...Accurate and reliable photovoltaic(PV)modeling is crucial for the performance evaluation,control,and optimization of PV systems.However,existing methods for PV parameter identification often suffer from limitations in accuracy and efficiency.To address these challenges,we propose an adaptive multi-learning cooperation search algorithm(AMLCSA)for efficient identification of unknown parameters in PV models.AMLCSA is a novel algorithm inspired by teamwork behaviors in modern enterprises.It enhances the original cooperation search algorithm in two key aspects:(i)an adaptive multi-learning strategy that dynamically adjusts search ranges using adaptive weights,allowing better individuals to focus on local exploitation while guiding poorer individuals toward global exploration;and(ii)a chaotic grouping reflection strategy that introduces chaotic sequences to enhance population diversity and improve search performance.The effectiveness of AMLCSA is demonstrated on single-diode,double-diode,and three PV-module models.Simulation results show that AMLCSA offers significant advantages in convergence,accuracy,and stability compared to existing state-of-the-art algorithms.展开更多
Adaptive robust secure framework plays a vital role in implementing intelligent automation and decentralized decision making of Industry 5.0.Latency,privacy risks and the complexity of industrial networks have been pr...Adaptive robust secure framework plays a vital role in implementing intelligent automation and decentralized decision making of Industry 5.0.Latency,privacy risks and the complexity of industrial networks have been preventing attempts at traditional cloud-based learning systems.We demonstrate that,to overcome these challenges,for instance,the EdgeGuard-IoT framework,a 6G edge intelligence framework enhancing cybersecurity and operational resilience of the smart grid,is needed on the edge to integrate Secure Federated Learning(SFL)and Adaptive Anomaly Detection(AAD).With ultra-reliable low latency communication(URLLC)of 6G,artificial intelligence-based network orchestration,and massive machine type communication(mMTC),EdgeGuard-IoT brings real-time,distributed intelligence on the edge,and mitigates risks in data transmission and enhances privacy.EdgeGuard-IoT,with a hierarchical federated learning framework,helps edge devices to collaboratively train models without revealing the sensitive grid data,which is crucial in the smart grid where real-time power anomaly detection and the decentralization of the energy management are a big deal.The hybrid AI models driven adaptive anomaly detection mechanism immediately raises the thumb if the grid stability and strength are negatively affected due to cyber threats,faults,and energy distribution,thereby keeping the grid stable with resilience.The proposed framework also adopts various security means within the blockchain and zero-trust authentication techniques to reduce the adversarial attack risks and model poisoning during federated learning.EdgeGuard-IoT shows superior detection accuracy,response time,and scalability performance at a much reduced communication overhead via extensive simulations and deployment in real-world case studies in smart grids.This research pioneers a 6G-driven federated intelligence model designed for secure,self-optimizing,and resilient Industry 5.0 ecosystems,paving the way for next-generation autonomous smart grids and industrial cyber-physical systems.展开更多
The rapid growth of Internet of things devices and the emergence of rapidly evolving network threats have made traditional security assessment methods inadequate.Federated learning offers a promising solution to exped...The rapid growth of Internet of things devices and the emergence of rapidly evolving network threats have made traditional security assessment methods inadequate.Federated learning offers a promising solution to expedite the training of security assessment models.However,ensuring the trustworthiness and robustness of federated learning under multi-party collaboration scenarios remains a challenge.To address these issues,this study proposes a shard aggregation network structure and a malicious node detection mechanism,along with improvements to the federated learning training process.First,we extract the data features of the participants by using spectral clustering methods combined with a Gaussian kernel function.Then,we introduce a multi-objective decision-making approach that combines data distribution consistency,consensus communication overhead,and consensus result reliability in order to determine the final network sharing scheme.Finally,by integrating the federated learning aggregation process with the malicious node detection mechanism,we improve the traditional decentralized learning process.Our proposed ShardFed algorithm outperforms conventional classification algorithms and state-of-the-art machine learning methods like FedProx and FedCurv in convergence speed,robustness against data interference,and adaptability across multiple scenarios.Experimental results demonstrate that the proposed approach improves model accuracy by up to 2.33%under non-independent and identically distributed data conditions,maintains higher performance with malicious nodes containing poisoned data ratios of 20%–50%,and significantly enhances model resistance to low-quality data.展开更多
In this paper, the containment control problem in nonlinear multi-agent systems(NMASs) under denial-of-service(DoS) attacks is addressed. Firstly, a prediction model is obtained using the broad learning technique to t...In this paper, the containment control problem in nonlinear multi-agent systems(NMASs) under denial-of-service(DoS) attacks is addressed. Firstly, a prediction model is obtained using the broad learning technique to train historical data generated by the system offline without DoS attacks. Secondly, the dynamic linearization method is used to obtain the equivalent linearization model of NMASs. Then, a novel model-free adaptive predictive control(MFAPC) framework based on historical and online data generated by the system is proposed, which combines the trained prediction model with the model-free adaptive control method. The development of the MFAPC method motivates a much simpler robust predictive control solution that is convenient to use in the case of DoS attacks. Meanwhile, the MFAPC algorithm provides a unified predictive framework for solving consensus tracking and containment control problems. The boundedness of the containment error can be proven by using the contraction mapping principle and the mathematical induction method. Finally, the proposed MFAPC is assessed through comparative experiments.展开更多
An adaptive topology learning approach is proposed to learn the topology of a practical camera network in an unsupervised way. The nodes are modeled by the Gaussian mixture model. The connectivity between nodes is jud...An adaptive topology learning approach is proposed to learn the topology of a practical camera network in an unsupervised way. The nodes are modeled by the Gaussian mixture model. The connectivity between nodes is judged by their cross-correlation function, which is also used to calculate their transition time distribution. The mutual information of the connected node pair is employed for transition probability calculation. A false link eliminating approach is proposed, along with a topology updating strategy to improve the learned topology. A real monitoring system with five disjoint cameras is built for experiments. Comparative results with traditional methods show that the proposed method is more accurate in topology learning and is more robust to environmental changes.展开更多
Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay di...Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay differential equations(DDEs)and delay integrodifferential equations(DIDEs)with constant delays,primarily due to their low regularity at delayinduced breaking points.In this paper,a DNN method that combines multi-task learning(MTL)which is proposed to solve both the forward and inverse problems of DIDEs.The core idea of this approach is to divide the original equation into multiple tasks based on the delay,using auxiliary outputs to represent the integral terms,followed by the use of MTL to seamlessly incorporate the properties at the breaking points into the loss function.Furthermore,given the increased training dificulty associated with multiple tasks and outputs,we employ a sequential training scheme to reduce training complexity and provide reference solutions for subsequent tasks.This approach significantly enhances the approximation accuracy of solving DIDEs with DNNs,as demonstrated by comparisons with traditional DNN methods.We validate the effectiveness of this method through several numerical experiments,test various parameter sharing structures in MTL and compare the testing results of these structures.Finally,this method is implemented to solve the inverse problem of nonlinear DIDE and the results show that the unknown parameters of DIDE can be discovered with sparse or noisy data.展开更多
The rapidly evolving cybersecurity threat landscape exposes a critical flaw in traditional educational programs where static curricula cannot adapt swiftly to novel attack vectors.This creates a significant gap betwee...The rapidly evolving cybersecurity threat landscape exposes a critical flaw in traditional educational programs where static curricula cannot adapt swiftly to novel attack vectors.This creates a significant gap between theoretical knowledge and the practical defensive capabilities needed in the field.To address this,we propose TeachSecure-CTI,a novel framework for adaptive cybersecurity curriculumgeneration that integrates real-time Cyber Threat Intelligence(CTI)with AI-driven personalization.Our framework employs a layered architecture featuring a CTI ingestion and clusteringmodule,natural language processing for semantic concept extraction,and a reinforcement learning agent for adaptive content sequencing.Bydynamically aligning learningmaterialswithboththe evolving threat environment and individual learner profiles,TeachSecure-CTI ensures content remains current,relevant,and tailored.A 12-week study with 150 students across three institutions demonstrated that the framework improves learning gains by 34%,significantly exceeding the 12%–21%reported in recent literature.The system achieved 84.8%personalization accuracy,85.9%recognition accuracy for MITRE ATT&CK tactics,and a 31%faster competency development rate compared to static curricula.These findings have implications beyond academia,extending to workforce development,cyber range training,and certification programs.By bridging the gap between dynamic threats and static educational materials,TeachSecure-CTI offers an empirically validated,scalable solution for cultivating cybersecurity professionals capable of responding to modern threats.展开更多
To enhance speech emotion recognition capability,this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup(AAM)and improved coordinate and shuffle attention(ICASA)methods.The AAM...To enhance speech emotion recognition capability,this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup(AAM)and improved coordinate and shuffle attention(ICASA)methods.The AAM method optimizes data augmentation by combining a sample selection strategy and dynamic interpolation coefficients,thus enabling information fusion of speech data with different emotions at the acoustic level.The ICASA method enhances feature extraction capability through dynamic fusion of the improved coordinate attention(ICA)and shuffle attention(SA)techniques.The ICA technique reduces computational overhead by employing depth-separable convolution and an h-swish activation function and captures long-range dependencies of multi-scale time-frequency features using the attention weights.The SA technique promotes feature interaction through channel shuffling,which helps the model learn richer and more discriminative emotional features.Experimental results demonstrate that,compared to the baseline model,the proposed model improves the weighted accuracy by 5.42%and 4.54%,and the unweighted accuracy by 3.37%and 3.85%on the IEMOCAP and RAVDESS datasets,respectively.These improvements were confirmed to be statistically significant by independent samples t-tests,further supporting the practical reliability and applicability of the proposed model in real-world emotion-aware speech systems.展开更多
This paper presents a hierarchical formation control strategy to address the challenges of multiple Unmanned Aerial Vehicles(UAVs)formation control within a cooperative consensus framework.The proposed strategy incorp...This paper presents a hierarchical formation control strategy to address the challenges of multiple Unmanned Aerial Vehicles(UAVs)formation control within a cooperative consensus framework.The proposed strategy incorporates a reference command generation layer,which derives UAV attitude commands based on formation requirements,and a tracking control layer to ensure accurate execution.Collaborative variables,including trajectory position and flight speed,are defined using a three-dimensional track particle and autopilot model,enabling the development of a consensus-based formation control law.Desired attitude angles are computed through altitudehold and coordinated-turn strategies.A sliding surface is designed based on reference models derived from flight quality metrics,while an adaptive controller compensates for aerodynamic model uncertainties.To enhance learning capabilities,a prediction error mechanism based on a series-parallel estimation model is introduced,enabling collaborative learning and the sharing of network weight estimation parameters within the multi-agent system.This facilitates the design of a distributed composite learning law.Lyapunov stability analysis confirms the local exponential stability of the tracking error.The simulations of a twelve-UAV formation,along with comparative analysis of two algorithms,demonstrate the system’s capability for formation maintenance and high-precision tracking control.展开更多
Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-...Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.展开更多
文摘In federated learning,backdoor attacks have become an important research topic with their wide application in processing sensitive datasets.Since federated learning detects or modifies local models through defense mechanisms during aggregation,it is difficult to conduct effective backdoor attacks.In addition,existing backdoor attack methods are faced with challenges,such as low backdoor accuracy,poor ability to evade anomaly detection,and unstable model training.To address these challenges,a method called adaptive simulation backdoor attack(ASBA)is proposed.Specifically,ASBA improves the stability of model training by manipulating the local training process and using an adaptive mechanism,the ability of the malicious model to evade anomaly detection by combing large simulation training and clipping,and the backdoor accuracy by introducing a stimulus model to amplify the impact of the backdoor in the global model.Extensive comparative experiments under five advanced defense scenarios show that ASBA can effectively evade anomaly detection and achieve high backdoor accuracy in the global model.Furthermore,it exhibits excellent stability and effectiveness after multiple rounds of attacks,outperforming state-of-the-art backdoor attack methods.
基金supported by the National Natural Science Foundation of China(Grant No.62172123)the Key Research and Development Program of Heilongjiang Province,China(GrantNo.2022ZX01A36).
文摘Federated Learning(FL)protects data privacy through a distributed training mechanism,yet its decentralized nature also introduces new security vulnerabilities.Backdoor attacks inject malicious triggers into the global model through compromised updates,posing significant threats to model integrity and becoming a key focus in FL security.Existing backdoor attack methods typically embed triggers directly into original images and consider only data heterogeneity,resulting in limited stealth and adaptability.To address the heterogeneity of malicious client devices,this paper proposes a novel backdoor attack method named Capability-Adaptive Shadow Backdoor Attack(CASBA).By incorporating measurements of clients’computational and communication capabilities,CASBA employs a dynamic hierarchical attack strategy that adaptively aligns attack intensity with available resources.Furthermore,an improved deep convolutional generative adversarial network(DCGAN)is integrated into the attack pipeline to embed triggers without modifying original data,significantly enhancing stealthiness.Comparative experiments with Shadow Backdoor Attack(SBA)across multiple scenarios demonstrate that CASBA dynamically adjusts resource consumption based on device capabilities,reducing average memory usage per iteration by 5.8%.CASBA improves resource efficiency while keeping the drop in attack success rate within 3%.Additionally,the effectiveness of CASBA against three robust FL algorithms is also validated.
基金funding from the National Key Research and Development Program of China(No.2018YFE0110000)the National Natural Science Foundation of China(No.11274259,No.11574258)the Science and Technology Commission Foundation of Shanghai(21DZ1205500)in support of the present research.
文摘While reinforcement learning-based underwater acoustic adaptive modulation shows promise for enabling environment-adaptive communication as supported by extensive simulation-based research,its practical performance remains underexplored in field investigations.To evaluate the practical applicability of this emerging technique in adverse shallow sea channels,a field experiment was conducted using three communication modes:orthogonal frequency division multiplexing(OFDM),M-ary frequency-shift keying(MFSK),and direct sequence spread spectrum(DSSS)for reinforcement learning-driven adaptive modulation.Specifically,a Q-learning method is used to select the optimal modulation mode according to the channel quality quantified by signal-to-noise ratio,multipath spread length,and Doppler frequency offset.Experimental results demonstrate that the reinforcement learning-based adaptive modulation scheme outperformed fixed threshold detection in terms of total throughput and average bit error rate,surpassing conventional adaptive modulation strategies.
文摘Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning.
基金supported by the National Natural Science Foundation of China(Grant Nos.42130719 and 42177173)the Doctoral Direct Train Project of Chongqing Natural Science Foundation(Grant No.CSTB2023NSCQ-BSX0029).
文摘Underground engineering projects such as deep tunnel excavation often encounter rockburst disasters accompanied by numerous microseismic events.Rapid interpretation of microseismic signals is crucial for the timely identification of rockbursts.However,conventional processing encompasses multi-step workflows,including classification,denoising,picking,locating,and computational analysis,coupled with manual intervention,which collectively compromise the reliability of early warnings.To address these challenges,this study innovatively proposes the“microseismic stethoscope"-a multi-task machine learning and deep learning model designed for the automated processing of massive microseismic signals.This model efficiently extracts three key parameters that are necessary for recognizing rockburst disasters:rupture location,microseismic energy,and moment magnitude.Specifically,the model extracts raw waveform features from three dedicated sub-networks:a classifier for source zone classification,and two regressors for microseismic energy and moment magnitude estimation.This model demonstrates superior efficiency compared to traditional processing and semi-automated processing,reducing per-event processing time from 0.71 s to 0.49 s to merely 0.036 s.It concurrently achieves 98%accuracy in source zone classification,with microseismic energy and moment magnitude estimation errors of 0.13 and 0.05,respectively.This model has been well applied and validated in the Daxiagu Tunnel case in Sichuan,China.The application results indicate that the model is as accurate as traditional methods in determining source parameters,and thus can be used to identify potential geomechanical processes of rockburst disasters.By enhancing the signal processing reliability of microseismic events,the proposed model in this study presents a significant advancement in the identification of rockburst disasters.
基金supported by the National Key R&D Program(No.2021YFB3501002)supported by the Ministry of Science and Technology of China,National Natural Science Foundation of China(No.51825101,52127801).
文摘Surface properties of crystals are critical in many fields,including electrochemistry and photoelectronics,the efficient prediction of which can expedite the design and optimization of catalysts,batteries,alloys etc.However,we are still far from realizing this vision due to the rarity of surface property-related databases,especially for multicomponent compounds,due to the large sample spaces and limited computing resources.In this work,we present a surface emphasized multi-task crystal graph convolutional neural network(SEM-CGCNN)to predict multiple surface properties simultaneously from crystal structures.The model is evaluated on a dataset of 3526 surface energies and work functions of binary magnesium intermetallics obtained through first-principles calculations,and obvious improvements are observed both in efficiency and accuracy over the original CGCNN model.By transferring the pre-trained model to the datasets of pure metals and other intermetallics,the fine-tuned SEM-CGCNN outperforms learning from scratch and can be further applied to other surface properties and materials systems.This study could be a paradigm for the end-to-end mapping of atomic structures to anisotropic surface properties of crystals,which provides an efficient framework to understand and screen materials with desired surface characteristics.
文摘Reconfigurable intelligent surface(RIS)have been cast as a promising alternative to alleviate blockage vulnerability and enhance coverage capability for terahertz(THz)communications.Owing to large-scale array elements at transceivers and RIS,the codebook based beamforming can be utilized in a computationally efficient manner.However,the codeword selection for analog beamforming is an intractable combinatorial optimization(CO)problem.To this end,by taking the CO problem as a classification problem,a multi-task learning based analog beam selection(MTL-ABS)framework is developed to implement cooperative beam selection concurrently at transceivers and RIS.In addition,residual network and self-attention mechanism are used to combat the network degradation and mine intrinsic THz channel features.Finally,the network convergence is analyzed from a blockwise perspective,and numerical results demonstrate that the MTL-ABS framework greatly decreases the beam selection overhead and achieves near optimal sum-rate compared with heuristic search based counterparts.
基金supported by the National Natural Science Foundation of China Funded Project(Project Name:Research on Robust Adaptive Allocation Mechanism of Human Machine Co-Driving System Based on NMS Features,Project Approval Number:52172381).
文摘To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervised masked contrastive learning and domain adaptation(SSMCL-DA)method for gearbox fault diagnosis under variable conditions.Initially,during the unsupervised pre-training phase,a dual signal augmentation strategy is devised,which simultaneously applies random masking in the time domain and random scaling in the frequency domain to unlabeled samples,thereby constructing more challenging positive sample pairs to guide the encoder in learning intrinsic features robust to condition variations.Subsequently,a ConvNeXt-Transformer hybrid architecture is employed,integrating the superior local detail modeling capacity of ConvNeXt with the robust global perception capability of Transformer to enhance feature extraction in complex scenarios.Thereafter,a contrastive learning model is constructed with the optimization objective of maximizing feature similarity across different masked instances of the same sample,enabling the extraction of consistent features from multiple masked perspectives and reducing reliance on labeled data.In the final supervised fine-tuning phase,a multi-scale attention mechanism is incorporated for feature rectification,and a domain adaptation module combining Local Maximum Mean Discrepancy(LMMD)with adversarial learning is proposed.This module embodies a dual mechanism:LMMD facilitates fine-grained class-conditional alignment,compelling features of identical fault classes to converge across varying conditions,while the domain discriminator utilizes adversarial training to guide the feature extractor toward learning domain-invariant features.Working in concert,they markedly diminish feature distribution discrepancies induced by changes in load,rotational speed,and other factors,thereby boosting the model’s adaptability to cross-condition scenarios.Experimental evaluations on the WT planetary gearbox dataset and the Case Western Reserve University(CWRU)bearing dataset demonstrate that the SSMCL-DA model effectively identifies multiple fault classes in gearboxes,with diagnostic performance substantially surpassing that of conventional methods.Under cross-condition scenarios,the model attains fault diagnosis accuracies of 99.21%for the WT planetary gearbox and 99.86%for the bearings,respectively.Furthermore,the model exhibits stable generalization capability in cross-device settings.
基金supported by the National Natural Science Foundation of China(Grant Nos.42130608 and 42075142)the National Key Research and Development Program of China(Grant No.2020YFA0608000)the CUIT Science and Technology Innovation Capacity Enhancement Program Project(Grant No.KYTD202330)。
文摘Existing traditional ocean vertical-mixing schemes are empirically developed without a thorough understanding of the physical processes involved,resulting in a discrepancy between the parameterization and forecast results.The uncertainty in ocean-mixing parameterization is primarily responsible for the bias in ocean models.Benefiting from deep-learning technology,we design the Adaptive Fully Connected Module with an Inception module as the baseline to minimize bias.It adaptively extracts the best features through fully connected layers with different widths,and better learns the nonlinear relationship between input variables and parameterization fields.Moreover,to obtain more accurate results,we impose KPP(K-Profile Parameterization)and PP(Pacanowski–Philander)schemes as physical constraints to make the network parameterization process follow the basic physical laws more closely.Since model data are calculated with human experience,lacking some unknown physical processes,which may differ from the actual data,we use a decade-long time record of hydrological and turbulence observations in the tropical Pacific Ocean as training data.Combining physical constraints and a nonlinear activation function,our method catches its nonlinear change and better adapts to the oceanmixing parameterization process.The use of physical constraints can improve the final results.
基金supported by the National Natural Science Foundation of China(62371082)Guangxi Science and Technology Project(AB24010317)+1 种基金Science and Technology Project of Chongqing Education Commission(KJZD-K202400606)Natural Science Foundation of Chongqing(CSTB2023NSCQ-MSX0726,CSTB2023NSCQ-LZX0014).
文摘Federated learning combined with edge computing has greatly facilitated transportation in real-time applications such as intelligent traffic sys-tems.However,synchronous federated learning is in-efficient in terms of time and convergence speed,mak-ing it unsuitable for high real-time requirements.To address these issues,this paper proposes an Adap-tive Waiting time Asynchronous Federated Learn-ing(AWTAFL)based on Dueling Double Deep Q-Network(D3QN).The server dynamically adjusts the waiting time using the D3QN algorithm based on the current task progress and energy consumption,aim-ing to accelerate convergence and save energy.Addi-tionally,this paper presents a new federated learning global aggregation scheme,where the central server performs weighted aggregation based on the freshness and contribution of client parameters.Experimen-tal simulations demonstrate that the proposed algo-rithm significantly reduces the convergence time while ensuring model quality and effectively reducing en-ergy consumption in asynchronous federated learning.Furthermore,the improved global aggregation update method enhances training stability and reduces oscil-lations in the global model convergence.
基金supported by the National Natural Science Foundation of China(Grant Nos.62303197,62273214)the Natural Science Foundation of Shandong Province(ZR2024MFO18).
文摘Accurate and reliable photovoltaic(PV)modeling is crucial for the performance evaluation,control,and optimization of PV systems.However,existing methods for PV parameter identification often suffer from limitations in accuracy and efficiency.To address these challenges,we propose an adaptive multi-learning cooperation search algorithm(AMLCSA)for efficient identification of unknown parameters in PV models.AMLCSA is a novel algorithm inspired by teamwork behaviors in modern enterprises.It enhances the original cooperation search algorithm in two key aspects:(i)an adaptive multi-learning strategy that dynamically adjusts search ranges using adaptive weights,allowing better individuals to focus on local exploitation while guiding poorer individuals toward global exploration;and(ii)a chaotic grouping reflection strategy that introduces chaotic sequences to enhance population diversity and improve search performance.The effectiveness of AMLCSA is demonstrated on single-diode,double-diode,and three PV-module models.Simulation results show that AMLCSA offers significant advantages in convergence,accuracy,and stability compared to existing state-of-the-art algorithms.
基金supported by Department of Information Technology,University of Tabuk,Tabuk,71491,Saudi Arabia.
文摘Adaptive robust secure framework plays a vital role in implementing intelligent automation and decentralized decision making of Industry 5.0.Latency,privacy risks and the complexity of industrial networks have been preventing attempts at traditional cloud-based learning systems.We demonstrate that,to overcome these challenges,for instance,the EdgeGuard-IoT framework,a 6G edge intelligence framework enhancing cybersecurity and operational resilience of the smart grid,is needed on the edge to integrate Secure Federated Learning(SFL)and Adaptive Anomaly Detection(AAD).With ultra-reliable low latency communication(URLLC)of 6G,artificial intelligence-based network orchestration,and massive machine type communication(mMTC),EdgeGuard-IoT brings real-time,distributed intelligence on the edge,and mitigates risks in data transmission and enhances privacy.EdgeGuard-IoT,with a hierarchical federated learning framework,helps edge devices to collaboratively train models without revealing the sensitive grid data,which is crucial in the smart grid where real-time power anomaly detection and the decentralization of the energy management are a big deal.The hybrid AI models driven adaptive anomaly detection mechanism immediately raises the thumb if the grid stability and strength are negatively affected due to cyber threats,faults,and energy distribution,thereby keeping the grid stable with resilience.The proposed framework also adopts various security means within the blockchain and zero-trust authentication techniques to reduce the adversarial attack risks and model poisoning during federated learning.EdgeGuard-IoT shows superior detection accuracy,response time,and scalability performance at a much reduced communication overhead via extensive simulations and deployment in real-world case studies in smart grids.This research pioneers a 6G-driven federated intelligence model designed for secure,self-optimizing,and resilient Industry 5.0 ecosystems,paving the way for next-generation autonomous smart grids and industrial cyber-physical systems.
基金supported by State Grid Hebei Electric Power Co.,Ltd.Science and Technology Project,Research on Security Protection of Power Services Carried by 4G/5G Networks(Grant No.KJ2024-127).
文摘The rapid growth of Internet of things devices and the emergence of rapidly evolving network threats have made traditional security assessment methods inadequate.Federated learning offers a promising solution to expedite the training of security assessment models.However,ensuring the trustworthiness and robustness of federated learning under multi-party collaboration scenarios remains a challenge.To address these issues,this study proposes a shard aggregation network structure and a malicious node detection mechanism,along with improvements to the federated learning training process.First,we extract the data features of the participants by using spectral clustering methods combined with a Gaussian kernel function.Then,we introduce a multi-objective decision-making approach that combines data distribution consistency,consensus communication overhead,and consensus result reliability in order to determine the final network sharing scheme.Finally,by integrating the federated learning aggregation process with the malicious node detection mechanism,we improve the traditional decentralized learning process.Our proposed ShardFed algorithm outperforms conventional classification algorithms and state-of-the-art machine learning methods like FedProx and FedCurv in convergence speed,robustness against data interference,and adaptability across multiple scenarios.Experimental results demonstrate that the proposed approach improves model accuracy by up to 2.33%under non-independent and identically distributed data conditions,maintains higher performance with malicious nodes containing poisoned data ratios of 20%–50%,and significantly enhances model resistance to low-quality data.
基金supported in part by the National Natural Science Foundation of China(62403396,62433018,62373113)the Guangdong Basic and Applied Basic Research Foundation(2023A1515011527,2023B1515120010)the Postdoctoral Fellowship Program of CPSF(GZB20240621)
文摘In this paper, the containment control problem in nonlinear multi-agent systems(NMASs) under denial-of-service(DoS) attacks is addressed. Firstly, a prediction model is obtained using the broad learning technique to train historical data generated by the system offline without DoS attacks. Secondly, the dynamic linearization method is used to obtain the equivalent linearization model of NMASs. Then, a novel model-free adaptive predictive control(MFAPC) framework based on historical and online data generated by the system is proposed, which combines the trained prediction model with the model-free adaptive control method. The development of the MFAPC method motivates a much simpler robust predictive control solution that is convenient to use in the case of DoS attacks. Meanwhile, the MFAPC algorithm provides a unified predictive framework for solving consensus tracking and containment control problems. The boundedness of the containment error can be proven by using the contraction mapping principle and the mathematical induction method. Finally, the proposed MFAPC is assessed through comparative experiments.
基金The National Natural Science Foundation of China(No.60972001)the Science and Technology Plan of Suzhou City(No.SS201223)
文摘An adaptive topology learning approach is proposed to learn the topology of a practical camera network in an unsupervised way. The nodes are modeled by the Gaussian mixture model. The connectivity between nodes is judged by their cross-correlation function, which is also used to calculate their transition time distribution. The mutual information of the connected node pair is employed for transition probability calculation. A false link eliminating approach is proposed, along with a topology updating strategy to improve the learned topology. A real monitoring system with five disjoint cameras is built for experiments. Comparative results with traditional methods show that the proposed method is more accurate in topology learning and is more robust to environmental changes.
文摘Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay differential equations(DDEs)and delay integrodifferential equations(DIDEs)with constant delays,primarily due to their low regularity at delayinduced breaking points.In this paper,a DNN method that combines multi-task learning(MTL)which is proposed to solve both the forward and inverse problems of DIDEs.The core idea of this approach is to divide the original equation into multiple tasks based on the delay,using auxiliary outputs to represent the integral terms,followed by the use of MTL to seamlessly incorporate the properties at the breaking points into the loss function.Furthermore,given the increased training dificulty associated with multiple tasks and outputs,we employ a sequential training scheme to reduce training complexity and provide reference solutions for subsequent tasks.This approach significantly enhances the approximation accuracy of solving DIDEs with DNNs,as demonstrated by comparisons with traditional DNN methods.We validate the effectiveness of this method through several numerical experiments,test various parameter sharing structures in MTL and compare the testing results of these structures.Finally,this method is implemented to solve the inverse problem of nonlinear DIDE and the results show that the unknown parameters of DIDE can be discovered with sparse or noisy data.
文摘The rapidly evolving cybersecurity threat landscape exposes a critical flaw in traditional educational programs where static curricula cannot adapt swiftly to novel attack vectors.This creates a significant gap between theoretical knowledge and the practical defensive capabilities needed in the field.To address this,we propose TeachSecure-CTI,a novel framework for adaptive cybersecurity curriculumgeneration that integrates real-time Cyber Threat Intelligence(CTI)with AI-driven personalization.Our framework employs a layered architecture featuring a CTI ingestion and clusteringmodule,natural language processing for semantic concept extraction,and a reinforcement learning agent for adaptive content sequencing.Bydynamically aligning learningmaterialswithboththe evolving threat environment and individual learner profiles,TeachSecure-CTI ensures content remains current,relevant,and tailored.A 12-week study with 150 students across three institutions demonstrated that the framework improves learning gains by 34%,significantly exceeding the 12%–21%reported in recent literature.The system achieved 84.8%personalization accuracy,85.9%recognition accuracy for MITRE ATT&CK tactics,and a 31%faster competency development rate compared to static curricula.These findings have implications beyond academia,extending to workforce development,cyber range training,and certification programs.By bridging the gap between dynamic threats and static educational materials,TeachSecure-CTI offers an empirically validated,scalable solution for cultivating cybersecurity professionals capable of responding to modern threats.
基金supported by the National Natural Science Foundation of China under Grant No.12204062the Natural Science Foundation of Shandong Province under Grant No.ZR2022MF330。
文摘To enhance speech emotion recognition capability,this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup(AAM)and improved coordinate and shuffle attention(ICASA)methods.The AAM method optimizes data augmentation by combining a sample selection strategy and dynamic interpolation coefficients,thus enabling information fusion of speech data with different emotions at the acoustic level.The ICASA method enhances feature extraction capability through dynamic fusion of the improved coordinate attention(ICA)and shuffle attention(SA)techniques.The ICA technique reduces computational overhead by employing depth-separable convolution and an h-swish activation function and captures long-range dependencies of multi-scale time-frequency features using the attention weights.The SA technique promotes feature interaction through channel shuffling,which helps the model learn richer and more discriminative emotional features.Experimental results demonstrate that,compared to the baseline model,the proposed model improves the weighted accuracy by 5.42%and 4.54%,and the unweighted accuracy by 3.37%and 3.85%on the IEMOCAP and RAVDESS datasets,respectively.These improvements were confirmed to be statistically significant by independent samples t-tests,further supporting the practical reliability and applicability of the proposed model in real-world emotion-aware speech systems.
基金co-supported in part by the National Natural Science Foundation of China(No.62403131)in part by Jiangsu Funding Program for Excellent Postdoctoral Talent,China(No.2024ZB267)in part by the Shenzhen Science and Technology Program,China(No.JCYJ20230807145500002)。
文摘This paper presents a hierarchical formation control strategy to address the challenges of multiple Unmanned Aerial Vehicles(UAVs)formation control within a cooperative consensus framework.The proposed strategy incorporates a reference command generation layer,which derives UAV attitude commands based on formation requirements,and a tracking control layer to ensure accurate execution.Collaborative variables,including trajectory position and flight speed,are defined using a three-dimensional track particle and autopilot model,enabling the development of a consensus-based formation control law.Desired attitude angles are computed through altitudehold and coordinated-turn strategies.A sliding surface is designed based on reference models derived from flight quality metrics,while an adaptive controller compensates for aerodynamic model uncertainties.To enhance learning capabilities,a prediction error mechanism based on a series-parallel estimation model is introduced,enabling collaborative learning and the sharing of network weight estimation parameters within the multi-agent system.This facilitates the design of a distributed composite learning law.Lyapunov stability analysis confirms the local exponential stability of the tracking error.The simulations of a twelve-UAV formation,along with comparative analysis of two algorithms,demonstrate the system’s capability for formation maintenance and high-precision tracking control.
基金The National Natural Science Foundation of China(62136008,62293541)The Beijing Natural Science Foundation(4232056)The Beijing Nova Program(20240484514).
文摘Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.