期刊文献+
共找到2,202篇文章
< 1 2 111 >
每页显示 20 50 100
Hybrid quantum–classical multi-agent decision-making framework based on hierarchical Bayesian networks in the noisy intermediate-scale quantum era
1
作者 Hao Shi Chenghao Han +1 位作者 Peng Wang Ming Zhang 《Chinese Physics B》 2025年第12期61-74,共14页
Although quantum Bayesian networks provide a promising paradigm for multi-agent decision-making,their practical application faces two challenges in the noisy intermediate-scale quantum(NISQ)era.Limited qubit resources... Although quantum Bayesian networks provide a promising paradigm for multi-agent decision-making,their practical application faces two challenges in the noisy intermediate-scale quantum(NISQ)era.Limited qubit resources restrict direct application to large-scale inference tasks.Additionally,no quantum methods are currently available for multi-agent collaborative decision-making.To address these,we propose a hybrid quantum–classical multi-agent decision-making framework based on hierarchical Bayesian networks,comprising two novel methods.The first one is a hybrid quantum–classical inference method based on hierarchical Bayesian networks.It decomposes large-scale hierarchical Bayesian networks into modular subnetworks.The inference for each subnetwork can be performed on NISQ devices,and the intermediate results are converted into classical messages for cross-layer transmission.The second one is a multi-agent decision-making method using the variational quantum eigensolver(VQE)in the influence diagram.This method models the collaborative decision-making with the influence diagram and encodes the expected utility of diverse actions into a Hamiltonian and subsequently determines the intra-group optimal action efficiently.Experimental validation on the IonQ quantum simulator demonstrates that the hierarchical method outperforms the non-hierarchical method at the functional inference level,and the VQE method can obtain the optimal strategy exactly at the collaborative decision-making level.Our research not only extends the application of quantum computing to multi-agent decision-making but also provides a practical solution for the NISQ era. 展开更多
关键词 quantum Bayesian networks multi-agent decision-making hybrid quantum–classical algorithms hierarchical Bayesian networks
原文传递
Multi-Agent Reinforcement Learning for Moving Target Defense Temporal Decision-Making Approach Based on Stackelberg-FlipIt Games
2
作者 Rongbo Sun Jinlong Fei +1 位作者 Yuefei Zhu Zhongyu Guo 《Computers, Materials & Continua》 2025年第8期3765-3786,共22页
Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,... Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness. 展开更多
关键词 Cyber security moving target defense multi-agent reinforcement learning security metrics game theory
在线阅读 下载PDF
Decision-making performance of large language models vs.human physicians in challenging lung cancer cases:A real-world case-based study
3
作者 Ning Yang Kailai Li +19 位作者 Baiyang Liu Xiting Chen Aimin Jiang Chang Qi Wenyi Gan Lingxuan Zhu Weiming Mou Dongqiang Zeng Mingjia Xiao Guangdi Chu Shengkun Peng Hank ZHWong Lin Zhang Hengguo Zhang Xinpei Deng Quan Cheng Bufu Tang Anqi Lin Juan Zhou Peng Luo 《Intelligent Oncology》 2026年第1期15-24,共10页
Background:Despite the promise shown by large language models(LLMs)for standardized tasks,their multidimensional performance in real-world oncology decision-making remains unevaluated.This study aims to introduce a fr... Background:Despite the promise shown by large language models(LLMs)for standardized tasks,their multidimensional performance in real-world oncology decision-making remains unevaluated.This study aims to introduce a framework for evaluating LLMs and physician decisions in challenging lung cancer cases.Methods:We curated 50 challenging lung cancer cases(25 local and 25 published)classified as complex,rare,or refractory.Blinded three-dimensional,five-point Likert evaluations(1–5 for comprehensiveness,specificity,and readability)compared standalone LLMs(DeepSeek R1,Claude 3.5,Gemini 1.5,and GPT-4o),physicians by experience level(junior,intermediate,and senior),and AI-assisted juniors;intergroup differences and augmentation effects were analyzed statistically.Results:Of 50 challenging cases(18 complex,17 rare,and 15 refractory)rated by three experts,DeepSeek R1 achieved scores of 3.95±0.33,3.71±0.53,and 4.26±0.18 for comprehensiveness,specificity,and readability,respectively,positioning it between intermediate(3.68,3.68,3.75)and senior(4.50,4.64,4.53)physicians.GPT-4o and Claude 3.5 reached intermediate physician–level comprehensiveness(3.76±0.39,3.60±0.39)but junior-to-intermediate physician–level specificity(3.39±0.39,3.39±0.49).All LLMs scored higher on rare cases than intermediate physicians but fell below junior physicians in refractory-case specificity.AIassisted junior physicians showed marked gains in rare cases,with comprehensiveness rising from 2.32 to 4.29(84.8%),specificity from 2.24 to 4.26(90.8%),and readability from 2.76 to 4.59(66.0%),while specificity declined by 3.2%(3.17 to 3.07)in refractory cases.Error analysis showed complementary strengths,with physicians demonstrating reasoning stability and LLMs excelling in knowledge updating and risk management.Conclusions:LLMs performed variably in clinical decision-making tasks depending on case type,performing better in rare cases and worse in refractory cases requiring longitudinal reasoning.Complementary strengths between LLMs and physicians support case-and task-tailored human–AI collaboration. 展开更多
关键词 Large language models Clinical evaluation decision-making Lung cancer
暂未订购
GRA:Graph-based reward aggregation for cooperative multi-agent reinforcement learning
4
作者 Jingcheng Tang Peng Zhou +1 位作者 He Bai Gangshan Jing 《Journal of Automation and Intelligence》 2026年第1期46-56,共11页
Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused ... Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused by the strong inter-agent coupling nature embedded in an MARL problem,which is yet to be fully exploited in existing algorithms.In this work,we recognize a learning graph characterizing the dependence between individual rewards and individual policies.Then we propose a graph-based reward aggregation(GRA)method,which utilizes the inherent coupling relationship among agents to eliminate redundant information.Specifically,GRA passes information among cooperating agents through graph attention networks to obtain aggregated rewards that contribute to the fitting of the value function,making each agent learn a decentralized executable cooperation policy.In addition,we propose a variant of GRA,named GRA-decen,which achieves decentralized training and decentralized execution(DTDE)when each agent only has access to information of partial agents in the learning process.We conduct experiments in different environments and demonstrate the practicality and scalability of our algorithms. 展开更多
关键词 Networked system multi-agent reinforcement learning Graph-based RL
在线阅读 下载PDF
Fixed-Time Zeroing Neural Dynamics for Adaptive Coordination of Multi-Agent Systems
5
作者 Cheng Hua Xinwei Cao +1 位作者 Jianfeng Li Shuai Li 《CAAI Transactions on Intelligence Technology》 2026年第1期267-278,共12页
This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination me... This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination methods that are solved by neural dynamics,the proposed strategy displays greater flexibility,adaptability and scalability.Furthermore,the proposed AMAC strategy is reconstructed as a time-varying complex-valued matrix equation.By introducing a dynamic error function,a fixed-time convergent zeroing neural network(FTCZNN)model is designed for the online solution of the AMAC strategy,with its convergence time upper bound derived theoretically.Finally,the effectiveness and applicability of the coordination control method are demonstrated by numerical simulations and physical experiments.Numerical results indicate that this method can reduce the formation error to the order of 10^(-6)within 1.8 s. 展开更多
关键词 fixed-time convergence multi-agent coordination ROBOTICS zeroing neural dynamics
在线阅读 下载PDF
Hybrid Pythagorean Fuzzy Decision-Making Framework for Sustainable Urban Planning under Uncertainty
6
作者 Sana Shahab Vladimir Simic +2 位作者 Ashit Kumar Dutta Mohd Anjum Dragan Pamucar 《Computer Modeling in Engineering & Sciences》 2026年第1期892-925,共34页
Environmental problems are intensifying due to the rapid growth of the population,industry,and urban infrastructure.This expansion has resulted in increased air and water pollution,intensified urban heat island effect... Environmental problems are intensifying due to the rapid growth of the population,industry,and urban infrastructure.This expansion has resulted in increased air and water pollution,intensified urban heat island effects,and greater runoff from parks and other green spaces.Addressing these challenges requires prioritizing green infrastructure and other sustainable urban development strategies.This study introduces a novel Integrated Decision Support System that combines Pythagorean Fuzzy Sets with the Advanced Alternative Ranking Order Method allowing for Two-Step Normalization(AAROM-TN),enhanced by a dual weighting strategy.The weighting approach integrates the Criteria Importance Through Intercriteria Correlation(CRITIC)method with the Criteria Importance through Means and Standard Deviation(CIMAS)technique.The originality of the proposed framework lies in its ability to objectively quantify criteria importance using CRITIC,incorporate decision-makers’preferences through CIMAS,and capture the uncertainty and hesitation inherent in human judgment via Pythagorean Fuzzy Sets.A case study evaluating green infrastructure alternatives in metropolitan regions demonstrates the applicability and effectiveness of the framework.A sensitivity analysis is conducted to examine how variations in criteria weights affect the rankings and to evaluate the robustness of the results.Furthermore,a comparative analysis highlights the practical and financial implications of each alternative by assessing their respective strengths and weaknesses. 展开更多
关键词 Sustainable urban planning criterion importance assessment two-step normalization environmental impact decision-making
在线阅读 下载PDF
Output feedback prescribed performance state synchronization for leader-following high-order uncertain nonlinear multi-agent systems
7
作者 Ilias Katsoukis George A.Rovithakis 《Journal of Automation and Intelligence》 2026年第1期35-45,共11页
This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to es... This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to estimate higher-order synchronization errors,enabling the controller to rely solely on relative output measurements.This approach significantly reduces the dependence on full-state information,which is often infeasible or costly in practical engineering applications.An output feedback control strategy is developed to overcome these limitations while ensuring robust and effective synchronization.Simulation results are provided to demonstrate the effectiveness of the proposed approach and validate the theoretical findings. 展开更多
关键词 Synchronization problem Leader-following High-order nonlinear systems multi-agent systems High-gain observer
在线阅读 下载PDF
Command-agent:Reconstructing warfare simulation and command decision-making using large language models
8
作者 Mengwei Zhang Minchi Kuang +3 位作者 Heng Shi Jihong Zhu Jingyu Zhu Xiao Jiang 《Defence Technology(防务技术)》 2026年第2期294-313,共20页
War rehearsals have become increasingly important in national security due to the growing complexity of international affairs.However,traditional rehearsal methods,such as military chess simulations,are inefficient an... War rehearsals have become increasingly important in national security due to the growing complexity of international affairs.However,traditional rehearsal methods,such as military chess simulations,are inefficient and inflexible,with particularly pronounced limitations in command and decision-making.The overwhelming volume of information and high decision complexity hinder the realization of autonomous and agile command and control.To address this challenge,an intelligent warfare simulation framework named Command-Agent is proposed,which deeply integrates large language models(LLMs)with digital twin battlefields.By constructing a highly realistic battlefield environment through real-time simulation and multi-source data fusion,the natural language interaction capabilities of LLMs are leveraged to lower the command threshold and to enable autonomous command through the Observe-Orient-Decide-Act(OODA)feedback loop.Within the Command-Agent framework,a multimodel collaborative architecture is further adopted to decouple the decision-generation and command-execution functions of LLMs.By combining specialized models such as Deep Seek-R1 and MCTool,the limitations of single-model capabilities are overcome.MCTool is a lightweight execution model fine-tuned for military Function Calling tasks.The framework also introduces a Vector Knowledge Base to mitigate hallucinations commonly exhibited by LLMs.Experimental results demonstrate that Command-Agent not only enables natural language-driven simulation and control but also deeply understands commander intent.Leveraging the multi-model collaborative architecture,during red-blue UAV confrontations involving 2 to 8 UAVs,the integrated score is improved by an average of 41.8%compared to the single-agent system(MCTool),accompanied by a 161.8%optimization in the battle loss ratio.Furthermore,when compared with multi-agent systems lacking the knowledge base,the inclusion of the Vector Knowledge Base further improves overall performance by 16.8%.In comparison with the general model(Qwen2.5-7B),the fine-tuned MCTool leads by 5%in execution efficiency.Therefore,the proposed Command-Agent introduces a novel perspective to the military command system and offers a feasible solution for intelligent battlefield decision-making. 展开更多
关键词 Digital twin battlefield Large language models multi-agent system Military command
在线阅读 下载PDF
Distributed unsupervised meta-learning algorithm over multi-agent systems
9
作者 Zhenzhen Wang Bing He +3 位作者 Zixin Jiang Xianyang Zhang Haidi Dong Di Ye 《Digital Communications and Networks》 2026年第1期134-142,共9页
Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and colla... Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and collaboration.However,most existing unsupervised meta-learning methods are centralized and not suitable for multi-agent systems where data are distributed stored and inaccessible to all agents.Meta-GMVAE,based on Variational Autoencoder(VAE)and set-level variational inference,represents a sophisticated unsupervised meta-learning model that improves generative performance by efficiently learning data representations across various tasks,increasing adaptability and reducing sample requirements.Inspired by these advancements,we propose a novel Distributed Unsupervised Meta-Learning(DUML)framework based on Meta-GMVAE and a fusion strategy.Furthermore,we present a DUML algorithm based on Gaussian Mixture Model(DUMLGMM),where the parameters of the Gaussian-mixture are solved by an Expectation-Maximization algorithm.Simulations on Omniglot and Mini Image Net datasets show that DUMLGMM can achieve the performance of the corresponding centralized algorithm and outperform non-cooperative algorithm. 展开更多
关键词 Unsupervised meta-learning multi-agent systems Variational autoencoder Gaussian mixture model
在线阅读 下载PDF
Toward Collaborative and Adaptive Learning:A Survey of Multi-agent Reinforcement Learning in Education
10
作者 Sirine Bouguettaya Ouarda Zedadra +1 位作者 Francesco Pupo Giancarlo Fortino 《Artificial Intelligence Science and Engineering》 2026年第1期1-19,共19页
In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Mu... In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education. 展开更多
关键词 reinforcement learning multi-agent reinforcement learning Agentic AI EDUCATION generative AI
在线阅读 下载PDF
Leader-following positive consensus of heterogeneous switched multi-agent systems with average dwell time switching
11
作者 Kaiming Li Wei Xing +1 位作者 Haoyue Yang Junfeng Zhang 《Control Theory and Technology》 2026年第1期66-81,共16页
This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consen... This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained. 展开更多
关键词 Heterogeneous switched multi-agent systems Positive consensus Linear programming
原文传递
Research on UAV-MEC Cooperative Scheduling Algorithms Based on Multi-Agent Deep Reinforcement Learning
12
作者 Yonghua Huo Ying Liu +1 位作者 Anni Jiang Yang Yang 《Computers, Materials & Continua》 2026年第3期1823-1850,共28页
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier... With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics. 展开更多
关键词 UAV-MEC networks multi-agent deep reinforcement learning MATD3 task offloading
在线阅读 下载PDF
Finite-time fault-tolerant tracking control for multi-agent systems based on neural observer
13
作者 Junzhe Cheng Shitong Zhang +1 位作者 Qing Wang Bin Xin 《Control Theory and Technology》 2026年第1期10-23,共14页
This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external di... This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external disturbances.Under the directed topology conditions,an observer-based finite-time control strategy based on adaptive backstepping and is proposed,in which a neural network-based state observer is employed to approximate the unmeasurable system state variables.To address the complexity explosion problem associated with the backstepping method,a finite-time command filter is incorporated,with error compensation signals designed to mitigate the filter-induced errors.Additionally,the Butterworth low-pass filter is introduced to avoid the algebraic ring problem in the design of the controller.The finite-time stability of the closed-loop system is rigorously analyzed with the finite-time Lyapunov stability criterion,validating that all closed-loop signals of the system remain bounded within a finite time.Finally,the effectiveness of the proposed control strategy is verified through a simulation example. 展开更多
关键词 multi-agent systems Command filtered backstepping Finite-time control Neural observer Non-affine faults
原文传递
Multi-agent reinforcement learning with layered autonomy and collaboration for enhanced collaborative confrontation
14
作者 Xiaoyu XING Haoxiang XIA 《Chinese Journal of Aeronautics》 2026年第2期370-388,共19页
Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making p... Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target. 展开更多
关键词 Attack-defense confrontation Collaborative confrontation Autonomous agents multi-agent systems Reinforcement learning Maneuvering decisionmaking
原文传递
Within-visual-range air combat maneuver decision-making in obstructed environments via a curriculum self-play soft actor-critic with an attention mechanism
15
作者 Longjie Zheng Xin Li +6 位作者 Xichao Su Bai Li Lei Wang Junlin Zhou Haijun Peng Wei Tian Xinwei Wang 《Defence Technology(防务技术)》 2026年第3期122-137,共16页
With the rapid development of artificial intelligence,intelligent air combat maneuver decision-making(ACMD)has garnered global attention.Although deep reinforcement learning provides a promising approach to ACMD,exist... With the rapid development of artificial intelligence,intelligent air combat maneuver decision-making(ACMD)has garnered global attention.Although deep reinforcement learning provides a promising approach to ACMD,existing methods often suffer from rigid reward functions and limited adaptability to evolving adversarial strategies.Moreover,most research assumes open airspace,overlooking the influence of potential obstacles.In this paper,we address one-on-one within-visual-range ACMD in obstructed environments,and propose an improved Soft Actor-Critic(SAC)algorithm trained under a curriculum self-play framework.A maneuver strategy mirroring inference module is integrated to estimate each other's likely positions when visual obstruction occurs.By leveraging curriculum learning to guide progressive experience accumulation and self-play for adversarial evolution,our method enhances both training efficiency and tactical diversity.We further integrate an attention mechanism that dynamically adjusts the weights of sub-rewards,enabling the learned policy to adapt to rapidly changing air combat situations.Numerical simulations demonstrate that our enhanced SAC converges more quickly and achieves higher win rates than other baseline methods.An animation is available at bilibili.com/video/BV1BHVszHE98 for better illustration. 展开更多
关键词 Air combat maneuver decision-making Soft actor-critic Curriculum self-play training Attention mechanism Obstructed environment
在线阅读 下载PDF
An Integrated Approach to Condition-Based Maintenance Decision-Making of Planetary Gearboxes: Combining Temporal Convolutional Network Auto Encoders with Wiener Process
16
作者 Bo Zhu Enzhi Dong +3 位作者 Zhonghua Cheng Xianbiao Zhan Kexin Jiang Rongcai Wang 《Computers, Materials & Continua》 2026年第1期661-686,共26页
With the increasing complexity of industrial automation,planetary gearboxes play a vital role in largescale equipment transmission systems,directly impacting operational efficiency and safety.Traditional maintenance s... With the increasing complexity of industrial automation,planetary gearboxes play a vital role in largescale equipment transmission systems,directly impacting operational efficiency and safety.Traditional maintenance strategies often struggle to accurately predict the degradation process of equipment,leading to excessive maintenance costs or potential failure risks.However,existing prediction methods based on statistical models are difficult to adapt to nonlinear degradation processes.To address these challenges,this study proposes a novel condition-based maintenance framework for planetary gearboxes.A comprehensive full-lifecycle degradation experiment was conducted to collect raw vibration signals,which were then processed using a temporal convolutional network autoencoder with multi-scale perception capability to extract deep temporal degradation features,enabling the collaborative extraction of longperiod meshing frequencies and short-term impact features from the vibration signals.Kernel principal component analysis was employed to fuse and normalize these features,enhancing the characterization of degradation progression.A nonlinear Wiener process was used to model the degradation trajectory,with a threshold decay function introduced to dynamically adjust maintenance strategies,and model parameters optimized through maximum likelihood estimation.Meanwhile,the maintenance strategy was optimized to minimize costs per unit time,determining the optimal maintenance timing and preventive maintenance threshold.The comprehensive indicator of degradation trends extracted by this method reaches 0.756,which is 41.2%higher than that of traditional time-domain features;the dynamic threshold strategy reduces the maintenance cost per unit time to 55.56,which is 8.9%better than that of the static threshold optimization.Experimental results demonstrate significant reductions in maintenance costs while enhancing system reliability and safety.This study realizes the organic integration of deep learning and reliability theory in the maintenance of planetary gearboxes,provides an interpretable solution for the predictive maintenance of complex mechanical systems,and promotes the development of condition-based maintenance strategies for planetary gearboxes. 展开更多
关键词 Temporal convolutional network autoencoder full lifecycle degradation experiment nonlinear Wiener process condition-based maintenance decision-making fault monitoring
在线阅读 下载PDF
MultiAgent-CoT:A Multi-Agent Chain-of-Thought Reasoning Model for Robust Multimodal Dialogue Understanding
17
作者 Ans D.Alghamdi 《Computers, Materials & Continua》 2026年第2期1395-1429,共35页
Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal ... Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal alignment,temporal consistency,and robust handling of noisy or incomplete inputs across multiple modalities.We propose Multi Agent-Chain of Thought(CoT),a novel multi-agent chain-of-thought reasoning framework where specialized agents for text,vision,and speech modalities collaboratively construct shared reasoning traces through inter-agent message passing and consensus voting mechanisms.Our architecture incorporates self-reflection modules,conflict resolution protocols,and dynamic rationale alignment to enhance consistency,factual accuracy,and user engagement.The framework employs a hierarchical attention mechanism with cross-modal fusion and implements adaptive reasoning depth based on dialogue complexity.Comprehensive evaluations on Situated Interactive Multi-Modal Conversations(SIMMC)2.0,VisDial v1.0,and newly introduced challenging scenarios demonstrate statistically significant improvements in grounding accuracy(p<0.01),chain-of-thought interpretability,and robustness to adversarial inputs compared to state-of-the-art monolithic transformer baselines and existing multi-agent approaches. 展开更多
关键词 multi-agent systems chain-of-thought reasoning multimodal dialogue conversational artificial intelligence(AI) cross-modal fusion reasoning Interpretability
在线阅读 下载PDF
A Distributed Dual-Network Meta-Adaptive Framework for Scalable and Privacy-Aware Multi-Agent Coordination
18
作者 Atef Gharbi Mohamed Ayari +3 位作者 Nasser Albalawi Ahmad Alshammari Nadhir Ben Halima Zeineb Klai 《Computers, Materials & Continua》 2026年第5期1456-1476,共21页
This paper presents Dual Adaptive Neural Topology(Dual ANT),a distributed dual-network metaadaptive framework that enhances ant-colony-based multi-agent coordination with online introspection,adaptive parameter contro... This paper presents Dual Adaptive Neural Topology(Dual ANT),a distributed dual-network metaadaptive framework that enhances ant-colony-based multi-agent coordination with online introspection,adaptive parameter control,and privacy-preserving interactions.This approach improves standard Ant Colony Optimization(ACO)with two lightweight neural components:a forward network that estimates swarm efficiency in real time and an inverse network that converts these descriptors into parameter adaptations.To preserve the privacy of individual trajectories in shared pheromone maps,we introduce a locally differentially private pheromone update mechanism that adds calibrated noise to each agent’s pheromone deposit while preserving the efficacy of the global pheromone signal.The resulting systemenables agents to dynamically and autonomously adapt their coordination strategies under challenging and dynamic conditions,including varying obstacle layouts,uncertain target locations,and time-varying disturbances.Extensive simulations of large grid-based search tasks demonstrated that Dual ANT achieved faster convergence,higher robustness,and improved scalability compared to advanced baselines such asMulti-StrategyACO and Hierarchical ACO.The meta-adaptive feedback loop compensates for the performance degradation caused by privacy noise and prevents premature stagnation by triggering Levy flight exploration only when necessary. 展开更多
关键词 Ant colony optimization multi-agent systems deep neural networks meta-adaptive learning Levy flight differential privacy swarm intelligence
在线阅读 下载PDF
Control-Communication Co-Optimization for Wireless Cloud Robotic System via Multi-Agent Transfer Reinforcement Learning
19
作者 Chi Xu Junyuan Zhang Haibin Yu 《IEEE/CAA Journal of Automatica Sinica》 2026年第2期311-326,共16页
The wireless cloud robotic system(WCRS),which fully integrates sensing,communication,computing,and control capabilities as an intelligent agent,is a promising way to achieve intelligent manufacturing due to easy deplo... The wireless cloud robotic system(WCRS),which fully integrates sensing,communication,computing,and control capabilities as an intelligent agent,is a promising way to achieve intelligent manufacturing due to easy deployment and flexible expansion.However,the high-precision control of WCRS requires deterministic wireless communication,which is always challenging in the complex and dynamic radio space.This paper employs the reconfigurable intelligent surface(RIS)to establish a novel RIS-assisted WCRS architecture,where the radio channel is controlled to achieve ultra-reliable,low-delay,and low-jitter communication for high-precision closed-loop motion control.However,control and communication are strongly coupled and should be co-optimized.Fully considering the constraints of control input threshold,control delay deadline,beam phase,antenna power,and information distortion,we establish a stability maximization problem to jointly optimize control input compensation,RIS phase shift,and beamforming.Herein,a new jitter-oriented system stability objective with respect to control error and communication jitter is defined and the closed-form expression of control delay deadline is derived based on the Jensen Inequality and Lyapunov-Krasovskii functional.Due to the time-varying and partial observability of the channel and robot states,we model the problem as a partially observable Markov decision process(POMDP).To solve this complex problem,we propose a multi-agent transfer reinforcement learning algorithm named LSTM-PPO-MATRL,where the LSTM-enhanced proximal policy optimization(PPO)is designed to approximate an optimal solution and the option-guided policy transfer learning is proposed to facilitate the learning process.By centralized training and decentralized execution,LSTM-PPO-MATRL is validated by extensive experiments on MuJoCo tasks for both low-mobility and high-mobility robotic control scenarios.The results demonstrate that LSTM-PPO-MATRL not only realizes high learning efficiency,but also supports low-delay,low-jitter communication for low error control,where 71.9%control accuracy improvement and 68.7%delay jitter reduction are achieved compared to the PPO-MADRL baseline. 展开更多
关键词 multi-agent transfer reinforcement learning(MATRL) partially observable Markov decision process(POMDP) reconfigurable intelligent surface(RIS) system stability wireless cloud robotic system(WCRS)
在线阅读 下载PDF
Research on Maneuver Decision-Making of Multi-Agent Adversarial Game in a Random Interference Environment 被引量:1
20
作者 Shiguang Hu Le Ru +4 位作者 Bo Lu Zhenhua Wang Xiaolin Zhao Wenfei Wang Hailong Xi 《Computers, Materials & Continua》 SCIE EI 2024年第10期1879-1903,共25页
The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-ma... The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-making among game players in simulated confrontation scenarios within a random interference environment.It considers the possible risks that random disturbances may pose to the autonomous decision-making of game players,as well as the impact of participants’manipulative behaviors on the state changes of the players.A nonlinear mathematical model is established to describe the strategy decision-making process of the participants in this scenario.Subsequently,the strategy selection interaction relationship,strategy evolution stability,and dynamic decision-making process of the game players are investigated and verified by simulation experiments.The results show that maneuver-related parameters and random environmental interference factors have different effects on the selection and evolutionary speed of the agent’s strategies.Especially in a highly uncertain environment,even small information asymmetry or miscalculation may have a significant impact on decision-making.This also confirms the feasibility and effectiveness of the method proposed in the paper,which can better explain the behavioral decision-making process of the agent in the interaction process.This study provides feasibility analysis ideas and theoretical references for improving multi-agent interactive decision-making and the interpretability of the game system model. 展开更多
关键词 Behavior decision-making stochastic evolutionary game nonlinear mathematical modeling multi-agent MANEUVER
在线阅读 下载PDF
上一页 1 2 111 下一页 到第
使用帮助 返回顶部