期刊文献+
共找到2,786篇文章
< 1 2 140 >
每页显示 20 50 100
Borehole reinforcement based on polymer materials induced by liquid-gas phase transition in simulating lunar coring
1
作者 Dingqiang Mo Tao Liu +6 位作者 Zhiyu Zhao Liangyu Zhu Dongsheng Yang Yifan Wu Cheng Lan Wenchuan Jiang Heping Xie 《International Journal of Mining Science and Technology》 2025年第3期383-398,共16页
Lunar core samples are the key materials for accurately assessing and developing lunar resources.However,the difficulty of maintaining borehole stability in the lunar coring process limits the depth of lunar coring.He... Lunar core samples are the key materials for accurately assessing and developing lunar resources.However,the difficulty of maintaining borehole stability in the lunar coring process limits the depth of lunar coring.Here,a strategy of using a reinforcement fluid that undergoes a phase transition spontaneously in a vacuum environment to reinforce the borehole is proposed.Based on this strategy,a reinforcement liquid suitable for a wide temperature range and a high vacuum environment was developed.A feasibility study on reinforcing the borehole with the reinforcement liquid was carried out,and it is found that the cohesion of the simulated lunar soil can be increased from 2 to 800 kPa after using the reinforcement liquid.Further,a series of coring experiments are conducted using a selfdeveloped high vacuum(vacuum degree of 5 Pa)and low-temperature(between-30 and 50℃)simulation platform.It is confirmed that the high-boiling-point reinforcement liquid pre-placed in the drill pipe can be released spontaneously during the drilling process and finally complete the reinforcement of the borehole.The reinforcement effect of the borehole is better when the solute concentration is between0.15 and 0.25 g/mL. 展开更多
关键词 Lunar coring reinforcement fluid Borehole reinforcement Drill bit cooling
在线阅读 下载PDF
Graph-based multi-agent reinforcement learning for collaborative search and tracking of multiple UAVs 被引量:2
2
作者 Bocheng ZHAO Mingying HUO +4 位作者 Zheng LI Wenyu FENG Ze YU Naiming QI Shaohai WANG 《Chinese Journal of Aeronautics》 2025年第3期109-123,共15页
This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary obj... This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments. 展开更多
关键词 Unmanned aerial vehicle(UAV) Multi-agent reinforcement learning(MARL) Graph attention network(GAT) Tracking Dynamic and unknown environment
原文传递
Rule-Guidance Reinforcement Learning for Lane Change Decision-making:A Risk Assessment Approach 被引量:1
3
作者 Lu Xiong Zhuoren Li +2 位作者 Danyang Zhong Puhang Xu Chen Tang 《Chinese Journal of Mechanical Engineering》 2025年第2期344-359,共16页
To solve problems of poor security guarantee and insufficient training efficiency in the conventional reinforcement learning methods for decision-making,this study proposes a hybrid framework to combine deep reinforce... To solve problems of poor security guarantee and insufficient training efficiency in the conventional reinforcement learning methods for decision-making,this study proposes a hybrid framework to combine deep reinforcement learning with rule-based decision-making methods.A risk assessment model for lane-change maneuvers considering uncertain predictions of surrounding vehicles is established as a safety filter to improve learning efficiency while correcting dangerous actions for safety enhancement.On this basis,a Risk-fused DDQN is constructed utilizing the model-based risk assessment and supervision mechanism.The proposed reinforcement learning algorithm sets up a separate experience buffer for dangerous trials and punishes such actions,which is shown to improve the sampling efficiency and training outcomes.Compared with conventional DDQN methods,the proposed algorithm improves the convergence value of cumulated reward by 7.6%and 2.2%in the two constructed scenarios in the simulation study and reduces the number of training episodes by 52.2%and 66.8%respectively.The success rate of lane change is improved by 57.3%while the time headway is increased at least by 16.5%in real vehicle tests,which confirms the higher training efficiency,scenario adaptability,and security of the proposed Risk-fused DDQN. 展开更多
关键词 Autonomous driving reinforcement learning DECISION-MAKING Risk assessment Safety filter
在线阅读 下载PDF
Optimized reinforcement of granite residual soil using a cement and alkaline solution: A coupling effect 被引量:1
4
作者 Bingxiang Yuan Jingkang Liang +5 位作者 Baifa Zhang Weijie Chen Xianlun Huang Qingyu Huang Yun Li Peng Yuan 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第1期509-523,共15页
Granite residual soil (GRS) is a type of weathering soil that can decompose upon contact with water, potentially causing geological hazards. In this study, cement, an alkaline solution, and glass fiber were used to re... Granite residual soil (GRS) is a type of weathering soil that can decompose upon contact with water, potentially causing geological hazards. In this study, cement, an alkaline solution, and glass fiber were used to reinforce GRS. The effects of cement content and SiO_(2)/Na2O ratio of the alkaline solution on the static and dynamic strengths of GRS were discussed. Microscopically, the reinforcement mechanism and coupling effect were examined using X-ray diffraction (XRD), micro-computed tomography (micro-CT), and scanning electron microscopy (SEM). The results indicated that the addition of 2% cement and an alkaline solution with an SiO_(2)/Na2O ratio of 0.5 led to the densest matrix, lowest porosity, and highest static compressive strength, which was 4994 kPa with a dynamic impact resistance of 75.4 kN after adding glass fiber. The compressive strength and dynamic impact resistance were a result of the coupling effect of cement hydration, a pozzolanic reaction of clay minerals in the GRS, and the alkali activation of clay minerals. Excessive cement addition or an excessively high SiO_(2)/Na2O ratio in the alkaline solution can have negative effects, such as the destruction of C-(A)-S-H gels by the alkaline solution and hindering the production of N-A-S-H gels. This can result in damage to the matrix of reinforced GRS, leading to a decrease in both static and dynamic strengths. This study suggests that further research is required to gain a more precise understanding of the effects of this mixture in terms of reducing our carbon footprint and optimizing its properties. The findings indicate that cement and alkaline solution are appropriate for GRS and that the reinforced GRS can be used for high-strength foundation and embankment construction. The study provides an analysis of strategies for mitigating and managing GRS slope failures, as well as enhancing roadbed performance. 展开更多
关键词 Granite residue soil(GRS) reinforcement Coupling effect Alkali activation Mechanical properties
在线阅读 下载PDF
Deep reinforcement learning based integrated evasion and impact hierarchical intelligent policy of exo-atmospheric vehicles 被引量:1
5
作者 Leliang REN Weilin GUO +3 位作者 Yong XIAN Zhenyu LIU Daqiao ZHANG Shaopeng LI 《Chinese Journal of Aeronautics》 2025年第1期409-426,共18页
Exo-atmospheric vehicles are constrained by limited maneuverability,which leads to the contradiction between evasive maneuver and precision strike.To address the problem of Integrated Evasion and Impact(IEI)decision u... Exo-atmospheric vehicles are constrained by limited maneuverability,which leads to the contradiction between evasive maneuver and precision strike.To address the problem of Integrated Evasion and Impact(IEI)decision under multi-constraint conditions,a hierarchical intelligent decision-making method based on Deep Reinforcement Learning(DRL)was proposed.First,an intelligent decision-making framework of“DRL evasion decision”+“impact prediction guidance decision”was established:it takes the impact point deviation correction ability as the constraint and the maximum miss distance as the objective,and effectively solves the problem of poor decisionmaking effect caused by the large IEI decision space.Second,to solve the sparse reward problem faced by evasion decision-making,a hierarchical decision-making method consisting of maneuver timing decision and maneuver duration decision was proposed,and the corresponding Markov Decision Process(MDP)was designed.A detailed simulation experiment was designed to analyze the advantages and computational complexity of the proposed method.Simulation results show that the proposed model has good performance and low computational resource requirement.The minimum miss distance is 21.3 m under the condition of guaranteeing the impact point accuracy,and the single decision-making time is 4.086 ms on an STM32F407 single-chip microcomputer,which has engineering application value. 展开更多
关键词 Exo-atmospheric vehicle Integrated evasion and impact Deep reinforcement learning Hierarchical intelligent policy Single-chip microcomputer Miss distance
原文传递
An extended discontinuous deformation analysis for simulation of grouting reinforcement in a water-rich fractured rock tunnel 被引量:1
6
作者 Jingyao Gao Siyu Peng +1 位作者 Guangqi Chen Hongyun Fan 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第1期168-186,共19页
Grouting has been the most effective approach to mitigate water inrush disasters in underground engineering due to its ability to plug groundwater and enhance rock strength.Nevertheless,there is a lack of potent numer... Grouting has been the most effective approach to mitigate water inrush disasters in underground engineering due to its ability to plug groundwater and enhance rock strength.Nevertheless,there is a lack of potent numerical tools for assessing the grouting effectiveness in water-rich fractured strata.In this study,the hydro-mechanical coupled discontinuous deformation analysis(HM-DDA)is inaugurally extended to simulate the grouting process in a water-rich discrete fracture network(DFN),including the slurry migration,fracture dilation,water plugging in a seepage field,and joint reinforcement after coagulation.To validate the capabilities of the developed method,several numerical examples are conducted incorporating the Newtonian fluid and Bingham slurry.The simulation results closely align with the analytical solutions.Additionally,a set of compression tests is conducted on the fresh and grouted rock specimens to verify the reinforcement method and calibrate the rational properties of reinforced joints.An engineering-scale model based on a real water inrush case of the Yonglian tunnel in a water-rich fractured zone has been established.The model demonstrates the effectiveness of grouting reinforcement in mitigating water inrush disaster.The results indicate that increased grouting pressure greatly affects the regulation of water outflow from the tunnel face and the prevention of rock detachment face after excavation. 展开更多
关键词 Discontinuous deformation analysis(DDA) Water-rich fractured rock tunnel Grouting reinforcement Water inrush disaster
在线阅读 下载PDF
A Survey of Cooperative Multi-agent Reinforcement Learning for Multi-task Scenarios 被引量:1
7
作者 Jiajun CHAI Zijie ZHAO +1 位作者 Yuanheng ZHU Dongbin ZHAO 《Artificial Intelligence Science and Engineering》 2025年第2期98-121,共24页
Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-... Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world. 展开更多
关键词 MULTI-TASK multi-agent reinforcement learning large language models
在线阅读 下载PDF
Multi-QoS routing algorithm based on reinforcement learning for LEO satellite networks 被引量:1
8
作者 ZHANG Yifan DONG Tao +1 位作者 LIU Zhihui JIN Shichao 《Journal of Systems Engineering and Electronics》 2025年第1期37-47,共11页
Low Earth orbit(LEO)satellite networks exhibit distinct characteristics,e.g.,limited resources of individual satellite nodes and dynamic network topology,which have brought many challenges for routing algorithms.To sa... Low Earth orbit(LEO)satellite networks exhibit distinct characteristics,e.g.,limited resources of individual satellite nodes and dynamic network topology,which have brought many challenges for routing algorithms.To satisfy quality of service(QoS)requirements of various users,it is critical to research efficient routing strategies to fully utilize satellite resources.This paper proposes a multi-QoS information optimized routing algorithm based on reinforcement learning for LEO satellite networks,which guarantees high level assurance demand services to be prioritized under limited satellite resources while considering the load balancing performance of the satellite networks for low level assurance demand services to ensure the full and effective utilization of satellite resources.An auxiliary path search algorithm is proposed to accelerate the convergence of satellite routing algorithm.Simulation results show that the generated routing strategy can timely process and fully meet the QoS demands of high assurance services while effectively improving the load balancing performance of the link. 展开更多
关键词 low Earth orbit(LEO)satellite network reinforcement learning multi-quality of service(QoS) routing algorithm
在线阅读 下载PDF
Intelligent path planning for small modular reactors based on improved reinforcement learning
9
作者 DONG Yun-Feng ZHOU Wei-Zheng +1 位作者 WANG Zhe-Zheng ZHANG Xiao 《四川大学学报(自然科学版)》 北大核心 2025年第4期1006-1014,共9页
Small modular reactor(SMR)belongs to the research forefront of nuclear reactor technology.Nowadays,advancement of intelligent control technologies paves a new way to the design and build of unmanned SMR.The autonomous... Small modular reactor(SMR)belongs to the research forefront of nuclear reactor technology.Nowadays,advancement of intelligent control technologies paves a new way to the design and build of unmanned SMR.The autonomous control process of SMR can be divided into three stages,say,state diagnosis,autonomous decision-making and coordinated control.In this paper,the autonomous state recognition and task planning of unmanned SMR are investigated.An operating condition recognition method based on the knowledge base of SMR operation is proposed by using the artificial neural network(ANN)technology,which constructs a basis for the state judgment of intelligent reactor control path planning.An improved reinforcement learning path planning algorithm is utilized to implement the path transfer decision-makingThis algorithm performs condition transitions with minimal cost under specified modes.In summary,the full range control path intelligent decision-planning technology of SMR is realized,thus provides some theoretical basis for the design and build of unmanned SMR in the future. 展开更多
关键词 Small modular reactor Operating condition recognition Path planning reinforcement learning
在线阅读 下载PDF
Refractive status and histological changes after posterior scleral reinforcement in guinea pig
10
作者 Yu-Yan Huang Li-Yang Zhou +4 位作者 Guo-Fu Chen Duo Peng Miao-Zhen Pan Ji-Bo Zhou Jia Qu 《International Journal of Ophthalmology(English edition)》 2025年第3期375-382,共8页
AIM:To investigate the refractive and the histological changes in guinea pig eyes after posterior scleral reinforcement with scleral allografts.METHODS:Four-week-old guinea pigs were implanted with scleral allografts,... AIM:To investigate the refractive and the histological changes in guinea pig eyes after posterior scleral reinforcement with scleral allografts.METHODS:Four-week-old guinea pigs were implanted with scleral allografts,and the changes of refraction,corneal curvature and axis length were monitored for 51d.The effects of methylprednisolone(MPS)on refraction parameters were also evaluated.And the microstructure and ultra-microstructure of eyes were observed on the 9d and 51d after operation.Repeated-measures analysis of variance and one-way analysis of variance were used.RESULTS:The refraction outcome of the implanted eye decreased after operation,and the refraction change of the 3 mm scleral allografts group was significantly different with control group(P=0.005)and the sham surgical group(P=0.004).After the application of MPS solution,the reduction of refraction outcome was statistically suppressed(P=0.008).The inflammatory encapsulation appeared 9d after surgery.On 51d after operation,the loose implanted materials were absorbed,while the adherent implanted materials with MPS group were still tightly attached to the recipient’s eyeball.CONCLUSION:After implantation of scleral allografts,the refraction of guinea pig eyes fluctuated from a decrease to an increase.The outcome of the scleral allografts is affected by implantation methods and the inflammatory response.Stability of the material can be improved by MPS. 展开更多
关键词 posterior scleral reinforcement METHYLPREDNISOLONE INFLAMMATION MYOPIA guinea pig
原文传递
Enhanced deep reinforcement learning for integrated navigation in multi-UAV systems
11
作者 Zhengyang CAO Gang CHEN 《Chinese Journal of Aeronautics》 2025年第8期119-138,共20页
In multiple Unmanned Aerial Vehicles(UAV)systems,achieving efficient navigation is essential for executing complex tasks and enhancing autonomy.Traditional navigation methods depend on predefined control strategies an... In multiple Unmanned Aerial Vehicles(UAV)systems,achieving efficient navigation is essential for executing complex tasks and enhancing autonomy.Traditional navigation methods depend on predefined control strategies and trajectory planning and often perform poorly in complex environments.To improve the UAV-environment interaction efficiency,this study proposes a multi-UAV integrated navigation algorithm based on Deep Reinforcement Learning(DRL).This algorithm integrates the Inertial Navigation System(INS),Global Navigation Satellite System(GNSS),and Visual Navigation System(VNS)for comprehensive information fusion.Specifically,an improved multi-UAV integrated navigation algorithm called Information Fusion with MultiAgent Deep Deterministic Policy Gradient(IF-MADDPG)was developed.This algorithm enables UAVs to learn collaboratively and optimize their flight trajectories in real time.Through simulations and experiments,test scenarios in GNSS-denied environments were constructed to evaluate the effectiveness of the algorithm.The experimental results demonstrate that the IF-MADDPG algorithm significantly enhances the collaborative navigation capabilities of multiple UAVs in formation maintenance and GNSS-denied environments.Additionally,it has advantages in terms of mission completion time.This study provides a novel approach for efficient collaboration in multi-UAV systems,which significantly improves the robustness and adaptability of navigation systems. 展开更多
关键词 Multi-UAV system reinforcement learning Integrated navigation MADDPG Information fusion
原文传递
Pathfinder:Deep Reinforcement Learning-Based Scheduling for Multi-Robot Systems in Smart Factories with Mass Customization
12
作者 Chenxi Lyu Chen Dong +3 位作者 Qiancheng Xiong Yuzhong Chen Qian Weng Zhenyi Chen 《Computers, Materials & Continua》 2025年第8期3371-3391,共21页
The rapid advancement of Industry 4.0 has revolutionized manufacturing,shifting production from centralized control to decentralized,intelligent systems.Smart factories are now expected to achieve high adaptability an... The rapid advancement of Industry 4.0 has revolutionized manufacturing,shifting production from centralized control to decentralized,intelligent systems.Smart factories are now expected to achieve high adaptability and resource efficiency,particularly in mass customization scenarios where production schedules must accommodate dynamic and personalized demands.To address the challenges of dynamic task allocation,uncertainty,and realtime decision-making,this paper proposes Pathfinder,a deep reinforcement learning-based scheduling framework.Pathfinder models scheduling data through three key matrices:execution time(the time required for a job to complete),completion time(the actual time at which a job is finished),and efficiency(the performance of executing a single job).By leveraging neural networks,Pathfinder extracts essential features from these matrices,enabling intelligent decision-making in dynamic production environments.Unlike traditional approaches with fixed scheduling rules,Pathfinder dynamically selects from ten diverse scheduling rules,optimizing decisions based on real-time environmental conditions.To further enhance scheduling efficiency,a specialized reward function is designed to support dynamic task allocation and real-time adjustments.This function helps Pathfinder continuously refine its scheduling strategy,improving machine utilization and minimizing job completion times.Through reinforcement learning,Pathfinder adapts to evolving production demands,ensuring robust performance in real-world applications.Experimental results demonstrate that Pathfinder outperforms traditional scheduling approaches,offering improved coordination and efficiency in smart factories.By integrating deep reinforcement learning,adaptable scheduling strategies,and an innovative reward function,Pathfinder provides an effective solution to the growing challenges of multi-robot job scheduling in mass customization environments. 展开更多
关键词 Smart factory CUSTOMIZATION deep reinforcement learning production scheduling multi-robot system task allocation
在线阅读 下载PDF
Reinforcement Learning for Solving the Knapsack Problem
13
作者 Zhenfu Zhang Haiyan Yin +1 位作者 Liudong Zuo Pan Lai 《Computers, Materials & Continua》 2025年第7期919-936,共18页
The knapsack problem is a classical combinatorial optimization problem widely encountered in areas such as logistics,resource allocation,and portfolio optimization.Traditional methods,including dynamic program-ming(DP... The knapsack problem is a classical combinatorial optimization problem widely encountered in areas such as logistics,resource allocation,and portfolio optimization.Traditional methods,including dynamic program-ming(DP)and greedy algorithms,have been effective in solving small problem instances but often struggle with scalability and efficiency as the problem size increases.DP,for instance,has exponential time complexity and can become computationally prohibitive for large problem instances.On the other hand,greedy algorithms offer faster solutions but may not always yield the optimal results,especially when the problem involves complex constraints or large numbers of items.This paper introduces a novel reinforcement learning(RL)approach to solve the knapsack problem by enhancing the state representation within the learning environment.We propose a representation where item weights and volumes are expressed as ratios relative to the knapsack’s capacity,and item values are normalized to represent their percentage of the total value across all items.This novel state modification leads to a 5%improvement in accuracy compared to the state-of-the-art RL-based algorithms,while significantly reducing execution time.Our RL-based method outperforms DP by over 9000 times in terms of speed,making it highly scalable for larger problem instances.Furthermore,we improve the performance of the RL model by incorporating Noisy layers into the neural network architecture.The addition of Noisy layers enhances the exploration capabilities of the agent,resulting in an additional accuracy boost of 0.2%–0.5%.The results demonstrate that our approach not only outperforms existing RL techniques,such as the Transformer model in terms of accuracy,but also provides a substantial improvement than DP in computational efficiency.This combination of enhanced accuracy and speed presents a promising solution for tackling large-scale optimization problems in real-world applications,where both precision and time are critical factors. 展开更多
关键词 Knapsack problem reinforcement learning state modification noisy layers neural networks accuracy improvement efficiency enhancement
在线阅读 下载PDF
Decision-making and confrontation in close-range air combat based on reinforcement learning
14
作者 Mengchao YANG Shengzhe SHAN Weiwei ZHANG 《Chinese Journal of Aeronautics》 2025年第9期401-420,共20页
The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in th... The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in this domain,the existing methods often lack strategic depth and generalization in complex,high-dimensional environments.To address these limitations,this paper proposes an optimized self-play method enhanced by advancements in fighter modeling,neural network design,and algorithmic frameworks.This study employs a six-degree-of-freedom(6-DOF)F-16 fighter model based on open-source aerodynamic data,featuring airborne equipment and a realistic visual simulation platform,unlike traditional 3-DOF models.To capture temporal dynamics,Long Short-Term Memory(LSTM)layers are integrated into the neural network,complemented by delayed input stacking.The RL environment incorporates expert strategies,curiositydriven rewards,and curriculum learning to improve adaptability and strategic decision-making.Experimental results demonstrate that the proposed approach achieves a winning rate exceeding90%against classical single-agent methods.Additionally,through enhanced 3D visual platforms,we conducted human-agent confrontation experiments,where the agent attained an average winning rate of over 75%.The agent's maneuver trajectories closely align with human pilot strategies,showcasing its potential in decision-making and pilot training applications.This study highlights the effectiveness of integrating advanced modeling and self-play techniques in developing robust air combat decision-making systems. 展开更多
关键词 Air combat Decision making Flight simulation reinforcement learning Self-play
原文传递
An Automatic Damage Detection Method Based on Adaptive Theory-Assisted Reinforcement Learning
15
作者 Chengwen Zhang Qing Chun Yijie Lin 《Engineering》 2025年第7期188-202,共15页
Current damage detection methods based on model updating and sensitivity Jacobian matrixes show a low convergence ratio and computational efficiency for online calculations.The aim of this paper is to construct a real... Current damage detection methods based on model updating and sensitivity Jacobian matrixes show a low convergence ratio and computational efficiency for online calculations.The aim of this paper is to construct a real-time automated damage detection method by developing a theory-assisted adaptive mutiagent twin delayed deep deterministic(TA2-MATD3)policy gradient algorithm.First,the theoretical framework of reinforcement-learning-driven damage detection is established.To address the disadvantages of traditional mutiagent twin delayed deep deterministic(MATD3)method,the theory-assisted mechanism and the adaptive experience playback mechanism are introduced.Moreover,a historical residential house built in 1889 was taken as an example,using its 12-month structural health monitoring data.TA2-MATD3 was compared with existing damage detection methods in terms of the convergence ratio,online computing efficiency,and damage detection accuracy.The results show that the computational efficiency of TA2-MATD3 is approximately 117–160 times that of the traditional methods.The convergence ratio of damage detection on the training set is approximately 97%,and that on the test set is in the range of 86.2%–91.9%.In addition,the main apparent damages found in the field survey were identified by TA2-MATD3.The results indicate that the proposed method can significantly improve the online computing efficiency and damage detection accuracy.This research can provide novel perspectives for the use of reinforcement learning methods to conduct damage detection in online structural health monitoring. 展开更多
关键词 reinforcement learning Theory-assisted Damage detection Newton’s method Model updating Architectural heritage
在线阅读 下载PDF
Privacy Preserving Federated Anomaly Detection in IoT Edge Computing Using Bayesian Game Reinforcement Learning
16
作者 Fatima Asiri Wajdan Al Malwi +4 位作者 Fahad Masood Mohammed S.Alshehri Tamara Zhukabayeva Syed Aziz Shah Jawad Ahmad 《Computers, Materials & Continua》 2025年第8期3943-3960,共18页
Edge computing(EC)combined with the Internet of Things(IoT)provides a scalable and efficient solution for smart homes.Therapid proliferation of IoT devices poses real-time data processing and security challenges.EC ha... Edge computing(EC)combined with the Internet of Things(IoT)provides a scalable and efficient solution for smart homes.Therapid proliferation of IoT devices poses real-time data processing and security challenges.EC has become a transformative paradigm for addressing these challenges,particularly in intrusion detection and anomaly mitigation.The widespread connectivity of IoT edge networks has exposed them to various security threats,necessitating robust strategies to detect malicious activities.This research presents a privacy-preserving federated anomaly detection framework combined with Bayesian game theory(BGT)and double deep Q-learning(DDQL).The proposed framework integrates BGT to model attacker and defender interactions for dynamic threat level adaptation and resource availability.It also models a strategic layout between attackers and defenders that takes into account uncertainty.DDQL is incorporated to optimize decision-making and aids in learning optimal defense policies at the edge,thereby ensuring policy and decision optimization.Federated learning(FL)enables decentralized and unshared anomaly detection for sensitive data between devices.Data collection has been performed from various sensors in a real-time EC-IoT network to identify irregularities that occurred due to different attacks.The results reveal that the proposed model achieves high detection accuracy of up to 98%while maintaining low resource consumption.This study demonstrates the synergy between game theory and FL to strengthen anomaly detection in EC-IoT networks. 展开更多
关键词 IOT edge computing smart homes anomaly detection Bayesian game theory reinforcement learning
在线阅读 下载PDF
Semantic Knowledge Based Reinforcement Learning Formalism for Smart Learning Environments
17
作者 Taimoor Hassan Ibrar Hussain +3 位作者 Hafiz Mahfooz Ul Haque Hamid Turab Mirza Muhammad Nadeem Ali Byung-Seo Kim 《Computers, Materials & Continua》 2025年第10期2071-2094,共24页
Smart learning environments have been considered as vital sources and essential needs in modern digital education systems.With the rapid proliferation of smart and assistive technologies,smart learning processes have ... Smart learning environments have been considered as vital sources and essential needs in modern digital education systems.With the rapid proliferation of smart and assistive technologies,smart learning processes have become quite convenient,comfortable,and financially affordable.This shift has led to the emergence of pervasive computing environments,where user’s intelligent behavior is supported by smart gadgets;however,it is becoming more challenging due to inconsistent behavior of Artificial intelligence(AI)assistive technologies in terms of networking issues,slow user responses to technologies and limited computational resources.This paper presents a context-aware predictive reasoning based formalism for smart learning environments that facilitates students in managing their academic as well as extra-curricular activities autonomously with limited human intervention.This system consists of a three-tier architecture including the acquisition of the contextualized information from the environment autonomously,modeling the system using Web Ontology Rule Language(OWL 2 RL)and Semantic Web Rule Language(SWRL),and perform reasoning to infer the desired goals whenever and wherever needed.For contextual reasoning,we develop a non-monotonic reasoning based formalism to reason with contextual information using rule-based reasoning.The focus is on distributed problem solving,where context-aware agents exchange information using rule-based reasoning and specify constraints to accomplish desired goals.To formally model-check and simulate the system behavior,we model the case study of a smart learning environment in the UPPAAL model checker and verify the desired properties in the model,such as safety,liveness and robust properties to reflect the overall correctness behavior of the system with achieving the minimum analysis time of 0.002 s and 34,712 KB memory utilization. 展开更多
关键词 CONTEXT-AWARENESS reinforcement learning multi-agent systems non-monotonic reasoning formal verification
在线阅读 下载PDF
Ground reaction curves for strain-softening rock masses with ground reinforcement based on unified strength criterion
18
作者 CHEN Xuan-hao ZHANG Ding-li +1 位作者 SUN Zhen-yu CHEN Wen-bo 《Journal of Central South University》 2025年第9期3383-3404,共22页
Ground reinforcement is crucial for tunnel construction, especially in soft rock tunnels. Existing analytical models are inadequate for predicting the ground reaction curves (GRCs) for reinforced tunnels in strain-sof... Ground reinforcement is crucial for tunnel construction, especially in soft rock tunnels. Existing analytical models are inadequate for predicting the ground reaction curves (GRCs) for reinforced tunnels in strain-softening (SS) rock masses. This study proposes a novel analytical model to determine the GRCs of SS rock masses, incorporating ground reinforcement and intermediate principal stress (IPS). The SS constitutive model captures the progressive post- peak failure, while the elastic-brittle model simulates reinforced rock masses. Nine combined states are innovatively investigated to analyze plastic zone development in natural and reinforced regions. Each region is analyzed separately, and coupled through boundary conditions at interface. Comparison with three types of existing models indicates that these models overestimate reinforcement effects. The deformation prediction errors of single geological material models may exceed 75%. Furthermore, neglecting softening and residual zones in natural regions could lead to errors over 50%. Considering the IPS can effectively utilize the rock strength to reduce tunnel deformation by at least 30%, thereby saving on reinforcement and support costs. The computational results show a satisfactory agreement with the monitoring data from a model test and two tunnel projects. The proposed model may offer valuable insights into the design and construction of reinforced tunnel engineering. 展开更多
关键词 ground reinforcement STRAIN-SOFTENING unified strength criterion tunnel responses analytical model
在线阅读 下载PDF
C-SPPO:A deep reinforcement learning framework for large-scale dynamic logistics UAV routing problem
19
作者 Fei WANG Honghai ZHANG +2 位作者 Sen DU Mingzhuang HUA Gang ZHONG 《Chinese Journal of Aeronautics》 2025年第5期296-316,共21页
Unmanned Aerial Vehicle(UAV)stands as a burgeoning electric transportation carrier,holding substantial promise for the logistics sector.A reinforcement learning framework Centralized-S Proximal Policy Optimization(C-S... Unmanned Aerial Vehicle(UAV)stands as a burgeoning electric transportation carrier,holding substantial promise for the logistics sector.A reinforcement learning framework Centralized-S Proximal Policy Optimization(C-SPPO)based on centralized decision process and considering policy entropy(S)is proposed.The proposed framework aims to plan the best scheduling scheme with the objective of minimizing both the timeout of order requests and the flight impact of UAVs that may lead to conflicts.In this framework,the intents of matching act are generated through the observations of UAV agents,and the ultimate conflict-free matching results are output under the guidance of a centralized decision maker.Concurrently,a pre-activation operation is introduced to further enhance the cooperation among UAV agents.Simulation experiments based on real-world data from New York City are conducted.The results indicate that the proposed CSPPO outperforms the baseline algorithms in the Average Delay Time(ADT),the Maximum Delay Time(MDT),the Order Delay Rate(ODR),the Average Flight Distance(AFD),and the Flight Impact Ratio(FIR).Furthermore,the framework demonstrates scalability to scenarios of different sizes without requiring additional training. 展开更多
关键词 Unmanned aerial vehicle Vehicle routing problem Orderdelivery reinforcement learning MULTI-AGENT Proximal policy optimization
原文传递
Achievement of Fish School Milling Motion Based on Distributed Multi-agent Reinforcement Learning
20
作者 Jincun Liu Yinjie Ren +3 位作者 Yang Liu Yan Meng Dong An Yaoguang Wei 《Journal of Bionic Engineering》 2025年第4期1683-1701,共19页
In recent years,significant research attention has been directed towards swarm intelligence.The Milling behavior of fish schools,a prime example of swarm intelligence,shows how simple rules followed by individual agen... In recent years,significant research attention has been directed towards swarm intelligence.The Milling behavior of fish schools,a prime example of swarm intelligence,shows how simple rules followed by individual agents lead to complex collective behaviors.This paper studies Multi-Agent Reinforcement Learning to simulate fish schooling behavior,overcoming the challenges of tuning parameters in traditional models and addressing the limitations of single-agent methods in multi-agent environments.Based on this foundation,a novel Graph Convolutional Networks(GCN)-Critic MADDPG algorithm leveraging GCN is proposed to enhance cooperation among agents in a multi-agent system.Simulation experiments demonstrate that,compared to traditional single-agent algorithms,the proposed method not only exhibits significant advantages in terms of convergence speed and stability but also achieves tighter group formations and more naturally aligned Milling behavior.Additionally,a fish school self-organizing behavior research platform based on an event-triggered mechanism has been developed,providing a robust tool for exploring dynamic behavioral changes under various conditions. 展开更多
关键词 Collective motion Collective behavior SELF-ORGANIZATION Fish school Multi-agent reinforcement learning
在线阅读 下载PDF
上一页 1 2 140 下一页 到第
使用帮助 返回顶部