期刊文献+
共找到314篇文章
< 1 2 16 >
每页显示 20 50 100
Recent Progress in Reinforcement Learning and Adaptive Dynamic Programming for Advanced Control Applications 被引量:14
1
作者 Ding Wang Ning Gao +2 位作者 Derong Liu Jinna Li Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期18-36,共19页
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ... Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence. 展开更多
关键词 Adaptive dynamic programming(ADP) advanced control complex environment data-driven control event-triggered design intelligent control neural networks nonlinear systems optimal control reinforcement learning(RL)
在线阅读 下载PDF
Data-based neural controls for an unknown continuous-time multi-input system with integral reinforcement
2
作者 Yongfeng Lv Jun Zhao +1 位作者 Wan Zhang Huimin Chang 《Control Theory and Technology》 2025年第1期118-130,共13页
Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.Ho... Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.However,solving the Hamilton-Jacobi-Bellman(HJB)equations for nonlinear systems requires precise and complicated dynamics.Moreover,the research and application of IRL in continuous-time(CT)systems must be further improved.To develop the IRL of a CT nonlinear system,a data-based adaptive neural dynamic programming(ANDP)method is proposed to investigate the optimal control problem of uncertain CT multi-input systems such that the knowledge of the dynamics in the HJB equation is unnecessary.First,the multi-input model is approximated using a neural network(NN),which can be utilized to design an integral reinforcement signal.Subsequently,two criterion networks and one action network are constructed based on the integral reinforcement signal.A nonzero-sum Nash equilibrium can be reached by learning the optimal strategies of the multi-input model.In this scheme,the NN weights are constantly updated using an adaptive algorithm.The weight convergence and the system stability are analyzed in detail.The optimal control problem of a multi-input nonlinear CT system is effectively solved using the ANDP scheme,and the results are verified by a simulation study. 展开更多
关键词 Adaptive dynamic programming Integral reinforcement Neural networks Heuristic dynamic programming Multi-input system
原文传递
A Survey on Reinforcement Learning for Optimal Decision-Making and Control of Intelligent Vehicles
3
作者 Yixing Lan Xin Xu +3 位作者 Jiahang Liu Xinglong Zhang Yang Lu Long Cheng 《CAAI Transactions on Intelligence Technology》 2025年第6期1593-1615,共23页
Reinforcement learning(RL)has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties.In recent years,the applications of RL in optimised decision-making ... Reinforcement learning(RL)has been widely studied as an efficient class of machine learning methods for adaptive optimal control under uncertainties.In recent years,the applications of RL in optimised decision-making and motion control of intelligent vehicles have received increasing attention.Due to the complex and dynamic operating environments of intelligent vehicles,it is necessary to improve the learning efficiency and generalisation ability of RL-based decision and control algorithms under different conditions.This survey systematically examines the theoretical foundations,algorithmic advancements and practical challenges of applying RL to intelligent vehicle systems operating in complex and dynamic environments.The major algorithm frameworks of RL are first introduced,and the recent advances in RL-based decision-making and control of intelligent vehicles are overviewed.In addition to self-learning decision and control approaches using state measurements,the developments of DRL methods for end-to-end driving control of intelligent vehicles are summarised.The open problems and directions for further research works are also discussed. 展开更多
关键词 adaptive dynamic programming intelligent vehicles learning control optimal decision-making reinforcement learning
在线阅读 下载PDF
Combining reinforcement learning with mathematical programming:An approach for optimal design of heat exchanger networks
4
作者 Hui Tan Xiaodong Hong +4 位作者 Zuwei Liao Jingyuan Sun Yao Yang Jingdai Wang Yongrong Yang 《Chinese Journal of Chemical Engineering》 SCIE EI CAS CSCD 2024年第5期63-71,共9页
Heat integration is important for energy-saving in the process industry.It is linked to the persistently challenging task of optimal design of heat exchanger networks(HEN).Due to the inherent highly nonconvex nonlinea... Heat integration is important for energy-saving in the process industry.It is linked to the persistently challenging task of optimal design of heat exchanger networks(HEN).Due to the inherent highly nonconvex nonlinear and combinatorial nature of the HEN problem,it is not easy to find solutions of high quality for large-scale problems.The reinforcement learning(RL)method,which learns strategies through ongoing exploration and exploitation,reveals advantages in such area.However,due to the complexity of the HEN design problem,the RL method for HEN should be dedicated and designed.A hybrid strategy combining RL with mathematical programming is proposed to take better advantage of both methods.An insightful state representation of the HEN structure as well as a customized reward function is introduced.A Q-learning algorithm is applied to update the HEN structure using theε-greedy strategy.Better results are obtained from three literature cases of different scales. 展开更多
关键词 Heat exchanger network reinforcement learning Mathematical programming Process design
在线阅读 下载PDF
Call for papers Journal of Control Theory and Applications Special issue on Approximate dynamic programming and reinforcement learning
5
《控制理论与应用(英文版)》 EI 2010年第2期257-257,共1页
Approximate dynamic programming (ADP) is a general and effective approach for solving optimal control and estimation problems by adapting to uncertain and nonconvex environments over time.
关键词 Call for papers Journal of Control Theory and Applications Special issue on Approximate dynamic programming and reinforcement learning
在线阅读 下载PDF
Feature-Based Aggregation and Deep Reinforcement Learning:A Survey and Some New Implementations 被引量:15
6
作者 Dimitri P.Bertsekas 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2019年第1期1-31,共31页
In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinfor... In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement. 展开更多
关键词 reinforcement learning dynamic programming Markovian DECISION problems AGGREGATION feature-based ARCHITECTURES policy ITERATION DEEP neural networks rollout algorithms
在线阅读 下载PDF
Multiagent Reinforcement Learning:Rollout and Policy Iteration 被引量:3
7
作者 Dimitri Bertsekas 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第2期249-272,共24页
We discuss the solution of complex multistage decision problems using methods that are based on the idea of policy iteration(PI),i.e.,start from some base policy and generate an improved policy.Rollout is the simplest... We discuss the solution of complex multistage decision problems using methods that are based on the idea of policy iteration(PI),i.e.,start from some base policy and generate an improved policy.Rollout is the simplest method of this type,where just one improved policy is generated.We can view PI as repeated application of rollout,where the rollout policy at each iteration serves as the base policy for the next iteration.In contrast with PI,rollout has a robustness property:it can be applied on-line and is suitable for on-line replanning.Moreover,rollout can use as base policy one of the policies produced by PI,thereby improving on that policy.This is the type of scheme underlying the prominently successful Alpha Zero chess program.In this paper we focus on rollout and PI-like methods for problems where the control consists of multiple components each selected(conceptually)by a separate agent.This is the class of multiagent problems where the agents have a shared objective function,and a shared and perfect state information.Based on a problem reformulation that trades off control space complexity with state space complexity,we develop an approach,whereby at every stage,the agents sequentially(one-at-a-time)execute a local rollout algorithm that uses a base policy,together with some coordinating information from the other agents.The amount of total computation required at every stage grows linearly with the number of agents.By contrast,in the standard rollout algorithm,the amount of total computation grows exponentially with the number of agents.Despite the dramatic reduction in required computation,we show that our multiagent rollout algorithm has the fundamental cost improvement property of standard rollout:it guarantees an improved performance relative to the base policy.We also discuss autonomous multiagent rollout schemes that allow the agents to make decisions autonomously through the use of precomputed signaling information,which is sufficient to maintain the cost improvement property,without any on-line coordination of control selection between the agents.For discounted and other infinite horizon problems,we also consider exact and approximate PI algorithms involving a new type of one-agent-at-a-time policy improvement operation.For one of our PI algorithms,we prove convergence to an agentby-agent optimal policy,thus establishing a connection with the theory of teams.For another PI algorithm,which is executed over a more complex state space,we prove convergence to an optimal policy.Approximate forms of these algorithms are also given,based on the use of policy and value neural networks.These PI algorithms,in both their exact and their approximate form are strictly off-line methods,but they can be used to provide a base policy for use in an on-line multiagent rollout scheme. 展开更多
关键词 Dynamic programming multiagent problems neuro-dynamic programming policy iteration reinforcement learning rollout
在线阅读 下载PDF
Feature Selection and Feature Learning for High-dimensional Batch Reinforcement Learning: A Survey 被引量:2
8
作者 De-Rong Liu Hong-Liang Li Ding Wang 《International Journal of Automation and computing》 EI CSCD 2015年第3期229-242,共14页
Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning tech... Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning techniques. In this paper, we focus on batch reinforcement learning(RL) algorithms for discounted Markov decision processes(MDPs) with large discrete or continuous state spaces, aiming to learn the best possible policy given a fixed amount of training data. The batch RL algorithms with handcrafted feature representations work well for low-dimensional MDPs. However, for many real-world RL tasks which often involve high-dimensional state spaces, it is difficult and even infeasible to use feature engineering methods to design features for value function approximation. To cope with high-dimensional RL problems, the desire to obtain data-driven features has led to a lot of works in incorporating feature selection and feature learning into traditional batch RL algorithms. In this paper, we provide a comprehensive survey on automatic feature selection and unsupervised feature learning for high-dimensional batch RL. Moreover, we present recent theoretical developments on applying statistical learning to establish finite-sample error bounds for batch RL algorithms based on weighted Lpnorms. Finally, we derive some future directions in the research of RL algorithms, theories and applications. 展开更多
关键词 Intelligent control reinforcement learning adaptive dynamic programming feature selection feature learning big data.
原文传递
Locally generalised multi-agent reinforcement learning for demand and capacity balancing with customised neural networks 被引量:2
9
作者 Yutong CHEN Minghua HU +1 位作者 Yan XU Lei YANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2023年第4期338-353,共16页
Reinforcement Learning(RL)techniques are being studied to solve the Demand and Capacity Balancing(DCB)problems to fully exploit their computational performance.A locally gen-eralised Multi-Agent Reinforcement Learning... Reinforcement Learning(RL)techniques are being studied to solve the Demand and Capacity Balancing(DCB)problems to fully exploit their computational performance.A locally gen-eralised Multi-Agent Reinforcement Learning(MARL)for real-world DCB problems is proposed.The proposed method can deploy trained agents directly to unseen scenarios in a specific Air Traffic Flow Management(ATFM)region to quickly obtain a satisfactory solution.In this method,agents of all flights in a scenario form a multi-agent decision-making system based on partial observation.The trained agent with the customised neural network can be deployed directly on the corresponding flight,allowing it to solve the DCB problem jointly.A cooperation coefficient is introduced in the reward function,which is used to adjust the agent’s cooperation preference in a multi-agent system,thereby controlling the distribution of flight delay time allocation.A multi-iteration mechanism is designed for the DCB decision-making framework to deal with problems arising from non-stationarity in MARL and to ensure that all hotspots are eliminated.Experiments based on large-scale high-complexity real-world scenarios are conducted to verify the effectiveness and efficiency of the method.From a statis-tical point of view,it is proven that the proposed method is generalised within the scope of the flights and sectors of interest,and its optimisation performance outperforms the standard computer-assisted slot allocation and state-of-the-art RL-based DCB methods.The sensitivity analysis preliminarily reveals the effect of the cooperation coefficient on delay time allocation. 展开更多
关键词 Air traffic flow management Demand and capacity bal-ancing Deep Q-learning network Flight delays GENERALISATION Ground delay program Multi-agent reinforcement learning
原文传递
Robotic Knee Tracking Control to Mimic the Intact Human Knee Profile Based on Actor-Critic Reinforcement Learning 被引量:2
10
作者 Ruofan Wu Zhikai Yao +1 位作者 Jennie Si He(Helen)Huang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第1期19-30,共12页
We address a state-of-the-art reinforcement learning(RL)control approach to automatically configure robotic pros-thesis impedance parameters to enable end-to-end,continuous locomotion intended for transfemoral amputee... We address a state-of-the-art reinforcement learning(RL)control approach to automatically configure robotic pros-thesis impedance parameters to enable end-to-end,continuous locomotion intended for transfemoral amputee subjects.Specifically,our actor-critic based RL provides tracking control of a robotic knee prosthesis to mimic the intact knee profile.This is a significant advance from our previous RL based automatic tuning of prosthesis control parameters which have centered on regulation control with a designer prescribed robotic knee profile as the target.In addition to presenting the tracking control algorithm based on direct heuristic dynamic programming(dHDP),we provide a control performance guarantee including the case of constrained inputs.We show that our proposed tracking control possesses several important properties,such as weight convergence of the learning networks,Bellman(sub)optimality of the cost-to-go value function and control input,and practical stability of the human-robot system.We further provide a systematic simulation of the proposed tracking control using a realistic human-robot system simulator,the OpenSim,to emulate how the dHDP enables level ground walking,walking on different terrains and at different paces.These results show that our proposed dHDP based tracking control is not only theoretically suitable,but also practically useful. 展开更多
关键词 Automatic tracking of intact knee configuration of robotic knee prosthesis direct heuristic dynamic programming(dHDP) reinforcement learning control
在线阅读 下载PDF
倒扣增强型波纹钢构件抗弯性能试验研究及理论分析
11
作者 武飞 刘保东 +4 位作者 张钰 张继磊 孔骁 武亦彬 王志宏 《隧道建设(中英文)》 北大核心 2026年第1期113-123,共11页
为解决埋置式波纹钢结构承载力和刚度不足的问题,提出倒扣增强型波纹钢构件,并针对增强肋宽度、是否填充混凝土等工况进行四点弯曲试验,研究构件的抗弯性能和部件之间的相互作用规律。基于部分相互作用理论构建理论模型,利用Python语言... 为解决埋置式波纹钢结构承载力和刚度不足的问题,提出倒扣增强型波纹钢构件,并针对增强肋宽度、是否填充混凝土等工况进行四点弯曲试验,研究构件的抗弯性能和部件之间的相互作用规律。基于部分相互作用理论构建理论模型,利用Python语言基于试射法和向前迭代差分法编制抗弯性能求解程序。结果表明:1)相较于同等波形的波纹钢构件,倒扣增强型波纹钢构件极限抗弯承载力提高1.0~2.5倍,抗弯刚度提高1.3~6.2倍,组合受力特性使构件抗弯承载力和刚度的提高量大于用钢量的增长量;2)在空腔内灌注混凝土可使构件的承载力和刚度得到进一步提升,波纹钢的衬托和约束作用使混凝土作用得到发挥,在波纹钢和混凝土之间进一步增加栓钉连接件对抗弯承载力的影响不明显;3)基于部分相互作用理论推导得到平衡方程和变形协调方程以及界面切向应力和内力的关系,利用试射法和向前迭代差分法编制抗弯性能求解程序。通过计算结果与试验结果的对比表明,求解程序能复现构件的刚度退化过程,具有良好的鲁棒性和收敛性。 展开更多
关键词 隧道工程 倒扣增强型波纹钢构件 抗弯性能 部分相互作用理论 求解程序
在线阅读 下载PDF
服役输电塔加固构件承载性能提升及优化研究
12
作者 张亮 牛凯 +4 位作者 徐尉豪 翟孟琪 靳庆通 刘俊才 田利 《山东大学学报(工学版)》 北大核心 2026年第1期122-132,共11页
由于构件屈曲失效是输电塔发生连续性倒塌破坏的主要因素,提出一种十字型无损加固措施,通过静力加载试验和有限元模拟对比分析加固构件的破坏模式、承载性能以及应力分布规律。研究构件长细比、宽厚比、夹具数量和加固材规格、钢级等参... 由于构件屈曲失效是输电塔发生连续性倒塌破坏的主要因素,提出一种十字型无损加固措施,通过静力加载试验和有限元模拟对比分析加固构件的破坏模式、承载性能以及应力分布规律。研究构件长细比、宽厚比、夹具数量和加固材规格、钢级等参数对加固构件极限受压承载力的影响规律。结果表明:本研究提出的加固方案效果良好,加固后极限承载力提升14%以上;长细比越大,加固构件极限受压承载力越小,加固构件的加固效果越显著;随着被加固主材宽厚比减小,加固构件极限承载力的提升幅度下降;当夹具增加到一定数量时,加固构件的极限受压承载力趋于稳定;加固材规格和钢级对加固构件影响较小。 展开更多
关键词 输电塔 角钢构件 无损加固方案 极限承载力 参数化分析
原文传递
PDP:Parallel Dynamic Programming 被引量:15
13
作者 Fei-Yue Wang Jie Zhang +2 位作者 Qinglai Wei Xinhu Zheng Li Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2017年第1期1-5,共5页
Deep reinforcement learning is a focus research area in artificial intelligence.The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods.The principle of adaptive dy... Deep reinforcement learning is a focus research area in artificial intelligence.The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods.The principle of adaptive dynamic programming ADP is first presented instead of direct dynamic programming DP,and the inherent relationship between ADP and deep reinforcement learning is developed.Next,analytics intelligence,as the necessary requirement,for the real reinforcement learning,is discussed.Finally,the principle of the parallel dynamic programming,which integrates dynamic programming and analytics intelligence,is presented as the future computational intelligence.©2014 Chinese Association of Automation. 展开更多
关键词 Parallel dynamic programming Dynamic programming Adaptive dynamic programming reinforcement learning Deep learning Neural networks Artificial intelligence
在线阅读 下载PDF
A Mix-integer Programming Based Deep Reinforcement Learning Framework for Optimal Dispatch of Energy Storage System in Distribution Networks
14
作者 Shengren Hou Edgar Mauricio Salazar +2 位作者 Peter Palensky Qixin Chen Pedro P.Vergara 《Journal of Modern Power Systems and Clean Energy》 2025年第2期597-608,共12页
The optimal dispatch of energy storage systems(ESSs)in distribution networks poses significant challenges,primarily due to uncertainties of dynamic pricing,fluctuating demand,and the variability inherent in renewable ... The optimal dispatch of energy storage systems(ESSs)in distribution networks poses significant challenges,primarily due to uncertainties of dynamic pricing,fluctuating demand,and the variability inherent in renewable energy sources.By exploiting the generalization capabilities of deep neural networks(DNNs),the deep reinforcement learning(DRL)algorithms can learn good-quality control models that adapt to the stochastic nature of distribution networks.Nevertheless,the practical deployment of DRL algorithms is often hampered by their limited capacity for satisfying operational constraints in real time,which is a crucial requirement for ensuring the reliability and feasibility of control actions during online operations.This paper introduces an innovative framework,named mixed-integer programming based deep reinforcement learning(MIP-DRL),to overcome these limitations.The proposed MIP-DRL framework can rigorously enforce operational constraints for the optimal dispatch of ESSs during the online execution.This framework involves training a Q-function with DNNs,which is subsequently represented in a mixed-integer programming(MIP)formulation.This unique combination allows for the seamless integration of operational constraints into the decision-making process.The effectiveness of the proposed MIP-DRL framework is validated through numerical simulations,demonstrating its superior capability to enforce all operational constraints and achieve high-quality dispatch decisions and showing its advantage over existing DRL algorithms. 展开更多
关键词 Voltage regulation optimal dispatch distribution network mixed-integer programming deep reinforcement learning(DRL) energy management
原文传递
基于SC-SAC算法的REHMIS-IES优化调度策略
15
作者 潘雷 丁云飞 +4 位作者 庞毅 王宇璇 陈建伟 高瑞 张立阳 《综合智慧能源》 2026年第1期43-58,共16页
可再生能源-制氢-制甲醇一体站(REHMIS)通过利用可再生能源发电制取绿氢,并进一步将绿氢与二氧化碳合成甲醇,从而实现绿氢对传统化石能源制氢的替代。为了同时满足REHMIS的甲醇负荷需求及其配套建筑的多能源需求,设计了新型综合能源系统... 可再生能源-制氢-制甲醇一体站(REHMIS)通过利用可再生能源发电制取绿氢,并进一步将绿氢与二氧化碳合成甲醇,从而实现绿氢对传统化石能源制氢的替代。为了同时满足REHMIS的甲醇负荷需求及其配套建筑的多能源需求,设计了新型综合能源系统(IES)拓扑结构REHMIS-IES。为获得REHMIS-IES高效运行策略,提出了一种基于严格约束的软演员-评论家(SC-SAC)算法执行框架。将所建数学模型转化为马尔可夫决策过程,同时引入状态约束机制(SCM)以避免储能系统状态出现剧烈波动。在SC-SAC算法的执行阶段,将训练后的Q网络与动作约束转化成混合整数线性规划(MILP)模型,以保证调度决策能够满足各项运行约束。多场景仿真结果表明:所提系统在保障多能需求的同时可有效降低运行成本;与其他深度强化学习算法相比,SC-SAC算法可使系统能量不平衡度降低约16.2%,运行成本至少下降11.7%。 展开更多
关键词 可再生能源-制氢-制甲醇一体化站 绿氢 储能 综合能源系统 深度强化学习 状态约束机制 软演员-评论家算法 混合整数线性规划
在线阅读 下载PDF
Prediction of the load-carrying capacity of reinforced concrete connections under post-earthquake fire 被引量:1
16
作者 Aydin SHISHEGARAN Mehdi MORADI +2 位作者 Mohammad Ali NAGHSH Behnam KARAMI Arshia SHISHEGARAN 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2021年第6期441-466,共26页
Finding out the most effective parameters relating to the resistance of reinforced concrete connections(RCCs)is an important topic in structural engineering.In this study,first,a finite element(FE)model is developed f... Finding out the most effective parameters relating to the resistance of reinforced concrete connections(RCCs)is an important topic in structural engineering.In this study,first,a finite element(FE)model is developed for simulating the performance of RCCs under post-earthquake fire(PEF).Then surrogate models,including multiple linear regression(MLR),multiple natural logarithm(Ln)equation regression(MLn ER),gene expression programming(GEP),and an ensemble model,are used to predict the remaining load-carrying capacity of an RCC under PEF.The statistical parameters,error terms,and a novel statistical table are used to evaluate and compare the accuracy of each surrogate model.According to the results,the ratio of the longitudinal reinforcement bars of the column(RLC)has a significant effect on the resistance of an RCC under PEF.Increasing the value of this parameter from 1%to 8%can increase the residual load-carrying capacity of an RCC under PEF by 492.2%when the RCC is exposed to fire at a temperature of 1000°C.Moreover,based on the results,the ensemble model can predict the residual load-carrying capacity with suitable accuracy.A safety factor of 1.55 should be applied to the results obtained from the ensemble model. 展开更多
关键词 reinforced concrete connection(RCC) Post-earthquake fire(PEF) Surrogate models Load-carrying capacity Gene expression programming(GEP) Ensemble model
原文传递
A COMPACT AND HANDHELD SURFACE PENETRATING RADAR FOR THE DETECTION OF REINFORCED CONCRETE STRUCTURES
17
作者 Zhou Bin Ye Shengbo +3 位作者 Xia Xinfan Shao Jinjin Liu Lihua Fang Guangyou 《Journal of Electronics(China)》 2013年第4期384-390,共7页
Surface Penetrating Radar (SPR) is a recently developed technology for non-destructive testing. It can be used to image and interpret the inner structure of the reinforced concrete. This paper gives the details about ... Surface Penetrating Radar (SPR) is a recently developed technology for non-destructive testing. It can be used to image and interpret the inner structure of the reinforced concrete. This paper gives the details about a compact and handheld SPR developed recently for reinforced concrete structure detection. The center operation frequency of the radar is 1.6 GHz. Not only it has fast acquisition ability, but also it can display the testing result on the LCD screen in real-time. The testing results show that the radar has a penetrating range of more than 30 cm, and a lateral resolution better than 5 cm. The performance validates that the radar can meet the application requirements for reinforced concrete structure detection. 展开更多
关键词 Surface Penetrating Radar (SPR) non-destructive testing reinforced concrete 3Dimaging
在线阅读 下载PDF
基于强化学习的非线性输入受限系统最优控制 被引量:1
18
作者 高晓格 韩淑云 《计算机应用与软件》 北大核心 2025年第2期287-291,298,共6页
针对一类输入受限的非线性系统最优跟踪控制问题,提出一种基于强化学习的自适应动态规划的控制策略。通过设计一种合适的性能指标函数解决控制系统输入受限问题;通过设计评价神经网络来估计系统的最优性能指标函数,从而求解控制系统HJB(... 针对一类输入受限的非线性系统最优跟踪控制问题,提出一种基于强化学习的自适应动态规划的控制策略。通过设计一种合适的性能指标函数解决控制系统输入受限问题;通过设计评价神经网络来估计系统的最优性能指标函数,从而求解控制系统HJB(Hamilton-Jacobi-Bellman)方程,获得最优控制输入;利用Lyapunov方法获得评价网络的权重更新率,并证明系统的跟踪误差和评价网络的权重估计误差为最终一致有界(UUB);通过数值仿真实验验证该控制策略的有效性。 展开更多
关键词 非线性系统 输入受限 强化学习 自适应动态规划
在线阅读 下载PDF
滚动优化下的对偶启发规划车辆路径跟踪控制
19
作者 郭洪艳 李光尧 +3 位作者 刘俊 郭景征 谭中秋 吕颖 《控制理论与应用》 北大核心 2025年第9期1746-1756,共11页
为提高智能车辆的路径跟踪精度,降低高速、大曲率工况下车辆模型不确定性对跟踪性能的影响,本文提出了一种基于滚动优化对偶启发式规划(RHDHP)的智能车辆路径跟踪控制策略.首先,结合魔术公式建立了可表征侧向轮胎力非线性特性的车辆系... 为提高智能车辆的路径跟踪精度,降低高速、大曲率工况下车辆模型不确定性对跟踪性能的影响,本文提出了一种基于滚动优化对偶启发式规划(RHDHP)的智能车辆路径跟踪控制策略.首先,结合魔术公式建立了可表征侧向轮胎力非线性特性的车辆系统模型.其次,设计了滚动优化思想下对偶启发式规划(DHP)的最优控制方法.该方法中的DHP结构确保了车辆非线性特性下的近似最优解,滚动优化的引入提高了车辆系统对环境变化的自适应性.同时,从理论上分析了RHDHP方法的收敛性以及闭环系统的稳定性.最后,通过仿真验证了所提方法的有效性. 展开更多
关键词 车辆路径跟踪 对偶启发式规划 模型预测控制 强化学习
在线阅读 下载PDF
基于MILP-TD3的用户侧储能系统优化运行
20
作者 陈景文 单茜 《中国电机工程学报》 北大核心 2025年第13期5119-5129,I0015,共12页
深度强化学习(deep reinforcement learning,DRL)作为调控用户侧储能以消纳光伏发电和满足用户用电需求的重要算法得到广泛的应用,但应用过程中,DRL智能体难以严格执行运行约束,导致其提供不可靠的动作,威胁储能系统运行的安全性。基于... 深度强化学习(deep reinforcement learning,DRL)作为调控用户侧储能以消纳光伏发电和满足用户用电需求的重要算法得到广泛的应用,但应用过程中,DRL智能体难以严格执行运行约束,导致其提供不可靠的动作,威胁储能系统运行的安全性。基于此,该文提出一种基于MILP-TD3的用户侧储能系统优化运行策略。首先,以调度周期内运行成本最小为目标,建立考虑电池退化成本的实时运行优化模型,引入包含功率平衡约束的马尔科夫决策过程(Markov decision process,MDP),将储能运行问题转换为智能体的寻优问题;其次,提出将双延迟深度确定性策略梯度(double delay depth deterministic strategy gradient,TD3)算法的动作价值函数转化为混合整数线性规划(mixed integer linear programming,MILP)公式的MILP-TD3算法,使智能体严格执行储能运行的约束条件;最后,通过算例对比分析,结果表明所提模型和算法能够确定最优运行策略,可以严格执行功率平衡约束,该文算法较传统TD3算法所得平均日运行成本降低25.34%,各时段平均优化时间为0.024 s,满足实时调度要求,保证用户侧储能系统安全运行。 展开更多
关键词 用户侧储能 深度强化学习 混合整数线性规划 优化运行
原文传递
上一页 1 2 16 下一页 到第
使用帮助 返回顶部