期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Learning the continuous-time optimal decision law from discrete-time rewards 被引量:1
1
作者 Ci Chen Lihua Xie +3 位作者 Kan Xie Frank Leroy Lewis Yilu Liu Shengli Xie 《National Science Open》 2024年第5期130-147,共18页
The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's ... The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's behavior has always been a challenge in reinforcement learning.In this work,we explore a discrete-time reward for reinforcement learning in continuous time and action spaces that represent many phenomena captured by applying physical laws.We find that the discrete-time reward leads to the extraction of the unique continuous-time decision law and improved computational efficiency by dropping the integrator operator that appears in classical results with integral rewards.We apply this finding to solve output-feedback design problems in power systems.The results reveal that our approach removes an intermediate stage of identifying dynamical models.Our work suggests that the discrete-time reward is efficient in search of the desired decision law,which provides a computational tool to understand and modify the behavior of large-scale engineering systems using the optimal learned decision. 展开更多
关键词 continuous-time state and action decision law learning discrete-time reward dynamical systems reinforcement learning
原文传递
DDQNC-P:A framework for civil aircraft tactical synergetic trajectory planning under adverse weather conditions 被引量:1
2
作者 Honghai ZHANG Jinlun ZHOU +2 位作者 Zongbei SHI Yike LI Jinpeng ZHANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第12期434-457,共24页
Adverse weather during aircraft operation generates more complex scenarios for tactical trajectory planning,which requires superior real-time performance and conflict-free reliability of solving methods.Multi-aircraft... Adverse weather during aircraft operation generates more complex scenarios for tactical trajectory planning,which requires superior real-time performance and conflict-free reliability of solving methods.Multi-aircraft real-time 4D trajectory planning under adverse weather is an essential problem in Air Traffic Control(ATC)and it is challenging for the existing methods to be applied effectively.A framework of Double Deep Q-value Network under the Critic guidance with heuristic Pairing(DDQNC-P)is proposed to solve this problem.An Agent for two aircraft synergetic trajectory planning is trained by the Deep Reinforcement Learning(DRL)model of DDQNC,which completes two aircraft 4D trajectory planning tasks preliminarily under dynamic weather conditions.Then a heuristic pairing algorithm is designed to convert the multi-aircraft synergetic trajectory planning into multi-time pairwise synergetic trajectory planning,making the multiaircraft trajectory planning problem processable for the trained Agent.This framework compresses the input dimensions of the DRL model while improving its generalization ability significantly.Substantial simulations with various aircraft numbers,weather conditions,and airspace structures were conducted for performance verification and comparison.The success rate of conflict-free trajectory resolution reached 96.56% with an average calculation time of 0.41 s for 3504D trajectory points per aircraft,finally confirming its applicability to make real-time decision-making support for controllers in real-world ATC systems. 展开更多
关键词 Air traffic control Trajectory-based operation 4D trajectory planning Reinforcement learning decision support systems
原文传递
Mining and Integrating Reliable Decision Rules for Imbalanced Cancer Gene Expression Data Sets 被引量:5
3
作者 Hualong Yu Jun Ni +1 位作者 Yuanyuan Dan Sen Xu 《Tsinghua Science and Technology》 SCIE EI CAS 2012年第6期666-673,共8页
There have been many skewed cancer gene expression datasets in the post-genomic era.Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms w... There have been many skewed cancer gene expression datasets in the post-genomic era.Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms will seriously underestimate the performance of the minority class,leading to inaccurate diagnosis in clinical trails.This paper presents a skewed gene selection algorithm that introduces a weighted metric into the gene selection procedure.The extracted genes are paired as decision rules to distinguish both classes,with these decision rules then integrated into an ensemble learning framework by majority voting to recognize test examples;thus avoiding tedious data normalization and classifier construction.The mining and integrating of a few reliable decision rules gave higher or at least comparable classification performance than many traditional class imbalance learning algorithms on four benchmark imbalanced cancer gene expression datasets. 展开更多
关键词 cancer gene expression data class imbalance paired differential expression genes decision ruleensemble learning majority voting
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部