摘要
在Markov性能势基础上 ,研究了一类转移速率不确定但受紧集约束的遍历连续时间Markov控制过程 (CTMCP)的鲁棒控制问题 .根据系统的遍历性 ,平均代价Poisson方程的解可被看作是性能势的一种定义 .在平均代价准则下 ,优化控制的目标是选择一个平稳策略使得系统在参数最坏取值下能获得最小无穷水平平均代价 ,据此论文给出了求解最优鲁棒控制策略的策略迭代 (PI)算法 ,并详细讨论了算法的收敛性 .
Motivated by the needs of optimization and control of practical engineering systems with uncertain parameters, we considered, through the Markov performance potential theory, the robust control problems for a class of continuous time Markov control processes with uncertain transition rates that are constrained on compact sets. By ergodic property of the processes, the solution of the average cost Poisson equation can be viewed as a definition for the concept of Markov performance potential. Under average cost criteria, our goal is to obtain a stationary policy that generates the minimal infinite horizon average cost under the worst choice of the system parameters. Therefore, we developed a policy iteration algorithm for generating an optimal robust control policy, and discussed in detail the convergence of the proposed algorithm.
基金
合肥工业大学中青年科技创新群体计划
安徽省优秀青年科技基金 (0 4 0 4 2 0 4 4 )