期刊文献+
共找到300篇文章
< 1 2 15 >
每页显示 20 50 100
Value Iteration-Based Distributed Adaptive Dynamic Programming for Multi-Player Differential Game With Incomplete Information
1
作者 Yun Zhang Yuqi Wang Yunze Cai 《IEEE/CAA Journal of Automatica Sinica》 2025年第2期436-447,共12页
In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others&#... In this paper,a distributed adaptive dynamic programming(ADP)framework based on value iteration is proposed for multi-player differential games.In the game setting,players have no access to the information of others'system parameters or control laws.Each player adopts an on-policy value iteration algorithm as the basic learning framework.To deal with the incomplete information structure,players collect a period of system trajectory data to compensate for the lack of information.The policy updating step is implemented by a nonlinear optimization problem aiming to search for the proximal admissible policy.Theoretical analysis shows that by adopting proximal policy searching rules,the approximated policies can converge to a neighborhood of equilibrium policies.The efficacy of our method is illustrated by three examples,which also demonstrate that the proposed method can accelerate the learning process compared with the centralized learning framework. 展开更多
关键词 Distributed adaptive dynamic programming incomplete information multi-player differential game(MPDG) value iteration
在线阅读 下载PDF
Residential Energy Scheduling With Solar Energy Based on Dyna Adaptive Dynamic Programming
2
作者 Kang Xiong Qinglai Wei Hongyang Li 《IEEE/CAA Journal of Automatica Sinica》 2025年第2期403-413,共11页
Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we pr... Learning-based methods have become mainstream for solving residential energy scheduling problems. In order to improve the learning efficiency of existing methods and increase the utilization of renewable energy, we propose the Dyna actiondependent heuristic dynamic programming(Dyna-ADHDP)method, which incorporates the ideas of learning and planning from the Dyna framework in action-dependent heuristic dynamic programming. This method defines a continuous action space for precise control of an energy storage system and allows online optimization of algorithm performance during the real-time operation of the residential energy model. Meanwhile, the target network is introduced during the training process to make the training smoother and more efficient. We conducted experimental comparisons with the benchmark method using simulated and real data to verify its applicability and performance. The results confirm the method's excellent performance and generalization capabilities, as well as its excellence in increasing renewable energy utilization and extending equipment life. 展开更多
关键词 Adaptive dynamic programming(ADP) dynamic residential scenarios optimal residential energy management smart grid
在线阅读 下载PDF
EDISON-WMW: Exact Dynamic Programing Solution of the Wilcoxon–Mann–Whitney Test 被引量:2
3
作者 Alexander Marx Christina Backes +2 位作者 Eckart Meese Hans-Peter Lenhof Andreas Keller 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2016年第1期55-61,共7页
In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular h... In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular hypothesis tests in medicine and life science to analyze if two groups of samples are equally distributed. This nonparametric statistical homogeneity test is commonly applied in molecular diagnosis. Generally, the solution of the WMW test takes a high combinatorial effort for large sample cohorts containing a significant number of ties. Hence, P value is frequently approximated by a normal distribution. We developed EDISON-WMW, a new approach to calcu- late the exact permutation of the two-tailed unpaired WMW test without any corrections required and allowing for ties. The method relies on dynamic programing to solve the combinatorial problem of the WMW test efficiently. Beyond a straightforward implementation of the algorithm, we pre- sented different optimization strategies and developed a parallel solution. Using our program, the exact P value for large cohorts containing more than 1000 samples with ties can be calculated within minutes. We demonstrate the performance of this novel approach on randomly-generated data, benchmark it against 13 other commonly-applied approaches and moreover evaluate molec- ular biomarkers for lung carcinoma and chronic obstructive pulmonary disease (COPD). We foundthat approximated P values were generally higher than the exact solution provided by EDISON- WMW. Importantly, the algorithm can also be applied to high-throughput omics datasets, where hundreds or thousands of features are included. To provide easy access to the multi-threaded version of EDISON-WMW, a web-based solution of our algorithm is freely available at http:// www.ccb.uni-saarland.de/software/wtest/. 展开更多
关键词 Wilcoxon-Mann-Whitneytest:Wilcoxon rank-sum test dynamic programing Exact permutation Parallel optimization
原文传递
Recent Progress in Reinforcement Learning and Adaptive Dynamic Programming for Advanced Control Applications 被引量:11
4
作者 Ding Wang Ning Gao +2 位作者 Derong Liu Jinna Li Frank L.Lewis 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期18-36,共19页
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ... Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence. 展开更多
关键词 Adaptive dynamic programming(ADP) advanced control complex environment data-driven control event-triggered design intelligent control neural networks nonlinear systems optimal control reinforcement learning(RL)
在线阅读 下载PDF
Adaptive fault-tolerant control for non-minimum phase hypersonic vehicles based on adaptive dynamic programming 被引量:3
5
作者 Le WANG Ruiyun QI Bin JIANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第3期290-311,共22页
In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on t... In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on the output redefinition method and Adaptive Dynamic Programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary controller.In the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new output.Then,Ideal Internal Dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV system.For the ADP-based compensation control part,an ActionDependent Heuristic Dynamic Programming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control system.Finally,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm. 展开更多
关键词 Hypersonic vehicle Fault-tolerant control Non-minimum phase system Adaptive control Nonlinear control Adaptive dynamic programming
原文传递
Adaptive Optimal Discrete-Time Output-Feedback Using an Internal Model Principle and Adaptive Dynamic Programming 被引量:1
6
作者 Zhongyang Wang Youqing Wang Zdzisław Kowalczuk 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期131-140,共10页
In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed metho... In order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming(ADP) technique based on the internal model principle(IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback-merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization problem. We then design an observer to reproduce the full state of the system by measuring the inputs and outputs. Moreover, this technique includes both a policy iteration algorithm and a value iteration algorithm to determine the optimal feedback gain without using a dynamic system model. It is important that with this concept one does not need to solve the regulator equation. Finally, this control method was tested on an inverter system of grid-connected LCLs to demonstrate that the proposed method provides the desired performance in terms of both tracking and disturbance rejection. 展开更多
关键词 Adaptive dynamic programming(ADP) internal model principle(IMP) output feedback problem policy iteration(PI) value iteration(VI)
在线阅读 下载PDF
Bayesian network structure learning by dynamic programming algorithm based on node block sequence constraints
7
作者 Chuchao He Ruohai Di +1 位作者 Bo Li Evgeny Neretin 《CAAI Transactions on Intelligence Technology》 2024年第6期1605-1622,共18页
The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study propose... The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study proposes a DP algorithm based on node block sequence constraints.The proposed algorithm constrains the traversal process of the parent graph by using the M-sequence matrix to considerably reduce the time consumption and space complexity by pruning the traversal process of the order graph using the node block sequence.Experimental results show that compared with existing DP algorithms,the proposed algorithm can obtain learning results more efficiently with less than 1%loss of accuracy,and can be used for learning larger-scale networks. 展开更多
关键词 Bayesian network(BN) dynamic programming(DP) node block sequence strongly connected component(SCC) structure learning
在线阅读 下载PDF
迁移增量启发式动态规划及污水处理应用
8
作者 王鼎 李鑫 《北京工业大学学报》 北大核心 2025年第3期277-283,共7页
针对污水处理系统中的溶解氧(dissolved oxygen,DO)质量浓度控制问题,提出一种迁移增量启发式动态规划(transferable incremental heuristic dynamic programming,TI-HDP)算法。针对污水处理过程的特性,该算法通过将控制变量的更新方式... 针对污水处理系统中的溶解氧(dissolved oxygen,DO)质量浓度控制问题,提出一种迁移增量启发式动态规划(transferable incremental heuristic dynamic programming,TI-HDP)算法。针对污水处理过程的特性,该算法通过将控制变量的更新方式改进为增量形式,提升了算法的抗干扰能力,并弱化了与增量式比例-积分-微分(proportional-integral-derivative,PID)算法之间的结构差异。基于数据驱动的思想,通过利用PID算法所产生的历史数据,成功地将传统控制领域中的专家经验迁移到TI-HDP算法框架中,保证了TI-HDP算法前期控制策略的稳定性。仿真结果表明:与PID算法和传统的启发式动态规划算法相比,所提算法对DO质量浓度具有更高的控制精度。 展开更多
关键词 启发式动态规划(heuristic dynamic programming HDP) 智能控制 知识迁移 非线性系统 神经网络 污水处理
在线阅读 下载PDF
Human-AI interactive optimized shared control
9
作者 Junkai Tan Shuangsi Xue +1 位作者 Hui Cao Shuzhi Sam Ge 《Journal of Automation and Intelligence》 2025年第3期163-176,共14页
This paper presents an optimized shared control algorithm for human–AI interaction, implemented through a digital twin framework where the physical system and human operator act as the real agent while an AI-driven d... This paper presents an optimized shared control algorithm for human–AI interaction, implemented through a digital twin framework where the physical system and human operator act as the real agent while an AI-driven digital system functions as the virtual agent. In this digital twin architecture, the real agent acquires an optimal control strategy through observed actions, while the AI virtual agent mirrors the real agent to establish a digital replica system and corresponding control policy. Both the real and virtual optimal controllers are approximated using reinforcement learning(RL) techniques. Specifically, critic neural networks(NNs) are employed to learn the virtual and real optimal value functions, while actor NNs are trained to derive their respective optimal controllers. A novel shared mechanism is introduced to integrate both virtual and real value functions into a unified learning framework, yielding an optimal shared controller. This controller adaptively adjusts the confidence ratio between virtual and real agents, enhancing the system's efficiency and flexibility in handling complex control tasks. The stability of the closed-loop system is rigorously analyzed using the Lyapunov method. The effectiveness of the proposed AI–human interactive system is validated through two numerical examples: a representative nonlinear system and an unmanned aerial vehicle(UAV) control system. 展开更多
关键词 Human-Alinteraction Digital-twin system Adaptive dynamic programming(ADP) DATA-DRIVEN Optimal shared control
在线阅读 下载PDF
Learning-based tracking control of AUV:Mixed policy improvement and game-based disturbance rejection
10
作者 Jun Ye Hongbo Gao +4 位作者 Manjiang Hu Yougang Bian Qingjia Cui Xiaohui Qin Rongjun Ding 《CAAI Transactions on Intelligence Technology》 2025年第2期510-528,共19页
A mixed adaptive dynamic programming(ADP)scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle(AUV)systems subject to disturbances and safe constraints.... A mixed adaptive dynamic programming(ADP)scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle(AUV)systems subject to disturbances and safe constraints.By combining prior dynamic knowledge and actual sampled data,the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm.Initially,the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias.Also,the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset.To comprehensively leverage the advantages of model-based and model-free methods during training,an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment,which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement.As a result,the proposed approach accelerates the learning speed compared to data-driven methods,concurrently also enhancing the tracking performance in comparison to model-based control methods.Moreover,the optimal control problem under disturbances is formulated as a zero-sum game,and the actor-critic-disturbance framework is introduced to approximate the optimal control input,cost function,and disturbance policy,respectively.Furthermore,the convergence property of the proposed algorithm based on the value iteration method is analysed.Finally,an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method. 展开更多
关键词 adaptive dynamic programming autonomous underwater vehicle game theory optimal control reinforcement learning
在线阅读 下载PDF
Data-based neural controls for an unknown continuous-time multi-input system with integral reinforcement
11
作者 Yongfeng Lv Jun Zhao +1 位作者 Wan Zhang Huimin Chang 《Control Theory and Technology》 2025年第1期118-130,共13页
Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.Ho... Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.However,solving the Hamilton-Jacobi-Bellman(HJB)equations for nonlinear systems requires precise and complicated dynamics.Moreover,the research and application of IRL in continuous-time(CT)systems must be further improved.To develop the IRL of a CT nonlinear system,a data-based adaptive neural dynamic programming(ANDP)method is proposed to investigate the optimal control problem of uncertain CT multi-input systems such that the knowledge of the dynamics in the HJB equation is unnecessary.First,the multi-input model is approximated using a neural network(NN),which can be utilized to design an integral reinforcement signal.Subsequently,two criterion networks and one action network are constructed based on the integral reinforcement signal.A nonzero-sum Nash equilibrium can be reached by learning the optimal strategies of the multi-input model.In this scheme,the NN weights are constantly updated using an adaptive algorithm.The weight convergence and the system stability are analyzed in detail.The optimal control problem of a multi-input nonlinear CT system is effectively solved using the ANDP scheme,and the results are verified by a simulation study. 展开更多
关键词 Adaptive dynamic programming Integral reinforcement Neural networks Heuristic dynamic programming Multi-input system
原文传递
An ADP-based robust control scheme for nonaffine nonlinear systems with uncertainties and input constraints
12
作者 Shijie Luo Kun Zhang Wenchao Xue 《Chinese Physics B》 2025年第6期251-260,共10页
The paper develops a robust control approach for nonaffine nonlinear continuous systems with input constraints and unknown uncertainties. Firstly, this paper constructs an affine augmented system(AAS) within a pre-com... The paper develops a robust control approach for nonaffine nonlinear continuous systems with input constraints and unknown uncertainties. Firstly, this paper constructs an affine augmented system(AAS) within a pre-compensation technique for converting the original nonaffine dynamics into affine dynamics. Secondly, the paper derives a stability criterion linking the original nonaffine system and the auxiliary system, demonstrating that the obtained optimal policies from the auxiliary system can achieve the robust controller of the nonaffine system. Thirdly, an online adaptive dynamic programming(ADP) algorithm is designed for approximating the optimal solution of the Hamilton–Jacobi–Bellman(HJB) equation.Moreover, the gradient descent approach and projection approach are employed for updating the actor-critic neural network(NN) weights, with the algorithm's convergence being proven. Then, the uniformly ultimately bounded stability of state is guaranteed. Finally, in simulation, some examples are offered for validating the effectiveness of this presented approach. 展开更多
关键词 adaptive dynamic programming robust control nonaffine nonlinear system neural network
原文传递
Decentralised adaptive learning-based control of robot manipulators with unknown parameters
13
作者 Emil Mühlbradt Sveen Jing Zhou +1 位作者 Morten Kjeld Ebbesen Mohammad Poursina 《Journal of Automation and Intelligence》 2025年第2期136-144,共9页
This paper studies motor joint control of a 4-degree-of-freedom(DoF)robotic manipulator using learning-based Adaptive Dynamic Programming(ADP)approach.The manipulator’s dynamics are modelled as an open-loop 4-link se... This paper studies motor joint control of a 4-degree-of-freedom(DoF)robotic manipulator using learning-based Adaptive Dynamic Programming(ADP)approach.The manipulator’s dynamics are modelled as an open-loop 4-link serial kinematic chain with 4 Degrees of Freedom(DoF).Decentralised optimal controllers are designed for each link using ADP approach based on a set of cost matrices and data collected from exploration trajectories.The proposed control strategy employs an off-line,off-policy iterative approach to derive four optimal control policies,one for each joint,under exploration strategies.The objective of the controller is to control the position of each joint.Simulation and experimental results show that four independent optimal controllers are found,each under similar exploration strategies,and the proposed ADP approach successfully yields optimal linear control policies despite the presence of these complexities.The experimental results conducted on the Quanser Qarm robotic platform demonstrate the effectiveness of the proposed ADP controllers in handling significant dynamic nonlinearities,such as actuation limitations,output saturation,and filter delays. 展开更多
关键词 Adaptive dynamic programming Optimal control Robot manipulator 4-DoF Unknown dynamics
在线阅读 下载PDF
Event-Triggered Robust Parallel Optimal Consensus Control for Multiagent Systems
14
作者 Qinglai Wei Shanshan Jiao +1 位作者 Qi Dong Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 2025年第1期40-53,共14页
This paper highlights the utilization of parallel control and adaptive dynamic programming(ADP) for event-triggered robust parallel optimal consensus control(ETRPOC) of uncertain nonlinear continuous-time multiagent s... This paper highlights the utilization of parallel control and adaptive dynamic programming(ADP) for event-triggered robust parallel optimal consensus control(ETRPOC) of uncertain nonlinear continuous-time multiagent systems(MASs).First, the parallel control system, which consists of a virtual control variable and a specific auxiliary variable obtained from the coupled Hamiltonian, allows general systems to be transformed into affine systems. Of interest is the fact that the parallel control technique's introduction provides an unprecedented perspective on eliminating the negative effects of disturbance. Then, an eventtriggered mechanism is adopted to save communication resources while ensuring the system's stability. The coupled HamiltonJacobi(HJ) equation's solution is approximated using a critic neural network(NN), whose weights are updated in response to events. Furthermore, theoretical analysis reveals that the weight estimation error is uniformly ultimately bounded(UUB). Finally,numerical simulations demonstrate the effectiveness of the developed ETRPOC method. 展开更多
关键词 Adaptive dynamic programming(ADP) critic neural network(NN) event-triggered control optimal consensus control robust control
在线阅读 下载PDF
Optimal Impulse Control and Impulse Game for Continuous-Time Deterministic Systems:A Review
15
作者 Chuandong LI Wenxuan WANG 《Artificial Intelligence Science and Engineering》 2025年第3期208-219,共12页
Optimal impulse control and impulse games provide the cutting-edge frameworks for modeling systems where control actions occur at discrete time points,and optimizing objectives under discontinuous interventions.This r... Optimal impulse control and impulse games provide the cutting-edge frameworks for modeling systems where control actions occur at discrete time points,and optimizing objectives under discontinuous interventions.This review synthesizes the theoretical advancements,computational approaches,emerging challenges,and possible research directions in the field.Firstly,we briefly review the fundamental theory of continuous-time optimal control,including Pontryagin's maximum principle(PMP)and dynamic programming principle(DPP).Secondly,we present the foundational results in optimal impulse control,including necessary conditions and sufficient conditions.Thirdly,we systematize impulse game methodologies,from Nash equilibrium existence theory to the connection between Nash equilibrium and systems stability.Fourthly,we summarize the numerical algorithms including the intelligent computation approaches.Finally,we examine the new trends and challenges in theory and applications as well as computational considerations. 展开更多
关键词 optimal impulse control impulse game Pontryagin's maximum principle dynamic programming principle
在线阅读 下载PDF
Multiple fixed-wing UAVs collaborative coverage 3D path planning method for complex areas
16
作者 Mengyang Wang Dong Zhang +1 位作者 Chaoyue Li Zhaohua Zhang 《Defence Technology(防务技术)》 2025年第5期197-215,共19页
Complex multi-area collaborative coverage path planning in dynamic environments poses a significant challenge for multi-fixed-wing UAVs(multi-UAV).This study establishes a comprehensive framework that incorporates UAV... Complex multi-area collaborative coverage path planning in dynamic environments poses a significant challenge for multi-fixed-wing UAVs(multi-UAV).This study establishes a comprehensive framework that incorporates UAV capabilities,terrain,complex areas,and mission dynamics.A novel dynamic collaborative path planning algorithm is introduced,designed to ensure complete coverage of designated areas.This algorithm meticulously optimizes the operation,entry,and transition paths for each UAV,while also establishing evaluation metrics to refine coverage sequences for each area.Additionally,a three-dimensional path is computed utilizing an altitude descent method,effectively integrating twodimensional coverage paths with altitude constraints.The efficacy of the proposed approach is validated through digital simulations and mixed-reality semi-physical experiments across a variety of dynamic scenarios,including both single-area and multi-area coverage by multi-UAV.Results show that the coverage paths generated by this method significantly reduce both computation time and path length,providing a reliable solution for dynamic multi-UAV mission planning in semi-physical environments. 展开更多
关键词 Multi-fixed-wing UAVs(multi-UAV) Minimum time cooperative coverage dynamic complete coverage path planning(DCCPP) Dubins curves Improved dynamic programming algorithm(IDP)
在线阅读 下载PDF
Optimal investment-consumption problem with discontinuous prices and random horizon
17
作者 CHEN Tian HUANG Zong-yuan WU Zhen 《Applied Mathematics(A Journal of Chinese Universities)》 2025年第2期359-374,共16页
This paper investigates an international optimal investmentCconsumption problem under a random time horizon.The investor may allocate wealth between a domestic bond and an international real project with production ou... This paper investigates an international optimal investmentCconsumption problem under a random time horizon.The investor may allocate wealth between a domestic bond and an international real project with production output,whose price may exhibit discontinuities.The model incorporates the effects of taxation and exchange rate dynamics,where the exchange rate follows a stochastic differential equation with jump-diffusion.The investor’s objective is to maximize the utility of consumption and terminal wealth over an uncertain investment horizon.It is worth noting that,under our framework,the exit time is not assumed to be a stopping time.In particular,for the case of constant relative risk aversion(CRRA),we derive the optimal investment and consumption strategies by applying the separation method to solve the associated HamiltonCJacobiCBellman(HJB)equation.Moreover,several numerical examples are provided to illustrate the practical applicability of the proposed results. 展开更多
关键词 corporate international investment random time horizon dynamical programming principle
在线阅读 下载PDF
Residential Energy Scheduling for Variable Weather Solar Energy Based on Adaptive Dynamic Programming 被引量:18
18
作者 Derong Liu Yancai Xu +1 位作者 Qinglai Wei Xinliang Liu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期36-46,共11页
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener... The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost. 展开更多
关键词 Action dependent heuristic dynamic programming adaptive dynamic programming control strategy residential energy management smart grid
在线阅读 下载PDF
PDP:Parallel Dynamic Programming 被引量:15
19
作者 Fei-Yue Wang Jie Zhang +2 位作者 Qinglai Wei Xinhu Zheng Li Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2017年第1期1-5,共5页
Deep reinforcement learning is a focus research area in artificial intelligence.The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods.The principle of adaptive dy... Deep reinforcement learning is a focus research area in artificial intelligence.The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods.The principle of adaptive dynamic programming ADP is first presented instead of direct dynamic programming DP,and the inherent relationship between ADP and deep reinforcement learning is developed.Next,analytics intelligence,as the necessary requirement,for the real reinforcement learning,is discussed.Finally,the principle of the parallel dynamic programming,which integrates dynamic programming and analytics intelligence,is presented as the future computational intelligence.©2014 Chinese Association of Automation. 展开更多
关键词 Parallel dynamic programming dynamic programming Adaptive dynamic programming Reinforcement learning Deep learning Neural networks Artificial intelligence
在线阅读 下载PDF
Parallel Control for Optimal Tracking via Adaptive Dynamic Programming 被引量:25
20
作者 Jingwei Lu Qinglai Wei Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2020年第6期1662-1674,共13页
This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is int... This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases. 展开更多
关键词 Adaptive dynamic programming(ADP) nonlinear optimal control parallel controller parallel control theory parallel system tracking control neural network(NN)
在线阅读 下载PDF
上一页 1 2 15 下一页 到第
使用帮助 返回顶部