The evolution of enabling technologies in wireless communications has paved the way for supporting novel applications with more demanding QoS requirements,but at the cost of increasing the complexity of optimizing the...The evolution of enabling technologies in wireless communications has paved the way for supporting novel applications with more demanding QoS requirements,but at the cost of increasing the complexity of optimizing the digital communication chain.In particular,Millimeter Wave(mmWave)communications provide an abundance of bandwidth,and energy harvesting supplies the network with a continual source of energy to facilitate self-sustainability;however,harnessing these technologies is challenging due to the stochastic dynamics of the mmWave channel as well as the random sporadic nature of the harvested energy.In this paper,we aim at the dynamic optimization of update transmissions in mmWave energy harvesting systems in terms of Age of Information(AoI).AoI has recently been introduced to quantify information freshness and is a more stringent QoS metric compared to conventional delay and throughput.However,most prior art has only addressed averagebased AoI metrics,which can be insufficient to capture the occurrence of rare but high-impact freshness violation events in time-critical scenarios.We formulate a control problem that aims to minimize the long-term entropic risk measure of AoI samples by configuring the“sense&transmit”of updates.Due to the high complexity of the exponential cost function,we reformulate the problem with an approximated mean-variance risk measure as the new objective.Under unknown system statistics,we propose a two-timescale model-free risk-sensitive reinforcement learning algorithm to compute a control policy that adapts to the trio of channel,energy,and AoI states.We evaluate the efficiency of the proposed scheme through extensive simulations.展开更多
A stochastic maximum principle for the risk-sensitive optimal control prob- lem of jump diffusion processes with an exponential-of-integral cost functional is derived assuming that the value function is smooth, where ...A stochastic maximum principle for the risk-sensitive optimal control prob- lem of jump diffusion processes with an exponential-of-integral cost functional is derived assuming that the value function is smooth, where the diffusion and jump term may both depend on the control. The form of the maximum principle is similar to its risk-neutral counterpart. But the adjoint equations and the maximum condition heavily depend on the risk-sensitive parameter. As applications, a linear-quadratic risk-sensitive control problem is solved by using the maximum principle derived and explicit optimal control is obtained.展开更多
A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robnsticity of solutions. The robnsticity of solutions maybe becomes a very important property for a learning s...A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robnsticity of solutions. The robnsticity of solutions maybe becomes a very important property for a learning system when there exists non-matching between theory models and practical physical system, or the practical system is not static, or the availability of a control action changes along with the variety of time. The main contribution is that a set of approximation algorithms and their convergence results are given. A generalized average operator instead of the general optimal operator max (or rain) is applied to study a class of important learning algorithms, dynamic prOgramming algorithms, and discuss their convergences from theoretic point of view. The purpose for this research is to improve the robnsticity of reinforcement learning algorithms theoretically.展开更多
The risk-sensitive filtering design problem with respect to the exponential mean-square cost criterion is con-sidered for stochastic Gaussian systems with polynomial of second and third degree drift terms and intensit...The risk-sensitive filtering design problem with respect to the exponential mean-square cost criterion is con-sidered for stochastic Gaussian systems with polynomial of second and third degree drift terms and intensity parameters multiplying diffusion terms in the state and observations equations. The closed-form optimal fil-tering equations are obtained using quadratic value functions as solutions to the corresponding Focker- Plank-Kolmogorov equation. The performance of the obtained risk-sensitive filtering equations for stochastic polynomial systems of second and third degree is verified in a numerical example against the optimal po-lynomial filtering equations (and extended Kalman-Bucy for system polynomial of second degree), through comparing the exponential mean-square cost criterion values. The simulation results reveal strong advan-tages in favor of the designed risk-sensitive equations for some values of the intensity parameters.展开更多
This paper considers risk-sensitive linear-quadratic mean-field games.By the so-called direct approach via dynamic programming,the authors determine the feedback Nash equilibrium in an N-player game.Subsequently,the a...This paper considers risk-sensitive linear-quadratic mean-field games.By the so-called direct approach via dynamic programming,the authors determine the feedback Nash equilibrium in an N-player game.Subsequently,the authors design a set of decentralized strategies by passing to the mean-field limit.The authors prove that the set of decentralized strategies constitutes an O(1/N)-Nash equilibrium when applied by the N players,and hence obtain so far the tightest equilibrium error bounds for this class of models.展开更多
In this paper, the design problem of satisfaction output feedback controls for stochastic nonlinear systems in strict feedback form under long-term tracking risk-sensitive index is investigated. The index function ado...In this paper, the design problem of satisfaction output feedback controls for stochastic nonlinear systems in strict feedback form under long-term tracking risk-sensitive index is investigated. The index function adopted here is of quadratic form usually encountered in practice, rather than of quartic one used to beg the essential difficulty on controller design and performance analysis of the closed-loop systems. For any given risk-sensitive parameter and desired index value, by using the integrator backstepping method, an output feedback control is constructively designed so that the closed-loop system is bounded in probability and the risk-sensitive index is upper bounded by the desired value.展开更多
This paper investigates the risk-sensitive fixed-point smoothing estimation for hnear omcrete-time systems with multiple time-delay measurements. The problem considered can be converted into an optimization one in ind...This paper investigates the risk-sensitive fixed-point smoothing estimation for hnear omcrete-time systems with multiple time-delay measurements. The problem considered can be converted into an optimization one in indefinite space. Then the risk-sensitive fixed-point smoother is obtained by solving the optimization problem via innovation analysis theory in indefinite space. Necessary and sufficient conditions guaranteeing the existence of the risk-sensitive smoother are also given when the risk-sensitive parameter is negative. Compared with the conventional approach, a significant advantage of presented approach is that it provides less computational cost.展开更多
The paper considers partially observed optimal control problems for risk-sensitive stochastic systems,where the control domain is non-convex and the diffusion term contains the control v.Utilizing Girsanov’s theorem,...The paper considers partially observed optimal control problems for risk-sensitive stochastic systems,where the control domain is non-convex and the diffusion term contains the control v.Utilizing Girsanov’s theorem,spike variational technique as well as duality method,the authors obtain four adjoint equations and establish a maximum principle under partial information.As an application,an example is presented to demonstrate the result.展开更多
The authors propose a data-driven direct adaptive control law based on the adaptive dynamic programming(ADP) algorithm for continuous-time stochastic linear systems with partially unknown system dynamics and infinite ...The authors propose a data-driven direct adaptive control law based on the adaptive dynamic programming(ADP) algorithm for continuous-time stochastic linear systems with partially unknown system dynamics and infinite horizon quadratic risk-sensitive indices.The authors use online data of the system to iteratively solve the generalized algebraic Riccati equation(GARE) and to learn the optimal control law directly.For the case with measurable system noises,the authors show that the adaptive control law approximates the optimal control law as time goes on.For the case with unmeasurable system noises,the authors use the least-square solution calculated only from the measurable data instead of the real solution of the regression equation to iteratively solve the GARE.The authors also study the influences of the intensity of the system noises,the intensity of the exploration noises,the initial iterative matrix,and the sampling period on the convergence of the ADP algorithm.Finally,the authors present two numerical simulation examples to demonstrate the effectiveness of the proposed algorithms.展开更多
This study advances the G-stochastic maximum principle(G-SMP)from a risk-neutral framework to a risk-sensitive one.A salient feature of this advancement is its applicability to systems governed by stochastic different...This study advances the G-stochastic maximum principle(G-SMP)from a risk-neutral framework to a risk-sensitive one.A salient feature of this advancement is its applicability to systems governed by stochastic differential equations under G-Brownian motion(G-SDEs),where the control variable may influence all terms.We aim to generalize our findings from a risk-neutral context to a risk-sensitive performance cost.Initially,we introduced an auxiliary process to address risk-sensitive performance costs within the G-expectation framework.Subsequently,we established and validated the correlation between the G-expected exponential utility and the G-quadratic backward stochastic differential equation.Furthermore,we simplified the G-adjoint process from a dual-component structure to a singular component.Moreover,we explained the necessary optimality conditions for this model by considering a convex set of admissible controls.To describe the main findings,we present two examples:the first addresses the linear-quadratic problem and the second examines a Merton-type problem characterized by power utility.展开更多
The two-player nonzero-sum linear-exponential-quadratic stochastic differential game is studied.The game takes into account the players'attitudes to risk.The nonlinear transformations and change of probability mea...The two-player nonzero-sum linear-exponential-quadratic stochastic differential game is studied.The game takes into account the players'attitudes to risk.The nonlinear transformations and change of probability measure techniques are used to study the existence of both open-loop and closed-loop Nash equilibria for the game.Some examples are constructed to illustrate their differences.Furthermore,theoretical results are applied to solve the risk-sensitive portfolio game problem in the financial market and show the effects of risk attitudes and economic performance on equilibria.展开更多
Tail risk is a classic topic in stressed portfolio optimization to treat unprecedented risks,while the traditional mean–variance approach may fail to perform well.This study proposes an innovative semiparametric meth...Tail risk is a classic topic in stressed portfolio optimization to treat unprecedented risks,while the traditional mean–variance approach may fail to perform well.This study proposes an innovative semiparametric method consisting of two modeling components:the nonparametric estimation and copula method for each marginal distribution of the portfolio and their joint distribution,respectively.We then focus on the optimal weights of the stressed portfolio and its optimal scale beyond the Gaussian restriction.Empirical studies include statistical estimation for the semiparametric method,risk measure minimization for optimal weights,and value measure maximization for the optimal scale to enlarge the investment.From the outputs of short-term and long-term data analysis,optimal stressed portfolios demonstrate the advantages of model flexibility to account for tail risk over the traditional mean–variance method.展开更多
This is an overview paper on the relationship between risk-averse designs based on exponential loss functions with or without an additional unknown(adversarial)term and some classes of stochastic games.In particular,t...This is an overview paper on the relationship between risk-averse designs based on exponential loss functions with or without an additional unknown(adversarial)term and some classes of stochastic games.In particular,the paper discusses the equivalences between risk-averse controller and filter designs and saddle-point solutions of some corresponding risk-neutral stochastic differential games with different information structures for the players.One of the by-products of these analyses is that risk-averse controllers and filters(or estimators)for control and signal-measurement models are robust,through stochastic dissipation inequalities,to unmodeled perturbations in controlled system dynamics as well as signal and the measurement processes.The paper also discusses equivalences between risk-sensitive stochastic zero-sum differential games and some corresponding risk-neutral three-player stochastic zero-sum differential games,as well as robustness issues in stochastic nonzero-sum differential games with finite and infinite populations of players,with the latter belonging to the domain of mean-field games.展开更多
文摘The evolution of enabling technologies in wireless communications has paved the way for supporting novel applications with more demanding QoS requirements,but at the cost of increasing the complexity of optimizing the digital communication chain.In particular,Millimeter Wave(mmWave)communications provide an abundance of bandwidth,and energy harvesting supplies the network with a continual source of energy to facilitate self-sustainability;however,harnessing these technologies is challenging due to the stochastic dynamics of the mmWave channel as well as the random sporadic nature of the harvested energy.In this paper,we aim at the dynamic optimization of update transmissions in mmWave energy harvesting systems in terms of Age of Information(AoI).AoI has recently been introduced to quantify information freshness and is a more stringent QoS metric compared to conventional delay and throughput.However,most prior art has only addressed averagebased AoI metrics,which can be insufficient to capture the occurrence of rare but high-impact freshness violation events in time-critical scenarios.We formulate a control problem that aims to minimize the long-term entropic risk measure of AoI samples by configuring the“sense&transmit”of updates.Due to the high complexity of the exponential cost function,we reformulate the problem with an approximated mean-variance risk measure as the new objective.Under unknown system statistics,we propose a two-timescale model-free risk-sensitive reinforcement learning algorithm to compute a control policy that adapts to the trio of channel,energy,and AoI states.We evaluate the efficiency of the proposed scheme through extensive simulations.
基金supported by the National Basic Research Program of China (973 Program, 2007CB814904)the National Natural Science Foundations of China (10921101)+2 种基金Shandong Province (2008BS01024, ZR2010AQ004)the Science Funds for Distinguished Young Scholars of Shandong Province (JQ200801)Shandong University (2009JQ004),the Independent Innovation Foundations of Shandong University (IIFSDU,2009TS036, 2010TS060)
文摘A stochastic maximum principle for the risk-sensitive optimal control prob- lem of jump diffusion processes with an exponential-of-integral cost functional is derived assuming that the value function is smooth, where the diffusion and jump term may both depend on the control. The form of the maximum principle is similar to its risk-neutral counterpart. But the adjoint equations and the maximum condition heavily depend on the risk-sensitive parameter. As applications, a linear-quadratic risk-sensitive control problem is solved by using the maximum principle derived and explicit optimal control is obtained.
基金Project supported by the National Natural Science Foundation of China (Nos. 10471088 and 60572126)
文摘A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robnsticity of solutions. The robnsticity of solutions maybe becomes a very important property for a learning system when there exists non-matching between theory models and practical physical system, or the practical system is not static, or the availability of a control action changes along with the variety of time. The main contribution is that a set of approximation algorithms and their convergence results are given. A generalized average operator instead of the general optimal operator max (or rain) is applied to study a class of important learning algorithms, dynamic prOgramming algorithms, and discuss their convergences from theoretic point of view. The purpose for this research is to improve the robnsticity of reinforcement learning algorithms theoretically.
文摘The risk-sensitive filtering design problem with respect to the exponential mean-square cost criterion is con-sidered for stochastic Gaussian systems with polynomial of second and third degree drift terms and intensity parameters multiplying diffusion terms in the state and observations equations. The closed-form optimal fil-tering equations are obtained using quadratic value functions as solutions to the corresponding Focker- Plank-Kolmogorov equation. The performance of the obtained risk-sensitive filtering equations for stochastic polynomial systems of second and third degree is verified in a numerical example against the optimal po-lynomial filtering equations (and extended Kalman-Bucy for system polynomial of second degree), through comparing the exponential mean-square cost criterion values. The simulation results reveal strong advan-tages in favor of the designed risk-sensitive equations for some values of the intensity parameters.
基金supported by Natural Sciences and Engineering Research Council(NSERC)of Canada.
文摘This paper considers risk-sensitive linear-quadratic mean-field games.By the so-called direct approach via dynamic programming,the authors determine the feedback Nash equilibrium in an N-player game.Subsequently,the authors design a set of decentralized strategies by passing to the mean-field limit.The authors prove that the set of decentralized strategies constitutes an O(1/N)-Nash equilibrium when applied by the N players,and hence obtain so far the tightest equilibrium error bounds for this class of models.
基金This work was supported by the National Natural Science Foundation of China.
文摘In this paper, the design problem of satisfaction output feedback controls for stochastic nonlinear systems in strict feedback form under long-term tracking risk-sensitive index is investigated. The index function adopted here is of quadratic form usually encountered in practice, rather than of quartic one used to beg the essential difficulty on controller design and performance analysis of the closed-loop systems. For any given risk-sensitive parameter and desired index value, by using the integrator backstepping method, an output feedback control is constructively designed so that the closed-loop system is bounded in probability and the risk-sensitive index is upper bounded by the desired value.
基金supported by the National Natural Science Foundations of China under Grant Nos.61273124,61174141China Postdoctoral Science Foundation under Grant No.2011M501132+2 种基金Special Funds for Postdoctoral Innovative Projects of Shandong Province under Grant No.201103043Doctoral Foundation of Taishan University under Grant No.Y11-2-02A Project of Shandong Province Higher Education Science and Technology Program under Grant No.J12LN90
文摘This paper investigates the risk-sensitive fixed-point smoothing estimation for hnear omcrete-time systems with multiple time-delay measurements. The problem considered can be converted into an optimization one in indefinite space. Then the risk-sensitive fixed-point smoother is obtained by solving the optimization problem via innovation analysis theory in indefinite space. Necessary and sufficient conditions guaranteeing the existence of the risk-sensitive smoother are also given when the risk-sensitive parameter is negative. Compared with the conventional approach, a significant advantage of presented approach is that it provides less computational cost.
基金supported by the National Natural Foundation of China under Grant Nos.11801154 and 11901112。
文摘The paper considers partially observed optimal control problems for risk-sensitive stochastic systems,where the control domain is non-convex and the diffusion term contains the control v.Utilizing Girsanov’s theorem,spike variational technique as well as duality method,the authors obtain four adjoint equations and establish a maximum principle under partial information.As an application,an example is presented to demonstrate the result.
基金supported in part by the National Natural Science Foundation of China under Grant No.62261136550in part by the Basic Research Project of Shanghai Science and Technology Commission under Grant No.20JC1414000。
文摘The authors propose a data-driven direct adaptive control law based on the adaptive dynamic programming(ADP) algorithm for continuous-time stochastic linear systems with partially unknown system dynamics and infinite horizon quadratic risk-sensitive indices.The authors use online data of the system to iteratively solve the generalized algebraic Riccati equation(GARE) and to learn the optimal control law directly.For the case with measurable system noises,the authors show that the adaptive control law approximates the optimal control law as time goes on.For the case with unmeasurable system noises,the authors use the least-square solution calculated only from the measurable data instead of the real solution of the regression equation to iteratively solve the GARE.The authors also study the influences of the intensity of the system noises,the intensity of the exploration noises,the initial iterative matrix,and the sampling period on the convergence of the ADP algorithm.Finally,the authors present two numerical simulation examples to demonstrate the effectiveness of the proposed algorithms.
基金supported by PRFU project N(Grant No.C00L03UN070120220004).
文摘This study advances the G-stochastic maximum principle(G-SMP)from a risk-neutral framework to a risk-sensitive one.A salient feature of this advancement is its applicability to systems governed by stochastic differential equations under G-Brownian motion(G-SDEs),where the control variable may influence all terms.We aim to generalize our findings from a risk-neutral context to a risk-sensitive performance cost.Initially,we introduced an auxiliary process to address risk-sensitive performance costs within the G-expectation framework.Subsequently,we established and validated the correlation between the G-expected exponential utility and the G-quadratic backward stochastic differential equation.Furthermore,we simplified the G-adjoint process from a dual-component structure to a singular component.Moreover,we explained the necessary optimality conditions for this model by considering a convex set of admissible controls.To describe the main findings,we present two examples:the first addresses the linear-quadratic problem and the second examines a Merton-type problem characterized by power utility.
文摘The two-player nonzero-sum linear-exponential-quadratic stochastic differential game is studied.The game takes into account the players'attitudes to risk.The nonlinear transformations and change of probability measure techniques are used to study the existence of both open-loop and closed-loop Nash equilibria for the game.Some examples are constructed to illustrate their differences.Furthermore,theoretical results are applied to solve the risk-sensitive portfolio game problem in the financial market and show the effects of risk attitudes and economic performance on equilibria.
文摘Tail risk is a classic topic in stressed portfolio optimization to treat unprecedented risks,while the traditional mean–variance approach may fail to perform well.This study proposes an innovative semiparametric method consisting of two modeling components:the nonparametric estimation and copula method for each marginal distribution of the portfolio and their joint distribution,respectively.We then focus on the optimal weights of the stressed portfolio and its optimal scale beyond the Gaussian restriction.Empirical studies include statistical estimation for the semiparametric method,risk measure minimization for optimal weights,and value measure maximization for the optimal scale to enlarge the investment.From the outputs of short-term and long-term data analysis,optimal stressed portfolios demonstrate the advantages of model flexibility to account for tail risk over the traditional mean–variance method.
基金the Air Force Office of Scientific Research(AFOSR)under Grant No.FA9550-19-1-0353the Army Research Office MURI under Grant No.AG285。
文摘This is an overview paper on the relationship between risk-averse designs based on exponential loss functions with or without an additional unknown(adversarial)term and some classes of stochastic games.In particular,the paper discusses the equivalences between risk-averse controller and filter designs and saddle-point solutions of some corresponding risk-neutral stochastic differential games with different information structures for the players.One of the by-products of these analyses is that risk-averse controllers and filters(or estimators)for control and signal-measurement models are robust,through stochastic dissipation inequalities,to unmodeled perturbations in controlled system dynamics as well as signal and the measurement processes.The paper also discusses equivalences between risk-sensitive stochastic zero-sum differential games and some corresponding risk-neutral three-player stochastic zero-sum differential games,as well as robustness issues in stochastic nonzero-sum differential games with finite and infinite populations of players,with the latter belonging to the domain of mean-field games.