At the first sight it seems that advanced operation research is not used enough in continuous production systems as comparison with mass production, batch production and job shop systems, but really in a comprehensive...At the first sight it seems that advanced operation research is not used enough in continuous production systems as comparison with mass production, batch production and job shop systems, but really in a comprehensive evaluation the advanced operation research techniques can be used in continuous production systems in developing countries very widely, because of initial inadequate plant layout, stage by stage development of production lines, the purchase of second hand machineries from various countries, plurality of customers. A case of production system planning is proposed for a chemical company in which the above mentioned conditions are almost presented. The goals and constraints in this issue are as follows: (1) Minimizing deviation of customer's requirements. (2) Maximizing the profit. (3) Minimizing the frequencies of changes in formula production. (4) Minimizing the inventory of final products. (5) Balancing the production sections with regard to rate in production. (6) Limitation in inventory of raw material. The present situation is in such a way that various techniques such as goal programming, linear programming and dynamic programming can be used. But dynamic production programming issues are divided into two categories, at first one with limitation in production capacity and another with unlimited production capacity. For the first category, a systematic and acceptable solution has not been presented yet. Therefore an innovative method is used to convert the dynamic situation to a zero- one model. At last this issue is changed to a goal programming model with non-linear limitations with the use of GRG algorithm and that's how it is solved.展开更多
Owing to extensive applications in many fields,the synchronization problem has been widely investigated in multi-agent systems.The synchronization for multi-agent systems is a pivotal issue,which means that under the ...Owing to extensive applications in many fields,the synchronization problem has been widely investigated in multi-agent systems.The synchronization for multi-agent systems is a pivotal issue,which means that under the designed control policy,the output of systems or the state of each agent can be consistent with the leader.The purpose of this paper is to investigate a heuristic dynamic programming(HDP)-based learning tracking control for discrete-time multi-agent systems to achieve synchronization while considering disturbances in systems.Besides,due to the difficulty of solving the coupled Hamilton–Jacobi–Bellman equation analytically,an improved HDP learning control algorithm is proposed to realize the synchronization between the leader and all following agents,which is executed by an action-critic neural network.The action and critic neural network are utilized to learn the optimal control policy and cost function,respectively,by means of introducing an auxiliary action network.Finally,two numerical examples and a practical application of mobile robots are presented to demonstrate the control performance of the HDP-based learning control algorithm.展开更多
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener...The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost.展开更多
This paper introduces a practical solving scheme of gradetransition trajectory optimization(GTTO) problems under typical certificate-checking–updating framework. Due to complicated kinetics of polymerization,differen...This paper introduces a practical solving scheme of gradetransition trajectory optimization(GTTO) problems under typical certificate-checking–updating framework. Due to complicated kinetics of polymerization,differential/algebraic equations(DAEs) always cause great computational burden and system non-linearity usually makes GTTO non-convex bearing multiple optima. Therefore, coupled with the three-stage decomposition model, a three-section algorithm of dynamic programming(TSDP) is proposed based on the general iteration mechanism of iterative programming(IDP) and incorporated with adaptivegrid allocation scheme and heuristic modifications. The algorithm iteratively performs dynamic programming with heuristic modifications under constant calculation loads and adaptively allocates the valued computational resources to the regions that can further improve the optimality under the guidance of local error estimates. TSDP is finally compared with IDP and interior point method(IP) to verify its efficiency of computation.展开更多
Intelligent confrontation has become a vital technology for future air combats.Confrontation games between a penetrating aircraft and an intercepting aircraft are essential for modern air combats.In addition,the perfo...Intelligent confrontation has become a vital technology for future air combats.Confrontation games between a penetrating aircraft and an intercepting aircraft are essential for modern air combats.In addition,the perfor-mance indexes of both the interceptor and penetrator must be considered.Traditional methods only solve one side’s guidance problem without considering the intelligence of the opponent.In this paper,an adaptive heuristic dynamic programming-based algorithm is proposed for aircraft confrontation games.This algorithm constructs a heuristic dynamic programming model for both confrontation aircraft and then updates the critical and ac-tion network parameters using the dynamic confrontation state information.Numerical simulations indicate that the proposed algorithm can optimize the guidance law for both the interceptor and penetrator and is therefore superior to traditional proportional navigation methods.展开更多
Maximizing the lifetime of wireless sensor networks(WSNs) is an important and challenging research problem. Properly scheduling the movements of mobile sinks to balance the energy consumption of wireless sensor networ...Maximizing the lifetime of wireless sensor networks(WSNs) is an important and challenging research problem. Properly scheduling the movements of mobile sinks to balance the energy consumption of wireless sensor network is one of the most effective approaches to prolong the lifetime of wireless sensor networks. However, the existing mobile sink scheduling methods either require a great amount of computational time or lack effectiveness in finding high-quality scheduling solutions. To address the above issues, this paper proposes a novel hyperheuristic framework, which can automatically construct high-level heuristics to schedule the sink movements and prolong the network lifetime. In the proposed framework, a set of low-level heuristics are defined as building blocks to construct high-level heuristics and a set of random networks with different features are designed for training. Further, a genetic programming algorithm is adopted to automatically evolve promising high-level heuristics based on the building blocks and the training networks. By using the genetic programming to evolve more effective heuristics and applying these heuristics in a greedy scheme, our proposed hyper-heuristic framework can prolong the network lifetime competitively with other methods, with small time consumption. A series of comprehensive experiments, including both static and dynamic networks,are designed. The simulation results have demonstrated that the proposed method can offer a very promising performance in terms of network lifetime and response time.展开更多
Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.Ho...Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.However,solving the Hamilton-Jacobi-Bellman(HJB)equations for nonlinear systems requires precise and complicated dynamics.Moreover,the research and application of IRL in continuous-time(CT)systems must be further improved.To develop the IRL of a CT nonlinear system,a data-based adaptive neural dynamic programming(ANDP)method is proposed to investigate the optimal control problem of uncertain CT multi-input systems such that the knowledge of the dynamics in the HJB equation is unnecessary.First,the multi-input model is approximated using a neural network(NN),which can be utilized to design an integral reinforcement signal.Subsequently,two criterion networks and one action network are constructed based on the integral reinforcement signal.A nonzero-sum Nash equilibrium can be reached by learning the optimal strategies of the multi-input model.In this scheme,the NN weights are constantly updated using an adaptive algorithm.The weight convergence and the system stability are analyzed in detail.The optimal control problem of a multi-input nonlinear CT system is effectively solved using the ANDP scheme,and the results are verified by a simulation study.展开更多
文摘At the first sight it seems that advanced operation research is not used enough in continuous production systems as comparison with mass production, batch production and job shop systems, but really in a comprehensive evaluation the advanced operation research techniques can be used in continuous production systems in developing countries very widely, because of initial inadequate plant layout, stage by stage development of production lines, the purchase of second hand machineries from various countries, plurality of customers. A case of production system planning is proposed for a chemical company in which the above mentioned conditions are almost presented. The goals and constraints in this issue are as follows: (1) Minimizing deviation of customer's requirements. (2) Maximizing the profit. (3) Minimizing the frequencies of changes in formula production. (4) Minimizing the inventory of final products. (5) Balancing the production sections with regard to rate in production. (6) Limitation in inventory of raw material. The present situation is in such a way that various techniques such as goal programming, linear programming and dynamic programming can be used. But dynamic production programming issues are divided into two categories, at first one with limitation in production capacity and another with unlimited production capacity. For the first category, a systematic and acceptable solution has not been presented yet. Therefore an innovative method is used to convert the dynamic situation to a zero- one model. At last this issue is changed to a goal programming model with non-linear limitations with the use of GRG algorithm and that's how it is solved.
基金This work was supported by Tianjin Natural Science Foundation under Grant 20JCYBJC00880Beijing key Laboratory Open Fund of Long-Life Technology of Precise Rotation and Transmission MechanismsGuangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control.
文摘Owing to extensive applications in many fields,the synchronization problem has been widely investigated in multi-agent systems.The synchronization for multi-agent systems is a pivotal issue,which means that under the designed control policy,the output of systems or the state of each agent can be consistent with the leader.The purpose of this paper is to investigate a heuristic dynamic programming(HDP)-based learning tracking control for discrete-time multi-agent systems to achieve synchronization while considering disturbances in systems.Besides,due to the difficulty of solving the coupled Hamilton–Jacobi–Bellman equation analytically,an improved HDP learning control algorithm is proposed to realize the synchronization between the leader and all following agents,which is executed by an action-critic neural network.The action and critic neural network are utilized to learn the optimal control policy and cost function,respectively,by means of introducing an auxiliary action network.Finally,two numerical examples and a practical application of mobile robots are presented to demonstrate the control performance of the HDP-based learning control algorithm.
基金supported in part by the National Natural Science Foundation of China(61533017,U1501251,61374105,61722312)
文摘The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost.
基金Supported by the National Basic Research Program of China(2012CB720500)the National High Technology Research and Development Program of China(2013AA040702)
文摘This paper introduces a practical solving scheme of gradetransition trajectory optimization(GTTO) problems under typical certificate-checking–updating framework. Due to complicated kinetics of polymerization,differential/algebraic equations(DAEs) always cause great computational burden and system non-linearity usually makes GTTO non-convex bearing multiple optima. Therefore, coupled with the three-stage decomposition model, a three-section algorithm of dynamic programming(TSDP) is proposed based on the general iteration mechanism of iterative programming(IDP) and incorporated with adaptivegrid allocation scheme and heuristic modifications. The algorithm iteratively performs dynamic programming with heuristic modifications under constant calculation loads and adaptively allocates the valued computational resources to the regions that can further improve the optimality under the guidance of local error estimates. TSDP is finally compared with IDP and interior point method(IP) to verify its efficiency of computation.
基金supported by the China Postdoctoral Science Founda-tion(Grant No.2020M681750).
文摘Intelligent confrontation has become a vital technology for future air combats.Confrontation games between a penetrating aircraft and an intercepting aircraft are essential for modern air combats.In addition,the perfor-mance indexes of both the interceptor and penetrator must be considered.Traditional methods only solve one side’s guidance problem without considering the intelligence of the opponent.In this paper,an adaptive heuristic dynamic programming-based algorithm is proposed for aircraft confrontation games.This algorithm constructs a heuristic dynamic programming model for both confrontation aircraft and then updates the critical and ac-tion network parameters using the dynamic confrontation state information.Numerical simulations indicate that the proposed algorithm can optimize the guidance law for both the interceptor and penetrator and is therefore superior to traditional proportional navigation methods.
基金supported by the National Natural Science Foundation of China(61602181,61876025)Program for Guangdong Introducing Innovative and Entrepreneurial Teams(2017ZT07X183)+2 种基金Guangdong Natural Science Foundation Research Team(2018B030312003)the Guangdong–Hong Kong Joint Innovation Platform(2018B050502006)the Fundamental Research Funds for the Central Universities(D2191200)
文摘Maximizing the lifetime of wireless sensor networks(WSNs) is an important and challenging research problem. Properly scheduling the movements of mobile sinks to balance the energy consumption of wireless sensor network is one of the most effective approaches to prolong the lifetime of wireless sensor networks. However, the existing mobile sink scheduling methods either require a great amount of computational time or lack effectiveness in finding high-quality scheduling solutions. To address the above issues, this paper proposes a novel hyperheuristic framework, which can automatically construct high-level heuristics to schedule the sink movements and prolong the network lifetime. In the proposed framework, a set of low-level heuristics are defined as building blocks to construct high-level heuristics and a set of random networks with different features are designed for training. Further, a genetic programming algorithm is adopted to automatically evolve promising high-level heuristics based on the building blocks and the training networks. By using the genetic programming to evolve more effective heuristics and applying these heuristics in a greedy scheme, our proposed hyper-heuristic framework can prolong the network lifetime competitively with other methods, with small time consumption. A series of comprehensive experiments, including both static and dynamic networks,are designed. The simulation results have demonstrated that the proposed method can offer a very promising performance in terms of network lifetime and response time.
文摘Integral reinforcement learning(IRL)is an effective tool for solving optimal control problems of nonlinear systems,and it has been widely utilized in optimal controller design for solving discrete-time nonlinearity.However,solving the Hamilton-Jacobi-Bellman(HJB)equations for nonlinear systems requires precise and complicated dynamics.Moreover,the research and application of IRL in continuous-time(CT)systems must be further improved.To develop the IRL of a CT nonlinear system,a data-based adaptive neural dynamic programming(ANDP)method is proposed to investigate the optimal control problem of uncertain CT multi-input systems such that the knowledge of the dynamics in the HJB equation is unnecessary.First,the multi-input model is approximated using a neural network(NN),which can be utilized to design an integral reinforcement signal.Subsequently,two criterion networks and one action network are constructed based on the integral reinforcement signal.A nonzero-sum Nash equilibrium can be reached by learning the optimal strategies of the multi-input model.In this scheme,the NN weights are constantly updated using an adaptive algorithm.The weight convergence and the system stability are analyzed in detail.The optimal control problem of a multi-input nonlinear CT system is effectively solved using the ANDP scheme,and the results are verified by a simulation study.