期刊文献+
共找到90篇文章
< 1 2 5 >
每页显示 20 50 100
A novel trajectories optimizing method for dynamic soaring based on deep reinforcement learning
1
作者 Wanyong Zou Ni Li +2 位作者 Fengcheng An Kaibo Wang Changyin Dong 《Defence Technology(防务技术)》 2025年第4期99-108,共10页
Dynamic soaring,inspired by the wind-riding flight of birds such as albatrosses,is a biomimetic technique which leverages wind fields to enhance the endurance of unmanned aerial vehicles(UAVs).Achieving a precise soar... Dynamic soaring,inspired by the wind-riding flight of birds such as albatrosses,is a biomimetic technique which leverages wind fields to enhance the endurance of unmanned aerial vehicles(UAVs).Achieving a precise soaring trajectory is crucial for maximizing energy efficiency during flight.Existing nonlinear programming methods are heavily dependent on the choice of initial values which is hard to determine.Therefore,this paper introduces a deep reinforcement learning method based on a differentially flat model for dynamic soaring trajectory planning and optimization.Initially,the gliding trajectory is parameterized using Fourier basis functions,achieving a flexible trajectory representation with a minimal number of hyperparameters.Subsequently,the trajectory optimization problem is formulated as a dynamic interactive process of Markov decision-making.The hyperparameters of the trajectory are optimized using the Proximal Policy Optimization(PPO2)algorithm from deep reinforcement learning(DRL),reducing the strong reliance on initial value settings in the optimization process.Finally,a comparison between the proposed method and the nonlinear programming method reveals that the trajectory generated by the proposed approach is smoother while meeting the same performance requirements.Specifically,the proposed method achieves a 34%reduction in maximum thrust,a 39.4%decrease in maximum thrust difference,and a 33%reduction in maximum airspeed difference. 展开更多
关键词 Dynamic soaring Differential flatness Trajectory optimization Proximal policy optimization
在线阅读 下载PDF
Optimal quasi-periodic maintenance policies for two-unit series system 被引量:2
2
作者 高文科 张志胜 +1 位作者 周一帆 甘淑媛 《Journal of Southeast University(English Edition)》 EI CAS 2013年第4期450-455,共6页
To investigate the effects of various random factors on the preventive maintenance (PM) decision-making of one type of two-unit series system, an optimal quasi-periodic PM policy is introduced. Assume that PM is per... To investigate the effects of various random factors on the preventive maintenance (PM) decision-making of one type of two-unit series system, an optimal quasi-periodic PM policy is introduced. Assume that PM is perfect for unit 1 and only mechanical service for unit 2 in the model. PM activity is randomly performed according to a dynamic PM plan distributed in each implementation period. A replacement is determined based on the competing results of unplanned and planned replacements. The unplanned replacement is trigged by a catastrophic failure of unit 2, and the planned replacement is executed when the PM number reaches the threshold N. Through modeling and analysis, a solution algorithm for an optimal implementation period and the PM number is given, and optimal process and parametric sensitivity are provided by a numerical example. Results show that the implementation period should be decreased as soon as possible under the condition of meeting the needs of practice, which can increase mean operating time and decrease the long-run cost rate. 展开更多
关键词 maintenance policy optimization quasi-periodic preventive maintenance two-unit series system
在线阅读 下载PDF
China’s National Carbon Price Trends and Outlook for 2025 被引量:1
3
作者 Xu Dong Zhou Xinyuan 《China Oil & Gas》 2025年第4期25-32,共8页
At the beginning of 2025,China’s national carbon market carbon price trend exhibited a continuous unilateral downward trajectory,representing a departure from the overall steady upward trend in carbon prices since th... At the beginning of 2025,China’s national carbon market carbon price trend exhibited a continuous unilateral downward trajectory,representing a departure from the overall steady upward trend in carbon prices since the carbon market launched in 2021.The analysis suggests that the primary reason for the recent decline in carbon prices is the reversal of supply and demand dynamics in the carbon market,with increased quota supply amid a sluggish economy.It is expected that downward pressure on carbon prices will persist in the short term,but with more industries being included and continued policy optimization and improvement,a rise in China’s medium-to long-term carbon prices is highly probable.Recommendations for enterprises involved in carbon asset operations and management:first,refining carbon asset reserves and trading strategies;second,accelerating internal CCER project development;third,exploring carbon financial instrument applications;fourth,establishing and improving internal carbon pricing mechanisms;fifth,proactively planning for new industry inclusion. 展开更多
关键词 CCER project industrial inclusion reversal supply demand dynamics carbon price policy optimization supply demand dynamics carbon asset management carbon market
在线阅读 下载PDF
Rural Revitalization and the Transformation of Xinhui Chenpi Industry: A Case Study of Policy Implementation and Development Pathways
4
作者 Yuxin Yang 《Proceedings of Business and Economic Studies》 2025年第5期132-141,共10页
This paper examines the transformation and development of the Xinhui Chenpi industry under the rural revitalization strategy in China.The study highlights the significant growth of the industry,with the annual product... This paper examines the transformation and development of the Xinhui Chenpi industry under the rural revitalization strategy in China.The study highlights the significant growth of the industry,with the annual production of chenpi reaching approximately 7,000 tons and the total output value surpassing 26 billion yuan in 2024.The paper proposes strategies to foster sustainable growth in industries facing challenges such as inefficient production processes,inconsistent product quality,and a lack of policy awareness among operators.These strategies include optimizing support policies,enhancing regulatory frameworks,and leveraging digital technologies for brand building and market expansion.The research contributes to understanding the development trajectory of the Xinhui Chenpi industry and provides insights for policymakers and industry practitioners. 展开更多
关键词 Rural revitalization Industrial transformation Policy optimization Digital marketing
在线阅读 下载PDF
Optimization Scheduling of Hydrogen-Coupled Electro-Heat-Gas Integrated Energy System Based on Generative Adversarial Imitation Learning
5
作者 Baiyue Song Chenxi Zhang +1 位作者 Wei Zhang Leiyu Wan 《Energy Engineering》 2025年第12期4919-4945,共27页
Hydrogen energy is a crucial support for China’s low-carbon energy transition.With the large-scale integration of renewable energy,the combination of hydrogen and integrated energy systems has become one of the most ... Hydrogen energy is a crucial support for China’s low-carbon energy transition.With the large-scale integration of renewable energy,the combination of hydrogen and integrated energy systems has become one of the most promising directions of development.This paper proposes an optimized schedulingmodel for a hydrogen-coupled electro-heat-gas integrated energy system(HCEHG-IES)using generative adversarial imitation learning(GAIL).The model aims to enhance renewable-energy absorption,reduce carbon emissions,and improve grid-regulation flexibility.First,the optimal scheduling problem of HCEHG-IES under uncertainty is modeled as a Markov decision process(MDP).To overcome the limitations of conventional deep reinforcement learning algorithms—including long optimization time,slow convergence,and subjective reward design—this study augments the PPO algorithm by incorporating a discriminator network and expert data.The newly developed algorithm,termed GAIL,enables the agent to perform imitation learning from expert data.Based on this model,dynamic scheduling decisions are made in continuous state and action spaces,generating optimal energy-allocation and management schemes.Simulation results indicate that,compared with traditional reinforcement-learning algorithms,the proposed algorithmoffers better economic performance.Guided by expert data,the agent avoids blind optimization,shortens the offline training time,and improves convergence performance.In the online phase,the algorithm enables flexible energy utilization,thereby promoting renewable-energy absorption and reducing carbon emissions. 展开更多
关键词 Hydrogen energy optimization dispatch generative adversarial imitation learning proximal policy optimization imitation learning renewable energy
在线阅读 下载PDF
Dynamic hedging of 50ETF options using Proximal Policy Optimization
6
作者 Lei Liu Mengmeng Hao Jinde Cao 《Journal of Automation and Intelligence》 2025年第3期198-206,共9页
This paper employs the PPO(Proximal Policy Optimization) algorithm to study the risk hedging problem of the Shanghai Stock Exchange(SSE) 50ETF options. First, the action and state spaces were designed based on the cha... This paper employs the PPO(Proximal Policy Optimization) algorithm to study the risk hedging problem of the Shanghai Stock Exchange(SSE) 50ETF options. First, the action and state spaces were designed based on the characteristics of the hedging task, and a reward function was developed according to the cost function of the options. Second, combining the concept of curriculum learning, the agent was guided to adopt a simulated-to-real learning approach for dynamic hedging tasks, reducing the learning difficulty and addressing the issue of insufficient option data. A dynamic hedging strategy for 50ETF options was constructed. Finally, numerical experiments demonstrate the superiority of the designed algorithm over traditional hedging strategies in terms of hedging effectiveness. 展开更多
关键词 B-S model Option hedging Reinforcement learning 50ETF Proximal Policy Optimization(PPO)
在线阅读 下载PDF
Research on Utility Evaluation and Optimization of the Third Pillar Pension in Multi-level Pension Security for Employees in New Business Forms
7
作者 Ren Feixiao Wang Wenbo Zhang Kexin 《Journal of Humanities and Nature》 2025年第2期3-19,共17页
Against the backdrop of uneven pressure on the three-pillar pension system and a mismatch between pension funds and the demographic structure,a large number of employees in new forms of employment remain outside the p... Against the backdrop of uneven pressure on the three-pillar pension system and a mismatch between pension funds and the demographic structure,a large number of employees in new forms of employment remain outside the pension security system,facing relatively high pension risks.Due to their high job mobility,weak long-term planning ability,and large income fluctuations,on the basis of maintaining the balance of the three-pillar pension system,individual pension schemes may become a breakthrough point for improving the pension situation of employees in new forms of employment.In line with the national goal of building a multi-level and multi-pillar old-age insurance system,to study the supplementary role of the third-pillar individual pension policy for employees in new forms of employment,this article constructs an evaluation system using the analytic hierarchy process and designs a questionnaire.After conducting a questionnaire survey in six cities in Shandong Province,the collected data are analyzed.It is found that the short-term effect of the current policy is that residents'awareness of pension issues is gradually improving,and the participation rate is increasing,but the behavior is short-term,and residents generally tend to avoid pension risks.Therefore,regarding the deepening of the individual pension system,the article puts forward three suggestions:(1)Conduct comprehensive publicity through multiple channels and with emphasis on key points;(2)Enhance the system's attractiveness according to the characteristics of the target population;(3)Improve the public's awareness of pension planning and financial literacy;(4)Strengthen the connection and transformation among different pillars of the pension system. 展开更多
关键词 New Business Format Personal Pension System Analytic Hierarchy Process Policy Optimization
在线阅读 下载PDF
Gait Learning Reproduction for Quadruped Robots Based on Experience Evolution Proximal Policy Optimization
8
作者 LI Chunyang ZHU Xiaoqing +2 位作者 RUAN Xiaogang LIU Xinyuan ZHANG Siyuan 《Journal of Shanghai Jiaotong university(Science)》 2025年第6期1125-1133,共9页
Bionic gait learning of quadruped robots based on reinforcement learning has become a hot research topic.The proximal policy optimization(PPO)algorithm has a low probability of learning a successful gait from scratch ... Bionic gait learning of quadruped robots based on reinforcement learning has become a hot research topic.The proximal policy optimization(PPO)algorithm has a low probability of learning a successful gait from scratch due to problems such as reward sparsity.To solve the problem,we propose a experience evolution proximal policy optimization(EEPPO)algorithm which integrates PPO with priori knowledge highlighting by evolutionary strategy.We use the successful trained samples as priori knowledge to guide the learning direction in order to increase the success probability of the learning algorithm.To verify the effectiveness of the proposed EEPPO algorithm,we have conducted simulation experiments of the quadruped robot gait learning task on Pybullet.Experimental results show that the central pattern generator based radial basis function(CPG-RBF)network and the policy network are simultaneously updated to achieve the quadruped robot’s bionic diagonal trot gait learning task using key information such as the robot’s speed,posture and joints information.Experimental comparison results with the traditional soft actor-critic(SAC)algorithm validate the superiority of the proposed EEPPO algorithm,which can learn a more stable diagonal trot gait in flat terrain. 展开更多
关键词 quadruped robot proximal policy optimization(PPO) priori knowledge evolutionary strategy bionic gait learning
原文传递
Meta Reinforcement Learning for Fast Spectrum Sharing in Vehicular Networks
9
作者 Huang Kai Liang Le +1 位作者 Jin Shi Geoffrey Ye Li 《China Communications》 2025年第9期320-332,共13页
In this paper,we investigate the problem of fast spectrum sharing in vehicle-to-everything com-munication.In order to improve the spectrum effi-ciency of the whole system,the spectrum of vehicle-to-infrastructure link... In this paper,we investigate the problem of fast spectrum sharing in vehicle-to-everything com-munication.In order to improve the spectrum effi-ciency of the whole system,the spectrum of vehicle-to-infrastructure links is reused by vehicle-to-vehicle links.To this end,we model it as a problem of deep reinforcement learning and tackle it with prox-imal policy optimization.A considerable number of interactions are often required for training an agent with good performance,so simulation-based training is commonly used in communication networks.Nev-ertheless,severe performance degradation may occur when the agent is directly deployed in the real world,even though it can perform well on the simulator,due to the reality gap between the simulation and the real environments.To address this issue,we make prelim-inary efforts by proposing an algorithm based on meta reinforcement learning.This algorithm enables the agent to rapidly adapt to a new task with the knowl-edge extracted from similar tasks,leading to fewer in-teractions and less training time.Numerical results show that our method achieves near-optimal perfor-mance and exhibits rapid convergence. 展开更多
关键词 meta reinforcement learning proximal policy optimization spectrum sharing V2X communication
在线阅读 下载PDF
Leverage International Travel Fairs to Facilitate the High-quality Development of Inbound Tourism
10
作者 Zhang Li 《China & The World Cultural Exchange》 2025年第3期12-15,共4页
Since last year,China’s inbound tourism market has accelerated its recovery.With the introduction and optimization of various facilitation policies and the development of new products,the inbound tourism market has s... Since last year,China’s inbound tourism market has accelerated its recovery.With the introduction and optimization of various facilitation policies and the development of new products,the inbound tourism market has shown unlimited potential for growth.According to data from the Data Center of the Ministry of Culture and Tourism,the number of inbound tourists reached a new high during the Spring Festival in 2025.The UK became China's third largest source of inbound tourists after the Republic of Korea and Japan. 展开更多
关键词 international travel fairs market recovery introduction optimization various facilitation policies data center facilitation policies development new productsthe inbound tourism inbound tourists
在线阅读 下载PDF
Deep Reinforcement Learning-based Multi-Objective Scheduling for Distributed Heterogeneous Hybrid Flow Shops with Blocking Constraints
11
作者 Xueyan Sun Weiming Shen +3 位作者 Jiaxin Fan Birgit Vogel-Heuser Fandi Bi Chunjiang Zhang 《Engineering》 2025年第3期278-291,共14页
This paper investigates a distributed heterogeneous hybrid blocking flow-shop scheduling problem(DHHBFSP)designed to minimize the total tardiness and total energy consumption simultaneously,and proposes an improved pr... This paper investigates a distributed heterogeneous hybrid blocking flow-shop scheduling problem(DHHBFSP)designed to minimize the total tardiness and total energy consumption simultaneously,and proposes an improved proximal policy optimization(IPPO)method to make real-time decisions for the DHHBFSP.A multi-objective Markov decision process is modeled for the DHHBFSP,where the reward function is represented by a vector with dynamic weights instead of the common objectiverelated scalar value.A factory agent(FA)is formulated for each factory to select unscheduled jobs and is trained by the proposed IPPO to improve the decision quality.Multiple FAs work asynchronously to allocate jobs that arrive randomly at the shop.A two-stage training strategy is introduced in the IPPO,which learns from both single-and dual-policy data for better data utilization.The proposed IPPO is tested on randomly generated instances and compared with variants of the basic proximal policy optimization(PPO),dispatch rules,multi-objective metaheuristics,and multi-agent reinforcement learning methods.Extensive experimental results suggest that the proposed strategies offer significant improvements to the basic PPO,and the proposed IPPO outperforms the state-of-the-art scheduling methods in both convergence and solution quality. 展开更多
关键词 Multi-objective Markov decision process Multi-agent deep reinforcement learning Proximal policy optimization Distributed hybrid flow-shop scheduling Blocking constraints
在线阅读 下载PDF
C-SPPO:A deep reinforcement learning framework for large-scale dynamic logistics UAV routing problem
12
作者 Fei WANG Honghai ZHANG +2 位作者 Sen DU Mingzhuang HUA Gang ZHONG 《Chinese Journal of Aeronautics》 2025年第5期296-316,共21页
Unmanned Aerial Vehicle(UAV)stands as a burgeoning electric transportation carrier,holding substantial promise for the logistics sector.A reinforcement learning framework Centralized-S Proximal Policy Optimization(C-S... Unmanned Aerial Vehicle(UAV)stands as a burgeoning electric transportation carrier,holding substantial promise for the logistics sector.A reinforcement learning framework Centralized-S Proximal Policy Optimization(C-SPPO)based on centralized decision process and considering policy entropy(S)is proposed.The proposed framework aims to plan the best scheduling scheme with the objective of minimizing both the timeout of order requests and the flight impact of UAVs that may lead to conflicts.In this framework,the intents of matching act are generated through the observations of UAV agents,and the ultimate conflict-free matching results are output under the guidance of a centralized decision maker.Concurrently,a pre-activation operation is introduced to further enhance the cooperation among UAV agents.Simulation experiments based on real-world data from New York City are conducted.The results indicate that the proposed CSPPO outperforms the baseline algorithms in the Average Delay Time(ADT),the Maximum Delay Time(MDT),the Order Delay Rate(ODR),the Average Flight Distance(AFD),and the Flight Impact Ratio(FIR).Furthermore,the framework demonstrates scalability to scenarios of different sizes without requiring additional training. 展开更多
关键词 Unmanned aerial vehicle Vehicle routing problem Orderdelivery Reinforcement learning MULTI-AGENT Proximal policy optimization
原文传递
OPTIMAL HARVESTING POLICY FOR INSHORE-OFFSHORE FISHERY MODEL WITH IMPULSIVE DIFFUSION 被引量:7
13
作者 董玲珍 陈兰荪 孙丽华 《Acta Mathematica Scientia》 SCIE CSCD 2007年第2期405-412,共8页
This article studies the inshore-offshore fishery model with impulsive diffusion. The existence and global asymptotic stability of both the trivial periodic solution and the positive periodic solution are obtained. Th... This article studies the inshore-offshore fishery model with impulsive diffusion. The existence and global asymptotic stability of both the trivial periodic solution and the positive periodic solution are obtained. The complexity of this system is also analyzed. Moreover, the optimal harvesting policy are given for the inshore subpopulation, which includes the maximum sustainable yield and the corresponding harvesting effort. 展开更多
关键词 Impulsive diffusion inshore-offshore fishery model global asymptotic stability periodic solution optimal harvesting policy
在线阅读 下载PDF
THE OPTIMAL STRATEGY FOR INSURANCE COMPANY UNDER THE INFLUENCE OF TERMINAL VALUE 被引量:3
14
作者 刘伟 袁海丽 胡亦钧 《Acta Mathematica Scientia》 SCIE CSCD 2011年第3期1077-1090,共14页
This paper considers a model of an insurance company which is allowed to invest a risky asset and to purchase proportional reinsurance. The objective is to find the policy which maximizes the expected total discounted... This paper considers a model of an insurance company which is allowed to invest a risky asset and to purchase proportional reinsurance. The objective is to find the policy which maximizes the expected total discounted dividend pay-out until the time of bankruptcy and the terminal value of the company under liquidity constraint. We find the solution of this problem via solving the problem with zero terminal value. We also analyze the influence of terminal value on the optimal policy. 展开更多
关键词 proportional reinsurance terminal value optimal policy HJB equation
在线阅读 下载PDF
Robust analysis of discounted Markov decision processes with uncertain transition probabilities 被引量:3
15
作者 LOU Zhen-kai HOU Fu-jun LOU Xu-ming 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2020年第4期417-436,共20页
Optimal policies in Markov decision problems may be quite sensitive with regard to transition probabilities.In practice,some transition probabilities may be uncertain.The goals of the present study are to find the rob... Optimal policies in Markov decision problems may be quite sensitive with regard to transition probabilities.In practice,some transition probabilities may be uncertain.The goals of the present study are to find the robust range for a certain optimal policy and to obtain value intervals of exact transition probabilities.Our research yields powerful contributions for Markov decision processes(MDPs)with uncertain transition probabilities.We first propose a method for estimating unknown transition probabilities based on maximum likelihood.Since the estimation may be far from accurate,and the highest expected total reward of the MDP may be sensitive to these transition probabilities,we analyze the robustness of an optimal policy and propose an approach for robust analysis.After giving the definition of a robust optimal policy with uncertain transition probabilities represented as sets of numbers,we formulate a model to obtain the optimal policy.Finally,we define the value intervals of the exact transition probabilities and construct models to determine the lower and upper bounds.Numerical examples are given to show the practicability of our methods. 展开更多
关键词 Markov decision processes uncertain transition probabilities robustness and sensitivity robust optimal policy value interval
在线阅读 下载PDF
基于多智能体深度强化学习的无人机路径规划 被引量:16
16
作者 司鹏搏 吴兵 +2 位作者 杨睿哲 李萌 孙艳华 《北京工业大学学报》 CAS CSCD 北大核心 2023年第4期449-458,共10页
为解决多无人机(unmanned aerial vehicle, UAV)在复杂环境下的路径规划问题,提出一个多智能体深度强化学习UAV路径规划框架.该框架首先将路径规划问题建模为部分可观测马尔可夫过程,采用近端策略优化算法将其扩展至多智能体,通过设计UA... 为解决多无人机(unmanned aerial vehicle, UAV)在复杂环境下的路径规划问题,提出一个多智能体深度强化学习UAV路径规划框架.该框架首先将路径规划问题建模为部分可观测马尔可夫过程,采用近端策略优化算法将其扩展至多智能体,通过设计UAV的状态观测空间、动作空间及奖赏函数等实现多UAV无障碍路径规划;其次,为适应UAV搭载的有限计算资源条件,进一步提出基于网络剪枝的多智能体近端策略优化(network pruning-based multi-agent proximal policy optimization, NP-MAPPO)算法,提高了训练效率.仿真结果验证了提出的多UAV路径规划框架在各参数配置下的有效性及NP-MAPPO算法在训练时间上的优越性. 展开更多
关键词 无人机(unmanned aerial vehicle UAV) 复杂环境 路径规划 马尔可夫决策过程 多智能体近端策略优化算法(multi-agent proximal policy optimization MAPPO) 网络剪枝(network pruning NP)
在线阅读 下载PDF
Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning 被引量:4
17
作者 Jia-yi Liu Gang Wang +2 位作者 Qiang Fu Shao-hua Yue Si-yuan Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第1期210-219,共10页
The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to... The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified. 展开更多
关键词 Ground-to-air confrontation Task assignment General and narrow agents Deep reinforcement learning Proximal policy optimization(PPO)
在线阅读 下载PDF
Optimal Static Partition Configuration in ARINC653 System 被引量:4
18
作者 Sheng-Lin Gui Lei Luo +1 位作者 Sen-Sen Tang Yang Meng 《Journal of Electronic Science and Technology》 CAS 2011年第4期373-378,共6页
ARINC653 systems,which have been widely used in avionics industry,are an important class of safety-critical applications.Partitions are the core concept in the Arinc653 system architecture.Due to the existence of part... ARINC653 systems,which have been widely used in avionics industry,are an important class of safety-critical applications.Partitions are the core concept in the Arinc653 system architecture.Due to the existence of partitions,the system designer must allocate adequate time slots statically to each partition in the design phase.Although some time slot allocation policies could be borrowed from task scheduling policies,no existing literatures give an optimal allocation policy.In this paper,we present a partition configuration policy and prove that this policy is optimal in the sense that if this policy fails to configure adequate time slots to each partition,nor do other policies.Then,by simulation,we show the effects of different partition configuration policies on time slot allocation of partitions and task response time,respectively. 展开更多
关键词 ARINC653 earliest-next release time first policy optimal partition configuration policy real-time systems.
在线阅读 下载PDF
Multi-agent reinforcement learning for edge information sharing in vehicular networks 被引量:3
19
作者 Ruyan Wang Xue Jiang +5 位作者 Yujie Zhou Zhidu Li Dapeng Wu Tong Tang Alexander Fedotov Vladimir Badenko 《Digital Communications and Networks》 SCIE CSCD 2022年第3期267-277,共11页
To guarantee the heterogeneous delay requirements of the diverse vehicular services,it is necessary to design a full cooperative policy for both Vehicle to Infrastructure(V2I)and Vehicle to Vehicle(V2V)links.This pape... To guarantee the heterogeneous delay requirements of the diverse vehicular services,it is necessary to design a full cooperative policy for both Vehicle to Infrastructure(V2I)and Vehicle to Vehicle(V2V)links.This paper investigates the reduction of the delay in edge information sharing for V2V links while satisfying the delay requirements of the V2I links.Specifically,a mean delay minimization problem and a maximum individual delay minimization problem are formulated to improve the global network performance and ensure the fairness of a single user,respectively.A multi-agent reinforcement learning framework is designed to solve these two problems,where a new reward function is proposed to evaluate the utilities of the two optimization objectives in a unified framework.Thereafter,a proximal policy optimization approach is proposed to enable each V2V user to learn its policy using the shared global network reward.The effectiveness of the proposed approach is finally validated by comparing the obtained results with those of the other baseline approaches through extensive simulation experiments. 展开更多
关键词 Vehicular networks Edge information sharing Delay guarantee Multi-agent reinforcement learning Proximal policy optimization
在线阅读 下载PDF
Human Machine Collaborative Support Scheduling System of Intelligence Information from Multiple Unmanned Aerial Vehicles Based on Eye Tracker 被引量:2
20
作者 简立轩 尹栋 +1 位作者 沈林成 牛轶峰 《Journal of Shanghai Jiaotong university(Science)》 EI 2017年第3期322-328,共7页
Many human-machine collaborative support scheduling systems are used to aid human decision making by providing several optimal scheduling algorithms that do not take operator's attention into consideration.However... Many human-machine collaborative support scheduling systems are used to aid human decision making by providing several optimal scheduling algorithms that do not take operator's attention into consideration.However, the current systems should take advantage of the operator's attention to obtain the optimal solution.In this paper, we innovatively propose a human-machine collaborative support scheduling system of intelligence information from multi-UAVs based on eye-tracker. Firstly, the target recognition algorithm is applied to the images from the multiple unmanned aerial vehicles(multi-UAVs) to recognize the targets in the images. Then,the support system utilizes the eye tracker to gain the eye-gaze points which are intended to obtain the focused targets in the images. Finally, the heuristic scheduling algorithms take both the attributes of targets and the operator's attention into consideration to obtain the sequence of the images. As the processing time of the images collected by the multi-UAVs is uncertain, however the upper bounds and lower bounds of the processing time are known before. So the processing time of the images is modeled by the interval processing time. The objective of the scheduling problem is to minimize mean weighted completion time. This paper proposes some new polynomial time heuristic scheduling algorithms which firstly schedule the images including the focused targets. We conduct the scheduling experiments under six different distributions. The results indicate that the proposed algorithm is not sensitive to the different distributions of the processing time and has a negligible computational time. The absolute error of the best performing heuristic solution is only about 1%. Then, we incorporate the best performing heuristic algorithm into the human-machine collaborative support systems to verify the performance of the system. 展开更多
关键词 eye tracker polynomial time heuristics human machine interaction collaborative support scheduling system receding horizon optimization policy
原文传递
上一页 1 2 5 下一页 到第
使用帮助 返回顶部