This paper studied a supervisory control system for a hybrid off-highway electric vehicle under the chargesustaining(CS)condition.A new predictive double Q-learning with backup models(PDQL)scheme is proposed to optimi...This paper studied a supervisory control system for a hybrid off-highway electric vehicle under the chargesustaining(CS)condition.A new predictive double Q-learning with backup models(PDQL)scheme is proposed to optimize the engine fuel in real-world driving and improve energy efficiency with a faster and more robust learning process.Unlike the existing“model-free”methods,which solely follow on-policy and off-policy to update knowledge bases(Q-tables),the PDQL is developed with the capability to merge both on-policy and off-policy learning by introducing a backup model(Q-table).Experimental evaluations are conducted based on software-in-the-loop(SiL)and hardware-in-the-loop(HiL)test platforms based on real-time modelling of the studied vehicle.Compared to the standard double Q-learning(SDQL),the PDQL only needs half of the learning iterations to achieve better energy efficiency than the SDQL at the end learning process.In the SiL under 35 rounds of learning,the results show that the PDQL can improve the vehicle energy efficiency by 1.75%higher than SDQL.By implementing the PDQL in HiL under four predefined real-world conditions,the PDQL can robustly save more than 5.03%energy than the SDQL scheme.展开更多
针对无监督环境下传统网络异常诊断算法存在异常点定位和异常数据分类准确率低等不足,通过设计一种基于改进Q-learning算法的无线网络异常诊断方法:首先基于ADU(Asynchronous Data Unit异步数据单元)单元采集无线网络的数据流,并提取数...针对无监督环境下传统网络异常诊断算法存在异常点定位和异常数据分类准确率低等不足,通过设计一种基于改进Q-learning算法的无线网络异常诊断方法:首先基于ADU(Asynchronous Data Unit异步数据单元)单元采集无线网络的数据流,并提取数据包特征;然后构建Q-learning算法模型探索状态值和奖励值的平衡点,利用SA(Simulated Annealing模拟退火)算法从全局视角对下一时刻状态进行精确识别;最后确定训练样本的联合分布概率,提升输出值的逼近性能以达到平衡探索与代价之间的均衡。测试结果显示:改进Q-learning算法的网络异常定位准确率均值达99.4%,在不同类型网络异常的分类精度和分类效率等方面,也优于三种传统网络异常诊断方法。展开更多
Lanthanide ions(Ln^(3+))doping provides a potential strategy to control over the luminescent properties of lead-free halide double perovskite nanocrystals(DP NCs).However,due to the low energy transfer efficiency betw...Lanthanide ions(Ln^(3+))doping provides a potential strategy to control over the luminescent properties of lead-free halide double perovskite nanocrystals(DP NCs).However,due to the low energy transfer efficiency between self-trapped exciton(STE)and Ln^(3+)ions,the characteristic emissions of Ln^(3+)ions are not prominent.Furthermore,the energy transfer mechanism between STE and Ln^(3+)ions is also elusive and requires in-depth study.We chose trace Bi^(3+)-doped Cs_(2)Ag_(0.6)Na_(0.4)InCl_(6-x)Br_(x) as a representative DP matrix to demonstrate that by tuning the bromide concentration,the Ln^(3+)emission can be greatly enhanced.Such enhanced STE and Ln^(3+)ions energy transfer originates from the high covalency of Ln-Br bond,which contributes to improve ment of the characteristic emission of Ln^(3+)ions.Furthermo re,optical spectroscopy reveals that the energy transfer mechanism from DP to Eu^(3+)ions is different from all the other doped Ln^(3+)ions.The energy transfer from DP to Eu^(3+)ions is mostly through Eu-Br charge transfer while the other Ln^(3+)ions are excited by energy transfer from STE.The distinct energy transfer mechanism has resulted from the energy separation between the excited energy level of Ln^(3+)ions and the bottom of conduction band of DP.With increasing the energy separation,the energy transfer from STE to Ln^(3+)ions is less efficient because of the generation of a larger number of phonons and finally becomes impossible for Eu^(3+)ions.Our results provide new insight into tuning the energy transfer of Ln^(3+)-doped DP NCs.展开更多
In this paper,the problem of trajectory de-sign of unmanned aerial vehicles(UAVs)for maximizing the number of satisfied users is studied in a UAV based cellular network where the UAV works as a flying base station tha...In this paper,the problem of trajectory de-sign of unmanned aerial vehicles(UAVs)for maximizing the number of satisfied users is studied in a UAV based cellular network where the UAV works as a flying base station that serves users,and the user indicates its satis-faction in terms of completion of its data request within an allowable maximum waiting time.The trajectory design is formulated as an optimization problem whose goal is to maximize the number of satisfied users.To solve this problem,a machine learning framework based on double Q-learning algorithm is proposed.The algorithm enables the UAV tofind the optimal trajectory that maximizes the number of satisfied users.Compared to the traditional learning algorithms,such as Q-learning that selects and evaluates the action using the same Q-table,the proposed algorithm can decouple the selection from the evaluation,therefore avoid overestimation which leads to sub-optimal policies.Simulation results show that the proposed algorithm can achieve up to 19.4% and 14.1% gains in terms of the number of satisfied users compared to random algorithm and Q-learning algorithm.展开更多
基金Project(KF2029)supported by the State Key Laboratory of Automotive Safety and Energy(Tsinghua University),ChinaProject(102253)supported partially by the Innovate UK。
文摘This paper studied a supervisory control system for a hybrid off-highway electric vehicle under the chargesustaining(CS)condition.A new predictive double Q-learning with backup models(PDQL)scheme is proposed to optimize the engine fuel in real-world driving and improve energy efficiency with a faster and more robust learning process.Unlike the existing“model-free”methods,which solely follow on-policy and off-policy to update knowledge bases(Q-tables),the PDQL is developed with the capability to merge both on-policy and off-policy learning by introducing a backup model(Q-table).Experimental evaluations are conducted based on software-in-the-loop(SiL)and hardware-in-the-loop(HiL)test platforms based on real-time modelling of the studied vehicle.Compared to the standard double Q-learning(SDQL),the PDQL only needs half of the learning iterations to achieve better energy efficiency than the SDQL at the end learning process.In the SiL under 35 rounds of learning,the results show that the PDQL can improve the vehicle energy efficiency by 1.75%higher than SDQL.By implementing the PDQL in HiL under four predefined real-world conditions,the PDQL can robustly save more than 5.03%energy than the SDQL scheme.
文摘针对无监督环境下传统网络异常诊断算法存在异常点定位和异常数据分类准确率低等不足,通过设计一种基于改进Q-learning算法的无线网络异常诊断方法:首先基于ADU(Asynchronous Data Unit异步数据单元)单元采集无线网络的数据流,并提取数据包特征;然后构建Q-learning算法模型探索状态值和奖励值的平衡点,利用SA(Simulated Annealing模拟退火)算法从全局视角对下一时刻状态进行精确识别;最后确定训练样本的联合分布概率,提升输出值的逼近性能以达到平衡探索与代价之间的均衡。测试结果显示:改进Q-learning算法的网络异常定位准确率均值达99.4%,在不同类型网络异常的分类精度和分类效率等方面,也优于三种传统网络异常诊断方法。
基金Project supported by the Research Project of Mindu Innovation Laboratory(2021ZZ114)Natural Science Foundation of Xiamen(3502Z20227255)+1 种基金Major Research Project of Xiamen(3502Z20191015)the Science and Technology Major Project of Fujian Province(2021HZ021013)。
文摘Lanthanide ions(Ln^(3+))doping provides a potential strategy to control over the luminescent properties of lead-free halide double perovskite nanocrystals(DP NCs).However,due to the low energy transfer efficiency between self-trapped exciton(STE)and Ln^(3+)ions,the characteristic emissions of Ln^(3+)ions are not prominent.Furthermore,the energy transfer mechanism between STE and Ln^(3+)ions is also elusive and requires in-depth study.We chose trace Bi^(3+)-doped Cs_(2)Ag_(0.6)Na_(0.4)InCl_(6-x)Br_(x) as a representative DP matrix to demonstrate that by tuning the bromide concentration,the Ln^(3+)emission can be greatly enhanced.Such enhanced STE and Ln^(3+)ions energy transfer originates from the high covalency of Ln-Br bond,which contributes to improve ment of the characteristic emission of Ln^(3+)ions.Furthermo re,optical spectroscopy reveals that the energy transfer mechanism from DP to Eu^(3+)ions is different from all the other doped Ln^(3+)ions.The energy transfer from DP to Eu^(3+)ions is mostly through Eu-Br charge transfer while the other Ln^(3+)ions are excited by energy transfer from STE.The distinct energy transfer mechanism has resulted from the energy separation between the excited energy level of Ln^(3+)ions and the bottom of conduction band of DP.With increasing the energy separation,the energy transfer from STE to Ln^(3+)ions is less efficient because of the generation of a larger number of phonons and finally becomes impossible for Eu^(3+)ions.Our results provide new insight into tuning the energy transfer of Ln^(3+)-doped DP NCs.
基金supported in part by the National Natural Science Foundation of China under Grant 61671086 and Grant 61629101。
文摘In this paper,the problem of trajectory de-sign of unmanned aerial vehicles(UAVs)for maximizing the number of satisfied users is studied in a UAV based cellular network where the UAV works as a flying base station that serves users,and the user indicates its satis-faction in terms of completion of its data request within an allowable maximum waiting time.The trajectory design is formulated as an optimization problem whose goal is to maximize the number of satisfied users.To solve this problem,a machine learning framework based on double Q-learning algorithm is proposed.The algorithm enables the UAV tofind the optimal trajectory that maximizes the number of satisfied users.Compared to the traditional learning algorithms,such as Q-learning that selects and evaluates the action using the same Q-table,the proposed algorithm can decouple the selection from the evaluation,therefore avoid overestimation which leads to sub-optimal policies.Simulation results show that the proposed algorithm can achieve up to 19.4% and 14.1% gains in terms of the number of satisfied users compared to random algorithm and Q-learning algorithm.