This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with u...This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.展开更多
文摘目的:调查影响围绝经期异常子宫出血(Abnormal uterine bleeding,AUB)合并子宫内膜病变的相关因素,提出针对性的早期诊治策略。方法:回顾性收集2023年1月至2024年12月九江市经济技术开发区人民医院收治的70例围绝经期AUB患者的临床资料,分析患者的子宫内膜病理检查结果,将研究对象分为两组,正常子宫内膜组(正常组)和子宫内膜病变组(病变组),并利用单因素分析和Logistic多因素回归分析统计围绝经期AUB合并子宫内膜病变的独立影响因素。结果:本研究纳入的70例患者中,正常子宫内膜共22例,占比为31.43%,子宫内膜病变共48例,占比为68.57%。单因素分析显示年龄、体质指数(Body mass index,BMI)、子宫内膜厚度、出血持续时间、糖尿病、宫内节孕器置入、家族肿瘤史在两组间差异显著(P<0.05)。多因素Logistic回归分析显示年龄、子宫内膜厚度和糖尿病是围绝经期AUB合并子宫内膜病变的独立影响因素(P<0.05)。子宫内膜厚度与围绝经期AUB合并子宫内膜病变的关联性最强,OR=4.654,95%CI:1.976~10.960(P<0.001)。结论:年龄增长、子宫内膜增厚及糖尿病是围绝经期AUB合并子宫内膜病变的独立危险因素,临床应重点关注高龄、糖尿病及子宫内膜增厚患者的早期筛查与干预。
基金supported by the National Natural Science Foundation of China(Grant No.12072090)。
文摘This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.