Reinforcement learning encounters formidable challenges when tasked with intricate decision-making scenarios,primarily due to the expansive parameterized action spaces and the vastness of the corresponding policy land...Reinforcement learning encounters formidable challenges when tasked with intricate decision-making scenarios,primarily due to the expansive parameterized action spaces and the vastness of the corresponding policy landscapes.To surmount these difficulties,we devise a practical structured action graph model augmented by guiding policies that integrate trust region constraints.Based on this,we propose guided proximal policy optimization with structured action graph(GPPO-SAG),which has demonstrated pronounced efficacy in refining policy learning and enhancing performance across sophisticated tasks characterized by parameterized action spaces.Rigorous empirical evaluations of our model have been performed on comprehensive gaming platforms,including the entire suite of StarCraft II and Hearthstone,yielding exceptionally favorable outcomes.Our source code is at https://github.com/sachiel321/GPPO-SAG.展开更多
抗体在免疫应答、疾病防御等方面都发挥着至关重要的作用。目前,抗体蛋白质结构预测方法对抗体的互补决定区预测仍然是一个挑战。本文设计了一种基于图神经网络的抗体蛋白质结构优化方法——图神经网络的抗体优化模型(optimization mode...抗体在免疫应答、疾病防御等方面都发挥着至关重要的作用。目前,抗体蛋白质结构预测方法对抗体的互补决定区预测仍然是一个挑战。本文设计了一种基于图神经网络的抗体蛋白质结构优化方法——图神经网络的抗体优化模型(optimization model of immune body structure based on graph neural network,GraphIR),给定抗体的初始结构,通过抗体预训练语言模型获得初始结构的互补决定区序列表征和其他序列特征以及结构特征。然后,设计了一个针对抗体结构优化的等变图神经网络优化抗体互补决定区结构。实验结果显示,在46个抗体基准测试集上,GraphIR预测的互补决定区(complementarity determining regions,CDR)H3区域的平均均方根偏差(root mean square deviation,RMSD)为1.37Å,预测精度比对标的方法ABodyBuilder、RepertoireBuilder、RosettaAntibody和Deep-Ab分别提升了7.45%、6.11%、7.65%和1.27%。展开更多
基金supported by National Nature Science Foundation of China(Nos.62073324,6200629,61771471 and 91748131)in part by the InnoHK Project,China.
文摘Reinforcement learning encounters formidable challenges when tasked with intricate decision-making scenarios,primarily due to the expansive parameterized action spaces and the vastness of the corresponding policy landscapes.To surmount these difficulties,we devise a practical structured action graph model augmented by guiding policies that integrate trust region constraints.Based on this,we propose guided proximal policy optimization with structured action graph(GPPO-SAG),which has demonstrated pronounced efficacy in refining policy learning and enhancing performance across sophisticated tasks characterized by parameterized action spaces.Rigorous empirical evaluations of our model have been performed on comprehensive gaming platforms,including the entire suite of StarCraft II and Hearthstone,yielding exceptionally favorable outcomes.Our source code is at https://github.com/sachiel321/GPPO-SAG.
文摘抗体在免疫应答、疾病防御等方面都发挥着至关重要的作用。目前,抗体蛋白质结构预测方法对抗体的互补决定区预测仍然是一个挑战。本文设计了一种基于图神经网络的抗体蛋白质结构优化方法——图神经网络的抗体优化模型(optimization model of immune body structure based on graph neural network,GraphIR),给定抗体的初始结构,通过抗体预训练语言模型获得初始结构的互补决定区序列表征和其他序列特征以及结构特征。然后,设计了一个针对抗体结构优化的等变图神经网络优化抗体互补决定区结构。实验结果显示,在46个抗体基准测试集上,GraphIR预测的互补决定区(complementarity determining regions,CDR)H3区域的平均均方根偏差(root mean square deviation,RMSD)为1.37Å,预测精度比对标的方法ABodyBuilder、RepertoireBuilder、RosettaAntibody和Deep-Ab分别提升了7.45%、6.11%、7.65%和1.27%。