Neural machine translation(NMT)has been widely applied to high-resource language pairs,but its dependence on large-scale data results in poor performance in low-resource scenarios.In this paper,we propose a transfer-l...Neural machine translation(NMT)has been widely applied to high-resource language pairs,but its dependence on large-scale data results in poor performance in low-resource scenarios.In this paper,we propose a transfer-learning-based approach called shared space transfer for zero-resource NMT.Our method leverages a pivot pre-trained language model(PLM)to create a shared representation space,which is used in both auxiliary source→pivot(Ms2p)and(Mp2t)translation models.Specifically,we exploit pivot PLM to initialize the Ms2p decoder pivot→targetand Mp2t encoder,while adopting a freezing strategy during the training process.We further propose a feature converter to mitigate representation space deviations by converting the features from the source encoder into the shared representation space.The converter is trained using the synthetic parallel corpus.The final Ms2t model source→targetcombines the Ms2p encoder,feature converter,and Mp2t decoder.We conduct simulation experiments using English as the pivot language for and translations.We finally test our method German→French,German→Czech,Turkish→Hindion a real zero-resource language pair,with Chinese as the pivot language.Experiment results Mongolian→Vietnameseshow that our method achieves high translation quality,with better Translation Error Rate(TER)and BLEU scores compared with other pivot-based methods.The step-wise pre-training with our feature converter outperforms baseline models in terms of COMET scores.展开更多
设计图像块特征表示是计算机视觉领域内的基本研究内容,优秀的图像块特征表示能够有效地提高图像分类、对象识别等相关算法的性能.SIFT(scale-invariant feature transform)和HOG(histogram of oriented gradient)是人为设计图像块特征...设计图像块特征表示是计算机视觉领域内的基本研究内容,优秀的图像块特征表示能够有效地提高图像分类、对象识别等相关算法的性能.SIFT(scale-invariant feature transform)和HOG(histogram of oriented gradient)是人为设计图像块特征表示的优秀代表,然而,人为设计图像块特征间的差异往往不能足够理想地反映图像块间的相似性.核描述子(kernel descriptor,简称KD)方法提供了一种新的方式生成图像块特征,在图像块间匹配核函数基础上,应用核主成分分析(kernel principal component analysis,简称KPCA)方法进行特征表示,且在图像分类应用上获得不错的性能.但是,该方法需要利用所有联合基向量去生成核描述子特征,导致算法时间复杂度较高.为了解决这个问题,提出了一种算法生成图像块特征表示,称为有效图像块描述子(efficient patch-level descriptor,简称EPLd).算法建立在不完整Cholesky分解基础上,自动选择少量的标志性图像块以提高算法效率,且利用MMD(maximum mean discrepancy)距离计算图像间的相似性.实验结果表明,该算法在图像/场景分类应用中获得了优秀的性能.展开更多
目的:从“少阳主枢”理论出发,通过网络药理学和分子对接技术探讨小柴胡汤治疗偏头痛的作用机制。方法:通过(traditional Chinese medicine systems pharmacology database and analysis platform,TCMSP)数据库获取小柴胡汤有效药物成...目的:从“少阳主枢”理论出发,通过网络药理学和分子对接技术探讨小柴胡汤治疗偏头痛的作用机制。方法:通过(traditional Chinese medicine systems pharmacology database and analysis platform,TCMSP)数据库获取小柴胡汤有效药物成分及相关靶点;利用在线人类孟德尔遗传数据库(online mendelian inheritance in man,OMIM)、人类基因数据库(the human gene database,GeneCards)获取偏头痛疾病的相关靶点;绘制韦恩图得到小柴胡汤治疗偏头痛的潜在作用靶点;采用STRING数据库、Cytoscape 3.9.1软件构建“药物-成分-疾病-靶点”网络、蛋白-蛋白互作(protein-protein interactions,PPI)网络;使用核心靶基因导入基因功能注释数据库(the database for annotation visualization and integrated discovery,DAVID)对潜在作用靶点进行基因本体论(gene ontology,GO)和京都基因与基因组百科全书(kyoto encyclopedia of genes and genomes,KEGG)富集分析;运用Autodock软件进行分子对接并用Py-Mol软件进行绘图展示。结果:共获得小柴胡汤有效药物成分214个;小柴胡汤治疗偏头痛的潜在作用靶点148个;使用Cytoscape 3.9.1拓扑分析得到槲皮素、豆甾醇、β-谷甾醇等10个关键成分,PPI网络得到TNF、JUN、AKT1等5个核心靶点;GO功能富集分析得到条目810条,包括生物过程604条,细胞组分73条,分子功能133条;KEGG通路富集分析得到172条信号通路,主要包括脂质与动脉粥样硬化、IL-17、TNF等信号通路;分子对接结果显示关键成分与核心靶点有较好的结合能力。结论:小柴胡汤通过调节IL-17、TNF等信号通路,作用于TNF、JUN、AKT1等核心靶点,以调节大脑炎症反应为核心,对偏头痛发挥治疗作用。展开更多
基金funded by the National Natural Science Foundation of China(Grant number:Nos.62172341 and 12204386)Sichuan Natural Science Foundation(Grant number:No.2024NSFSC1375)+1 种基金Youth Foundation of Inner Mongolia Natural Science Foundation(Grant number:No.2024QN06017)Basic Scientific Research Business Fee Project for Universities in Inner Mongolia(Grant number:No.0406082215).
文摘Neural machine translation(NMT)has been widely applied to high-resource language pairs,but its dependence on large-scale data results in poor performance in low-resource scenarios.In this paper,we propose a transfer-learning-based approach called shared space transfer for zero-resource NMT.Our method leverages a pivot pre-trained language model(PLM)to create a shared representation space,which is used in both auxiliary source→pivot(Ms2p)and(Mp2t)translation models.Specifically,we exploit pivot PLM to initialize the Ms2p decoder pivot→targetand Mp2t encoder,while adopting a freezing strategy during the training process.We further propose a feature converter to mitigate representation space deviations by converting the features from the source encoder into the shared representation space.The converter is trained using the synthetic parallel corpus.The final Ms2t model source→targetcombines the Ms2p encoder,feature converter,and Mp2t decoder.We conduct simulation experiments using English as the pivot language for and translations.We finally test our method German→French,German→Czech,Turkish→Hindion a real zero-resource language pair,with Chinese as the pivot language.Experiment results Mongolian→Vietnameseshow that our method achieves high translation quality,with better Translation Error Rate(TER)and BLEU scores compared with other pivot-based methods.The step-wise pre-training with our feature converter outperforms baseline models in terms of COMET scores.
文摘设计图像块特征表示是计算机视觉领域内的基本研究内容,优秀的图像块特征表示能够有效地提高图像分类、对象识别等相关算法的性能.SIFT(scale-invariant feature transform)和HOG(histogram of oriented gradient)是人为设计图像块特征表示的优秀代表,然而,人为设计图像块特征间的差异往往不能足够理想地反映图像块间的相似性.核描述子(kernel descriptor,简称KD)方法提供了一种新的方式生成图像块特征,在图像块间匹配核函数基础上,应用核主成分分析(kernel principal component analysis,简称KPCA)方法进行特征表示,且在图像分类应用上获得不错的性能.但是,该方法需要利用所有联合基向量去生成核描述子特征,导致算法时间复杂度较高.为了解决这个问题,提出了一种算法生成图像块特征表示,称为有效图像块描述子(efficient patch-level descriptor,简称EPLd).算法建立在不完整Cholesky分解基础上,自动选择少量的标志性图像块以提高算法效率,且利用MMD(maximum mean discrepancy)距离计算图像间的相似性.实验结果表明,该算法在图像/场景分类应用中获得了优秀的性能.
文摘目的:从“少阳主枢”理论出发,通过网络药理学和分子对接技术探讨小柴胡汤治疗偏头痛的作用机制。方法:通过(traditional Chinese medicine systems pharmacology database and analysis platform,TCMSP)数据库获取小柴胡汤有效药物成分及相关靶点;利用在线人类孟德尔遗传数据库(online mendelian inheritance in man,OMIM)、人类基因数据库(the human gene database,GeneCards)获取偏头痛疾病的相关靶点;绘制韦恩图得到小柴胡汤治疗偏头痛的潜在作用靶点;采用STRING数据库、Cytoscape 3.9.1软件构建“药物-成分-疾病-靶点”网络、蛋白-蛋白互作(protein-protein interactions,PPI)网络;使用核心靶基因导入基因功能注释数据库(the database for annotation visualization and integrated discovery,DAVID)对潜在作用靶点进行基因本体论(gene ontology,GO)和京都基因与基因组百科全书(kyoto encyclopedia of genes and genomes,KEGG)富集分析;运用Autodock软件进行分子对接并用Py-Mol软件进行绘图展示。结果:共获得小柴胡汤有效药物成分214个;小柴胡汤治疗偏头痛的潜在作用靶点148个;使用Cytoscape 3.9.1拓扑分析得到槲皮素、豆甾醇、β-谷甾醇等10个关键成分,PPI网络得到TNF、JUN、AKT1等5个核心靶点;GO功能富集分析得到条目810条,包括生物过程604条,细胞组分73条,分子功能133条;KEGG通路富集分析得到172条信号通路,主要包括脂质与动脉粥样硬化、IL-17、TNF等信号通路;分子对接结果显示关键成分与核心靶点有较好的结合能力。结论:小柴胡汤通过调节IL-17、TNF等信号通路,作用于TNF、JUN、AKT1等核心靶点,以调节大脑炎症反应为核心,对偏头痛发挥治疗作用。