摘要
目的西尼罗河病毒(West Nile Virus,WNV)是常见的蚊媒人畜共患传染病病毒,目前对病毒传播因素的研究主要涉及生态、宿主因素及相关病毒基因或基因位点等层面;对病毒基因组组成特征的关注较少。而基于病毒基因组组成特征的人工智能在识别和预测其他病毒宿主适应性方面成果颇多。本研究旨在建立一个卷积神经网络(CNN)模型,根据基因组特征预测WNV的宿主适应性。方法首先,采用二核苷酸组成性表征(DCR)编码WNV基因序列的基因组特征。其次,采用非监督学习对WNV样本进行基于DCR的分布差异分析。最后,利用基于DCR的卷积神经网络(CNN)模型预测来自鸟类、哺乳动物和蚊虫的WNV的适应性。此外,通过贝叶斯方法推断出适应性相关基因上的特异性氨基酸残基。结果DCR能有效区分宿主特异性WNV,CNN模型能准确预测哺乳动物和禽类高适应的WNV。贝叶斯模型可确定适应性相关的氨基酸残基。结论WNV的基因组组成特征具有宿主特异性,这种基因组偏差有助于通过深度学习方法预测WNV对禽类或哺乳动物宿主的适应性。本研究提供了关于有助于西尼罗河病毒(WNV)适应宿主的基因组特征的共性特征。
Objective West Nile virus(WNV)is one of the most common mosquito-borne zoonotic viruses worldwide,with unique transmission dynamics and varied hosts.Lots of ecological and host factors have been reported to influence the host adaptation and transmission of WNVs,however,general genomic features of WNVs are less focused,except for some exact host-specific genotypes at molecular level.Artificial intelligence that analyzes genome composition characteristics currently shows significant advantages in identifying and predicting viral host adaptability.This research aimed to establish a convolutional neural network(CNN)model to predict the host adaptability of WNVs based on general genomic features.Methods Presently available WNV gene sequences were embedded for their genomic features with an embedding approach of dinucleotide composition representation(DCR).And DCR-based distribution difference of WNV samples among various hosts was performed with unsupervised learning methods.Then a classification model was built with a convolutional neural network(CNN)framework based on genomic DCR to evaluate the adaptation of the WNVs from birds,mammals and mosquitos.Additionally,host-specific amino acids in WNV proteins were inferred via Bayes method.Results DCR features could effectively distinguish host-specific WNVs.The trained CNN model predicted accurately mammalian susceptible WNVs from avian susceptible WNVs,however,much less accurately for mosquito/mammalian WNVs.Such predicted host adaptation was interpreted as host specified significance of biased amino acid distribution on the bayes-inferred sites in WNV proteins,implying a possible high significance of these sites for WNV adaptive phenotypes.Conclusions Genomic compositional features of WNVs are host-specific,and such genomic bias facilitates predicting the adaptation of WNVs to avian or mammalian hosts via deep learning methods.DCR-based decomposition is helpful to recognize the high risk of infecting mammals of WNVs.The present study provides a general knowledge of genomic features contributing to host adaptation to WNVs.
作者
蔡玉荣
曾丹丹
张森
李靖
福泉
CAI Yu-rong;ZENG Dan-dan;ZHANG Sen;LI Jing;FU Quan(Affiliated Hospital of Inner Mongolia Medical University(clinical laboratory),Hohhot 010030,Inner Mongolia,China;State Key Laboratory of Pathogen and Biosecurity,Academy of Military Medical Sciences,Beijing 100071,China)
出处
《寄生虫与医学昆虫学报》
2025年第2期84-92,共9页
Acta Parasitologica et Medica Entomologica Sinica
基金
国家重点研发计划项目(2024YFC2607500)。