摘要
结构关系是指序列之间存在的并发、互斥、关联关系,生物序列中也存在这类结构关系,结构关系挖掘可以将这些结构关系提取出来,转化为有价值可利用的信息。论文将数据挖掘技术与生物信息技术相结合,对生物序列进行变异点提取和结构关系挖掘。作为测试案例,选取了新冠病毒序列作为挖掘对象,对五个国家的Delta(B.1.617.2)变异毒株的S蛋白质序列进行挖掘,发现这些序列相对于B.1参考序列均存在变异情况,且这些变异点之间包含并发、互斥、关联关系。对实验结果进行生物学分析,结合现有的病毒感染性和疫苗有效性等文献,证实挖掘得到的变异点的结构关系具有生物学意义。
Structural relation refers to the concurrent relation,exclusive relation and associated relation among sequences,which also exists in biological sequences.Structural relation mining can extract these relations and transform them into valuable information.Data mining technology and bioinformation technology are combined to extract the mutation points and excavate the structural relation of biological sequences.As a test case,the COVID-19 sequence is selected as the mining object to mine the S protein sequence of Delta(B.1.617.2)mutant strains in five countries.It is found that these sequences had variation relative to the B.1 reference sequence,and there are concurrent relation,exclusive relation and associated relation among these variation points.The biological analysis of the experimental results,combined with the existing literature on viral infectivity and vaccine effectiveness,confirms that the structural relation of the mutation points obtained by mining has biological significance.
作者
韩静
陈未如
张雪
高胜召
陈章昭
HAN Jing;CHEN Weiru;ZHANG Xue;GAO Shengzhao;CHEN Zhangzhao(College of Computer Science and Technology,Shenyang University of Chemical Technology,Shenyang 110142;Liaoning Key Laborotary of Industrial Intelligence Technology on Chemical Process,Shenyang 110142)
出处
《计算机与数字工程》
2025年第9期2428-2432,2454,共6页
Computer & Digital Engineering
基金
辽宁省“百千万人才工程”项目(编号:辽人社[2019]45号)
辽宁省自然基金项目(编号:2022-MS-291)
辽宁省教育厅科研项目(编号:LJ2020024,2022)
辽宁省教育厅基本科研项目(编号:LJKMZ20220781,LJKMZ20220783)资助。
关键词
生物序列
结构关系
并发关系
互斥关系
关联关系
biological sequence
structural relation
concurrent relation
exclusive relation
associated relation