摘要
【目的】旨在发现潜在的抗肿瘤药物作用靶点,为日后临床工作及实验验证提供参考。【方法】从DrugBank数据库获取抗肿瘤药物靶点,结合HPRD数据库中蛋白质相互作用信息,使用Cytoscape建立药物靶点PPI网络并计算网络节点的拓扑属性,使用SPSS单因素分析和Weka信息增益原理筛选拓扑属性变量,采用SMOTE算法处理不平衡数据集问题,利用决策树方法构建抗肿瘤药物靶点预测模型,并与其他三种常见的机器学习分类算法模型进行性能比较。【结果】应用决策树算法构建的抗肿瘤药物靶点预测模型的预测准确率达73.18%,在CBioPortal中验证发现,结果中预测分数大于等于0.9的16个靶点在多种肿瘤中存在突变和扩增,并以NR5A1为例进行具体分析。【局限】仅使用抗肿瘤药物靶点的PPI网络属性构建预测模型,未加入靶点的功能、序列属性等特征。【结论】基于PPI网络的拓扑属性,采用机器学习方法对潜在的抗肿瘤药物靶点进行预测是有效的,可以为抗肿瘤药物的研发及临床工作提供一定参考。
[Objective] This paper tries to identify potential targets of antineoplastic drugs, aiming to provide references for future clinical work and experiment. [Methods] First, we retrieved the targets of antineoplastic drugs from the DrugBank database, which were also combined with the protein interaction information from the HPRD database. Then, we established the PPI network for these targets with Cytoscape and calculated the topology properties of the nodes. Third, we used SPSS single factor analysis and Weka’s information gain principle to choose the variables for topological attributes. Fourth, we introduced the SMOTE algorithm to process unbalanced data sets and constructed the prediction model for antineoplastic drug targets with the decision tree method. Finally, we compared the performance of our new model with those of the classic ones. [Results] The precision of the proposed model reached 73.18%. With the help of CBioPortal, we found 16 targets’ prediction scores higher than 0.9. These targets could mutate and amplify in various tumors, which were analyzed with the case of NR5A1. [Limitations] The characteristics of target functions, sequence attributes, and other factors should also be included to construct the model. [Conclusions] The proposed model could predict the potential targets of antineoplastic drugs effectively.
作者
范馨月
崔雷
Fan Xinyue;Cui Lei(School of Medical Informatics,China Medical University,Shenyang 110122,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2018年第12期98-108,共11页
Data Analysis and Knowledge Discovery
基金
赛尔网络下一代互联网技术创新项目"面向高等院校的医学影像学教学平台"(项目编号:NGII20150503)的研究成果之一
关键词
PPI网络
机器学习
决策树
抗肿瘤药靶点预测
PPI Network
Machine Learning
Decision Tree
Antineoplastic Drug Targets Prediction