摘要
随着单细胞测序技术的发展,许多基于单细胞RNA测序数据的聚类算法被提出,用于单细胞分类,并取得较好的应用效果。但是到目前为止,单细胞聚类算法研究领域缺乏关于聚类模型的综述,缺乏对不同聚类模型的性能评估。本文从聚类模型的角度将常见的11种单细胞聚类算法分成了K邻近聚类、层次聚类、基于图形分类、基于模型分类、基于密度分类的5种类型,对相关算法的特点和研究进展进行总结,并选择了10组scRNA-seq数据集对这些聚类算法进行性能评价。实验结果表明,现有聚类方法中SC3、Seurat和SIMLR的性能较好,在5类模型中,基于密度模型的算法具有最优性能,体现出较好的应用价值。
Withthe development of single-cell sequencing technology,many clustering algorithms based on single-cell RNA sequencing data have been proposed and applied to single cell classification and achieve good application results.But so far,the research field of single-cell clustering algorithms still lacks summary research on clustering models and performance evaluation studies of different clustering models.Therefore,from the perspective of the clustering model,we divide 11 common single-cell clustering algorithms into five types:K-means clustering,hierarchical clustering,graphic-based clustering,model-based clustering,and density-based clustering.The characteristics and research progress of related algorithms are summarized,and ten sc RNA-seq datasets are selected for evaluation of 11 clustering algorithms.The experimental results showthat the performance of SC3,Seurat,and SIMLR are better in the existing clustering methods.Among the five types of models,the algorithms based on the density model have the best performance and reflect good application value.
作者
何睿
余娜
李淼
张峻巍
王浩杰
赵玉茗
HE Rui;YU Na;LI Miao;ZHANG Junwei;WANG Haojie;ZHAO Yuming(Information and Computer Engineering College,Northeast Forestry University,Harbin 150000,China)
出处
《智能计算机与应用》
2020年第7期104-108,共5页
Intelligent Computer and Applications
基金
国家级大学生创新创业训练计划(201810225173)
国家自然科学基金(61971119)
关键词
细胞分类
聚类算法
单细胞测序
Cell classification
Cluster algorithm
Single cell sequencing