摘要
目的 探讨在基因表达谱数据分析中用主成分分析方法结合层次聚类法与K -均值聚类方法两种分类方法对组织样品分类的分类效果。方法 用主成分分析方法对数据降维后进行聚类分析 ,与不经主成分分析直接聚类 ,并结合筛选与组织样品分型相关的基因的各种筛选水平 ,评价聚类效果。结果 用约当指数进行评价两种聚类方法 :经主成分分析后用提取的主成分聚类与不用主成分的直接聚类效果不同 ;不同筛选相关基因的筛选水平对聚类效果也有影响。结论 对组织样品做聚类分析时 ,主成分分析能提高聚类质量 。
Objective To study application of principal components analysis in gene expression profile when hierarchical clustering and K-means clustering were applied to tissues samples classification. Methods To compare the two clustered results, one is clustering directly, another is clustering after principal components analysis, integrated different significant levels of selecting relative genes. Results By using Rand index, there are some difference between the two clustered results, and it is also associated with different significant levels. Conclusion The result showed that principal components analysis improved cluster quality, and they also were different with the number of differential expression genes filtered.
出处
《中国卫生统计》
CSCD
北大核心
2003年第1期2-5,共4页
Chinese Journal of Health Statistics