Background:Single-cell RNA-sequencing(scRNA-seq)has emerged as a powerful tool for cancer research,enabling in-depth characterization of tumor heterogeneity at the single-cell level.Recently,several scRNA-seq copy num...Background:Single-cell RNA-sequencing(scRNA-seq)has emerged as a powerful tool for cancer research,enabling in-depth characterization of tumor heterogeneity at the single-cell level.Recently,several scRNA-seq copy number variation(scCNV)inference methods have been developed,expanding the application of scRNA-seq to study genetic heterogeneity in cancer using transcriptomic data.However,the fidelity of these methods has not been investigated systematically.Methods:We benchmarked five commonly used scCNV inference methods:HoneyBADGER,CopyKAT,CaSpER,inferCNV,and sciCNV.We evaluated their performance across four different scRNA-seq platforms using data from our previous multicenter study.We evaluated scCNV performance further using scRNA-seq datasets derived from mixed samples consisting of five human lung adenocarcinoma cell lines and also sequenced tissues from a small cell lung cancer patient and used the data to validate our findings with a clinical scRNA-seq dataset.Results:We found that the sensitivity and specificity of the five scCNV inference methods varied,depending on the selection of reference data,sequencing depth,and read length.CopyKAT and CaSpER outperformed other methods overall,while inferCNV,sciCNV,and CopyKAT performed better than other methods in subclone identification.We found that batch effects significantly affected the performance of subclone identification in mixed datasets in most methods we tested.Conclusion:Our benchmarking study revealed the strengths and weaknesses of each of these scCNV inference methods and provided guidance for selecting the optimal CNV inference method using scRNA-seq data.展开更多
基金carried out at the LLU Center for Genomics was funded in part by the Ardmore Institute of Health grant 2150141(CW)and Dr.Charles A.Sims’gift to LLU Center for Genomicssupported by the National Center for Biotechnology Information of the National Library of Medicine(NLM),National Institutes of Healt。
文摘Background:Single-cell RNA-sequencing(scRNA-seq)has emerged as a powerful tool for cancer research,enabling in-depth characterization of tumor heterogeneity at the single-cell level.Recently,several scRNA-seq copy number variation(scCNV)inference methods have been developed,expanding the application of scRNA-seq to study genetic heterogeneity in cancer using transcriptomic data.However,the fidelity of these methods has not been investigated systematically.Methods:We benchmarked five commonly used scCNV inference methods:HoneyBADGER,CopyKAT,CaSpER,inferCNV,and sciCNV.We evaluated their performance across four different scRNA-seq platforms using data from our previous multicenter study.We evaluated scCNV performance further using scRNA-seq datasets derived from mixed samples consisting of five human lung adenocarcinoma cell lines and also sequenced tissues from a small cell lung cancer patient and used the data to validate our findings with a clinical scRNA-seq dataset.Results:We found that the sensitivity and specificity of the five scCNV inference methods varied,depending on the selection of reference data,sequencing depth,and read length.CopyKAT and CaSpER outperformed other methods overall,while inferCNV,sciCNV,and CopyKAT performed better than other methods in subclone identification.We found that batch effects significantly affected the performance of subclone identification in mixed datasets in most methods we tested.Conclusion:Our benchmarking study revealed the strengths and weaknesses of each of these scCNV inference methods and provided guidance for selecting the optimal CNV inference method using scRNA-seq data.