摘要
目的·针对基因Panel测序数据整合多种二代测序分析方法,建立一套具备质量控制、基因突变检测的自动化分析及可视化工具。方法·整合Fast QC、Prinseq等方法开发针对基因Panel测序数据的质量控制和可视化R包;BWA或TMAP用于FASTQ文件与参考基因组映射;Lofreq、Varscan2、GATK、TVC等用于基因突变检测得到含有基因突变信息的变异识别格式(VCF)文件;使用Annovar完成基因突变注释。结果·完成36例急性髓系白血病患者PGM平台数据分析,在2例示例样本数据的DNMT3A、TET2、JAK2、PHF6、ASXL1、NPM1和CEBPA基因中找到了10个经过一代测序验证的高可信度基因突变位点。结论·该分析方法整合和开发了一系列用于基因Panel数据分析的工具,能有效完成基因Panel测序数据基因突变检测工作,降低检测假阳性率,并提高检测效率,对基因Panel测序相关数据分析工作提供了有效支持。
Objective · To establish an integrative method for the gene-panel sequencing data to automatically complete quality control, detection of gene mutation and visualization. Methods · Integrate several methods, e.g. FastQC, preprocessing and information of sequences (Prinseq) to develop an R package that can be used to visualize and control the quality of the raw sequencing reads and final mutations result. The sequencing reads mapped against to the reference genome using Burrows-Wheeler Alignment Tool (BWA)/Torrent Mapping Alignment Program (TMAP). Lofreq, Varscan2, the Genome Analysis Toolkit (GATK) and Torrent Variant Caller (TVC) were used to detect gene mutation and get the variant call format (VCF) format file. Annotate the gene mutation sites using Annovar. Results · Thirty-six cases of acute myeloid leukemia sequencing from Ion Torrent Personal Genome Machine (PGM) platform were passed by this analysis tool. Ten mutation sites of 2 demo data were found in DNMT3A, TET2, JAK2, PHF6, ASXL1, NPM1 and CEBPA which were validated by sanger sequencing. Conclusion · The analysis method that integrated and developed several tools for gene-panel sequencing data analysis can accomplish the gene-panel sequencing data analysis effectively. Besides, it can reduce the false positive ratio and improve the sensitivity of gene mutation detection that provides support for the analysis of gene-panel sequencing data.
出处
《上海交通大学学报(医学版)》
CSCD
北大核心
2017年第11期1574-1580,共7页
Journal of Shanghai Jiao tong University:Medical Science
基金
国家自然科学基金(81570122,81770205)
上海市教育委员会高峰高原学科建设计划(20161303)~~
关键词
二代测序
基因Panel测序
质量控制
基因突变检测
可视化
next-generation sequencing
gene-panel sequencing
quality control
detection of mutations
visualization