Along with the develoipment of high-throughput sequencing technologies,both sample size and SNP number are increasing rapidly in genome-wide association studies(GWAS),and the associated computation is more challenging...Along with the develoipment of high-throughput sequencing technologies,both sample size and SNP number are increasing rapidly in genome-wide association studies(GWAS),and the associated computation is more challenging than ever.Here,we present a memory-efficient,visualization-enhanced,and parallel-accelerated R package called“r MVP”to address the need for improved GWAS computation.r MVP can 1)effectively process large GWAS data,2)rapidly evaluate population structure,3)efficiently estimate variance components by Efficient Mixed-Model Association e Xpedited(EMMAX),Factored Spectrally Transformed Linear Mixed Models(Fa ST-LMM),and Haseman-Elston(HE)regression algorithms,4)implement parallel-accelerated association tests of markers using general linear model(GLM),mixed linear model(MLM),and fixed and random model circulating probability unification(Farm CPU)methods,5)compute fast with a globally efficient design in the GWAS processes,and 6)generate various visualizations of GWASrelated information.Accelerated by block matrix multiplication strategy and multiple threads,the association test methods embedded in r MVP are significantly faster than PLINK,GEMMA,and Farm CPU_pkg.r MVP is freely available at https://github.com/xiaolei-lab/r MVP.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.31730089,31672391,31702087,and 31701144)the National Key R&D Program of China(Grant No.2016YFD0101900)+2 种基金the Fundamental Research Funds for the Central Universities,China(Grant Nos.2662020DKPY007 and 2662019PY011)the National Science Foundation,USA(Grant No.DBI 1661348)the National Swine System Industry Technology System,China(Grant No.CARS-35)。
文摘Along with the develoipment of high-throughput sequencing technologies,both sample size and SNP number are increasing rapidly in genome-wide association studies(GWAS),and the associated computation is more challenging than ever.Here,we present a memory-efficient,visualization-enhanced,and parallel-accelerated R package called“r MVP”to address the need for improved GWAS computation.r MVP can 1)effectively process large GWAS data,2)rapidly evaluate population structure,3)efficiently estimate variance components by Efficient Mixed-Model Association e Xpedited(EMMAX),Factored Spectrally Transformed Linear Mixed Models(Fa ST-LMM),and Haseman-Elston(HE)regression algorithms,4)implement parallel-accelerated association tests of markers using general linear model(GLM),mixed linear model(MLM),and fixed and random model circulating probability unification(Farm CPU)methods,5)compute fast with a globally efficient design in the GWAS processes,and 6)generate various visualizations of GWASrelated information.Accelerated by block matrix multiplication strategy and multiple threads,the association test methods embedded in r MVP are significantly faster than PLINK,GEMMA,and Farm CPU_pkg.r MVP is freely available at https://github.com/xiaolei-lab/r MVP.