摘要
Bacillus thuringiensis(B.thuringiensis) is a soil-dwelling Gram-positive bacterium and its plasmid-encoded toxins(Cry) are commonly used as biological alternatives to pesticides.In a pangenomic study,we sequenced seven B.thuringiensis isolates in both high coverage and base-quality using the next-generation sequencing platform.The B.thuringiensis pangenome was extrapolated to have 4196 core genes and an asymptotic value of 558 unique genes when a new genome is added.Compared to the pangenomes of its closely related species of the same genus,B.thuringiensis pangenome shows an open characteristic,similar to B.cereus but not to B.anthracis;the latter has a closed pangenome. We also found extensive divergence among the seven B.thuringiensis genome assemblies,which harbor ample repeats and single nucleotide polymorphisms(SNPs).The identities among orthologous genes are greater than 84.5%and the hotspots for the genome variations were discovered in genomic regions of 2.3-2.8 Mb and 5.0-5.6 Mb.We concluded that high-coverage sequence assemblies from multiple strains, before all the gaps are closed,are very useful for pangenomic studies.
Bacillus thuringiensis(B.thuringiensis) is a soil-dwelling Gram-positive bacterium and its plasmid-encoded toxins(Cry) are commonly used as biological alternatives to pesticides.In a pangenomic study,we sequenced seven B.thuringiensis isolates in both high coverage and base-quality using the next-generation sequencing platform.The B.thuringiensis pangenome was extrapolated to have 4196 core genes and an asymptotic value of 558 unique genes when a new genome is added.Compared to the pangenomes of its closely related species of the same genus,B.thuringiensis pangenome shows an open characteristic,similar to B.cereus but not to B.anthracis;the latter has a closed pangenome. We also found extensive divergence among the seven B.thuringiensis genome assemblies,which harbor ample repeats and single nucleotide polymorphisms(SNPs).The identities among orthologous genes are greater than 84.5%and the hotspots for the genome variations were discovered in genomic regions of 2.3-2.8 Mb and 5.0-5.6 Mb.We concluded that high-coverage sequence assemblies from multiple strains, before all the gaps are closed,are very useful for pangenomic studies.
基金
supported by a grant from King Abdulaziz City for Science and Technology,Riyadh,Saudi Arabia(No. KACST 428-29)
institutional grant from CAS Key Laboratory of Genome Sciences and Information,Beijing Institute of Genomics, Chinese Academy of Sciences
supported by the grants from the National Basic Research Program(973 Program)(No.2010CB126604)
the Special Foundation Work Program(No.2009FY 120100)
the Ministry of Science and Technology of the People's Republic of China and from the National Science Foundation of China(No. 31071163).