A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference a...A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/without login requirements.展开更多
We report an important but long-overlooked manifestation of low-resolution power of 16S rRNA sequence analysis at the species level, namely, in 16S rRNA-based phylogenetic trees polyphyletic placements of closely-rela...We report an important but long-overlooked manifestation of low-resolution power of 16S rRNA sequence analysis at the species level, namely, in 16S rRNA-based phylogenetic trees polyphyletic placements of closely-related species are abundant compared to those in genomebased phylogeny. This phenomenon makes the demarcation of genera within many families ambiguous in the 16S rRNA-based taxonomy. In this study, we reconstructed phylogenetic relationship for more than ten thousand prokaryote genomes using the CVTree method, which is based on wholegenome information. And many such genera, which are polyphyletic in 16S rRNA-based trees, are well resolved as monophyletic clusters by CVTree. We believe that with genome sequencing of prokaryotes becoming a commonplace, genome-based phylogeny is doomed to play a definitive role in the construction of a natural and objective taxonomy.展开更多
Composition Vector Tree(CVTree) is an alignment-free algorithm to infer phylogenetic relationships from genome sequences. It has been successfully applied to study phylogeny and taxonomy of viruses, prokaryotes, and f...Composition Vector Tree(CVTree) is an alignment-free algorithm to infer phylogenetic relationships from genome sequences. It has been successfully applied to study phylogeny and taxonomy of viruses, prokaryotes, and fungi based on the whole genomes, as well as chloroplast genomes, mitochondrial genomes, and metagenomes. Here we presented the standalone software for the CVTree algorithm. In the software, an extensible parallel workflow for the CVTree algorithm was designed. Based on the workflow, new alignment-free methods were also implemented. And by examining the phylogeny and taxonomy of 13,903 prokaryotes based on 16 S r RNA sequences, we showed that CVTree software is an efficient and effective tool for studying phylogeny and taxonomy based on genome sequences. The code of CVTree software can be available at https://github.com/ghzuo/cvtree.展开更多
The Composition Vector Tree (CVTree) is a parameter-free and alignment-free method to infer pro-karyotic phylogeny from their complete genomes. It is distinct from the traditional 16S rRNA analysis in both the input d...The Composition Vector Tree (CVTree) is a parameter-free and alignment-free method to infer pro-karyotic phylogeny from their complete genomes. It is distinct from the traditional 16S rRNA analysis in both the input data and the methodology. The prokaryotic phylogenetic trees constructed by using the CVTree method agree well with the Bergey’s taxonomy in all major groupings and fine branching patterns. Thus, combined use of the CVTree approach and the 16S rRNA analysis may provide an objective and reliable reconstruction of the prokaryotic branch of the Tree of Life.展开更多
Composition vector trees (CVTrees) are inferred from whole-genome data by an alignment-free and parameter-free method. The agreement of these trees with the corresponding taxonomy provides an objective justification...Composition vector trees (CVTrees) are inferred from whole-genome data by an alignment-free and parameter-free method. The agreement of these trees with the corresponding taxonomy provides an objective justification of the inferred phylogeny. In this work, we show the stability and self-consistency of CVTrees by performing bootstrap and jackknife re-sampling tests adapted to this alignment-free approach. Our ultimate goal is to advocate the viewpoint that time-consuming statistical re-sampling tests can be avoided at all in using this alignment-free approach. Agreement with taxonomy should be taken as a major criterion to estimate prokaryotic phylogenetic trees.展开更多
Shigella species and Escherichia coli are closely related organisms. Early phenotyping experiments and several recent molecular studies put Shigella within the species E. coli. However, the whole-genome-based, alignme...Shigella species and Escherichia coli are closely related organisms. Early phenotyping experiments and several recent molecular studies put Shigella within the species E. coli. However, the whole-genome-based, alignment-free and parameter-free CVTree approach shows convincingly that four established Shigella species, Shigella boydii, Shigella sonnei, Shigella felxneri and Shigella dysenteriae, are distinct from E. coli strains, and form sister species to E. coli within the genus Esch- erichia. In view of the overall success and high resolution power of the CVTree approach, this result should be taken seriously. We hope that the present report may promote further in-depth study of the Shigella-E. coli relationship.展开更多
We perform an exhaustive, taxon by taxon, comparison of the branchings in the composition vector trees (CVTrees) inferred from 432 prokaryotic genomes available on 31 December 2006, with the bacte-riologists' taxo...We perform an exhaustive, taxon by taxon, comparison of the branchings in the composition vector trees (CVTrees) inferred from 432 prokaryotic genomes available on 31 December 2006, with the bacte-riologists' taxonomy-primarily the latest online Outline of the Bergey's Manual of Systematic Bacteri-ology. The CVTree phylogeny agrees very well with the Bergey's taxonomy in majority of fine branchings and overall structures. At the same time most of the differences between the trees and the Manual have been known to biologists to some extent and may hint at taxonomic revisions. Instead of demonstrating the overwhelming agreement this paper puts emphasis on the biological implications of the differences.展开更多
A long-standing question about the early evolution of club fungi(phylum Basidiomycota)is the relationship between the three major groups,Pucciniomycotina,Ustilaginomycotina and Agaricomycotina.It is unresolved whether...A long-standing question about the early evolution of club fungi(phylum Basidiomycota)is the relationship between the three major groups,Pucciniomycotina,Ustilaginomycotina and Agaricomycotina.It is unresolved whether Agaricomycotina are more closely related to Ustilaginomycotina or to Pucciniomycotina.Here we reconstructed the branching order of the three subphyla through two sources of phylogenetic signals,i.e.standard phylogenomic analysis and alignment-free phylogenetic approach.Overall,beyond congruency within the frame of standard phylogenomic analysis,our results consistently and robustly supported the early divergence of Ustilaginomycotina and a closer relationship between Agaricomycotina and Pucciniomycotina.展开更多
The newly proposed alignment-free and parameter-free composition vector (CVtree) method has been successfully applied to infer phylogenetic relationship of viruses, chloroplasts, bacteria, and fungi from their whole-g...The newly proposed alignment-free and parameter-free composition vector (CVtree) method has been successfully applied to infer phylogenetic relationship of viruses, chloroplasts, bacteria, and fungi from their whole-genome data. In this study we pay special attention to the phylogenetic positions of 56 Archaea genomes among which 7 species have not been listed either in Bergey’s Manual of Systematic Bacteriology or in Taxonomic Outline of Bacteria and Archaea (TOBA). By inspecting the stable monophyletic branchings in CVTrees reconstructed from a total of 861 genomes (56 Archaea plus 797 Bacteria, using 8 Eukarya as outgroups) definite taxonomic assignments were proposed for these not-fully-classified species. Further development of Archaea taxonomy may verify the predicted phylogenetic results of the CVTree approach.展开更多
基金supported by the National Basic Research Program of the Ministry of Science and Technology of China (973 ProjectGrant No. 2013CB834100)the State Key Laboratory of Applied Surface Physics as well as the Department of Physics, Fudan University, China
文摘A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/without login requirements.
基金supported by the National Basic Research Program of China(973 ProjectGrant No.2013CB834100)+1 种基金the National Natural Science Foundation of China(Grant No.11474068)the support of the State Key Laboratory of Applied Surface Physics and the Department of Physics,Fudan University,China
文摘We report an important but long-overlooked manifestation of low-resolution power of 16S rRNA sequence analysis at the species level, namely, in 16S rRNA-based phylogenetic trees polyphyletic placements of closely-related species are abundant compared to those in genomebased phylogeny. This phenomenon makes the demarcation of genera within many families ambiguous in the 16S rRNA-based taxonomy. In this study, we reconstructed phylogenetic relationship for more than ten thousand prokaryote genomes using the CVTree method, which is based on wholegenome information. And many such genera, which are polyphyletic in 16S rRNA-based trees, are well resolved as monophyletic clusters by CVTree. We believe that with genome sequencing of prokaryotes becoming a commonplace, genome-based phylogeny is doomed to play a definitive role in the construction of a natural and objective taxonomy.
文摘Composition Vector Tree(CVTree) is an alignment-free algorithm to infer phylogenetic relationships from genome sequences. It has been successfully applied to study phylogeny and taxonomy of viruses, prokaryotes, and fungi based on the whole genomes, as well as chloroplast genomes, mitochondrial genomes, and metagenomes. Here we presented the standalone software for the CVTree algorithm. In the software, an extensible parallel workflow for the CVTree algorithm was designed. Based on the workflow, new alignment-free methods were also implemented. And by examining the phylogeny and taxonomy of 13,903 prokaryotes based on 16 S r RNA sequences, we showed that CVTree software is an efficient and effective tool for studying phylogeny and taxonomy based on genome sequences. The code of CVTree software can be available at https://github.com/ghzuo/cvtree.
文摘The Composition Vector Tree (CVTree) is a parameter-free and alignment-free method to infer pro-karyotic phylogeny from their complete genomes. It is distinct from the traditional 16S rRNA analysis in both the input data and the methodology. The prokaryotic phylogenetic trees constructed by using the CVTree method agree well with the Bergey’s taxonomy in all major groupings and fine branching patterns. Thus, combined use of the CVTree approach and the 16S rRNA analysis may provide an objective and reliable reconstruction of the prokaryotic branch of the Tree of Life.
基金supported by the National Basic Research Program of China (the 973 Program, Grant No. 2007CB814800)the Shanghai Leading Academic Discipline Project (Grant No. B111)
文摘Composition vector trees (CVTrees) are inferred from whole-genome data by an alignment-free and parameter-free method. The agreement of these trees with the corresponding taxonomy provides an objective justification of the inferred phylogeny. In this work, we show the stability and self-consistency of CVTrees by performing bootstrap and jackknife re-sampling tests adapted to this alignment-free approach. Our ultimate goal is to advocate the viewpoint that time-consuming statistical re-sampling tests can be avoided at all in using this alignment-free approach. Agreement with taxonomy should be taken as a major criterion to estimate prokaryotic phylogenetic trees.
基金supported by the National Basic Research Program of China (973 Project, Grant No. 2007CB814800 and2013CB834100)the Shanghai Leading Academic Discipline Project (Grant No. B111)the National Key Laboratory of Applied Surface Physics and the Department of Physics, Fu-dan University
文摘Shigella species and Escherichia coli are closely related organisms. Early phenotyping experiments and several recent molecular studies put Shigella within the species E. coli. However, the whole-genome-based, alignment-free and parameter-free CVTree approach shows convincingly that four established Shigella species, Shigella boydii, Shigella sonnei, Shigella felxneri and Shigella dysenteriae, are distinct from E. coli strains, and form sister species to E. coli within the genus Esch- erichia. In view of the overall success and high resolution power of the CVTree approach, this result should be taken seriously. We hope that the present report may promote further in-depth study of the Shigella-E. coli relationship.
文摘We perform an exhaustive, taxon by taxon, comparison of the branchings in the composition vector trees (CVTrees) inferred from 432 prokaryotic genomes available on 31 December 2006, with the bacte-riologists' taxonomy-primarily the latest online Outline of the Bergey's Manual of Systematic Bacteri-ology. The CVTree phylogeny agrees very well with the Bergey's taxonomy in majority of fine branchings and overall structures. At the same time most of the differences between the trees and the Manual have been known to biologists to some extent and may hint at taxonomic revisions. Instead of demonstrating the overwhelming agreement this paper puts emphasis on the biological implications of the differences.
基金This work was supported by the National Basic Research Program of China(973 ProjectGrant No.2013CB834100)the National Natural Science Foundation of China(Grant No.11474068).
文摘A long-standing question about the early evolution of club fungi(phylum Basidiomycota)is the relationship between the three major groups,Pucciniomycotina,Ustilaginomycotina and Agaricomycotina.It is unresolved whether Agaricomycotina are more closely related to Ustilaginomycotina or to Pucciniomycotina.Here we reconstructed the branching order of the three subphyla through two sources of phylogenetic signals,i.e.standard phylogenomic analysis and alignment-free phylogenetic approach.Overall,beyond congruency within the frame of standard phylogenomic analysis,our results consistently and robustly supported the early divergence of Ustilaginomycotina and a closer relationship between Agaricomycotina and Pucciniomycotina.
基金supported by the National Basic Research Program of China (2007CB814800)Shanghai Leading Academic Discipline Project (B111)
文摘The newly proposed alignment-free and parameter-free composition vector (CVtree) method has been successfully applied to infer phylogenetic relationship of viruses, chloroplasts, bacteria, and fungi from their whole-genome data. In this study we pay special attention to the phylogenetic positions of 56 Archaea genomes among which 7 species have not been listed either in Bergey’s Manual of Systematic Bacteriology or in Taxonomic Outline of Bacteria and Archaea (TOBA). By inspecting the stable monophyletic branchings in CVTrees reconstructed from a total of 861 genomes (56 Archaea plus 797 Bacteria, using 8 Eukarya as outgroups) definite taxonomic assignments were proposed for these not-fully-classified species. Further development of Archaea taxonomy may verify the predicted phylogenetic results of the CVTree approach.