Recently,a report explaining the construction of a Human Pangenome ANalysis(HUPAN)system attracted wide attention in the biomedicine community.The original article was published in Genome Biology,a leading internation...Recently,a report explaining the construction of a Human Pangenome ANalysis(HUPAN)system attracted wide attention in the biomedicine community.The original article was published in Genome Biology,a leading international journal in genomics1.Many researchers,particularly those conducting basic medical research.展开更多
Background Domestic goose breeds are descended from either the Swan goose(Anser cygnoides)or the Greylag goose(Anser anser),exhibiting variations in body size,reproductive performance,egg production,feather color,and ...Background Domestic goose breeds are descended from either the Swan goose(Anser cygnoides)or the Greylag goose(Anser anser),exhibiting variations in body size,reproductive performance,egg production,feather color,and other phenotypic traits.Constructing a pan-genome facilitates a thorough identification of genetic variations,thereby deepening our comprehension of the molecular mechanisms underlying genetic diversity and phenotypic variability.Results To comprehensively facilitate population genomic and pan-genomic analyses in geese,we embarked on the task of 659 geese whole genome resequencing data and compiling a database of 155 RNA-seq samples.By constructing the pan-genome for geese,we generated non-reference contigs totaling 612 Mb,unveiling a collection of 2,813 novel genes and pinpointing 15,567 core genes,1,324 softcore genes,2,734 shell genes,and 878 cloud genes in goose genomes.Furthermore,we detected an 81.97 Mb genomic region showing signs of genome selection,encompassing the TGFBR2 gene correlated with variations in body weight among geese.Genome-wide association studies utilizing single nucleotide polymorphisms(SNPs)and presence-absence variation revealed significant genomic associations with various goose meat quality,reproductive,and body composition traits.For instance,a gene encoding the SVEP1 protein was linked to carcass oblique length,and a distinct gene-CDS haplotype of the SVEP1 gene exhibited an association with carcass oblique length.Notably,the pan-genome analysis revealed enrichment of variable genes in the“hair follicle maturation”Gene Ontology term,potentially linked to the selection of feather-related traits in geese.A gene presence-absence variation analysis suggested a reduced frequency of genes associated with“regulation of heart contraction”in domesticated geese compared to their wild counterparts.Our study provided novel insights into gene expression features and functions by integrating gene expression patterns across multiple organs and tissues in geese and analyzing population variation.Conclusion This accomplishment originates from the discernment of a multitude of selection signals and candidate genes associated with a wide array of traits,thereby markedly enhancing our understanding of the processes underlying domestication and breeding in geese.Moreover,assembling the pan-genome for geese has yielded a comprehensive apprehension of the goose genome,establishing it as an indispensable asset poised to offer innovative viewpoints and make substantial contributions to future geese breeding initiatives.展开更多
The Solanaceae family contains many agriculturally important crops,including tomato,potato,pepper,and tobacco,as well as other species with potential for agricultural development,such as the orphan crops groundcherry,...The Solanaceae family contains many agriculturally important crops,including tomato,potato,pepper,and tobacco,as well as other species with potential for agricultural development,such as the orphan crops groundcherry,wolfberry,and pepino.Research progress varies greatly among these species,with model crops like tomato being far ahead.This disparity limits the broader agricultural application of other Solanaceae species.In this study,we constructed an interspecies pan-genome for the Solanaceae family and identified various gene retention patterns.Our findings reveal that the activity of specific transposable elements is closely associated with gene fractionation and transposition.The pan-genome was further resolved at the level of T subgenomes,which were generated by Solanaceae-specific paleohexaploidization(T event).We demonstrate substantial gene fractionation(loss)and divergence events following ancient duplications.For example,all class A and E flower model genes in Solanaceae originated from two tandemly duplicated genes,which expanded through the g and T events before fractionating into 10 genes in tomato,each acquiring distinct functions critical for fruit development.Based on these results,we developed the Solanaceae Pan-Genome Database(SolPGD,http://www.bioinformaticslab.cn/SolPGD),which integrates datasets from both inter-and intra-species pan-genomes of Solanaceae.These findings and resources will facilitate future studies of solanaceous species,including orphan crops.展开更多
Cassava is a highly resilient tropical crop that produces large,starchy storage roots and high biomass.However,how did cassava’s remarkable environmental adaptability and key economic traits evolve from its wild spec...Cassava is a highly resilient tropical crop that produces large,starchy storage roots and high biomass.However,how did cassava’s remarkable environmental adaptability and key economic traits evolve from its wild species remain unclear.In this study,we obtained near complete telomere-to-telomere genome assemblies and their haplotype forms for the cultivar AM560,the wild ancestors FLA4047 and W14,constructed a graphic pan-genome of 30 representatives with a size of 1.15 Gb,and built a clarified evolutionary tree of 486 accessions.A comparison of structural variations and single-nucleotide variations between the ancestors and cultivated cassavas reveals predominant expansions and contractions of numbers of genes and gene families,which are mainly driven by transposons.Significant selective sweeping occurred in 122 footprints of genomes and affects 1,519 domesticated genes.We identify selective mutations in MeCSK and MeFNR2 that could promote photoreactions associated with MeNADP-ME in C4 photosynthesis in modern cassava.Coevolution of retard floral primordia and initiation of storage roots may arise from MeCOL5 variants with altered bindings to MeFT1,MeFT2,and MeTFL2.Mutations in MeMATE1 and MeGTR occur in sweet cassava,and MeAHL19 has evolved to regulate the biosynthesis,transport,and endogenous remobilization of cyanogenic glucosides in cassava.These extensive genomic and gene resources provided here,along with the findings on the evolutionary mechanisms responsible for beneficial traits in modern cultivars,lay a strong foundation for future breeding improvements of cassava.展开更多
Ash trees(Fraxinus)exhibit rich genetic diversity and wide adaptation to various ecological environments,and several species are highly salt tolerant.Dissecting the genomic basis of salt adaptation in Fraxinus is vita...Ash trees(Fraxinus)exhibit rich genetic diversity and wide adaptation to various ecological environments,and several species are highly salt tolerant.Dissecting the genomic basis of salt adaptation in Fraxinus is vital for its resistance breeding.Here,we present 11 high-quality chromosome-level genome assemblies for Fraxinus species,which reveal two unequal subgenome compositions and two recent whole-genome triplication events in their evolutionary history.A Fraxinus pan-genome was constructed on the basis of structural variations and revealed that presence–absence variations(PAVs)of transmembrane transport genes have likely contributed to salt adaptation in Fraxinus.Through whole-genome resequencing of an F1 population from an interspecies cross of F.velutina‘Lula 3’(salt tolerant)with F.pennsylvanica‘Lula 5’(salt sensitive),we mapped salt-tolerance PAV-based quantitative trait loci(QTLs)and pinpointed two PAV-QTLs and candidate genes associated with Fraxinus salt tolerance.Mechanistically,FvbHLH85 enhances salt tolerance by mediating reactive oxygen species and Na^(+)/K^(+)homeostasis,whereas FvSWEET5 enhances salt tolerance by mediating osmotic homeostasis.Collectively,these findings provide valuable genomic resources for Fraxinus salt-resistance breeding and the research community.展开更多
Chickens are one of the most important domesticated animals,serving as an important protein source.Studying genetic variations in chickens to enhance their production performance is of great potential value.The emerge...Chickens are one of the most important domesticated animals,serving as an important protein source.Studying genetic variations in chickens to enhance their production performance is of great potential value.The emergence of next-generation sequencing has enabled precise analysis of single nucleotide polymorphisms and insertions/deletions in chicken,while third-generation sequencing achieves the accurate structural variant identification.However,the high cost of third-generation sequencing technology limits its application in population studies.The graph-based pan-genome strategy can overcome this challenge by enabling the detection of structural variations using cost-effective next-generation sequencing data.This study constructed a graph-based pangenome for chickens using 12 high-quality genomes.This pan-genome used linear genome GRCg6a as the reference genome,containing variant information from two commercial and nine native chicken breeds.Compared to the linear genome,the pan-genome provided significant improvements in the efficiency of structural variation identification.On the basis of the graph-based pan-genome,high-frequency structural variations related to high egg production in Leghorn chicken were predicted.Additionally,it was discovered that potential structural variations was associated with highland adaptation in Tibetan chickens according to next-generation sequencing and transcriptomics data.Using the pan-genome graph,a new strategy to identify structural variations related to traits of interest in chickens is presented.展开更多
Pigs were domesticated independently in the Near East and China,indicating that a single reference genome from one individual is unable to represent the full spectrum of divergent sequences in pigs worldwide.Therefore...Pigs were domesticated independently in the Near East and China,indicating that a single reference genome from one individual is unable to represent the full spectrum of divergent sequences in pigs worldwide.Therefore,12 de novo pig assemblies from Eurasia were compared in this study to identify the missing sequences from the reference genome.As a result,72.5 Mb of nonredundant sequences(~3% of the genome)were found to be absent from the reference genome(Sscrofa11.1)and were defined as pan-sequences.Of the pan-sequences,9.0 Mb were dominant in Chinese pigs,in contrast with their low frequency in European pigs.One sequence dominant in Chinese pigs contained the complete genic region of the tazarotene-induced gene 3(TIG3)gene which is involved in fatty acid metabolism.Using flanking sequences and Hi-C based methods,27.7% of the sequences could be anchored to the reference genome.The supplementation of these sequences could contribute to the accurate interpretation of the 3D chromatin structure.A web-based pan-genome database was further provided to serve as a primary resource for exploration of genetic diversity and promote pig breeding and biomedical research.展开更多
Post-polyploid diploidization associated with descending dysploidy and interspecific introgression drives plant genome evolution by unclear mechanisms.Raphanus is an economically and ecologically important Brassiceae ...Post-polyploid diploidization associated with descending dysploidy and interspecific introgression drives plant genome evolution by unclear mechanisms.Raphanus is an economically and ecologically important Brassiceae genus and model system for studying post-polyploidization genome evolution and introgres-sion.Here,we report the de novo sequence assemblies for 11 genomes covering most of the typical sub-species and varieties of domesticated,wild and weedy radishes from East Asia,South Asia,Europe,and America.Divergence among the species,sub-species,and South/East Asian types coincided with Quaternary glaciations.A genus-level pan-genome was constructed with family-based,locus-based,and graph-based methods,and whole-genome comparisons revealed genetic variations ranging from single-nucleotide polymorphisms(SNPs)to inversions and translocations of whole ancestral karyotype(AK)blocks.Extensive gene flow occurred between wild,weedy,and domesticated radishes.High fre-quencies of genome reshuffling,biased retention,and large-fragment translocation have shaped the genomic diversity.Most variety-specific gene-rich blocks showed large structural variations.Extensive translocation and tandem duplication of dispensable genes were revealed in two large rearrangement-rich islands.Disease resistance genes mostly resided on specific and dispensable loci.Variations causing the loss of function of enzymes modulating gibberellin deactivation were identified and could play an important role in phenotype divergence and adaptive evolution.This study provides new insights into the genomic evolution underlying post-polyploid diploidization and lays the foundation for genetic improve-ment of radish crops,biological control of weeds,and protection of wild species'germplasms.展开更多
Structural variations(SVs)have long been described as being involved in the origin,adaption,and domes-tication of species.However,the underlying genetic and genomic mechanisms are poorly understood.Here,we report a hi...Structural variations(SVs)have long been described as being involved in the origin,adaption,and domes-tication of species.However,the underlying genetic and genomic mechanisms are poorly understood.Here,we report a high-quality genome assembly of Gossypium barbadense acc.Tanguis,a landrace that is closely related to formation of extra-long-staple(ELS)cultivated cotton.An SV-based pan-genome(Pan-SV)was then constructed using a total of 182593 non-redundant SVs,including 2236 inversions,97398 insertions,and 82959 deletions from 11 assembled genomes of allopolyploid cotton.The utility of this Pan-sV was then demonstrated through population structure analysis and genome-wide association studies(GWASs).Using segregation mapping populations produced through crossing ELS cotton and the landrace along with an Sv-based GWAs,certain SVs responsible for speciation,domestication,and improvement in tetraploid cottons were identified.Importantly,some of the SVs presently identified as associated with the yield and fiber quality improvement had not been identified in previous SNP-based GWAS.In particular,a 9-bp insertion or deletion was found to associate with elimination of the interspecific reproductive isolation between Gossypium hirsutum and G.barbadense.Collectively,this study provides new insights into genome-wide,gene-scale SVs linked to important agronomic traits in a major crop spe-cies and highlights the importance of sVs during the speciation,domestication,and improvement of culti-vated crop species.展开更多
The domestication of Brassica oleracea has resulted in diverse morphological types with distinct patterns of organ development.Here we report a graph-based pan-genome of B.oleracea constructed from high-quality genome...The domestication of Brassica oleracea has resulted in diverse morphological types with distinct patterns of organ development.Here we report a graph-based pan-genome of B.oleracea constructed from high-quality genome assemblies of different morphotypes.The pan-genome harbors over 200 structural variant hotspot regions enriched in auxin-andflowering-related genes.Population genomic analyses revealed that early domestication of B.oleracea focused on leaf or stem development.Geneflows resulting from agricultural practices and variety improvement were detected among different morphotypes.Selective-sweep and pan-genome analyses identified an auxin-responsive small auxin up-regulated RNA gene and a CLAV-ATA3/ESR-RELATED family gene as crucial players in leaf–stem differentiation during the early stage of B.oleracea domestication and the BoKAN1 gene as instrumental in shaping the leafy heads of cabbage and Brussels sprouts.Our pan-genome and functional analyses further revealed that variations in the BoFLC2 gene play key roles in the divergence of vernalization andflowering characteristics among different morphotypes,and variations in thefirst intron of BoFLC3 are involved infine-tuning theflowering process in cauliflower.This study provides a comprehensive understanding of the pan-genome of B.oleracea and sheds light on the domestication and differential organ development of this globally important crop species.展开更多
Rice(Oryza sativa)is a significant crop worldwide with a genome shaped by various evolutionary factors.Rice centromeres are crucial for chromosome segregation,and contain some unreported genes.Due to the diverse and c...Rice(Oryza sativa)is a significant crop worldwide with a genome shaped by various evolutionary factors.Rice centromeres are crucial for chromosome segregation,and contain some unreported genes.Due to the diverse and complex centromere region,a comprehensive understanding of rice centromere structure and function at the population level is needed.We constructed a high-quality centromere map based on the rice super pangenome consisting of a 251-accession panel comprising both cultivated and wild species of Asian and African rice.We showed that rice centromeres have diverse satellite repeat CentO,which vary across chromosomes and subpopulations,reflecting their distinct evolutionary patterns.We also revealed that long terminal repeats(LTRs),especially young Gypsy-type LTRs,are abundant in the peripheral CentO-enriched regions and drive rice centromere expansion and evolution.Furthermore,high-quality genome assembly and complete telomere-to-telomere(T2T)reference genome enable us to obtain more centromeric genome information despite mapping and cloning of centromere genes being challenging.We investigated the association between structural variations and gene expression in the rice centromere.A centromere gene,OsMAB,which positively regulates rice tiller number,was further confirmed by expression quantitative trait loci,haplotype analysis and clustered regularly interspaced palindromic repeats(CRISPR)/CRISPR-associated protein9 methods.By revealing the new insights into the evolutionary patterns and biological roles of rice centromeres,our finding will facilitate future research on centromere biology and crop improvement.展开更多
Mung bean is an economically important legume crop species that is used as a food,consumed as a vegetable,and used as an ingredient and even as a medicine.To explore the genomic diversity of mung bean,we assembled a h...Mung bean is an economically important legume crop species that is used as a food,consumed as a vegetable,and used as an ingredient and even as a medicine.To explore the genomic diversity of mung bean,we assembled a high-quality reference genome(Vrad_JL7)that was479.35 Mb in size,with a contig N50 length of 10.34 Mb.A total of 40,125 protein-coding genes were annotated,representing96.9%of the genetic region.We also sequenced 217 accessions,mainly landraces and cultivars from China,and identified 2,229,343 high-quality single-nucleotide polymorphisms(SNPs).Population structure revealed that the Chinese accessions diverged into two groups and were distinct from non-Chinese lines.Genetic diversity analysis based on genomic data from 750 accessions in 23 countries supported the hypothesis that mung bean was first domesticated in south Asia and introduced to east Asia probably through the Silk Road.We constructed the first pan-genome of mung bean germplasm and assembled 287.73 Mb of non-reference sequences.Among the genes,83.1%were core genes and 16.9%were variable.Presence/absence variation(PAV)events of nine genes involved in the regulation of the photoperiodic flowering pathway were identified as being under selection during the adaptation process to promote early flowering in the spring.Genomewide association studies(GWASs)revealed 2,912 SNPs and 259 gene PAV events associated with 33 agronomic traits,including a SNP in the coding region of the SWEET10 homolog(jg24043)involved in crude starch content and a PAV event in a large fragment containing 11 genes for color-related traits.This high-quality reference genome and pan-genome will provide insights into mung bean breeding.展开更多
Pan-genomics can encompass most of the genetic diversity of a species or population and has proved to be a powerful tool for studying genomic evolution and the origin and domestication of species,and for providing inf...Pan-genomics can encompass most of the genetic diversity of a species or population and has proved to be a powerful tool for studying genomic evolution and the origin and domestication of species,and for providing information for plant improvement.Plant genomics has greatly progressed because of improvements in sequencing technologies and the rapid reduction of sequencing costs.Nevertheless,pangenomics still presents many challenges,including computationally intensive assembly methods,high costs with large numbers of samples,ineffective integration of big data,and difficulty in applying it to downstream multi-omics analysis and breeding research.In this review,we summarize the definition and recent achievements of plant pan-genomics,computational technologies used for pan-genome construction,and the applications of pan-genomes in plant genomics and molecular breeding.We also discuss challenges and perspectives for future pan-genomics studies and provide a detailed pipeline for sample selection,genome assembly and annotation,structural variation identification,and construction and application of graph-based pan-genomes.The aim is to provide important guidance for plant pan-genome research and a better understanding of the genetic basis of genome evolution,crop domestication,and phenotypic diversity for future studies.展开更多
The wild rice species in the genus Oryza harbor a large amount of genetic diversity that has been untapped for rice improvement.Pan-genomics has revolutionized genomic research in plants.However,rice pan-genomic studi...The wild rice species in the genus Oryza harbor a large amount of genetic diversity that has been untapped for rice improvement.Pan-genomics has revolutionized genomic research in plants.However,rice pan-genomic studies so far have been limited mostly to cultivated accessions,with only a few close wild relatives.Advances in sequencing technologies have permitted the assembly of highquality rice genome sequences at low cost,making it possible to construct genus-level pan-genomes across all species.In this review,we summarize progress in current research on genetic and genomic resources in Oryza,and in sequencing and computational technologies used for rice genome and pangenome construction.For future work,we discuss the approaches and challenges in the construction of,and data access to,Oryza pan-genomes based on representative high-quality genome assemblies.The Oryza pan-genomes will provide a basis for the exploration and use of the extensive genetic diversity present in both cultivated and wild rice populations.展开更多
Background Unveiling genetic diversity features and understanding the genetic mechanisms of diverse goat pheno-types are pivotal in facilitating the preservation and utilization of these genetic resources.However,the ...Background Unveiling genetic diversity features and understanding the genetic mechanisms of diverse goat pheno-types are pivotal in facilitating the preservation and utilization of these genetic resources.However,the total genetic diversity within a species can’t be captured by the reference genome of a single individual.The pan-genome is a col-lection of all the DNA sequences that occur in a species,and it is expected to capture the total genomic diversity of the specific species.Results We constructed a goat pan-genome using map-to-pan assemble based on 813 individuals,including 723 domestic goats and 90 samples from their wild relatives,which presented a broad regional and global represen-tation.In total,146 Mb sequences and 974 genes were identified as absent from the reference genome(ARS1.2;GCF_001704415.2).We identified 3,190 novel single nucleotide polymorphisms(SNPs)using the pan-genome analysis.These novel SNPs could properly reveal the population structure of domestic goats and their wild relatives.Presence/absence variation(PAV)analysis revealed gene loss and intense negative selection during domestication and improvement.Conclusions Our research highlights the importance of the goat pan-genome in capturing the missing genetic variations.It reveals the changes in genomic architecture during goat domestication and improvement,such as gene loss.This improves our understanding of the evolutionary and breeding history of goats.展开更多
Background Rumen microorganisms are key regulators of ruminant growth and production performance.Identifying probiotic candidates through microbial culturomics presents a promising strategy for improving ruminant prod...Background Rumen microorganisms are key regulators of ruminant growth and production performance.Identifying probiotic candidates through microbial culturomics presents a promising strategy for improving ruminant production performance.Our previous study identified significant differences in rumen microbial communities of Holstein calves with varying average daily gain(ADG).This study aims to identify a target strain based on the findings from multi-omics analysis and literature review,isolating and evaluating the target microbial strains from both the rumen and hindgut contents for their probiotic potential.Results Parabacteroides distasonis,a strain closely associated with ADG,was successfully isolated from calf rumen content cultured with Fastidious Anaerobe Agar(FAA)medium and named Parabacteroides distasonis F4.Wholegenome sequencing and pan-genome analysis showed that P.distasonis F4 possesses a core functional potential for carbohydrate and amino acid metabolism,with the ability to produce propionate,acetate,and lactate.The results of targeted and untargeted metabolomics further validated the organic acid production and metabolic pathways of P.distasonis F4.An in vitro simulated rumen fermentation test showed that supplementation with P.distasonis F4 significantly altered rumen microbial community structure and increased the molar proportions of propionate and butyrate in the rumen.Furthermore,an in vivo study demonstrated that dietary supplementation with P.distasonis F4 significantly increased the ADG of pre-weaning calves.Conclusions This study represents the first isolation of P.distasonis F4 from rumen,highlighting its potential as a probiotic strain for improving rumen development and growth performance in ruminants.展开更多
As a common foodborne pathogen,Salmonella poses risks to public health safety,common given the emergence of antimicrobial-resistant strains.However,there is currently a lack of systematic platforms based on large lang...As a common foodborne pathogen,Salmonella poses risks to public health safety,common given the emergence of antimicrobial-resistant strains.However,there is currently a lack of systematic platforms based on large language models(LLMs)for Salmonella resistance prediction,data presentation,and data sharing.To overcome this issue,we firstly propose a two-step feature-selection process based on the chi-square test and conditional mutual information maximization to find the key Salmonella resistance genes in a pan-genomics analysis and develop an LLM-based Salmonella antimicrobial-resistance predictive(SARPLLM)algorithm to achieve accurate antimicrobial-resistance prediction,based on Qwen2 LLM and low-rank adaptation.Secondly,we optimize the time complexity to compute the sample distance from the linear to logarithmic level by constructing a quantum data augmentation algorithm denoted as QSMOTEN.Thirdly,we build up a user-friendly Salmonella antimicrobial-resistance predictive online platform based on knowledge graphs,which not only facilitates online resistance prediction for users but also visualizes the pan-genomics analysis results of the Salmonella datasets.展开更多
Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between mul...Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between multiple gene and sequence copies, and in genetic mapping, hindering use of genomic data for genetics and breeding. Polyploid genomes may also be more prone to containing structural variation, such as loss of gene copies or sequences(presence–absence variation) and the presence of genes or sequences in multiple copies(copynumber variation). Although the two main types of genomic structural variation commonly identified are presence–absence variation and copy-number variation, we propose that homeologous exchanges constitute a third major form of genomic structural variation in polyploids. Homeologous exchanges involve the replacement of one genomic segment by a similar copy from another genome or ancestrally duplicated region, and are known to be extremely common in polyploids. Detecting all kinds of genomic structural variation is challenging, but recent advances such as optical mapping and long-read sequencing offer potential strategies to help identify structural variants even in complex polyploid genomes. All three major types of genomic structural variation(presence–absence, copy-number, and homeologous exchange) are now known to influence phenotypes in crop plants, with examples of flowering time, frost tolerance, and adaptive and agronomic traits. In this review,we summarize the challenges of genome analysis in polyploid crops, describe the various types of genomic structural variation and the genomics technologies and data that can be used to detect them, and collate information produced to date related to the impact of genomic structural variation on crop phenotypes. We highlight the importance of genomic structural variation for the future genetic improvement of polyploid crops.展开更多
Horizontal gene transfer (HGT) plays key roles in the evolution of pathogenetic bacteria, especially in pathogenetic associated genes. In this study, the evolutionary dynamics of Xanthomonas at species level were de...Horizontal gene transfer (HGT) plays key roles in the evolution of pathogenetic bacteria, especially in pathogenetic associated genes. In this study, the evolutionary dynamics of Xanthomonas at species level were determined by the comparative analysis of the complete genomes of 15 Xanthomonas strains. A concatenated multiprotein phyletic pattern and a dataset with Xanthomonas clusters of orthologous genes were constructed. Mathematical extrapolation estimates that the core genome will reach a minimum of about 1 547 genes while the pan-genome will increase up to 22 624 genes when sequencing 1 000 genomes. The HGT extent in this genus was assessed by using a Markov-based probabilistic method. The reconstructed gene gain/loss history, which contained several features consistent with biological observations, showed that nearly 60% of the Xanthomonas genes were acquired by HGT. A large fraction of variability was in the clade ancestor nodes and "leaves of the tree". Coexpression analysis suggested that the pathogenic and metabolic variation between Xanthomonas oryzae pv. oryzicola and Xanthomonas oryzae pv. oryzae might due to recently-transferred genes. Our results strongly supported that the gene gain/loss may play an important role in divergence and pathogenicity variation of Xanthomonas species.展开更多
基金supported by grants from the National Key R&D Program of China(Grant No.2017YFC0908300 and 2016YFC1303200)the National Natural Science Foundation of China(Grant No.81772505 and J1210047)+4 种基金the Shanghai Science and Technology Committee(Grant No.18411953100)the Cross-Institute Research Fund of Shanghai Jiao Tong University(Grant No.YG2017ZD01)the Innovation Foundation of Translational Medicine of Shanghai Jiao Tong University School of Medicine(Grant No.15ZH4001,TM201617,and TM 201702)Neil Shen’s SJTU Medical Research Fundthe SJTU-Yale Collaborative Research Seed Fund。
文摘Recently,a report explaining the construction of a Human Pangenome ANalysis(HUPAN)system attracted wide attention in the biomedicine community.The original article was published in Genome Biology,a leading international journal in genomics1.Many researchers,particularly those conducting basic medical research.
基金funding from several sources,including the Chongqing Scientific Research Institution Performance Incentive Project(grant number cstc2022jxjl80007)the Earmarked Fund for China Agriculture Research System(grant number CARS-42-51)+5 种基金the Chongqing Scientific Research Institution Performance Incentive Project(grant number 22527 J)the Key R&D Project in Agriculture and Animal Husbandry of Rongchang(grant number No.22534C-22)Natural Science Foundation of Chongqing Project,grant number CSTB2022NSCQ-MSX0434Natural Science Foundation of Sichuan Project,grant number 2022NSFSC0605Natural Science Foundation of Sichuan Project,grant number 2021YFS0379the Chongqing Technology Innovation and Application Development Project(grant number No.cstc2021ycjh-bgzxm0248)。
文摘Background Domestic goose breeds are descended from either the Swan goose(Anser cygnoides)or the Greylag goose(Anser anser),exhibiting variations in body size,reproductive performance,egg production,feather color,and other phenotypic traits.Constructing a pan-genome facilitates a thorough identification of genetic variations,thereby deepening our comprehension of the molecular mechanisms underlying genetic diversity and phenotypic variability.Results To comprehensively facilitate population genomic and pan-genomic analyses in geese,we embarked on the task of 659 geese whole genome resequencing data and compiling a database of 155 RNA-seq samples.By constructing the pan-genome for geese,we generated non-reference contigs totaling 612 Mb,unveiling a collection of 2,813 novel genes and pinpointing 15,567 core genes,1,324 softcore genes,2,734 shell genes,and 878 cloud genes in goose genomes.Furthermore,we detected an 81.97 Mb genomic region showing signs of genome selection,encompassing the TGFBR2 gene correlated with variations in body weight among geese.Genome-wide association studies utilizing single nucleotide polymorphisms(SNPs)and presence-absence variation revealed significant genomic associations with various goose meat quality,reproductive,and body composition traits.For instance,a gene encoding the SVEP1 protein was linked to carcass oblique length,and a distinct gene-CDS haplotype of the SVEP1 gene exhibited an association with carcass oblique length.Notably,the pan-genome analysis revealed enrichment of variable genes in the“hair follicle maturation”Gene Ontology term,potentially linked to the selection of feather-related traits in geese.A gene presence-absence variation analysis suggested a reduced frequency of genes associated with“regulation of heart contraction”in domesticated geese compared to their wild counterparts.Our study provided novel insights into gene expression features and functions by integrating gene expression patterns across multiple organs and tissues in geese and analyzing population variation.Conclusion This accomplishment originates from the discernment of a multitude of selection signals and candidate genes associated with a wide array of traits,thereby markedly enhancing our understanding of the processes underlying domestication and breeding in geese.Moreover,assembling the pan-genome for geese has yielded a comprehensive apprehension of the goose genome,establishing it as an indispensable asset poised to offer innovative viewpoints and make substantial contributions to future geese breeding initiatives.
基金supported by the National Natural Science Foundation of China(NSFC grants U23A20210,32102382,and 32102386)the China Agricultural Research System(CARS-23-A15)+1 种基金the Central Publicinterest Scientific Institution Basal Research Fund(Y2024QC05)the Innovation Program of the Chinese Academy of Agricultural Sciences,and the Key Laboratory of Biology and Genetic Improvement of Horticultural Crops,Ministry of Agriculture and Rural Affairs,China.
文摘The Solanaceae family contains many agriculturally important crops,including tomato,potato,pepper,and tobacco,as well as other species with potential for agricultural development,such as the orphan crops groundcherry,wolfberry,and pepino.Research progress varies greatly among these species,with model crops like tomato being far ahead.This disparity limits the broader agricultural application of other Solanaceae species.In this study,we constructed an interspecies pan-genome for the Solanaceae family and identified various gene retention patterns.Our findings reveal that the activity of specific transposable elements is closely associated with gene fractionation and transposition.The pan-genome was further resolved at the level of T subgenomes,which were generated by Solanaceae-specific paleohexaploidization(T event).We demonstrate substantial gene fractionation(loss)and divergence events following ancient duplications.For example,all class A and E flower model genes in Solanaceae originated from two tandemly duplicated genes,which expanded through the g and T events before fractionating into 10 genes in tomato,each acquiring distinct functions critical for fruit development.Based on these results,we developed the Solanaceae Pan-Genome Database(SolPGD,http://www.bioinformaticslab.cn/SolPGD),which integrates datasets from both inter-and intra-species pan-genomes of Solanaceae.These findings and resources will facilitate future studies of solanaceous species,including orphan crops.
基金supported by grants from the National Key R&D Program of China(2018YFD1000501)the National Natural Science Foundation of China-CG joint foundation(3181101517)the startup funds for the double first-class disciplines of crop science in Hainan University(RZ2100003362).
文摘Cassava is a highly resilient tropical crop that produces large,starchy storage roots and high biomass.However,how did cassava’s remarkable environmental adaptability and key economic traits evolve from its wild species remain unclear.In this study,we obtained near complete telomere-to-telomere genome assemblies and their haplotype forms for the cultivar AM560,the wild ancestors FLA4047 and W14,constructed a graphic pan-genome of 30 representatives with a size of 1.15 Gb,and built a clarified evolutionary tree of 486 accessions.A comparison of structural variations and single-nucleotide variations between the ancestors and cultivated cassavas reveals predominant expansions and contractions of numbers of genes and gene families,which are mainly driven by transposons.Significant selective sweeping occurred in 122 footprints of genomes and affects 1,519 domesticated genes.We identify selective mutations in MeCSK and MeFNR2 that could promote photoreactions associated with MeNADP-ME in C4 photosynthesis in modern cassava.Coevolution of retard floral primordia and initiation of storage roots may arise from MeCOL5 variants with altered bindings to MeFT1,MeFT2,and MeTFL2.Mutations in MeMATE1 and MeGTR occur in sweet cassava,and MeAHL19 has evolved to regulate the biosynthesis,transport,and endogenous remobilization of cyanogenic glucosides in cassava.These extensive genomic and gene resources provided here,along with the findings on the evolutionary mechanisms responsible for beneficial traits in modern cultivars,lay a strong foundation for future breeding improvements of cassava.
基金supported by the Agriculture Seed Improvement Project of Shandong Province(2023LZGVC012 and 2019LZGC009).
文摘Ash trees(Fraxinus)exhibit rich genetic diversity and wide adaptation to various ecological environments,and several species are highly salt tolerant.Dissecting the genomic basis of salt adaptation in Fraxinus is vital for its resistance breeding.Here,we present 11 high-quality chromosome-level genome assemblies for Fraxinus species,which reveal two unequal subgenome compositions and two recent whole-genome triplication events in their evolutionary history.A Fraxinus pan-genome was constructed on the basis of structural variations and revealed that presence–absence variations(PAVs)of transmembrane transport genes have likely contributed to salt adaptation in Fraxinus.Through whole-genome resequencing of an F1 population from an interspecies cross of F.velutina‘Lula 3’(salt tolerant)with F.pennsylvanica‘Lula 5’(salt sensitive),we mapped salt-tolerance PAV-based quantitative trait loci(QTLs)and pinpointed two PAV-QTLs and candidate genes associated with Fraxinus salt tolerance.Mechanistically,FvbHLH85 enhances salt tolerance by mediating reactive oxygen species and Na^(+)/K^(+)homeostasis,whereas FvSWEET5 enhances salt tolerance by mediating osmotic homeostasis.Collectively,these findings provide valuable genomic resources for Fraxinus salt-resistance breeding and the research community.
基金supported by the National Key Research and Development Program of China(2023YFF1001000)。
文摘Chickens are one of the most important domesticated animals,serving as an important protein source.Studying genetic variations in chickens to enhance their production performance is of great potential value.The emergence of next-generation sequencing has enabled precise analysis of single nucleotide polymorphisms and insertions/deletions in chicken,while third-generation sequencing achieves the accurate structural variant identification.However,the high cost of third-generation sequencing technology limits its application in population studies.The graph-based pan-genome strategy can overcome this challenge by enabling the detection of structural variations using cost-effective next-generation sequencing data.This study constructed a graph-based pangenome for chickens using 12 high-quality genomes.This pan-genome used linear genome GRCg6a as the reference genome,containing variant information from two commercial and nine native chicken breeds.Compared to the linear genome,the pan-genome provided significant improvements in the efficiency of structural variation identification.On the basis of the graph-based pan-genome,high-frequency structural variations related to high egg production in Leghorn chicken were predicted.Additionally,it was discovered that potential structural variations was associated with highland adaptation in Tibetan chickens according to next-generation sequencing and transcriptomics data.Using the pan-genome graph,a new strategy to identify structural variations related to traits of interest in chickens is presented.
基金supported by the National Natural Science Foundation of China(31822052 and 31572381)the Science&Technology Support Program of Sichuan(2016NYZ0042 and 2017NZDZX0002)。
文摘Pigs were domesticated independently in the Near East and China,indicating that a single reference genome from one individual is unable to represent the full spectrum of divergent sequences in pigs worldwide.Therefore,12 de novo pig assemblies from Eurasia were compared in this study to identify the missing sequences from the reference genome.As a result,72.5 Mb of nonredundant sequences(~3% of the genome)were found to be absent from the reference genome(Sscrofa11.1)and were defined as pan-sequences.Of the pan-sequences,9.0 Mb were dominant in Chinese pigs,in contrast with their low frequency in European pigs.One sequence dominant in Chinese pigs contained the complete genic region of the tazarotene-induced gene 3(TIG3)gene which is involved in fatty acid metabolism.Using flanking sequences and Hi-C based methods,27.7% of the sequences could be anchored to the reference genome.The supplementation of these sequences could contribute to the accurate interpretation of the 3D chromatin structure.A web-based pan-genome database was further provided to serve as a primary resource for exploration of genetic diversity and promote pig breeding and biomedical research.
基金supported by the National Key Research and Development Program of China(2016YFD0100204-02,2013BAD01B04-1)the National Natural Science Foundation of China(31772301,31772303,and 31801858)the Technology Innovation Program of the Chinese Academy of Agricultural Sciences(CAAS-ASTIP-2019-IVFCAAS,CAAS-XTCX2016016-4-4,and CAAS-XTCX2016001-5-2).
文摘Post-polyploid diploidization associated with descending dysploidy and interspecific introgression drives plant genome evolution by unclear mechanisms.Raphanus is an economically and ecologically important Brassiceae genus and model system for studying post-polyploidization genome evolution and introgres-sion.Here,we report the de novo sequence assemblies for 11 genomes covering most of the typical sub-species and varieties of domesticated,wild and weedy radishes from East Asia,South Asia,Europe,and America.Divergence among the species,sub-species,and South/East Asian types coincided with Quaternary glaciations.A genus-level pan-genome was constructed with family-based,locus-based,and graph-based methods,and whole-genome comparisons revealed genetic variations ranging from single-nucleotide polymorphisms(SNPs)to inversions and translocations of whole ancestral karyotype(AK)blocks.Extensive gene flow occurred between wild,weedy,and domesticated radishes.High fre-quencies of genome reshuffling,biased retention,and large-fragment translocation have shaped the genomic diversity.Most variety-specific gene-rich blocks showed large structural variations.Extensive translocation and tandem duplication of dispensable genes were revealed in two large rearrangement-rich islands.Disease resistance genes mostly resided on specific and dispensable loci.Variations causing the loss of function of enzymes modulating gibberellin deactivation were identified and could play an important role in phenotype divergence and adaptive evolution.This study provides new insights into the genomic evolution underlying post-polyploid diploidization and lays the foundation for genetic improve-ment of radish crops,biological control of weeds,and protection of wild species'germplasms.
基金supported in part by the 2021 Research Program of Sanya Yazhou Bay Science and Technology City(SKJC-2021-02-001)the Leading Innovative and Entrepreneur Team Introduction Program of Zhejiang(2019R01002)the Fundamental Research Funds for the Central Universities(226-2022-00100 and 2022QZJH43).
文摘Structural variations(SVs)have long been described as being involved in the origin,adaption,and domes-tication of species.However,the underlying genetic and genomic mechanisms are poorly understood.Here,we report a high-quality genome assembly of Gossypium barbadense acc.Tanguis,a landrace that is closely related to formation of extra-long-staple(ELS)cultivated cotton.An SV-based pan-genome(Pan-SV)was then constructed using a total of 182593 non-redundant SVs,including 2236 inversions,97398 insertions,and 82959 deletions from 11 assembled genomes of allopolyploid cotton.The utility of this Pan-sV was then demonstrated through population structure analysis and genome-wide association studies(GWASs).Using segregation mapping populations produced through crossing ELS cotton and the landrace along with an Sv-based GWAs,certain SVs responsible for speciation,domestication,and improvement in tetraploid cottons were identified.Importantly,some of the SVs presently identified as associated with the yield and fiber quality improvement had not been identified in previous SNP-based GWAS.In particular,a 9-bp insertion or deletion was found to associate with elimination of the interspecific reproductive isolation between Gossypium hirsutum and G.barbadense.Collectively,this study provides new insights into genome-wide,gene-scale SVs linked to important agronomic traits in a major crop spe-cies and highlights the importance of sVs during the speciation,domestication,and improvement of culti-vated crop species.
基金supported by grants from the National Key Research and Development Program of China (2022YFF1003001)the National Natural Science Foundation of China (32072576)+3 种基金the National Modern Agriculture Industry Technology System (CARS-23-G42)the Jiangsu Provincial Key Research and Development Program (BE2021376)the Innovation Program of the Beijing Academy of Agricultural and Forestry Sciences (KJCX20230121)the Collaborative Innovation Program for Leafy and Root Vegetables of the Beijing Vegetable Research Center,Beijing Academy of Agricultural and Forestry Sciences (XTCX202302).
文摘The domestication of Brassica oleracea has resulted in diverse morphological types with distinct patterns of organ development.Here we report a graph-based pan-genome of B.oleracea constructed from high-quality genome assemblies of different morphotypes.The pan-genome harbors over 200 structural variant hotspot regions enriched in auxin-andflowering-related genes.Population genomic analyses revealed that early domestication of B.oleracea focused on leaf or stem development.Geneflows resulting from agricultural practices and variety improvement were detected among different morphotypes.Selective-sweep and pan-genome analyses identified an auxin-responsive small auxin up-regulated RNA gene and a CLAV-ATA3/ESR-RELATED family gene as crucial players in leaf–stem differentiation during the early stage of B.oleracea domestication and the BoKAN1 gene as instrumental in shaping the leafy heads of cabbage and Brussels sprouts.Our pan-genome and functional analyses further revealed that variations in the BoFLC2 gene play key roles in the divergence of vernalization andflowering characteristics among different morphotypes,and variations in thefirst intron of BoFLC3 are involved infine-tuning theflowering process in cauliflower.This study provides a comprehensive understanding of the pan-genome of B.oleracea and sheds light on the domestication and differential organ development of this globally important crop species.
基金supported by the National Natural Science Foundation of China(32188102,32372148)Innovation Program of Chinese Academy of Agricultural Sciences,the Youth Innovation of Chinese Academy of Agricultural Sciences(Y20230C36)+1 种基金Guangdong Basic and Applied Basic Research Foundation(2023B1515020053)the Youth Program of Guangdong Basic and Applied Research(2021A1515111123)。
文摘Rice(Oryza sativa)is a significant crop worldwide with a genome shaped by various evolutionary factors.Rice centromeres are crucial for chromosome segregation,and contain some unreported genes.Due to the diverse and complex centromere region,a comprehensive understanding of rice centromere structure and function at the population level is needed.We constructed a high-quality centromere map based on the rice super pangenome consisting of a 251-accession panel comprising both cultivated and wild species of Asian and African rice.We showed that rice centromeres have diverse satellite repeat CentO,which vary across chromosomes and subpopulations,reflecting their distinct evolutionary patterns.We also revealed that long terminal repeats(LTRs),especially young Gypsy-type LTRs,are abundant in the peripheral CentO-enriched regions and drive rice centromere expansion and evolution.Furthermore,high-quality genome assembly and complete telomere-to-telomere(T2T)reference genome enable us to obtain more centromeric genome information despite mapping and cloning of centromere genes being challenging.We investigated the association between structural variations and gene expression in the rice centromere.A centromere gene,OsMAB,which positively regulates rice tiller number,was further confirmed by expression quantitative trait loci,haplotype analysis and clustered regularly interspaced palindromic repeats(CRISPR)/CRISPR-associated protein9 methods.By revealing the new insights into the evolutionary patterns and biological roles of rice centromeres,our finding will facilitate future research on centromere biology and crop improvement.
基金supported by the National Key R&D Program of China(2019YFD1000700/2019YFD1000702)the China Agricultural Research System(CARS-08-G3)+2 种基金the Key Research and Development Program of Hebei(21326305D)the Hebei Agriculture Research System(HBCT2018070203)the Hebei Talent Project.
文摘Mung bean is an economically important legume crop species that is used as a food,consumed as a vegetable,and used as an ingredient and even as a medicine.To explore the genomic diversity of mung bean,we assembled a high-quality reference genome(Vrad_JL7)that was479.35 Mb in size,with a contig N50 length of 10.34 Mb.A total of 40,125 protein-coding genes were annotated,representing96.9%of the genetic region.We also sequenced 217 accessions,mainly landraces and cultivars from China,and identified 2,229,343 high-quality single-nucleotide polymorphisms(SNPs).Population structure revealed that the Chinese accessions diverged into two groups and were distinct from non-Chinese lines.Genetic diversity analysis based on genomic data from 750 accessions in 23 countries supported the hypothesis that mung bean was first domesticated in south Asia and introduced to east Asia probably through the Silk Road.We constructed the first pan-genome of mung bean germplasm and assembled 287.73 Mb of non-reference sequences.Among the genes,83.1%were core genes and 16.9%were variable.Presence/absence variation(PAV)events of nine genes involved in the regulation of the photoperiodic flowering pathway were identified as being under selection during the adaptation process to promote early flowering in the spring.Genomewide association studies(GWASs)revealed 2,912 SNPs and 259 gene PAV events associated with 33 agronomic traits,including a SNP in the coding region of the SWEET10 homolog(jg24043)involved in crude starch content and a PAV event in a large fragment containing 11 genes for color-related traits.This high-quality reference genome and pan-genome will provide insights into mung bean breeding.
基金supported by the National Natural Science Foundation of China(32100500)the Natural Science Foundation of Hebei Province(C2021201048)Interdisciplinary Research Program of Natural Science of Hebei University。
文摘Pan-genomics can encompass most of the genetic diversity of a species or population and has proved to be a powerful tool for studying genomic evolution and the origin and domestication of species,and for providing information for plant improvement.Plant genomics has greatly progressed because of improvements in sequencing technologies and the rapid reduction of sequencing costs.Nevertheless,pangenomics still presents many challenges,including computationally intensive assembly methods,high costs with large numbers of samples,ineffective integration of big data,and difficulty in applying it to downstream multi-omics analysis and breeding research.In this review,we summarize the definition and recent achievements of plant pan-genomics,computational technologies used for pan-genome construction,and the applications of pan-genomes in plant genomics and molecular breeding.We also discuss challenges and perspectives for future pan-genomics studies and provide a detailed pipeline for sample selection,genome assembly and annotation,structural variation identification,and construction and application of graph-based pan-genomes.The aim is to provide important guidance for plant pan-genome research and a better understanding of the genetic basis of genome evolution,crop domestication,and phenotypic diversity for future studies.
基金supported by Chinese Academy of Sciences"Strategic Priority Research Program"(XDA24040201)National Key Research and Development Program of China(2020YFE0202300)State Key Laboratory of Plant Genomics。
文摘The wild rice species in the genus Oryza harbor a large amount of genetic diversity that has been untapped for rice improvement.Pan-genomics has revolutionized genomic research in plants.However,rice pan-genomic studies so far have been limited mostly to cultivated accessions,with only a few close wild relatives.Advances in sequencing technologies have permitted the assembly of highquality rice genome sequences at low cost,making it possible to construct genus-level pan-genomes across all species.In this review,we summarize progress in current research on genetic and genomic resources in Oryza,and in sequencing and computational technologies used for rice genome and pangenome construction.For future work,we discuss the approaches and challenges in the construction of,and data access to,Oryza pan-genomes based on representative high-quality genome assemblies.The Oryza pan-genomes will provide a basis for the exploration and use of the extensive genetic diversity present in both cultivated and wild rice populations.
基金Strategic Priority Research Program of Chinese Academy of Sciences(No.XDA24030205)National Natural Science Foundation of China(Nos.U21A20246,32102511)+3 种基金National Key Research and Development Program-Key Projects(2021YFD1200900 and 2021YFD1300904)Second Tibetan Plateau Scientific Expedition and Research Program(STEP)(No.2019QZKK0501)Biological Breeding-National Science and Technology Major Project(2023ZD0407106)Chinese Universities Scientific Fund(2024TC162).
文摘Background Unveiling genetic diversity features and understanding the genetic mechanisms of diverse goat pheno-types are pivotal in facilitating the preservation and utilization of these genetic resources.However,the total genetic diversity within a species can’t be captured by the reference genome of a single individual.The pan-genome is a col-lection of all the DNA sequences that occur in a species,and it is expected to capture the total genomic diversity of the specific species.Results We constructed a goat pan-genome using map-to-pan assemble based on 813 individuals,including 723 domestic goats and 90 samples from their wild relatives,which presented a broad regional and global represen-tation.In total,146 Mb sequences and 974 genes were identified as absent from the reference genome(ARS1.2;GCF_001704415.2).We identified 3,190 novel single nucleotide polymorphisms(SNPs)using the pan-genome analysis.These novel SNPs could properly reveal the population structure of domestic goats and their wild relatives.Presence/absence variation(PAV)analysis revealed gene loss and intense negative selection during domestication and improvement.Conclusions Our research highlights the importance of the goat pan-genome in capturing the missing genetic variations.It reveals the changes in genomic architecture during goat domestication and improvement,such as gene loss.This improves our understanding of the evolutionary and breeding history of goats.
基金funded by National Key Research and Development Program(2022YFA1304200)Agricultural Science and Technology Innovation Program(CAAS-ASTIP-2023-IFR-04 and CAAS-ZDRW202305)the Beijing Innovation Consortium of Livestock Research System(BAIC05-2023).
文摘Background Rumen microorganisms are key regulators of ruminant growth and production performance.Identifying probiotic candidates through microbial culturomics presents a promising strategy for improving ruminant production performance.Our previous study identified significant differences in rumen microbial communities of Holstein calves with varying average daily gain(ADG).This study aims to identify a target strain based on the findings from multi-omics analysis and literature review,isolating and evaluating the target microbial strains from both the rumen and hindgut contents for their probiotic potential.Results Parabacteroides distasonis,a strain closely associated with ADG,was successfully isolated from calf rumen content cultured with Fastidious Anaerobe Agar(FAA)medium and named Parabacteroides distasonis F4.Wholegenome sequencing and pan-genome analysis showed that P.distasonis F4 possesses a core functional potential for carbohydrate and amino acid metabolism,with the ability to produce propionate,acetate,and lactate.The results of targeted and untargeted metabolomics further validated the organic acid production and metabolic pathways of P.distasonis F4.An in vitro simulated rumen fermentation test showed that supplementation with P.distasonis F4 significantly altered rumen microbial community structure and increased the molar proportions of propionate and butyrate in the rumen.Furthermore,an in vivo study demonstrated that dietary supplementation with P.distasonis F4 significantly increased the ADG of pre-weaning calves.Conclusions This study represents the first isolation of P.distasonis F4 from rumen,highlighting its potential as a probiotic strain for improving rumen development and growth performance in ruminants.
基金supported by the National Science and Technology Major Project(2021YFF1201200)the National Natural Science Foundation of China(62372316)the Sichuan Science and Technology Program key project(2024YFHZ0091).
文摘As a common foodborne pathogen,Salmonella poses risks to public health safety,common given the emergence of antimicrobial-resistant strains.However,there is currently a lack of systematic platforms based on large language models(LLMs)for Salmonella resistance prediction,data presentation,and data sharing.To overcome this issue,we firstly propose a two-step feature-selection process based on the chi-square test and conditional mutual information maximization to find the key Salmonella resistance genes in a pan-genomics analysis and develop an LLM-based Salmonella antimicrobial-resistance predictive(SARPLLM)algorithm to achieve accurate antimicrobial-resistance prediction,based on Qwen2 LLM and low-rank adaptation.Secondly,we optimize the time complexity to compute the sample distance from the linear to logarithmic level by constructing a quantum data augmentation algorithm denoted as QSMOTEN.Thirdly,we build up a user-friendly Salmonella antimicrobial-resistance predictive online platform based on knowledge graphs,which not only facilitates online resistance prediction for users but also visualizes the pan-genomics analysis results of the Salmonella datasets.
基金supported by the Deutsche Forschungsgemeinschaft(MA6473/1-1,MA6473/2-1)
文摘Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between multiple gene and sequence copies, and in genetic mapping, hindering use of genomic data for genetics and breeding. Polyploid genomes may also be more prone to containing structural variation, such as loss of gene copies or sequences(presence–absence variation) and the presence of genes or sequences in multiple copies(copynumber variation). Although the two main types of genomic structural variation commonly identified are presence–absence variation and copy-number variation, we propose that homeologous exchanges constitute a third major form of genomic structural variation in polyploids. Homeologous exchanges involve the replacement of one genomic segment by a similar copy from another genome or ancestrally duplicated region, and are known to be extremely common in polyploids. Detecting all kinds of genomic structural variation is challenging, but recent advances such as optical mapping and long-read sequencing offer potential strategies to help identify structural variants even in complex polyploid genomes. All three major types of genomic structural variation(presence–absence, copy-number, and homeologous exchange) are now known to influence phenotypes in crop plants, with examples of flowering time, frost tolerance, and adaptive and agronomic traits. In this review,we summarize the challenges of genome analysis in polyploid crops, describe the various types of genomic structural variation and the genomics technologies and data that can be used to detect them, and collate information produced to date related to the impact of genomic structural variation on crop phenotypes. We highlight the importance of genomic structural variation for the future genetic improvement of polyploid crops.
基金supported by the Natural Science Foundation of Zhejiang Province of China (Y3090150)the Fundamental Research Funds for the Central Universities,China+4 种基金the Zhejiang Provincial Project, China (2010R10091)the Research Project for Commonweal Industry of Agricultural Ministry, China (nyhyzx 201003029 201003066)the Specialized Research Fund for the Doctoral Program of Higher Education, China (20090101120083)the Key Subject Construction Program for Modern Agricultural Biotechnology and Crop Disease Control of Zhejiang, China
文摘Horizontal gene transfer (HGT) plays key roles in the evolution of pathogenetic bacteria, especially in pathogenetic associated genes. In this study, the evolutionary dynamics of Xanthomonas at species level were determined by the comparative analysis of the complete genomes of 15 Xanthomonas strains. A concatenated multiprotein phyletic pattern and a dataset with Xanthomonas clusters of orthologous genes were constructed. Mathematical extrapolation estimates that the core genome will reach a minimum of about 1 547 genes while the pan-genome will increase up to 22 624 genes when sequencing 1 000 genomes. The HGT extent in this genus was assessed by using a Markov-based probabilistic method. The reconstructed gene gain/loss history, which contained several features consistent with biological observations, showed that nearly 60% of the Xanthomonas genes were acquired by HGT. A large fraction of variability was in the clade ancestor nodes and "leaves of the tree". Coexpression analysis suggested that the pathogenic and metabolic variation between Xanthomonas oryzae pv. oryzicola and Xanthomonas oryzae pv. oryzae might due to recently-transferred genes. Our results strongly supported that the gene gain/loss may play an important role in divergence and pathogenicity variation of Xanthomonas species.