The publication of several high-quality genomes has contributed greatly to clarifying the evolution of citrus.However,due to their complex genetic backgrounds,the origins and evolution of many citrus species remain un...The publication of several high-quality genomes has contributed greatly to clarifying the evolution of citrus.However,due to their complex genetic backgrounds,the origins and evolution of many citrus species remain unclear.We assembled de novo the 294-Mbp chromosome-level genome of a more than200-year-old primitive papeda(DYC002).Comparison between the two sets of homologous chromosomes of the haplotype-resolved genome revealed 1.2%intragenomic variations,including 1.75 million SNPs,149,471 insertions and 154,215 deletions.Using this genome as a reference,we resequenced and performed population and phylogenetic analyses of378 representative citrus accessions.Our study confirmed that the primary origin center of core Citrus species is in South China,particularly in the Himalaya-Hengduan Mountains.Papeda species are an ancient Citrus type compared with C.ichangensis.We found that the evolution of the Citrus genus followed two radiations through two routes(to East China and Southeast Asia)along river systems.Evidence for the origin and evolution of some individual citrus species was provided.Papeda probably played an important role in the origins of Australian finger lime,citrons,Honghe papeda and pummelos;Ichang papeda originated from Yuanjiang city of Yunnan Province,China,and C.mangshanensis has a close relationship with kumquat and Ichang papeda.Moreover,the Hunan and Guangdong Provinces of China are predicted to be the origin center of mandarin,sweet orange and sour orange.Additionally,our study revealed that fruit bitterness was significantly selected against during citrus domestication.Taken together,this study provides new insight into the origin and evolution of citrus species and may serve as a valuable genomic resource for citrus breeding and improvement.展开更多
The major histocompatibility complex(MHC)region plays a crucial role in immune function and is implicated in various diseases and cancer immunoediting.However,its high polymorphism poses challenges for accurate geneti...The major histocompatibility complex(MHC)region plays a crucial role in immune function and is implicated in various diseases and cancer immunoediting.However,its high polymorphism poses challenges for accurate genetic profiling using conventional reference genomes.Here,we present high-quality,haplotype-resolved assemblies of the MHC region in five widely used tumor cell lines:A549,HeLa,HepG2,K562,and U2OS.Numerous oncological studies extensively employ these cell lines,ranging from basic molecular research to drug discovery and personalized medicine approaches.By integrating CRISPR-based targeted enrichment with 10×Genomics linked-read and PacBio HiFi long-read sequencing,we constructed MHC haplotypes for each cell line,providing a valuable resource for the research community.Using these assembled haplotypes as references,we characterize the aneuploidy of the MHC region in these cell lines,offering insights into the genetic landscape of this critical immunological locus.Our work addresses the urgent need for accurate MHC profiling in these widely used cell line models,enabling more precise interpretation of existing and future genomic and epigenomic data.This resource is expected to significantly enhance our understanding of tumor biology,immune responses,and the development of targeted therapies.展开更多
Hybrid crops often exhibit increased yield and greater resilience,yet the genomic mechanism(s)underlying hybrid vigor or heterosis remain unclear,hindering our ability to predict the expression of phenotypic traits in...Hybrid crops often exhibit increased yield and greater resilience,yet the genomic mechanism(s)underlying hybrid vigor or heterosis remain unclear,hindering our ability to predict the expression of phenotypic traits in hybrid breeding.Here,we generated haplotype-resolved T2T genome assemblies of two pear hybrid varieties,‘Yuluxiang’(YLX)and‘Hongxiangsu’(HXS),which share the same maternal parent but differ in their paternal parents.We then used these assemblies to explore the genome-scale landscape of allele-specific expression(ASE)and create a pangenome graph for pear.ASE was observed for close to 6000 genes in both hybrid cultivars.A subset of ASE genes related to aspects of fruit quality such as sugars,organic acids,and cuticular wax were identified,suggesting their important contributions to heterosis.Specifically,Ma1,a gene regulating fruit acidity,is absent in the paternal haplotypes of HXS and YLX.A pangenome graph was built based on our assemblies and seven published pear genomes.Resequencing data for 139 cultivated pear genotypes(including 97 genotypes sequenced here)were subsequently aligned to the pangenome graph,revealing numerous structural variant hotspots and selective sweeps during pear diversification.As predicted,the Ma1 allele was found to be absent in varieties with low organic acid content,and this association was functionally validated by Ma1 overexpression in pear fruit and calli.Overall,these results reveal the contributions of ASE to fruit-quality heterosis and provide a robust pangenome reference for high-resolution allele discovery and association mapping.展开更多
Roses are consistently ranked at the forefront in cut flower production.Increasing demands of market and changing climate conditions have resulted in the need to further improve the diversity and quality of traits.How...Roses are consistently ranked at the forefront in cut flower production.Increasing demands of market and changing climate conditions have resulted in the need to further improve the diversity and quality of traits.However,frequent hybridization leads to highly heterozygous nature,including the allelic variants.Therefore,the absence of comprehensive genomic information leads to them making it challenging to molecular breeding.Here,two haplotype-resolved chromosome genomes for Rosa chinensis‘Chilong Hanzhu’(2n=14)which is high heterozygous diploid old Chinese rose are generated.An amount of genetic variation(1,605,616 SNPs,209,575 indels)is identified.13,971 allelic genes show differential expression patterns between two haplotypes.Importantly,these differences hold valuable insights into regulatory mechanisms of traits.RcMYB114b can influence cyanidin-3-glucoside accumulation and the allelic variation in its promoter leads to differences in promoter activity,which as a factor control petal color.Moreover,gene family expansion may contribute to the abundance of terpenes in floral scents.Additionally,RcANT1,RcDA1,RcAG1 and RcSVP1 genes are involved in regulation of petal number and size under heat stress treatment.This study provides a foundation for molecular breeding to improve important characteristics of roses.展开更多
The advantages of both the length and accuracy of high-fidelity(HiFi)reads enable chromosome-scale haplotype-resolved genome assembly.In this study,we sequenced a cell line named HJ,established from a Chinese Han male...The advantages of both the length and accuracy of high-fidelity(HiFi)reads enable chromosome-scale haplotype-resolved genome assembly.In this study,we sequenced a cell line named HJ,established from a Chinese Han male individual by using HiFi and Hi-C.We assembled two high-quality haplotypes of the HJ genome(haplotype 1(H1):3.1 Gb,haplotype 2(H2):2.9 Gb).The continuity(H1:contig N50=28.2 Mb,H2:contig N50=25.9 Mb)and completeness(BUSCO:H1=94.9%,H2=93.5%)are substantially better than those of other Chinese genomes,for example,HX1,NH1.0,and YH2.0.By comparing HJ genome with GRCh38,we reported the mutation landscape of HJ and found that 176 and 213 N-gaps were filled in H1 and H2,respectively.In addition,we detected 12.9 Mb and 13.4 Mb novel sequences containing 246 and 135 protein-coding genes in H1 and H2,respectively.Our results demonstrate the advantages of HiFi reads in haplotype-resolved genome assembly and provide two high-quality haplotypes of a potential Chinese genome as a reference for the Chinese Han population.展开更多
Objectives:The basic helix-loop-helix(bHLH)transcription factors(TFs)regulate fruit growth in many plants.However,there is no available study on the bHLH gene family in the haplotype-resolved genome of cultivated stra...Objectives:The basic helix-loop-helix(bHLH)transcription factors(TFs)regulate fruit growth in many plants.However,there is no available study on the bHLH gene family in the haplotype-resolved genome of cultivated strawberry(Fragaria×ananassa).Materials and Methods:The 131 FabHLH genes identifed in the strawberry cultivar‘Yanli’haplotype-resolved genome were classifed into 24 subfamilies according to their phylogenetic relationships.Gene structure,conserved motifs,and chromosomal locations were investigated using bioinformatics.Results:In total,15 FabHLH genes potentially involved in fruit development were screened based on transcriptome analysis of different stages of fruit development.We also identifed the cis-regulatory elements of these 15 FabHLH genes,predicted upstream transcription factors,and identifed protein-protein interactions.Conclusions:The fndings of this study improve our understanding of the regulation mediated by bHLH TFs during strawberry fruit growth and maturation.展开更多
Potato(Solanum tuberosum)is the most consumed non-cereal food crop.Most commercial potato cultivars are autotetraploids with highly heterozygous genomes,severely hampering genetic analyses and improvement.By leveragin...Potato(Solanum tuberosum)is the most consumed non-cereal food crop.Most commercial potato cultivars are autotetraploids with highly heterozygous genomes,severely hampering genetic analyses and improvement.By leveraging the state-of-the-art sequencing technologies and polyploid graph binning,we achieved a chromosome-scale,haplotype-resolved genome assembly of a cultivated potato,Cooperation-88(C88).lntra-haplotype comparative analyses revealed extensive sequence and expression differences in this tetraploid genome.We identified haplotype-specific pericentromeres on chromosomes,suggesting a distinct evolutionary trajectory of potato homologous centromeres.Furthermore,we detected double reduction events that are unevenly distributed on haplotypes in 1021 of 1034 selfing progeny,a feature of autopolyploid inheritance.By distinguishing maternal and paternal haplotype sets in C88,we simulated the origin of heterosis in cultivated tetraploid with a survey of 3110 tetra-allelic loci with deleterious mutations,which were masked in the heterozygous condition bytwo parents.This study provides insights into the genomic architecture of autopolyploids and will guide their breeding.展开更多
Since its initial release in 2001,the human reference genome has undergone continuous improvement in quality,and the recently released telomere-to-telomere(T2T)version-T2T-CHM13—reaches its highest level of continuit...Since its initial release in 2001,the human reference genome has undergone continuous improvement in quality,and the recently released telomere-to-telomere(T2T)version-T2T-CHM13—reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified,nearly homozygous genome of a hydatidiform mole cell line.Here,to provide an authentic complete diploid human genome reference for the Han Chinese,the largest population in the world,we assembled the genome of a male Han Chinese individual,T2T-YAO,which includes T2T assemblies of all the 22+X+M and 22+Y chromosomes in both haploids.The quality of T2T-YAO is much better than those of all currently available diploid assemblies,and its haploid version,T2T-YAO-hp,generated by selecting the better assembly for each autosome,reaches the top quality of fewer than one error per 29.5 Mb,even higher than that of T2T-CHM13.Derived from an individual living in the aboriginal region of the Han population,T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors.Each haplotype of T2TYAO possesses330-Mb exclusive sequences,3100 unique genes,and tens of thousands of nucleotide and structural variations as compared with CHM13,highlighting the necessity of a population-stratified reference genome.The construction of T2T-YAO,an accurate and authentic representative of the Chinese population,would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes,especially within the context of the unique variations of the Chinese population.展开更多
Over the past 20 years,tremendous advances in sequencing technologies and computational algorithms have spurred plant genomic research into a thriving era with hundreds of genomes decoded already,ranging from those of...Over the past 20 years,tremendous advances in sequencing technologies and computational algorithms have spurred plant genomic research into a thriving era with hundreds of genomes decoded already,ranging from those of nonvascular plants to those of flowering plants.However,complex plant genome assembly is still challenging and remains difficult to fully resolve with conventional sequencing and assembly methods due to high heterozygosity,highly repetitive sequences,or high ploidy characteristics of complex genomes.Herein,we summarize the challenges of and advances in complex plant genome assembly,including feasible experimental strategies,upgrades to sequencing technology,existing assembly methods,and different phasing algorithms.Moreover,we list actual cases of complex genome projects for readers to refer to and draw upon to solve future problems related to complex genomes.Finally,we expect that the accurate,gapless,telomere-totelomere,and fully phased assembly of complex plant genomes could soon become routine.展开更多
The importance of structural variants(SVs)for human phenotypes and diseases is now recognized.Although a variety of SV detection platforms and strategies that vary in sensitivity and specificity have been developed,fe...The importance of structural variants(SVs)for human phenotypes and diseases is now recognized.Although a variety of SV detection platforms and strategies that vary in sensitivity and specificity have been developed,few benchmarking procedures are available to confidently assess their performances in biological and clinical research.To facilitate the validation and application of these SV detection approaches,we established an Asian reference material by characterizing the genome of an Epstein-Barr virus(EBV)-immortalized B lymphocyte line along with identified benchmark regions and high-confidence SV calls.We established a high-confidence SV callset with 8938 SVs by integrating four alignment-based SV callers,including 109×Pacific Bio sciences(PacBio)continuous long reads(CLRs),22×PacBio circular consensus sequencing(CCS)reads,104×Oxford Nanopore Technologies(ONT)long reads,and 114×Bionano optical mapping platform,and one de novo assembly-based SV caller using CCS reads.A total of 544 randomly selected SVs were validated by PCR amplification and Sanger sequencing,demonstrating the robustness of our SV calls.Combining trio-binning-based haplotype assemblies,we established an SV benchmark for identifying false negatives and false positives by constructing the continuous high-confidence regions(CHCRs),which covered 1.46 gigabase pairs(Gb)and 6882 SVs supported by at least one diploid haplotype assembly.Establishing high-confidence SV calls for a benchmark sample that has been characterized by multiple technologies provides a valuable resource for investigating SVs in human biology,disease,and clinical research.展开更多
基金supported by the China Agriculture Research System(CARS-Citrus),Fundamental Research Funds for the Central Universities(Grant No.SWU120021)the National Natural Science Foundation of China(Nos.32060664,31901995)funds from Fujian Agriculture and Forestry University,and Fundamental Research Funds for the Central Universities(SWU-XDJH202308)。
文摘The publication of several high-quality genomes has contributed greatly to clarifying the evolution of citrus.However,due to their complex genetic backgrounds,the origins and evolution of many citrus species remain unclear.We assembled de novo the 294-Mbp chromosome-level genome of a more than200-year-old primitive papeda(DYC002).Comparison between the two sets of homologous chromosomes of the haplotype-resolved genome revealed 1.2%intragenomic variations,including 1.75 million SNPs,149,471 insertions and 154,215 deletions.Using this genome as a reference,we resequenced and performed population and phylogenetic analyses of378 representative citrus accessions.Our study confirmed that the primary origin center of core Citrus species is in South China,particularly in the Himalaya-Hengduan Mountains.Papeda species are an ancient Citrus type compared with C.ichangensis.We found that the evolution of the Citrus genus followed two radiations through two routes(to East China and Southeast Asia)along river systems.Evidence for the origin and evolution of some individual citrus species was provided.Papeda probably played an important role in the origins of Australian finger lime,citrons,Honghe papeda and pummelos;Ichang papeda originated from Yuanjiang city of Yunnan Province,China,and C.mangshanensis has a close relationship with kumquat and Ichang papeda.Moreover,the Hunan and Guangdong Provinces of China are predicted to be the origin center of mandarin,sweet orange and sour orange.Additionally,our study revealed that fruit bitterness was significantly selected against during citrus domestication.Taken together,this study provides new insight into the origin and evolution of citrus species and may serve as a valuable genomic resource for citrus breeding and improvement.
基金supported by funding from the Science and Technology Commission of Shanghai Municipality,China(No.23JS1400400)the National Natural Science Foundation of China(No.32300484,82171837)+1 种基金Shanghai Municipal Science and Technology Major Project(China)(No.2017SHZDZX01,2018SHZDZX01)ZJLab(Shanghai,China).
文摘The major histocompatibility complex(MHC)region plays a crucial role in immune function and is implicated in various diseases and cancer immunoediting.However,its high polymorphism poses challenges for accurate genetic profiling using conventional reference genomes.Here,we present high-quality,haplotype-resolved assemblies of the MHC region in five widely used tumor cell lines:A549,HeLa,HepG2,K562,and U2OS.Numerous oncological studies extensively employ these cell lines,ranging from basic molecular research to drug discovery and personalized medicine approaches.By integrating CRISPR-based targeted enrichment with 10×Genomics linked-read and PacBio HiFi long-read sequencing,we constructed MHC haplotypes for each cell line,providing a valuable resource for the research community.Using these assembled haplotypes as references,we characterize the aneuploidy of the MHC region in these cell lines,offering insights into the genetic landscape of this critical immunological locus.Our work addresses the urgent need for accurate MHC profiling in these widely used cell line models,enabling more precise interpretation of existing and future genomic and epigenomic data.This resource is expected to significantly enhance our understanding of tumor biology,immune responses,and the development of targeted therapies.
基金funded by the National Key Research and Development Program of China(2022YFF1003100-02)the National Natural Science Foundation of China(32172511)+5 种基金the Jiangsu Agricultural Science and Technology Innovation Fund(CX(22)2025)the Natural Science Foundation of Jiangsu Province(BK20210397)the Seed Industry Promotion Project of Jiangsu(JBGS(2021)022)the Guidance Foundation of the Hainan Institute of Nanjing Agricultural University(NAUSY-MS08)the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions,the Jiangsu Provincial Key Research and Development Program(BE2023365)the Earmarked Fund for China Agriculture Research System(CARS-28).This study was supported by the High-Performance Computing Platform of the Bioinformatics Center,Nanjing Agricultural University.
文摘Hybrid crops often exhibit increased yield and greater resilience,yet the genomic mechanism(s)underlying hybrid vigor or heterosis remain unclear,hindering our ability to predict the expression of phenotypic traits in hybrid breeding.Here,we generated haplotype-resolved T2T genome assemblies of two pear hybrid varieties,‘Yuluxiang’(YLX)and‘Hongxiangsu’(HXS),which share the same maternal parent but differ in their paternal parents.We then used these assemblies to explore the genome-scale landscape of allele-specific expression(ASE)and create a pangenome graph for pear.ASE was observed for close to 6000 genes in both hybrid cultivars.A subset of ASE genes related to aspects of fruit quality such as sugars,organic acids,and cuticular wax were identified,suggesting their important contributions to heterosis.Specifically,Ma1,a gene regulating fruit acidity,is absent in the paternal haplotypes of HXS and YLX.A pangenome graph was built based on our assemblies and seven published pear genomes.Resequencing data for 139 cultivated pear genotypes(including 97 genotypes sequenced here)were subsequently aligned to the pangenome graph,revealing numerous structural variant hotspots and selective sweeps during pear diversification.As predicted,the Ma1 allele was found to be absent in varieties with low organic acid content,and this association was functionally validated by Ma1 overexpression in pear fruit and calli.Overall,these results reveal the contributions of ASE to fruit-quality heterosis and provide a robust pangenome reference for high-resolution allele discovery and association mapping.
基金supported by funding from the National key research and development program of China(2019YFD1000403&2021YFD1200205-03)The Scientific Research Foundation for Principle Investigator,Kunpeng Institute of Modern Agriculture at Foshan(KIMA-QD2022004)+1 种基金the Funding of Major Scientific Research Tasks,Kunpeng Institute of Modern Agriculture at Foshan(KIMA-ZDKY2022004)this work was also supported in part by the Chinese Academy of Agricultural Sciences Elite Youth Program to Zhiqiang Wu.
文摘Roses are consistently ranked at the forefront in cut flower production.Increasing demands of market and changing climate conditions have resulted in the need to further improve the diversity and quality of traits.However,frequent hybridization leads to highly heterozygous nature,including the allelic variants.Therefore,the absence of comprehensive genomic information leads to them making it challenging to molecular breeding.Here,two haplotype-resolved chromosome genomes for Rosa chinensis‘Chilong Hanzhu’(2n=14)which is high heterozygous diploid old Chinese rose are generated.An amount of genetic variation(1,605,616 SNPs,209,575 indels)is identified.13,971 allelic genes show differential expression patterns between two haplotypes.Importantly,these differences hold valuable insights into regulatory mechanisms of traits.RcMYB114b can influence cyanidin-3-glucoside accumulation and the allelic variation in its promoter leads to differences in promoter activity,which as a factor control petal color.Moreover,gene family expansion may contribute to the abundance of terpenes in floral scents.Additionally,RcANT1,RcDA1,RcAG1 and RcSVP1 genes are involved in regulation of petal number and size under heat stress treatment.This study provides a foundation for molecular breeding to improve important characteristics of roses.
基金the National Key R&D Program of China(2022YFC3400300)the National Natural Science Foundation of China(32125009,62172325,32070663)+1 种基金the National Key R&D Program of China(2017YFC0906501)the Key Construction Program of the National‘985’Project,and the Fundamental Research Funds for the Central Universities.
文摘The advantages of both the length and accuracy of high-fidelity(HiFi)reads enable chromosome-scale haplotype-resolved genome assembly.In this study,we sequenced a cell line named HJ,established from a Chinese Han male individual by using HiFi and Hi-C.We assembled two high-quality haplotypes of the HJ genome(haplotype 1(H1):3.1 Gb,haplotype 2(H2):2.9 Gb).The continuity(H1:contig N50=28.2 Mb,H2:contig N50=25.9 Mb)and completeness(BUSCO:H1=94.9%,H2=93.5%)are substantially better than those of other Chinese genomes,for example,HX1,NH1.0,and YH2.0.By comparing HJ genome with GRCh38,we reported the mutation landscape of HJ and found that 176 and 213 N-gaps were filled in H1 and H2,respectively.In addition,we detected 12.9 Mb and 13.4 Mb novel sequences containing 246 and 135 protein-coding genes in H1 and H2,respectively.Our results demonstrate the advantages of HiFi reads in haplotype-resolved genome assembly and provide two high-quality haplotypes of a potential Chinese genome as a reference for the Chinese Han population.
基金supported by the National Natural Science Foundation of China(Nos.32130092,31872072,and 32102350)the Liaoning Provincial Department of Education Scientifc Research Funding Project(No.LJKZ0635),China.
文摘Objectives:The basic helix-loop-helix(bHLH)transcription factors(TFs)regulate fruit growth in many plants.However,there is no available study on the bHLH gene family in the haplotype-resolved genome of cultivated strawberry(Fragaria×ananassa).Materials and Methods:The 131 FabHLH genes identifed in the strawberry cultivar‘Yanli’haplotype-resolved genome were classifed into 24 subfamilies according to their phylogenetic relationships.Gene structure,conserved motifs,and chromosomal locations were investigated using bioinformatics.Results:In total,15 FabHLH genes potentially involved in fruit development were screened based on transcriptome analysis of different stages of fruit development.We also identifed the cis-regulatory elements of these 15 FabHLH genes,predicted upstream transcription factors,and identifed protein-protein interactions.Conclusions:The fndings of this study improve our understanding of the regulation mediated by bHLH TFs during strawberry fruit growth and maturation.
基金supported by the Guangdong Major Project of Basic and Applied Basic Research(2021B0301030004)Agricultural Science and Technology Innovation Program(CAAS-ZDRW202101)to S.H.This work was also supported by the National Natural Science Foundation of China(grant nos.31561143006 to G.L.and 32001601 to Y.L.).
文摘Potato(Solanum tuberosum)is the most consumed non-cereal food crop.Most commercial potato cultivars are autotetraploids with highly heterozygous genomes,severely hampering genetic analyses and improvement.By leveraging the state-of-the-art sequencing technologies and polyploid graph binning,we achieved a chromosome-scale,haplotype-resolved genome assembly of a cultivated potato,Cooperation-88(C88).lntra-haplotype comparative analyses revealed extensive sequence and expression differences in this tetraploid genome.We identified haplotype-specific pericentromeres on chromosomes,suggesting a distinct evolutionary trajectory of potato homologous centromeres.Furthermore,we detected double reduction events that are unevenly distributed on haplotypes in 1021 of 1034 selfing progeny,a feature of autopolyploid inheritance.By distinguishing maternal and paternal haplotype sets in C88,we simulated the origin of heterosis in cultivated tetraploid with a survey of 3110 tetra-allelic loci with deleterious mutations,which were masked in the heterozygous condition bytwo parents.This study provides insights into the genomic architecture of autopolyploids and will guide their breeding.
基金supported by the Science and Technology Research Project of Henan(Grant No.232102311003)the National Natural Science Foundation of China(Grant No.U1804282)。
文摘Since its initial release in 2001,the human reference genome has undergone continuous improvement in quality,and the recently released telomere-to-telomere(T2T)version-T2T-CHM13—reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified,nearly homozygous genome of a hydatidiform mole cell line.Here,to provide an authentic complete diploid human genome reference for the Han Chinese,the largest population in the world,we assembled the genome of a male Han Chinese individual,T2T-YAO,which includes T2T assemblies of all the 22+X+M and 22+Y chromosomes in both haploids.The quality of T2T-YAO is much better than those of all currently available diploid assemblies,and its haploid version,T2T-YAO-hp,generated by selecting the better assembly for each autosome,reaches the top quality of fewer than one error per 29.5 Mb,even higher than that of T2T-CHM13.Derived from an individual living in the aboriginal region of the Han population,T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors.Each haplotype of T2TYAO possesses330-Mb exclusive sequences,3100 unique genes,and tens of thousands of nucleotide and structural variations as compared with CHM13,highlighting the necessity of a population-stratified reference genome.The construction of T2T-YAO,an accurate and authentic representative of the Chinese population,would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes,especially within the context of the unique variations of the Chinese population.
基金supported by the National Natural Science Foundation of China(Grant No.32222019)the National Key R&D Program of China(Grant No.2021YFF1000900).
文摘Over the past 20 years,tremendous advances in sequencing technologies and computational algorithms have spurred plant genomic research into a thriving era with hundreds of genomes decoded already,ranging from those of nonvascular plants to those of flowering plants.However,complex plant genome assembly is still challenging and remains difficult to fully resolve with conventional sequencing and assembly methods due to high heterozygosity,highly repetitive sequences,or high ploidy characteristics of complex genomes.Herein,we summarize the challenges of and advances in complex plant genome assembly,including feasible experimental strategies,upgrades to sequencing technology,existing assembly methods,and different phasing algorithms.Moreover,we list actual cases of complex genome projects for readers to refer to and draw upon to solve future problems related to complex genomes.Finally,we expect that the accurate,gapless,telomere-totelomere,and fully phased assembly of complex plant genomes could soon become routine.
基金supported by grants from the National Key R&D Program of China(Grant No.2017YFC0906501)。
文摘The importance of structural variants(SVs)for human phenotypes and diseases is now recognized.Although a variety of SV detection platforms and strategies that vary in sensitivity and specificity have been developed,few benchmarking procedures are available to confidently assess their performances in biological and clinical research.To facilitate the validation and application of these SV detection approaches,we established an Asian reference material by characterizing the genome of an Epstein-Barr virus(EBV)-immortalized B lymphocyte line along with identified benchmark regions and high-confidence SV calls.We established a high-confidence SV callset with 8938 SVs by integrating four alignment-based SV callers,including 109×Pacific Bio sciences(PacBio)continuous long reads(CLRs),22×PacBio circular consensus sequencing(CCS)reads,104×Oxford Nanopore Technologies(ONT)long reads,and 114×Bionano optical mapping platform,and one de novo assembly-based SV caller using CCS reads.A total of 544 randomly selected SVs were validated by PCR amplification and Sanger sequencing,demonstrating the robustness of our SV calls.Combining trio-binning-based haplotype assemblies,we established an SV benchmark for identifying false negatives and false positives by constructing the continuous high-confidence regions(CHCRs),which covered 1.46 gigabase pairs(Gb)and 6882 SVs supported by at least one diploid haplotype assembly.Establishing high-confidence SV calls for a benchmark sample that has been characterized by multiple technologies provides a valuable resource for investigating SVs in human biology,disease,and clinical research.