[Objective] The study aimed to investigate the genetic variation characters of entire sequences between two H9N2 subtype avian influenza virus strains and other reference strains.[Method] The entire sequences of 8 gen...[Objective] The study aimed to investigate the genetic variation characters of entire sequences between two H9N2 subtype avian influenza virus strains and other reference strains.[Method] The entire sequences of 8 genes were obtained by using RT-PCR,and these sequences were analyzed with that of six H9N2 subtype avian influenza isolates in homology comparison and genetic evolution relation.[Result] The results showed that the nucleotide sequence of entire gene of the strain shared 91.1%-95.4% homology with other seven reference strains,and PG08 shared the highest homology 91.3% with C/BJ/1/94;ZD06 shared the highest homology 92.3% with D/HK/Y280/97.HA cleavage sites of two H9N2 subtype avian influenza virus isolated strains were PARSSR/GLF,typical of mildly pathogenic avian influenza virus.[Conclusion] Phylogenetic tree for entire gene of eight strains showed that the genetic relationship was the closest between ZD06 and C/Pak/2/99 strains,which belonged to the Eurasian lineage;PG08 shared the highest homology 91.3% with ZD06,it may be the product of gene rearrangements of other sub-lines.展开更多
Chinese tree shrews (Tupaia belangeri chinensis) have become an increasingly important experimental animal in biomedical research due to their close relationship to primates. An accurately sequenced and assembled geno...Chinese tree shrews (Tupaia belangeri chinensis) have become an increasingly important experimental animal in biomedical research due to their close relationship to primates. An accurately sequenced and assembled genome is essential for understanding the genetic features and biology of this animal. In this study, we used long-read single-molecule sequencing and high-throughput chromosome conformation capture (Hi-C) technology to obtain a high-qualitychromosome-scale scaffolding of the Chinese tree shrew genome. The new reference genome (KIZ version 2: TS_2.0) resolved problems in presently available tree shrew genomes and enabled accurate identification of large and complex repeat regions, gene structures, and species-specific genomic structural variants. In addition, by sequencing the genomes of six Chinese tree shrew individuals, we produced a comprehensive map of 12.8 M single nucleotide polymorphisms and confirmed that the major histocompatibility complex (MHC) loci and immunoglobulin gene family exhibited high nucleotide diversity in the tree shrew genome. We updated the tree shrew genome database (TreeshrewDB v2.0: http://www.treeshrewdb.org) to include the genome annotation information and genetic variations. The new high-quality reference genome of the Chinese tree shrew and the updated TreeshrewDB will facilitate the use of this animal in many different fields of research.展开更多
Reliable and accurate pre-implantation genetic diagnosis (PGD) of patient's embryos by next-generation sequencing (NGS) is dependent on efficient whole genome amplification (WGA) of a representative biopsy samp...Reliable and accurate pre-implantation genetic diagnosis (PGD) of patient's embryos by next-generation sequencing (NGS) is dependent on efficient whole genome amplification (WGA) of a representative biopsy sample. However, the performance of the current state of the art WGA methods has not been evaluated for sequencing. Using low template DNA (15 pg) and single cells, we showed that the two PCR-based WGA systems SurePlex and MALBAC are superior to the REPLI-g WGA multiple displacement amplification (MDA) system in terms of consistent and reproducible genome coverage and sequence bias across the 24 chromosomes, allowing better normalization of test to reference sequencing data. When copy number variation sequencing (CNV-Seq) was applied to single cell WGA products derived by either SurePlex or MALBAC amplification, we showed that known disease CNVs in the range of 3-15 Mb could be reliably and accurately detected at the correct genomic positions. These findings indicate that our CNV-Seq pipeline incorporating either SurePlex or MALBAC as the key initial WGA step is a powerful methodology for clinical PGD to identify euploid embryos in a patient's cohort for uterine transplantation,展开更多
Common wheat is an important and widely cultivated food crop throughout the world.Much progress has been made in regard to wheat genome sequencing in the last decade.Starting from the sequencing of single chromosomes/...Common wheat is an important and widely cultivated food crop throughout the world.Much progress has been made in regard to wheat genome sequencing in the last decade.Starting from the sequencing of single chromosomes/chromosome arms whole genome sequences of common wheat and its diploid and tetraploid ancestors have been decoded along with the development of sequencing and assembling technologies. In this review, we give a brief summary on international progress in wheat genome sequencing, and mainly focus on reviewing the effort and contributions made by Chinese scientists.展开更多
Pyrola atropurpurea Franch is an important annual herbaceous plant.Few genomic analyses have been conducted on this plant,and chloroplast genome research will enrich its genomics basis.This study is based on high-thro...Pyrola atropurpurea Franch is an important annual herbaceous plant.Few genomic analyses have been conducted on this plant,and chloroplast genome research will enrich its genomics basis.This study is based on high-throughput sequencing technology and Bioinformatics methods to obtain the sequence,structure,and other characteristics of the P.atropurpurea chloroplast genome.The result showed that the chloroplast genome of P.atropurpurea has a double-stranded circular structure with a total length of 172,535 bp and a typical four-segment structure.The genome has annotated a total of 132 functional genes,including 43 tRNAs,8 rRNAs,76 protein-coding genes,and 5 pseudo-genes.In total,358 SSR loci were checked out,mainly composed of mononucleotide and trinucleotide repeat.There are three types of scattered repetitive sequences,totaling 4223,including 2452 forward repeats,1763 palindrome repeats,and eight reverse repeats.The optimal codon usage frequency is relatively high with AT usage preference in this genome.Chloroplast genome comparative analysis in the family Ericaceae shows that the overall sequence is more complex,and there are more variations in the gene interval region.The collinearity analysis indicated that there is a complex rearrangement of species between different genera in Ericaceae.The selection pressure analysis showed that the protein-encoding genes rpl33 and rps16 were positively selected among the seven medicinal plants in Ericaceae.The maximum likelihood tree shows that the genetic relationship among P.atropurpurea,Pyrola rotundifolia,and Chimaphila japonica is relatively close.Therefore,an important data basis was provided for species identification,genetic diversity,and phylogenetic studies of P.atropurpurea and even this genus of plants.展开更多
Rice(Oryza sativa)is a staple food for more than half of the world's population and a critical crop for global agriculture.Understanding the regulatory mechanisms that control gene expression in the rice genome is...Rice(Oryza sativa)is a staple food for more than half of the world's population and a critical crop for global agriculture.Understanding the regulatory mechanisms that control gene expression in the rice genome is fundamental for advancing agricultural productivity and food security.In mechanism,cis-regulatory elements(including promoters,enhancers,silencers,and insulators)are key DNA sequences whose activities determine the spatial and temporal expression patterns of nearby genes(Yocca and Edger,2022;Schmitz et al.,2022).展开更多
Soybean chlorotic mottle virus(SbCMV)was first detected from soybean plants in Jiangxi Province of China by high throughput sequencing and was confirmed by PCR.The complete nucleotide sequence of NC113 was determined ...Soybean chlorotic mottle virus(SbCMV)was first detected from soybean plants in Jiangxi Province of China by high throughput sequencing and was confirmed by PCR.The complete nucleotide sequence of NC113 was determined to be 8210 nucleotides,and shared the highest similarity(91.7%)with sequences of SbCMV that was only reported in Japan.It encodes nine putative open reading frames(ORFs Ia,Ib and Ⅱ-Ⅷ),and contains a large intergenic region located at nucleotide 5976-6512 between ORFs VI and VII.Sequence analysis and phylogenetic tree indicated that NC113 is an isolate of SbCMV,and is more related to the soymoviruses Blueberry red ringspot virus(BRRSV),Peanut chlorotic streak virus(PCSV)and Cestrum yellow leaf curling virus(CmYLCV)than to other representative members in the Caulimoviridae family.Field survey of 472 legume plants from Jiangxi and Zhejiang provinces showed SbCMV was only detected from soybean in Nanchang City with a low incidence rate.This is the first report of Soybean chlorotic mottle virus identified in China.展开更多
As a part of the Multinational Genome Sequencing Project of Brassica rapa, linkage group R9 and R3 were sequenced using a bacterial artificial chromosome (BAC) by BAC strategy. The current physical contigs are expec...As a part of the Multinational Genome Sequencing Project of Brassica rapa, linkage group R9 and R3 were sequenced using a bacterial artificial chromosome (BAC) by BAC strategy. The current physical contigs are expected to cover approximately 90% euchromatins of both chromosomes. As the project progresses, BAC selection for sequence extension becomes more limited because BAC libraries are restriction enzyme-specific. To support the project, a random sheared fosmid library was constructed. The library consists of 97536 clones with average insert size of approximately 40 kb corresponding to seven genome equivalents, assuming a Chinese cabbage genome size of 550 Mb. The library was screened with primers designed at the end of sequences of nine points of scaffold gaps where BAC clones cannot be selected to extend the physical contigs. The selected positive clones were end-sequenced to check the overlap between the fosmid clones and the adjacent BAC clones. Nine fosmid clones were selected and fully sequenced. The sequences revealed two completed gap filling and seven sequence extensions, which can be used for further selection of BAC clones confirming that the fosmid library will facilitate the sequence completion of B. rapa.展开更多
Objective Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly ...Objective Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. Methods In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Results Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. Conclusion MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use.展开更多
Bread wheat (Triticum aestivum, AABBDD) is an allohexaploid species derived from two rounds of interspecific hybridizations. A high-quality genome sequence assembly of diploid Aegilops tauschii, the donor of the whe...Bread wheat (Triticum aestivum, AABBDD) is an allohexaploid species derived from two rounds of interspecific hybridizations. A high-quality genome sequence assembly of diploid Aegilops tauschii, the donor of the wheat D genome, will provide a useful platform to study polyploid wheat evolution. A combined approach of BAC pooling and next-generation sequencing technology was employed to sequence the minimum tiling path (MTP) of 3176 BAC clones from the short arm ofAe. tauschii chromosome 3 (At3DS). The final assembly of 135 super-scaffolds with an N50 of 4.2 Mb was used to build a 247-Mb pseudomolecule with a total of 2222 predicted protein-coding genes. Compared with the orthologous regions of rice, Brachypodium, and sorghum, At3DS contains 38.67% more genes. In comparison to At3DS, the short arm sequence of wheat chromosome 3B (Ta3BS) is 95-Mb large in size, which is primarily due to the expansion of the non-centromeric region, suggesting that transposable element (TE) bursts in Ta3B likely occurred there. Also, the size increase is accompanied by a proportional increase in gene number in Ta3BS. We found that in the sequence of short arm of wheat chromosome 3D (Ta3DS), there was only less than 0.27% gene loss compared to At3DS. Our study reveals divergent evolution of grass genomes and provides new insights into sequence changes in the polyploid wheat genome.展开更多
The complete genome of Cnaphalocrocis medinalis granulovirus(CnmeGV) from a serious migratory rice pest, Cnaphalocrocis medinalis(Lepidoptera: Pyralidae), was sequenced using the Roche 454 Genome Sequencer FLX system(...The complete genome of Cnaphalocrocis medinalis granulovirus(CnmeGV) from a serious migratory rice pest, Cnaphalocrocis medinalis(Lepidoptera: Pyralidae), was sequenced using the Roche 454 Genome Sequencer FLX system(GS FLX) with shotgun strategy and assembled by Roche GS De Novo assembler software. Its circular double-stranded genome is 111,246 bp in size with a high A+T content of 64.8% and codes for 118 putative open reading frames(ORFs). It contains 37 conserved baculovirus core ORFs, 13 unique ORFs, 26 ORFs that were found in all Lepidoptera baculoviruses and 42 common ORFs. The analysis of nucleotide sequence repeats revealed that the CnmeGV genome differs from the rest of sequenced GVs by a 23 kb and a 17 kb gene block inversions, and does not contain any typical homologous region(hr) except for a region of non-hr-like sequence. Chitinase and cathepsin genes, which are reported to have major roles in the liquefaction of the hosts, were not found in the CnmeGV genome, which explains why CnmeGV infected insects do not show the phenotype of typical liquefaction. Phylogenetic analysis,based on the 37 core baculovirus genes, indicates that CnmeGV is closely related to Adoxophyes orana granulovirus. The genome analysis would contribute to the functional research of CnmeGV,and would benefit to the utilization of CnmeGV as pest control reagent for rice production.展开更多
Apparently balanced chromosomal structural rearrangements are known to cause male infertility and account for approximately 1%of azoospermia or severe oligospermia.However,the underlying mechanisms of pathogenesis and...Apparently balanced chromosomal structural rearrangements are known to cause male infertility and account for approximately 1%of azoospermia or severe oligospermia.However,the underlying mechanisms of pathogenesis and etiologies are still largely unknown.Herein,we investigated apparently balanced interchromosomal structural rearrangements in six cases with azoospermia/severe oligospermia to comprehensively identify and delineate cryptic structural rearrangements and the related copy number variants.In addition,high read-depth genome sequencing(GS)(30-fold)was performed to investigate point mutations causative of male infertility.Mate-pair GS(4-fold)revealed additional structural rearrangements and/or copy number changes in 5 of 6 cases and detected a total of 48 rearrangements.Overall,the breakpoints caused truncations of 30 RefSeq genes,five of which were associated with spermatogenesis.Furthermore,the breakpoints disrupted 43 topological-associated domains.Direct disruptions or potential dysregulations of genes,which play potential roles in male germ cell development,apoptosis,and spermatogenesis,were found in all cases(n=6).In addition,high read-depth GS detected dual molecular findings in case MI6,involving a complex rearrangement and two point mutations in the gene DNAH1.Overall,our study provided the molecular characteristics of apparently balanced interchromosomal structural rearrangements in patients with male infertility.We demonstrated the complexity of chromosomal structural rearrangements,potential gene disruptions/dysregulation and single-gene mutations could be the contributing mechanisms underlie male infertility.展开更多
Only in recent years, the draft sequences for several agricultural animals have been assembled. Assembling an individual animal's entire genome sequence or specific region(s) of interest is increasingly important f...Only in recent years, the draft sequences for several agricultural animals have been assembled. Assembling an individual animal's entire genome sequence or specific region(s) of interest is increasingly important for agricultura researchers to perform genetic comparisons between animals with different performance. We review the current status for several sequenced agricultural species and suggest that next generation sequencing (NGS) technology with decreased sequencing cost and increased speed of sequencing can benefit agricultural researchers. By taking advantage of advanced NGS technologies, genes and chromosomal regions that are more labile to the influence of environmental factors could be pinpointed. A more long term goal would be addressing the question of how animals respond at the molecular and cellular levels to different environmental models (e.g. nutrition). Upon revealing important genes and gene-environment interactions, the rate of genetic improvement can also be accelerated. It is clear that NGS technologies will be able to assist animal scientists to efficiently raise animals and to better prevent infectious diseases so that overall costs of animal production can be decreased.展开更多
The Tibetan macaque, which is endemic to China, is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature (IUCN)(2017). Short tandem repeats (STRs) refer to r...The Tibetan macaque, which is endemic to China, is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature (IUCN)(2017). Short tandem repeats (STRs) refer to repetitive elements of genome sequence that range in length from 1-6 bp. They are found in many organisms and are widely applied in population genetic studies. To clarify the distribution characteristics of genome-wide STRs and understand their variation among Tibetan macaques, we conducted a genome-wide survey of STRs with next-generation sequencing of five macaque samples. A total of 1 077 790 perfect STRs were mined from our assembly, with an N50 of 4 966 bp. Mono-nucleotide repeats were the most abundant, followed by tetra- and di-nucleotide repeats. Analysis of GC content and repeats showed consistent results with other macaques. Furthermore, using STR analysis software (IobSTR), we found that the proportion of base pair deletions in the STRs was greater than that of insertions in the five Tibetan macaque individuals (P〈0.05, t-test). We also found a greater number of homozygous STRs than heterozygous STRs (P〈0.05, t-test), with the Emei and Jianyang Tibetan macaques showing more heterozygous loci than Huangshan Tibetan macaques. The proportion of insertions and mean variation of alleles in the Emei and Jianyang individuals were slightly higher than those in the Huangshan individuals, thus revealing differences in STR allele size between the two populations The polymorphic STR loci identified based on the reference genome showed good amplification efficiency and could be used to study population genetics in Tibetan macaques. The neighbor-joining tree classified the five macaques into two different branches according to their geographical origin, indicating high genetic differentiation between the Huangshan and Sichuan populations. We elucidated the distribution characteristics of STRs in the Tibetan macaque genome and provided an effective method for screening polymorphic STRs. Our results also lay a foundation for future genetic variation studies of macaques.展开更多
Little is known about the genome of Pacific white shrimp (Litopenaeus vannamei). To address this, we conducted BAC (bacterial artificial chromosome) end sequencing of L. vannamei. We selected and sequenced 7 812 BAC c...Little is known about the genome of Pacific white shrimp (Litopenaeus vannamei). To address this, we conducted BAC (bacterial artificial chromosome) end sequencing of L. vannamei. We selected and sequenced 7 812 BAC clones from the BAC library LvHE from the two ends of the inserts by Sanger sequencing. After trimming and quality filtering, 11 279 BAC end sequences (BESs) including 4 609 paired- ends BESs were obtained. The total length of the BESs was 4 340 753 bp, representing 0.18% of the L. vannamei haploid genome. The lengths of the BESs ranged from 100 bp to 660 bp with an average length of 385 bp. Analysis of the BESs indicated that the L. vannamei genome is AT-rich and that the primary repeats patterns were simple sequence repeats (SSRs) and low complexity sequences. Dinucleotide and hexanucleotide repeats were the most common SSR types in the BESs. The most abundant transposable element was gypsy, which may contribute to the generation of the large genome size of L. vannamei. We successfully annotated 4 519 BESs by BLAST searching, including genes involved in immunity and sex determination. Our results provide an important resource for functional gene studies, map construction and integration, and complete genome assembly for this species.展开更多
Genome sequencing has shown strong capabilities in the initial stages of the COVID-19 pandemic such as pathogen identification and virus preliminary tracing.While the rapid acquisition of SARS-Co V-2 genome from clini...Genome sequencing has shown strong capabilities in the initial stages of the COVID-19 pandemic such as pathogen identification and virus preliminary tracing.While the rapid acquisition of SARS-Co V-2 genome from clinical specimens is limited by their low nucleic acid load and the complexity of the nucleic acid background.To address this issue,we modified and evaluated an approach by utilizing SARS-Co V-2-specific amplicon amplification and Oxford Nanopore Prometh ION platform.This workflow started with the throat swab of the COVID-19 patient,combined reverse transcript PCR,and multi-amplification in one-step to shorten the experiment time,then can quickly and steadily obtain high-quality SARS-Co V-2 genome within 24 h.A comprehensive evaluation of the method was conducted in 42 samples:the sequencing quality of the method was correlated well with the viral load of the samples;high-quality SARS-Co V-2 genome could be obtained stably in the samples with Ct value up to 39.14;data yielding for different Ct values were assessed and the recommended sequencing time was 8 h for samples with Ct value of less than 20;variation analysis indicated that the method can detect the existing and emerging genomic mutations as well;Illumina sequencing verified that ultra-deep sequencing can greatly improve the single read error rate of Nanopore sequencing,making it as low as 0.4/10,000 bp.In summary,high-quality SARS-Co V-2 genome can be acquired by utilizing the amplicon amplification and it is an effective method in accelerating the acquisition of genetic resources and tracking the genome diversity of SARSCo V-2.展开更多
Medicinal plants are renowned for their abundant production of secondary metabolites,which exhibit notable pharmacological activities and great potential for drug development.The biosynthesis of secondary metabolites ...Medicinal plants are renowned for their abundant production of secondary metabolites,which exhibit notable pharmacological activities and great potential for drug development.The biosynthesis of secondary metabolites is highly intricate and influenced by various intrinsic and extrinsic factors,resulting in substantial species diversity and content variation.Consequently,precise regulation of secondary metabolite synthesis is of utmost importance.In recent years,genome sequencing has emerged as a valuable tool for investigating the synthesis and regulation of secondary metabolites in medicinal plants,facilitated by the widespread use of high-throughput sequencing technologies.This review highlights the latest advancements in genome sequencing within this field and presents several strategies for studying secondary metabolites.Specifically,the article elucidates how genome sequencing can unravel the pathways for secondary metabolite synthesis in medicinal plants,offering insights into the functions and regulatory mechanisms of participating enzymes.Comparative analyses of plant genomes allow identification of shared pathways of metabolite synthesis among species,thereby providing novel avenues for obtaining cost-effective biosynthetic intermediates.By examining individual genomic variations,genes or gene clusters associated with the synthesis of specific compounds can be discovered,indicating potential targets and directions for drug development and the exploration of alternative compound sources.Moreover,the advent of gene-editing technology has enabled the precise modifications of medicinal plant genomes.Optimization of specific secondary metabolite synthesis pathways becomes thus feasible,enabling the precise editing of target genes to regulate secondary metabolite production within cells.These findings serve as valuable references and lessons for future drug development endeavors,conservation of rare resources,and the exploration of new resources.展开更多
The microbial potential of Penicillium has received critical attention.The present research aimed to elucidate the efficacy of crude enzyme secreted from Penicillium oxalicum WX-209 in degrading citrus segments and ev...The microbial potential of Penicillium has received critical attention.The present research aimed to elucidate the efficacy of crude enzyme secreted from Penicillium oxalicum WX-209 in degrading citrus segments and evaluate the safety of the process.Results showed that citrus segment membranes gradually dissolved after treatment with the crude enzyme solution,indicating good degradation capability.No significant differences in body weight,food ingestion rate,hematology,blood biochemistry,and weight changes of different organs were found between the enzyme intake and control groups.Serial experiments showed that the crude enzyme had high biological safety.Moreover,the whole genome of P.oxalicum WX-209 was sequenced by PacBio and Illumina platforms.Twenty-five scaffolds were assembled to generate 36 Mbp size of genome sequence comprising 11369 predicted genes modeled with a GC content of 48.33%.A total of 592 genes were annotated to encode enzymes related to carbohydrates,and some degradation enzyme genes were identified in strain P.oxalicum WX-209.展开更多
基金Supported by a Sub-project of 973 Program of China(2005CB523001)~~
文摘[Objective] The study aimed to investigate the genetic variation characters of entire sequences between two H9N2 subtype avian influenza virus strains and other reference strains.[Method] The entire sequences of 8 genes were obtained by using RT-PCR,and these sequences were analyzed with that of six H9N2 subtype avian influenza isolates in homology comparison and genetic evolution relation.[Result] The results showed that the nucleotide sequence of entire gene of the strain shared 91.1%-95.4% homology with other seven reference strains,and PG08 shared the highest homology 91.3% with C/BJ/1/94;ZD06 shared the highest homology 92.3% with D/HK/Y280/97.HA cleavage sites of two H9N2 subtype avian influenza virus isolated strains were PARSSR/GLF,typical of mildly pathogenic avian influenza virus.[Conclusion] Phylogenetic tree for entire gene of eight strains showed that the genetic relationship was the closest between ZD06 and C/Pak/2/99 strains,which belonged to the Eurasian lineage;PG08 shared the highest homology 91.3% with ZD06,it may be the product of gene rearrangements of other sub-lines.
基金supported by the National Natural Science Foundation of China(U1402224,31601010,81571998,and U1702284)Yunnan Province(2015HA038 and 2018FB054)Chinese Academy of Sciences(CAS zsys-02)
文摘Chinese tree shrews (Tupaia belangeri chinensis) have become an increasingly important experimental animal in biomedical research due to their close relationship to primates. An accurately sequenced and assembled genome is essential for understanding the genetic features and biology of this animal. In this study, we used long-read single-molecule sequencing and high-throughput chromosome conformation capture (Hi-C) technology to obtain a high-qualitychromosome-scale scaffolding of the Chinese tree shrew genome. The new reference genome (KIZ version 2: TS_2.0) resolved problems in presently available tree shrew genomes and enabled accurate identification of large and complex repeat regions, gene structures, and species-specific genomic structural variants. In addition, by sequencing the genomes of six Chinese tree shrew individuals, we produced a comprehensive map of 12.8 M single nucleotide polymorphisms and confirmed that the major histocompatibility complex (MHC) loci and immunoglobulin gene family exhibited high nucleotide diversity in the tree shrew genome. We updated the tree shrew genome database (TreeshrewDB v2.0: http://www.treeshrewdb.org) to include the genome annotation information and genetic variations. The new high-quality reference genome of the Chinese tree shrew and the updated TreeshrewDB will facilitate the use of this animal in many different fields of research.
基金supported by grants awarded to Yuanqing Yao by the Key Program of the "Twelfth Five-year plan" of People’s liberation Army(No.BWS11J058)the National High Technology Research and Development Program(SS2015AA020402)
文摘Reliable and accurate pre-implantation genetic diagnosis (PGD) of patient's embryos by next-generation sequencing (NGS) is dependent on efficient whole genome amplification (WGA) of a representative biopsy sample. However, the performance of the current state of the art WGA methods has not been evaluated for sequencing. Using low template DNA (15 pg) and single cells, we showed that the two PCR-based WGA systems SurePlex and MALBAC are superior to the REPLI-g WGA multiple displacement amplification (MDA) system in terms of consistent and reproducible genome coverage and sequence bias across the 24 chromosomes, allowing better normalization of test to reference sequencing data. When copy number variation sequencing (CNV-Seq) was applied to single cell WGA products derived by either SurePlex or MALBAC amplification, we showed that known disease CNVs in the range of 3-15 Mb could be reliably and accurately detected at the correct genomic positions. These findings indicate that our CNV-Seq pipeline incorporating either SurePlex or MALBAC as the key initial WGA step is a powerful methodology for clinical PGD to identify euploid embryos in a patient's cohort for uterine transplantation,
基金supported by the Chinese Academy of Sciences (QYZDJ-SSW-SMC001)the National Key Research and Development Program of China (2016YFD0101004)
文摘Common wheat is an important and widely cultivated food crop throughout the world.Much progress has been made in regard to wheat genome sequencing in the last decade.Starting from the sequencing of single chromosomes/chromosome arms whole genome sequences of common wheat and its diploid and tetraploid ancestors have been decoded along with the development of sequencing and assembling technologies. In this review, we give a brief summary on international progress in wheat genome sequencing, and mainly focus on reviewing the effort and contributions made by Chinese scientists.
基金supported by the Education Reform Program of Jiangxi Provincial Department of Education(JXJG-22-23-3,JXJG-23-23-5)the“Biology and Medicine”Discipline Construction Project of Nanchang NormalUniversity(100/20149)+2 种基金Jiangxi Province Key Laboratory of Oil Crops Biology(YLKFKT202203)the Education Reform Program of Nanchang Normal University(NSJG-21-25)Nanchang Key Laboratory of Comprehensive Research and Development of Brasenia schreberi(32060078).
文摘Pyrola atropurpurea Franch is an important annual herbaceous plant.Few genomic analyses have been conducted on this plant,and chloroplast genome research will enrich its genomics basis.This study is based on high-throughput sequencing technology and Bioinformatics methods to obtain the sequence,structure,and other characteristics of the P.atropurpurea chloroplast genome.The result showed that the chloroplast genome of P.atropurpurea has a double-stranded circular structure with a total length of 172,535 bp and a typical four-segment structure.The genome has annotated a total of 132 functional genes,including 43 tRNAs,8 rRNAs,76 protein-coding genes,and 5 pseudo-genes.In total,358 SSR loci were checked out,mainly composed of mononucleotide and trinucleotide repeat.There are three types of scattered repetitive sequences,totaling 4223,including 2452 forward repeats,1763 palindrome repeats,and eight reverse repeats.The optimal codon usage frequency is relatively high with AT usage preference in this genome.Chloroplast genome comparative analysis in the family Ericaceae shows that the overall sequence is more complex,and there are more variations in the gene interval region.The collinearity analysis indicated that there is a complex rearrangement of species between different genera in Ericaceae.The selection pressure analysis showed that the protein-encoding genes rpl33 and rps16 were positively selected among the seven medicinal plants in Ericaceae.The maximum likelihood tree shows that the genetic relationship among P.atropurpurea,Pyrola rotundifolia,and Chimaphila japonica is relatively close.Therefore,an important data basis was provided for species identification,genetic diversity,and phylogenetic studies of P.atropurpurea and even this genus of plants.
基金supported by the National Natural Science Foundation of China(32070656)。
文摘Rice(Oryza sativa)is a staple food for more than half of the world's population and a critical crop for global agriculture.Understanding the regulatory mechanisms that control gene expression in the rice genome is fundamental for advancing agricultural productivity and food security.In mechanism,cis-regulatory elements(including promoters,enhancers,silencers,and insulators)are key DNA sequences whose activities determine the spatial and temporal expression patterns of nearby genes(Yocca and Edger,2022;Schmitz et al.,2022).
基金supported by the Special Fund for Agro-Scientific Research in the Public Interest, China (201303028)the National Natural Science Foundation of China (31571977)
文摘Soybean chlorotic mottle virus(SbCMV)was first detected from soybean plants in Jiangxi Province of China by high throughput sequencing and was confirmed by PCR.The complete nucleotide sequence of NC113 was determined to be 8210 nucleotides,and shared the highest similarity(91.7%)with sequences of SbCMV that was only reported in Japan.It encodes nine putative open reading frames(ORFs Ia,Ib and Ⅱ-Ⅷ),and contains a large intergenic region located at nucleotide 5976-6512 between ORFs VI and VII.Sequence analysis and phylogenetic tree indicated that NC113 is an isolate of SbCMV,and is more related to the soymoviruses Blueberry red ringspot virus(BRRSV),Peanut chlorotic streak virus(PCSV)and Cestrum yellow leaf curling virus(CmYLCV)than to other representative members in the Caulimoviridae family.Field survey of 472 legume plants from Jiangxi and Zhejiang provinces showed SbCMV was only detected from soybean in Nanchang City with a low incidence rate.This is the first report of Soybean chlorotic mottle virus identified in China.
基金This work was supported by grants from the National Academy of Agricultural Science(Code #200901FHT020508369)the BioGreen21 Program(Code #20050301034438 and Code #20070301034037),Rural Development Administration, Republic of Korea
文摘As a part of the Multinational Genome Sequencing Project of Brassica rapa, linkage group R9 and R3 were sequenced using a bacterial artificial chromosome (BAC) by BAC strategy. The current physical contigs are expected to cover approximately 90% euchromatins of both chromosomes. As the project progresses, BAC selection for sequence extension becomes more limited because BAC libraries are restriction enzyme-specific. To support the project, a random sheared fosmid library was constructed. The library consists of 97536 clones with average insert size of approximately 40 kb corresponding to seven genome equivalents, assuming a Chinese cabbage genome size of 550 Mb. The library was screened with primers designed at the end of sequences of nine points of scaffold gaps where BAC clones cannot be selected to extend the physical contigs. The selected positive clones were end-sequenced to check the overlap between the fosmid clones and the adjacent BAC clones. Nine fosmid clones were selected and fully sequenced. The sequences revealed two completed gap filling and seven sequence extensions, which can be used for further selection of BAC clones confirming that the fosmid library will facilitate the sequence completion of B. rapa.
基金supported by the National key research and development plan(2016TFC1202700,2016YFC1200900)Beijing Municipal Science&Technology Commission project(grant numbers D151100002115003)Guangzhou Municipal Science&Technology Commission project(grant numbers 2015B2150820)
文摘Objective Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. Methods In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Results Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. Conclusion MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use.
基金supported by funding from the National Natural Science Foundation of China(Nos.31290210,31210103902)the Unites States National Science Foundation grant(No.IOS 1238231)+1 种基金the USDA-Agricultural Research Service CRIS project(No.5325-21000-019)the Ministry of Education of China(111 project)
文摘Bread wheat (Triticum aestivum, AABBDD) is an allohexaploid species derived from two rounds of interspecific hybridizations. A high-quality genome sequence assembly of diploid Aegilops tauschii, the donor of the wheat D genome, will provide a useful platform to study polyploid wheat evolution. A combined approach of BAC pooling and next-generation sequencing technology was employed to sequence the minimum tiling path (MTP) of 3176 BAC clones from the short arm ofAe. tauschii chromosome 3 (At3DS). The final assembly of 135 super-scaffolds with an N50 of 4.2 Mb was used to build a 247-Mb pseudomolecule with a total of 2222 predicted protein-coding genes. Compared with the orthologous regions of rice, Brachypodium, and sorghum, At3DS contains 38.67% more genes. In comparison to At3DS, the short arm sequence of wheat chromosome 3B (Ta3BS) is 95-Mb large in size, which is primarily due to the expansion of the non-centromeric region, suggesting that transposable element (TE) bursts in Ta3B likely occurred there. Also, the size increase is accompanied by a proportional increase in gene number in Ta3BS. We found that in the sequence of short arm of wheat chromosome 3D (Ta3DS), there was only less than 0.27% gene loss compared to At3DS. Our study reveals divergent evolution of grass genomes and provides new insights into sequence changes in the polyploid wheat genome.
基金supported by the Hi-Tech Research and Development Program of China (863 Program grant 2011AA10A204)the State Key Laboratory of Biocontrol (SKLBC13KF01)
文摘The complete genome of Cnaphalocrocis medinalis granulovirus(CnmeGV) from a serious migratory rice pest, Cnaphalocrocis medinalis(Lepidoptera: Pyralidae), was sequenced using the Roche 454 Genome Sequencer FLX system(GS FLX) with shotgun strategy and assembled by Roche GS De Novo assembler software. Its circular double-stranded genome is 111,246 bp in size with a high A+T content of 64.8% and codes for 118 putative open reading frames(ORFs). It contains 37 conserved baculovirus core ORFs, 13 unique ORFs, 26 ORFs that were found in all Lepidoptera baculoviruses and 42 common ORFs. The analysis of nucleotide sequence repeats revealed that the CnmeGV genome differs from the rest of sequenced GVs by a 23 kb and a 17 kb gene block inversions, and does not contain any typical homologous region(hr) except for a region of non-hr-like sequence. Chitinase and cathepsin genes, which are reported to have major roles in the liquefaction of the hosts, were not found in the CnmeGV genome, which explains why CnmeGV infected insects do not show the phenotype of typical liquefaction. Phylogenetic analysis,based on the 37 core baculovirus genes, indicates that CnmeGV is closely related to Adoxophyes orana granulovirus. The genome analysis would contribute to the functional research of CnmeGV,and would benefit to the utilization of CnmeGV as pest control reagent for rice production.
基金supported by the National Natural Science Foundation of China(No.31801042)the Health and Medical Research Fund(No.04152666 and No.07180576)General Research Fund(No.14115418),and Direct Grant(No.2020.052).
文摘Apparently balanced chromosomal structural rearrangements are known to cause male infertility and account for approximately 1%of azoospermia or severe oligospermia.However,the underlying mechanisms of pathogenesis and etiologies are still largely unknown.Herein,we investigated apparently balanced interchromosomal structural rearrangements in six cases with azoospermia/severe oligospermia to comprehensively identify and delineate cryptic structural rearrangements and the related copy number variants.In addition,high read-depth genome sequencing(GS)(30-fold)was performed to investigate point mutations causative of male infertility.Mate-pair GS(4-fold)revealed additional structural rearrangements and/or copy number changes in 5 of 6 cases and detected a total of 48 rearrangements.Overall,the breakpoints caused truncations of 30 RefSeq genes,five of which were associated with spermatogenesis.Furthermore,the breakpoints disrupted 43 topological-associated domains.Direct disruptions or potential dysregulations of genes,which play potential roles in male germ cell development,apoptosis,and spermatogenesis,were found in all cases(n=6).In addition,high read-depth GS detected dual molecular findings in case MI6,involving a complex rearrangement and two point mutations in the gene DNAH1.Overall,our study provided the molecular characteristics of apparently balanced interchromosomal structural rearrangements in patients with male infertility.We demonstrated the complexity of chromosomal structural rearrangements,potential gene disruptions/dysregulation and single-gene mutations could be the contributing mechanisms underlie male infertility.
基金supported by the National Institutes of Health Grant #U54 DA021519
文摘Only in recent years, the draft sequences for several agricultural animals have been assembled. Assembling an individual animal's entire genome sequence or specific region(s) of interest is increasingly important for agricultura researchers to perform genetic comparisons between animals with different performance. We review the current status for several sequenced agricultural species and suggest that next generation sequencing (NGS) technology with decreased sequencing cost and increased speed of sequencing can benefit agricultural researchers. By taking advantage of advanced NGS technologies, genes and chromosomal regions that are more labile to the influence of environmental factors could be pinpointed. A more long term goal would be addressing the question of how animals respond at the molecular and cellular levels to different environmental models (e.g. nutrition). Upon revealing important genes and gene-environment interactions, the rate of genetic improvement can also be accelerated. It is clear that NGS technologies will be able to assist animal scientists to efficiently raise animals and to better prevent infectious diseases so that overall costs of animal production can be decreased.
基金supported by the State Key Program of National Natural Science Foundation of China(31530068)National Natural Science Foundation of China(31770415)Sichuan Application Foundation Project(2015JY0268)
文摘The Tibetan macaque, which is endemic to China, is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature (IUCN)(2017). Short tandem repeats (STRs) refer to repetitive elements of genome sequence that range in length from 1-6 bp. They are found in many organisms and are widely applied in population genetic studies. To clarify the distribution characteristics of genome-wide STRs and understand their variation among Tibetan macaques, we conducted a genome-wide survey of STRs with next-generation sequencing of five macaque samples. A total of 1 077 790 perfect STRs were mined from our assembly, with an N50 of 4 966 bp. Mono-nucleotide repeats were the most abundant, followed by tetra- and di-nucleotide repeats. Analysis of GC content and repeats showed consistent results with other macaques. Furthermore, using STR analysis software (IobSTR), we found that the proportion of base pair deletions in the STRs was greater than that of insertions in the five Tibetan macaque individuals (P〈0.05, t-test). We also found a greater number of homozygous STRs than heterozygous STRs (P〈0.05, t-test), with the Emei and Jianyang Tibetan macaques showing more heterozygous loci than Huangshan Tibetan macaques. The proportion of insertions and mean variation of alleles in the Emei and Jianyang individuals were slightly higher than those in the Huangshan individuals, thus revealing differences in STR allele size between the two populations The polymorphic STR loci identified based on the reference genome showed good amplification efficiency and could be used to study population genetics in Tibetan macaques. The neighbor-joining tree classified the five macaques into two different branches according to their geographical origin, indicating high genetic differentiation between the Huangshan and Sichuan populations. We elucidated the distribution characteristics of STRs in the Tibetan macaque genome and provided an effective method for screening polymorphic STRs. Our results also lay a foundation for future genetic variation studies of macaques.
基金Supported by the National Natural Science Foundation of China (Nos.30972245, 30730071)the National Basic Research Program of China (973 Program) (No. 2012CB114403)
文摘Little is known about the genome of Pacific white shrimp (Litopenaeus vannamei). To address this, we conducted BAC (bacterial artificial chromosome) end sequencing of L. vannamei. We selected and sequenced 7 812 BAC clones from the BAC library LvHE from the two ends of the inserts by Sanger sequencing. After trimming and quality filtering, 11 279 BAC end sequences (BESs) including 4 609 paired- ends BESs were obtained. The total length of the BESs was 4 340 753 bp, representing 0.18% of the L. vannamei haploid genome. The lengths of the BESs ranged from 100 bp to 660 bp with an average length of 385 bp. Analysis of the BESs indicated that the L. vannamei genome is AT-rich and that the primary repeats patterns were simple sequence repeats (SSRs) and low complexity sequences. Dinucleotide and hexanucleotide repeats were the most common SSR types in the BESs. The most abundant transposable element was gypsy, which may contribute to the generation of the large genome size of L. vannamei. We successfully annotated 4 519 BESs by BLAST searching, including genes involved in immunity and sex determination. Our results provide an important resource for functional gene studies, map construction and integration, and complete genome assembly for this species.
基金supported by grants from the Foundation for National Mega Project on Major Infectious Disease Prevention(grant number 2017ZX10103005-005)National Key Research and Development Program of China(2020YFC0845800 and 2020YFC0845600)the National Natural Science Foundation of China(31970548 and 91631110)。
文摘Genome sequencing has shown strong capabilities in the initial stages of the COVID-19 pandemic such as pathogen identification and virus preliminary tracing.While the rapid acquisition of SARS-Co V-2 genome from clinical specimens is limited by their low nucleic acid load and the complexity of the nucleic acid background.To address this issue,we modified and evaluated an approach by utilizing SARS-Co V-2-specific amplicon amplification and Oxford Nanopore Prometh ION platform.This workflow started with the throat swab of the COVID-19 patient,combined reverse transcript PCR,and multi-amplification in one-step to shorten the experiment time,then can quickly and steadily obtain high-quality SARS-Co V-2 genome within 24 h.A comprehensive evaluation of the method was conducted in 42 samples:the sequencing quality of the method was correlated well with the viral load of the samples;high-quality SARS-Co V-2 genome could be obtained stably in the samples with Ct value up to 39.14;data yielding for different Ct values were assessed and the recommended sequencing time was 8 h for samples with Ct value of less than 20;variation analysis indicated that the method can detect the existing and emerging genomic mutations as well;Illumina sequencing verified that ultra-deep sequencing can greatly improve the single read error rate of Nanopore sequencing,making it as low as 0.4/10,000 bp.In summary,high-quality SARS-Co V-2 genome can be acquired by utilizing the amplicon amplification and it is an effective method in accelerating the acquisition of genetic resources and tracking the genome diversity of SARSCo V-2.
基金funded by the National Natural Science Foundation of China,grant number 81603221.
文摘Medicinal plants are renowned for their abundant production of secondary metabolites,which exhibit notable pharmacological activities and great potential for drug development.The biosynthesis of secondary metabolites is highly intricate and influenced by various intrinsic and extrinsic factors,resulting in substantial species diversity and content variation.Consequently,precise regulation of secondary metabolite synthesis is of utmost importance.In recent years,genome sequencing has emerged as a valuable tool for investigating the synthesis and regulation of secondary metabolites in medicinal plants,facilitated by the widespread use of high-throughput sequencing technologies.This review highlights the latest advancements in genome sequencing within this field and presents several strategies for studying secondary metabolites.Specifically,the article elucidates how genome sequencing can unravel the pathways for secondary metabolite synthesis in medicinal plants,offering insights into the functions and regulatory mechanisms of participating enzymes.Comparative analyses of plant genomes allow identification of shared pathways of metabolite synthesis among species,thereby providing novel avenues for obtaining cost-effective biosynthetic intermediates.By examining individual genomic variations,genes or gene clusters associated with the synthesis of specific compounds can be discovered,indicating potential targets and directions for drug development and the exploration of alternative compound sources.Moreover,the advent of gene-editing technology has enabled the precise modifications of medicinal plant genomes.Optimization of specific secondary metabolite synthesis pathways becomes thus feasible,enabling the precise editing of target genes to regulate secondary metabolite production within cells.These findings serve as valuable references and lessons for future drug development endeavors,conservation of rare resources,and the exploration of new resources.
基金the financial support of the National Natural Science Foundation of China[32201960,32073020]Science and Technology Innovation Program of Hunan Province[2022RC1150]+2 种基金Changsha Municipal Natural Science Foundation[kq2202332]Hunan innovative province construction project[2019NK2041]Agricultural Science and Technology Innovation Project of Hunan Province[2021CX05].
文摘The microbial potential of Penicillium has received critical attention.The present research aimed to elucidate the efficacy of crude enzyme secreted from Penicillium oxalicum WX-209 in degrading citrus segments and evaluate the safety of the process.Results showed that citrus segment membranes gradually dissolved after treatment with the crude enzyme solution,indicating good degradation capability.No significant differences in body weight,food ingestion rate,hematology,blood biochemistry,and weight changes of different organs were found between the enzyme intake and control groups.Serial experiments showed that the crude enzyme had high biological safety.Moreover,the whole genome of P.oxalicum WX-209 was sequenced by PacBio and Illumina platforms.Twenty-five scaffolds were assembled to generate 36 Mbp size of genome sequence comprising 11369 predicted genes modeled with a GC content of 48.33%.A total of 592 genes were annotated to encode enzymes related to carbohydrates,and some degradation enzyme genes were identified in strain P.oxalicum WX-209.