The plant genome possesses a large number of microRNAs (miRNAs) mainly 21-24 nucleotides in length. They play a vital role in regulation of target gene expression at various stages throughout the whole plant life cy...The plant genome possesses a large number of microRNAs (miRNAs) mainly 21-24 nucleotides in length. They play a vital role in regulation of target gene expression at various stages throughout the whole plant life cycle. Here we sequenced and analyzed ~ 10 million non-coding RNAs (ncRNAs) derived from fiber tissue of the allotetraploid cotton (Gossypium hirsutum) 7 days post-anthesis using ncRNA-seq technology. In terms of distinct reads, 24 nt ncRNA is by far the dominant species, followed by 21 nt and 23 nt ncRNAs. Using ab initio prediction, we identified and characterized a total of 562 candidate miRNA gene loci on the recently assembled D5 genome of the diploid cotton G. raimondii. Of all the 562 predicted miRNAs, 22 were previously discovered in cotton species and 187 had sequence conservation and homology to homologous miRNAs of other plant species. Nucleotide bias analysis showed that the 9th and 1st positions were significantly conserved among different types of miRNA genes. Among the 463 putative miRNA target genes, most significant up/down-regulation occurred in 10-20 days post-anthesis, indicating that miRNAs played an important role during the elongation and secondary cell wall synthesis stages of cotton fiber development. The discovery of new miRNA genes will help understand the mechanisms of miRNA generation and regulation in cotton.展开更多
Heterosis,one of the most important biological phenomena, refers to the phenotypic superiority of a hybrid over its genetically diverse parents with respect to many traits such as biomass,growth rate and yield.Despite...Heterosis,one of the most important biological phenomena, refers to the phenotypic superiority of a hybrid over its genetically diverse parents with respect to many traits such as biomass,growth rate and yield.Despite its successful application in breeding and agronomic production of many crop and animal varieties,the molecular basis of heterosis remains elusive.The classic genetic explanations for heterosis centered on three hypotheses:dominance(Davenport,1908;Bruce,展开更多
The monkeypox virus(MPXV)has triggered a current outbreak globally.Genome sequencing of MPXV and rapid tracing of genetic variants will benefit disease diagnosis and control.It is a significant challenge but necessary...The monkeypox virus(MPXV)has triggered a current outbreak globally.Genome sequencing of MPXV and rapid tracing of genetic variants will benefit disease diagnosis and control.It is a significant challenge but necessary to optimize the strategy and application of rapid full-length genome identification and to track variations of MPXV in clinical specimens with low viral loads,as it is one of the DNA viruses with the largest genome and the most AT-biased,and has a significant number of tandem repeats.Here we evaluated the performance of metagenomic and amplicon sequencing techniques,and three sequencing platforms in MPXV genome sequencing based on multiple clinical specimens of five mpox cases in Chinese mainland.We rapidly identified the full-length genome of MPXV with the assembly of accurate tandem repeats in multiple clinical specimens.Amplicon sequencing enables cost-effective and rapid sequencing of clinical specimens to obtain high-quality MPXV genomes.Third-generation sequencing facilitates the assembly of the terminal tandem repeat regions in the monkeypox virus genome and corrects a common misassembly in published sequences.Besides,several intra-host single nucleotide variations were identified in the first imported mpox case.This study offers an evaluation of various strategies aimed at identifying the complete genome of MPXV in clinical specimens.The findings of this study will significantly enhance the surveillance of MPXV.展开更多
The sessile plants encounter various stresses;some are prolonged,whereas some others are recurrent.Temperature is crucial for plant growth and development,and plants often encounter adverse high temperature fluctuatio...The sessile plants encounter various stresses;some are prolonged,whereas some others are recurrent.Temperature is crucial for plant growth and development,and plants often encounter adverse high temperature fluctuations(heat stresses)as well as prolonged cold exposure such as seasonal temperature drops in winter when grown in temperate regions.Many plants can remember past temperature stresses to get adapted to adverse local temperature changes to ensure survival and/or reproductive success.Here,we summarize chromatin-based mechanisms underlying acquired thermotolerance or thermomemory in plants and review recent progresses on molecular epigenetic understanding of‘remembering of prolonged cold in winter’or vernalization,a process critical for various over-wintering plants to acquire competence to flower in the coming spring.In addition,perspectives on future study in temperature stress memories of economically-important crops are discussed.展开更多
Mountains are rich in biodiversity,and butterflies are species-rich and have a good ecological and evolutionary research foundation.This review addresses the potential and progress of studying mountain biodiversity us...Mountains are rich in biodiversity,and butterflies are species-rich and have a good ecological and evolutionary research foundation.This review addresses the potential and progress of studying mountain biodiversity using butterflies as a model.We discuss the uniqueness of mountain ecosystems,factors influencing the distribution of mountain butterflies,representative genetic and evolutionary models in butterfly research,and evolutionary studies of mountain biodiversity involving butterfly genetics and genomics.Finally,we demonstrate the necessity of studying mountain butterflies and propose future perspectives.This review provides insights for studying the biodiversity of mountain butterflies as well as a summary of research methods for reference.展开更多
Gene set enrichment(GSE) analyses play an important role in the interpretation of large-scale transcriptome datasets. Multiple GSE tools can be integrated into a single method as obtaining optimal results is challen...Gene set enrichment(GSE) analyses play an important role in the interpretation of large-scale transcriptome datasets. Multiple GSE tools can be integrated into a single method as obtaining optimal results is challenging due to the plethora of GSE tools and their discrepant performances. Several existing ensemble methods lead to different scores in sorting pathways as integrated results; furthermore, it is difficult for users to choose a single ensemble score to obtain optimal final results. Here, we develop an ensemble method using a machine learning approach called Combined Gene set analysis incorporating Prioritization and Sensitivity(CGPS) that integrates the results provided by nine prominent GSE tools into a single ensemble score(R score) to sort pathways as integrated results. Moreover, to the best of our knowledge, CGPS is the first GSE ensemble method built based on a priori knowledge of pathways and phenotypes. Compared with 10 widely used individual methods and five types of ensemble scores from two ensemble methods, we demonstrate that sorting pathways based on the R score can better prioritize relevant pathways, as established by an evaluation of 120 simulated datasets and 45 real datasets.Additionally, CGPS is applied to expression data involving the drug panobinostat, which is an anticancer treatment against multiple myeloma. The results identify cell processes associated with cancer, such as the p53 signaling pathway(hsa04115); by contrast, according to two ensemble methods(EnrichmentBrowser and EGSEA), this pathway has a rank higher than 20, which may cause users to miss the pathway in their analyses. We show that this method, which is based on a priori knowledge, can capture valuable biological information from numerous types of gene set collections, such as KEGG pathways, GO terms, Reactome, and BioCarta. CGPS is publicly available as a standalone source code at ftp://ftp.cbi.pku.edu.cn/pub/CGPS_download/cgps-1.0.0.tar.gz.展开更多
Male sterile genes and mutants are valuable resources in hybrid seed production for monoclinous crops.High genetic redundancy due to allohexaploidy makes it difficult to obtain the nuclear recessive male sterile mutan...Male sterile genes and mutants are valuable resources in hybrid seed production for monoclinous crops.High genetic redundancy due to allohexaploidy makes it difficult to obtain the nuclear recessive male sterile mutants through spontaneous mutation or chemical or physical mutagenesis methods in wheat.The emerging effective genome editing tool,CRISPR/Cas9 system,makes it possible to achieve simultaneous mutagenesis in multiple homoeoalleles.To improve the genome modification efficiency of the CRISPR/Cas9 system in wheat,we compared four different RNA polymerase(Pol)Ⅲpromoters(TaU3 p,TaU6 p,OsU3 p,and OsU6 p)and three types of sgRNA scaffold in the protoplast system.We show that the TaU3 promoter-driven optimized sgRNA scaffold was most effective.The optimized CRISPR/Cas9 system was used to edit three TaNP1 homoeoalleles,whose orthologs,OsNP1 in rice and ZmIPE1 in maize,encode a putative glucose-methanol-choline oxidoreductase and are required for male sterility.Triple homozygous mutations in TaNP1 genes result in complete male sterility.We further demonstrated that anyone wild-type copy of the three TaNP1 genes is sufficient for maintenance of male fertility.Taken together,this study provides an optimized CRISPR/Cas9 vector for wheat genome editing and a complete male sterile mutant for development of a commercially viable hybrid wheat seed production system.展开更多
The tomato encode four functional DCL families,of which DCL2 is poorly studied.Here,we generated loss-of-function mutants for a tomato DCL2 gene,dcl2b,and we identified its major role in defending against tomato mosai...The tomato encode four functional DCL families,of which DCL2 is poorly studied.Here,we generated loss-of-function mutants for a tomato DCL2 gene,dcl2b,and we identified its major role in defending against tomato mosaic virus in relation to both natural and manual infections.Genome-wide small RNA expression profiling revealed that DCL2b was required for the processing 22-nt small RNAs,including a few species of miRNAs.Interestingly,these DCL2b-dependent 22-nt miRNAs functioned similarly to the DCL1-produced 22-nt miRNAs in Arabidopsis and could serve as triggers to generate a class of secondary siRNAs.In particular,the majority of secondary siRNAs were derived from plant defense genes when the plants were challenged with viruses.We also examined differentially expressed genes in dcl2b through RNA-seq and observed that numerous genes were associated with mitochondrial metabolism and hormone signaling under virus-free conditions.Notably,when the loss-of-function dcl2b mutant was challenged with tomato mosaic virus,a group of defense response genes was activated,whereas the genes related to lipid metabolism were suppressed.Together,our findings provided new insights into the roles of tomato DCL2b in small RNA biogenesis and in antiviral defense.展开更多
Epithelial-mesenchymal transition(EMT)is a critical cellular process in embryonic development and is also the basis for wound repair,tissue regeneration,and cancer metastasis(Zhao et al.,2015).During cancer migration ...Epithelial-mesenchymal transition(EMT)is a critical cellular process in embryonic development and is also the basis for wound repair,tissue regeneration,and cancer metastasis(Zhao et al.,2015).During cancer migration and invasion,EMT involved comprehensive reprogramming processes related to cytoskeletal remodeling,cell differentiation,epigenetic regulation and metabolism(Plikus et al.,2015).In fact,the understanding of EMT in cancer development is still limited.In 2015.展开更多
Understanding the functional effects of genetic variants is crucial in modern genomics and genetics. Transcription factor binding sites (TFBSs) are one of the most important cis-regulatory elements. While multiple t...Understanding the functional effects of genetic variants is crucial in modern genomics and genetics. Transcription factor binding sites (TFBSs) are one of the most important cis-regulatory elements. While multiple tools have been developed to assess functional effects of genetic variants at TFBSs, they usually assume that each variant works in isolation and neglect the potential "interference" among multiple variants within the same TFBS. In this study, we presented COPE-TFBS (Context-Oriented Predictor for variant Effect on Transcription Factor Binding Site), a novel method that considers sequence context to accurately predict variant effects on TFBSs. We systematically re-analyzed the sequencing data from both the 1000 Genomes Project and the Genotype-Tissue Expression (GTEx) Project via COPE-TFBS, and identified numbers of novel TFBSs, transformed TFBSs and discordantly annotated TFBSs resulting from multiple variants, further highlighting the necessity of sequence context in accurately annotating genetic variants.展开更多
Cancer metastasis is the end product of cancer evolution,contributing to the massive mortality of cancer patients(Chaffer and Weinberg,2011).Different primary cancers have distinct spreading routes via the blood or ...Cancer metastasis is the end product of cancer evolution,contributing to the massive mortality of cancer patients(Chaffer and Weinberg,2011).Different primary cancers have distinct spreading routes via the blood or the lymphatics or through both routes,which presents challenge for effective cancer treatment(Qian et aL,2017).展开更多
Isoflavones which mainly distributed in leguminous plants have plenty of health benefits.Isoflavone synthase(IFS)is a membrane-associated cytochrome P450 enzyme(CYP450)which carries out the unique aryl-ring migration ...Isoflavones which mainly distributed in leguminous plants have plenty of health benefits.Isoflavone synthase(IFS)is a membrane-associated cytochrome P450 enzyme(CYP450)which carries out the unique aryl-ring migration and hydroxylation.So far,few crystal structures of plant P450s have been obtained.We determined the crystal structure of IFS from Medicago truncatula at 1.9 by MAD method using a selenomethionine substituted crystal and conducted molecular docking and mutagenesis study.The structure of IFS complexed with imidazole exhibits the helix Iα-loop-helix Iβmotif which corresponds to helix I of other P 450s.Compared with structures of common P450s,IFS/imidazole structure contains an extra domain,i.e.,theγ-domain.The structure reveals a homodimer in which theγ-domain of one molecule interacts with theβ-domain of another.The plane of heme group makes an angle of approximately 40°with the helix Iα-loop-helix Iβmotif.Molecular docking combined with mutagenesis study suggested that Trp-128 and Asp-300 might play important roles in substrate binding and recognition.Phe-301,Ser-303 and Gly-305 from the helix Iα-loop-helix Iβmotif may play important roles in the aryl-ring migration.These novel structural features reveal insights into the unique reaction mechanism of IFS and provide a basis for engineering IFS in leguminous crops for health purpose.展开更多
Understanding the zoonotic origin and evolution history of SARS-CoV-2 will provide critical insights for alerting and preventing future outbreaks.A significant gap remains for the possible role of pangolins as a reser...Understanding the zoonotic origin and evolution history of SARS-CoV-2 will provide critical insights for alerting and preventing future outbreaks.A significant gap remains for the possible role of pangolins as a reservoir of SARS-CoV-2 related coronaviruses(SC2r-CoVs).Here,we screened SC2r-CoVs in 172 samples from 163 pangolin individuals of four species,and detected positive signals in muscles of four Manis javanica and,for the first time,one M.pentadactyla.Phylogeographic analysis of pangolin mitochondrial DNA traced their origins from Southeast Asia.Using in-solution hybridization capture sequencing,we assembled a partial pangolin SC2r-CoV(pangolin-CoV)genome sequence of 22895 bp(MP20)from the M.pentadactyla sample.Phylogenetic analyses revealed MP20 was very closely related to pangolin-CoVs that were identified in M.javanica seized by Guangxi Customs.A genetic contribution of bat coronavirus to pangolin-CoVs via recombination was indicated.Our analysis revealed that the genetic diversity of pangolin-CoVs is substantially higher than previously anticipated.Given the potential infectivity of pangolin-CoVs,the high genetic diversity of pangolin-CoVs alerts the ecological risk of zoonotic evolution and transmission of pathogenic SC2r-CoVs.展开更多
The monkeypox virus(mpox virus,MPXV)epidemic in 2022 has posed a significant public health risk.Yet,the evolutionary principles of MPXV remain largely unknown.Here,we examined the evolutionary patterns of protein sequ...The monkeypox virus(mpox virus,MPXV)epidemic in 2022 has posed a significant public health risk.Yet,the evolutionary principles of MPXV remain largely unknown.Here,we examined the evolutionary patterns of protein sequences and codon usage in MPXV.We first demonstrated the signal of positive selection in OPG027,specifically in the CladeⅠlineage of MPXV.Subsequently,we discovered accelerated protein sequence evolution over time in the variants responsible for the 2022 outbreak.Furthermore,we showed strong epistasis between amino acid substitutions located in different genes.The codon adaptation index(CAI)analysis revealed that MPXV genes tended to use more non-preferred codons compared to human genes,and the CAI decreased over time and diverged between clades,with CladeⅠ>Ⅱa andⅡb-A>Ⅱb-B.While the decrease in fatality rate among the three groups aligned with the CAI pattern,it remains unclear whether this correlation was coincidental or if the deoptimization of codon usage in MPXV led to a reduction in fatality rates.This study sheds new light on the mechanisms that govern the evolution of MPXV in human populations.展开更多
BARD1(BRCA1 associated RING domain protein 1), as an important animal tumor suppressor gene associated with many kinds of cancers, has been intensively studied for decades. Surprisingly, homolog of BARD1 was found in ...BARD1(BRCA1 associated RING domain protein 1), as an important animal tumor suppressor gene associated with many kinds of cancers, has been intensively studied for decades. Surprisingly, homolog of BARD1 was found in plants and it was renamed At ROW1(repressor of Wuschel-1) according to its extremely important function with regard to plant stem cell homeostasis. Although great advances have been made in human BARD1, the function of this animal tumor-suppressor like gene in plant is not well studied and need to be further elucidated. Here, we review and summarize past and present work regarding this protein. Apart from its previously proposed role in DNA repair, recently it is found essential for shoot and root stem cell development and differentiation in plants. The study of At ROW1 in plant may provide an ideal model for further elucidating the functional mechanism of BARD1 in mammals.展开更多
Due to the economic value of natural textile fiber, cotton has attracted much research attention, which has led to the publication of two diploid genomes and two tetraploid genomes. These big data facilitate functiona...Due to the economic value of natural textile fiber, cotton has attracted much research attention, which has led to the publication of two diploid genomes and two tetraploid genomes. These big data facilitate functional genomic study in cotton, and allow researchers to investigate cotton genome structure, gene expression, and protein function on the global scale using high-throughput methods. In this review, we summarized recent studies of cotton genomes. Population genomic analyses revealed the domestication history of cultivated upland cotton and the roles of transposable elements in cotton genome evolution.Alternative splicing of cotton transcriptomes was evaluated genome-widely. Several important gene families like MYC, NAC, Sus and GhPLDal were systematically identified and classified based on genetic structure and biological function. High-throughput proteomics also unraveled the key functional proteins correlated with fiber development. Functional genomic studies have provided unprecedented insights into global-scale methods for cotton research.展开更多
Kallima butterflies are famous for their leaf-mimicking wing patterns.Yet the characterization of Kallima species is still under debate owing to their high phenotypic similarity.With the release of the K.inachus refer...Kallima butterflies are famous for their leaf-mimicking wing patterns.Yet the characterization of Kallima species is still under debate owing to their high phenotypic similarity.With the release of the K.inachus reference genome,phylogenetic studies based on genome-wide data have been carried out,thus improving the understanding of the evolutionary relationships of the genus Kallima.However,we noticed that there is some conflict between genome-based phylogenies and morphological classifications in butterflies.We further examined the cause of this conflict by conducting an in-depth study of the relationships among Kallima butterflies to test possible reticulate phylogenetic topologies.We constructed phylogenies based on various datasets(including SNPs in single-copy genes,coding sequences,neutral regions and all remaining sites across the genome)to compare the topologies,revealing the complex evolutionary history of Kallima butterflies.Our results suggest that the reticulate species topology may constitute a pervasive pattern present not only in species with adaptive radiations but also in gradually evolving species,with Kallima butterflies as an example.展开更多
Swallowtail butterflies(Papilionidae)are a historically significant butterfly group due to their colorful wing patterns,extensive morphological diversity,and phylogenetically important position as a sister group to al...Swallowtail butterflies(Papilionidae)are a historically significant butterfly group due to their colorful wing patterns,extensive morphological diversity,and phylogenetically important position as a sister group to all other butterflies and have been widely studied regarding ecological adaption,phylogeny,genetics,and evolution.Notably,they contain a unique class of pigments,i.e.,papiliochromes,which contribute to their color diversity and various biological functions such as predator avoidance and mate preference.To date,however,the genomic and genetic basis of their color diversity and papiliochrome origin in a phylogenetic and evolutionary context remain largely unknown.Here,we obtained high-quality reference genomes of 11 swallowtail butterfly species covering all tribes of Papilioninae and Parnassiinae using long-read sequencing technology.Combined with previously published butterfly genomes,we obtained robust phylogenetic relationships among tribes,overcoming the challenges of incomplete lineage sorting(ILS)and gene flow.Comprehensive genomic analyses indicated that the evolution of Papilionidae-specific conserved non-exonic elements(PSCNEs)and transcription factor binding sites(TFBSs)of patterning and transporter/cofactor genes,together with the rapid evolution of transporters/cofactors,likely promoted the origin and evolution of papiliochromes.These findings not only provide novel insights into the genomic basis of color diversity,especially papiliochrome origin in swallowtail butterflies,but also provide important data resources for exploring the evolution,ecology,and conservation of butterflies.展开更多
文摘The plant genome possesses a large number of microRNAs (miRNAs) mainly 21-24 nucleotides in length. They play a vital role in regulation of target gene expression at various stages throughout the whole plant life cycle. Here we sequenced and analyzed ~ 10 million non-coding RNAs (ncRNAs) derived from fiber tissue of the allotetraploid cotton (Gossypium hirsutum) 7 days post-anthesis using ncRNA-seq technology. In terms of distinct reads, 24 nt ncRNA is by far the dominant species, followed by 21 nt and 23 nt ncRNAs. Using ab initio prediction, we identified and characterized a total of 562 candidate miRNA gene loci on the recently assembled D5 genome of the diploid cotton G. raimondii. Of all the 562 predicted miRNAs, 22 were previously discovered in cotton species and 187 had sequence conservation and homology to homologous miRNAs of other plant species. Nucleotide bias analysis showed that the 9th and 1st positions were significantly conserved among different types of miRNA genes. Among the 463 putative miRNA target genes, most significant up/down-regulation occurred in 10-20 days post-anthesis, indicating that miRNAs played an important role during the elongation and secondary cell wall synthesis stages of cotton fiber development. The discovery of new miRNA genes will help understand the mechanisms of miRNA generation and regulation in cotton.
基金supported by grants from the National Basic Research Program of China(973 Program) (No.2012CB910900)the National Program on Key Basic Research Project of China(No.2011CB 100101)+3 种基金the National High Technology Research and Development Program of China (863 Program)(No.2012AA10A304)the National Natural Science Foundation of China(No.U1031001)the Ministry of Agriculture of China(948 Program)(No.2011-G2B)the Peking-Tsinghua Center for Life Sciences
文摘Heterosis,one of the most important biological phenomena, refers to the phenotypic superiority of a hybrid over its genetically diverse parents with respect to many traits such as biomass,growth rate and yield.Despite its successful application in breeding and agronomic production of many crop and animal varieties,the molecular basis of heterosis remains elusive.The classic genetic explanations for heterosis centered on three hypotheses:dominance(Davenport,1908;Bruce,
基金supported by the National Key Research and Development Program of China(2022YFC2303401,2022YFC2304100,2016YFD0500301,2021YFC0863300)the Beijing Science and Technology Plan(Z211100002521017)the National Natural Science Foundation of China(82241080)。
文摘The monkeypox virus(MPXV)has triggered a current outbreak globally.Genome sequencing of MPXV and rapid tracing of genetic variants will benefit disease diagnosis and control.It is a significant challenge but necessary to optimize the strategy and application of rapid full-length genome identification and to track variations of MPXV in clinical specimens with low viral loads,as it is one of the DNA viruses with the largest genome and the most AT-biased,and has a significant number of tandem repeats.Here we evaluated the performance of metagenomic and amplicon sequencing techniques,and three sequencing platforms in MPXV genome sequencing based on multiple clinical specimens of five mpox cases in Chinese mainland.We rapidly identified the full-length genome of MPXV with the assembly of accurate tandem repeats in multiple clinical specimens.Amplicon sequencing enables cost-effective and rapid sequencing of clinical specimens to obtain high-quality MPXV genomes.Third-generation sequencing facilitates the assembly of the terminal tandem repeat regions in the monkeypox virus genome and corrects a common misassembly in published sequences.Besides,several intra-host single nucleotide variations were identified in the first imported mpox case.This study offers an evaluation of various strategies aimed at identifying the complete genome of MPXV in clinical specimens.The findings of this study will significantly enhance the surveillance of MPXV.
基金supported partly by the National Natural Science Foundation of China(31830049,31721001,and 31970327)the Peking-Tsinghua Joint Center for Life Sciences。
文摘The sessile plants encounter various stresses;some are prolonged,whereas some others are recurrent.Temperature is crucial for plant growth and development,and plants often encounter adverse high temperature fluctuations(heat stresses)as well as prolonged cold exposure such as seasonal temperature drops in winter when grown in temperate regions.Many plants can remember past temperature stresses to get adapted to adverse local temperature changes to ensure survival and/or reproductive success.Here,we summarize chromatin-based mechanisms underlying acquired thermotolerance or thermomemory in plants and review recent progresses on molecular epigenetic understanding of‘remembering of prolonged cold in winter’or vernalization,a process critical for various over-wintering plants to acquire competence to flower in the coming spring.In addition,perspectives on future study in temperature stress memories of economically-important crops are discussed.
基金the National Natural Science Foundation of China(32170420 and 31871271)the Beijing Natural Science Foundation(JQ19021)the Peking-Tsinghua Center for Life Science,the State Key Laboratory of Protein and Plant Gene Research,the Qidong-SLS Innovation Fund,Benyuan Charity Young Investigator Exploration Fellowship in Life Science to W.Z.,and grants from the China Postdoctoral Science Foundation(2023M730082 and BX20230026)to S.W.
文摘Mountains are rich in biodiversity,and butterflies are species-rich and have a good ecological and evolutionary research foundation.This review addresses the potential and progress of studying mountain biodiversity using butterflies as a model.We discuss the uniqueness of mountain ecosystems,factors influencing the distribution of mountain butterflies,representative genetic and evolutionary models in butterfly research,and evolutionary studies of mountain biodiversity involving butterfly genetics and genomics.Finally,we demonstrate the necessity of studying mountain butterflies and propose future perspectives.This review provides insights for studying the biodiversity of mountain butterflies as well as a summary of research methods for reference.
基金supported by the National Key Research and Development Program of China (2017YFC1201200,2017YFC0908404,2016YFC0901603,2016YFB0201700)National High-tech R&D Program of China (863 Program) (2015AA020108)the State Key Laboratory of Protein and Plant Gene Research
文摘Gene set enrichment(GSE) analyses play an important role in the interpretation of large-scale transcriptome datasets. Multiple GSE tools can be integrated into a single method as obtaining optimal results is challenging due to the plethora of GSE tools and their discrepant performances. Several existing ensemble methods lead to different scores in sorting pathways as integrated results; furthermore, it is difficult for users to choose a single ensemble score to obtain optimal final results. Here, we develop an ensemble method using a machine learning approach called Combined Gene set analysis incorporating Prioritization and Sensitivity(CGPS) that integrates the results provided by nine prominent GSE tools into a single ensemble score(R score) to sort pathways as integrated results. Moreover, to the best of our knowledge, CGPS is the first GSE ensemble method built based on a priori knowledge of pathways and phenotypes. Compared with 10 widely used individual methods and five types of ensemble scores from two ensemble methods, we demonstrate that sorting pathways based on the R score can better prioritize relevant pathways, as established by an evaluation of 120 simulated datasets and 45 real datasets.Additionally, CGPS is applied to expression data involving the drug panobinostat, which is an anticancer treatment against multiple myeloma. The results identify cell processes associated with cancer, such as the p53 signaling pathway(hsa04115); by contrast, according to two ensemble methods(EnrichmentBrowser and EGSEA), this pathway has a rank higher than 20, which may cause users to miss the pathway in their analyses. We show that this method, which is based on a priori knowledge, can capture valuable biological information from numerous types of gene set collections, such as KEGG pathways, GO terms, Reactome, and BioCarta. CGPS is publicly available as a standalone source code at ftp://ftp.cbi.pku.edu.cn/pub/CGPS_download/cgps-1.0.0.tar.gz.
基金supported by grants from the Ministry of Agriculture of China(2016ZX08010001 and 2016ZX08010002)Peking University Institute of Advanced Agricultural Sciences and Beijing Natural Science Foundation(19530290014)。
文摘Male sterile genes and mutants are valuable resources in hybrid seed production for monoclinous crops.High genetic redundancy due to allohexaploidy makes it difficult to obtain the nuclear recessive male sterile mutants through spontaneous mutation or chemical or physical mutagenesis methods in wheat.The emerging effective genome editing tool,CRISPR/Cas9 system,makes it possible to achieve simultaneous mutagenesis in multiple homoeoalleles.To improve the genome modification efficiency of the CRISPR/Cas9 system in wheat,we compared four different RNA polymerase(Pol)Ⅲpromoters(TaU3 p,TaU6 p,OsU3 p,and OsU6 p)and three types of sgRNA scaffold in the protoplast system.We show that the TaU3 promoter-driven optimized sgRNA scaffold was most effective.The optimized CRISPR/Cas9 system was used to edit three TaNP1 homoeoalleles,whose orthologs,OsNP1 in rice and ZmIPE1 in maize,encode a putative glucose-methanol-choline oxidoreductase and are required for male sterility.Triple homozygous mutations in TaNP1 genes result in complete male sterility.We further demonstrated that anyone wild-type copy of the three TaNP1 genes is sufficient for maintenance of male fertility.Taken together,this study provides an optimized CRISPR/Cas9 vector for wheat genome editing and a complete male sterile mutant for development of a commercially viable hybrid wheat seed production system.
基金This research was supported by grants from the National Natural Science Foundation of China(31471921,91540118,31622050,and 31672208)to H.ZT.W.was supported by a fellowship from the Chinese Scholarship Council.
文摘The tomato encode four functional DCL families,of which DCL2 is poorly studied.Here,we generated loss-of-function mutants for a tomato DCL2 gene,dcl2b,and we identified its major role in defending against tomato mosaic virus in relation to both natural and manual infections.Genome-wide small RNA expression profiling revealed that DCL2b was required for the processing 22-nt small RNAs,including a few species of miRNAs.Interestingly,these DCL2b-dependent 22-nt miRNAs functioned similarly to the DCL1-produced 22-nt miRNAs in Arabidopsis and could serve as triggers to generate a class of secondary siRNAs.In particular,the majority of secondary siRNAs were derived from plant defense genes when the plants were challenged with viruses.We also examined differentially expressed genes in dcl2b through RNA-seq and observed that numerous genes were associated with mitochondrial metabolism and hormone signaling under virus-free conditions.Notably,when the loss-of-function dcl2b mutant was challenged with tomato mosaic virus,a group of defense response genes was activated,whereas the genes related to lipid metabolism were suppressed.Together,our findings provided new insights into the roles of tomato DCL2b in small RNA biogenesis and in antiviral defense.
基金supported by the National Natural Science Foundation of China(Nos.31671375,31871339 and 31801120)the National Key Research and Development Program of China(No.2017YFC1201200)the research start-up fellowship of University of the Sunshine Coast to MZ.
文摘Epithelial-mesenchymal transition(EMT)is a critical cellular process in embryonic development and is also the basis for wound repair,tissue regeneration,and cancer metastasis(Zhao et al.,2015).During cancer migration and invasion,EMT involved comprehensive reprogramming processes related to cytoskeletal remodeling,cell differentiation,epigenetic regulation and metabolism(Plikus et al.,2015).In fact,the understanding of EMT in cancer development is still limited.In 2015.
基金supported by funds from the National Key R&D Program of China (2016YFC0901603)the China 863 Program (2015AA020108)+1 种基金the State Key Laboratory of Protein and Plant Gene Researchsupported in part by the National Program for Support of Top-notch Young Professionals
文摘Understanding the functional effects of genetic variants is crucial in modern genomics and genetics. Transcription factor binding sites (TFBSs) are one of the most important cis-regulatory elements. While multiple tools have been developed to assess functional effects of genetic variants at TFBSs, they usually assume that each variant works in isolation and neglect the potential "interference" among multiple variants within the same TFBS. In this study, we presented COPE-TFBS (Context-Oriented Predictor for variant Effect on Transcription Factor Binding Site), a novel method that considers sequence context to accurately predict variant effects on TFBSs. We systematically re-analyzed the sequencing data from both the 1000 Genomes Project and the Genotype-Tissue Expression (GTEx) Project via COPE-TFBS, and identified numbers of novel TFBSs, transformed TFBSs and discordantly annotated TFBSs resulting from multiple variants, further highlighting the necessity of sequence context in accurately annotating genetic variants.
基金supported by the National Natural Science Foundation of China(Nos.31171270 and 31671375)the research start-up fellowship of University of the Sunshine Coast to M.Z
文摘Cancer metastasis is the end product of cancer evolution,contributing to the massive mortality of cancer patients(Chaffer and Weinberg,2011).Different primary cancers have distinct spreading routes via the blood or the lymphatics or through both routes,which presents challenge for effective cancer treatment(Qian et aL,2017).
文摘Isoflavones which mainly distributed in leguminous plants have plenty of health benefits.Isoflavone synthase(IFS)is a membrane-associated cytochrome P450 enzyme(CYP450)which carries out the unique aryl-ring migration and hydroxylation.So far,few crystal structures of plant P450s have been obtained.We determined the crystal structure of IFS from Medicago truncatula at 1.9 by MAD method using a selenomethionine substituted crystal and conducted molecular docking and mutagenesis study.The structure of IFS complexed with imidazole exhibits the helix Iα-loop-helix Iβmotif which corresponds to helix I of other P 450s.Compared with structures of common P450s,IFS/imidazole structure contains an extra domain,i.e.,theγ-domain.The structure reveals a homodimer in which theγ-domain of one molecule interacts with theβ-domain of another.The plane of heme group makes an angle of approximately 40°with the helix Iα-loop-helix Iβmotif.Molecular docking combined with mutagenesis study suggested that Trp-128 and Asp-300 might play important roles in substrate binding and recognition.Phe-301,Ser-303 and Gly-305 from the helix Iα-loop-helix Iβmotif may play important roles in the aryl-ring migration.These novel structural features reveal insights into the unique reaction mechanism of IFS and provide a basis for engineering IFS in leguminous crops for health purpose.
基金This work was supported by the National Key Research and Development Projects of the Ministry of Science and Technology of China,National Key Research and Development Program of China(2021YFC0863300)Ministry of Agriculture of China(2016ZX08009003-006)+1 种基金Key Program of Chinese Academy of Sciences(KJZD-SW-L11)Animal Branch of the Germplasm Bank of Wild Species,Chinese Academy of Sciences(the Large Research Infrastructure Funding)。
文摘Understanding the zoonotic origin and evolution history of SARS-CoV-2 will provide critical insights for alerting and preventing future outbreaks.A significant gap remains for the possible role of pangolins as a reservoir of SARS-CoV-2 related coronaviruses(SC2r-CoVs).Here,we screened SC2r-CoVs in 172 samples from 163 pangolin individuals of four species,and detected positive signals in muscles of four Manis javanica and,for the first time,one M.pentadactyla.Phylogeographic analysis of pangolin mitochondrial DNA traced their origins from Southeast Asia.Using in-solution hybridization capture sequencing,we assembled a partial pangolin SC2r-CoV(pangolin-CoV)genome sequence of 22895 bp(MP20)from the M.pentadactyla sample.Phylogenetic analyses revealed MP20 was very closely related to pangolin-CoVs that were identified in M.javanica seized by Guangxi Customs.A genetic contribution of bat coronavirus to pangolin-CoVs via recombination was indicated.Our analysis revealed that the genetic diversity of pangolin-CoVs is substantially higher than previously anticipated.Given the potential infectivity of pangolin-CoVs,the high genetic diversity of pangolin-CoVs alerts the ecological risk of zoonotic evolution and transmission of pathogenic SC2r-CoVs.
基金We thank the researchers who generated and shared the sequencing data in the NCBI(Table S4)and GISAID(https://www.gisaid.org/)(Table S5),on which this research is basedThis work is supported by the National Key R&D Projects of China(Grant Nos.2021YFC2301300,2022YFC2304100,and 2022YFC2303401)+2 种基金the National Natural Science Foundation of China(Grant No.82241080)the Beijing Natural Science Foundation,China(Grant No.L222009)the SLS-Qidong Innovation Fund,China,and the Beijing Postdoctoral Research Foundation,China(Grant No.2023-ZZ-018).
文摘The monkeypox virus(mpox virus,MPXV)epidemic in 2022 has posed a significant public health risk.Yet,the evolutionary principles of MPXV remain largely unknown.Here,we examined the evolutionary patterns of protein sequences and codon usage in MPXV.We first demonstrated the signal of positive selection in OPG027,specifically in the CladeⅠlineage of MPXV.Subsequently,we discovered accelerated protein sequence evolution over time in the variants responsible for the 2022 outbreak.Furthermore,we showed strong epistasis between amino acid substitutions located in different genes.The codon adaptation index(CAI)analysis revealed that MPXV genes tended to use more non-preferred codons compared to human genes,and the CAI decreased over time and diverged between clades,with CladeⅠ>Ⅱa andⅡb-A>Ⅱb-B.While the decrease in fatality rate among the three groups aligned with the CAI pattern,it remains unclear whether this correlation was coincidental or if the deoptimization of codon usage in MPXV led to a reduction in fatality rates.This study sheds new light on the mechanisms that govern the evolution of MPXV in human populations.
基金supported by the National Natural Science Foundation of China(90717009)
文摘BARD1(BRCA1 associated RING domain protein 1), as an important animal tumor suppressor gene associated with many kinds of cancers, has been intensively studied for decades. Surprisingly, homolog of BARD1 was found in plants and it was renamed At ROW1(repressor of Wuschel-1) according to its extremely important function with regard to plant stem cell homeostasis. Although great advances have been made in human BARD1, the function of this animal tumor-suppressor like gene in plant is not well studied and need to be further elucidated. Here, we review and summarize past and present work regarding this protein. Apart from its previously proposed role in DNA repair, recently it is found essential for shoot and root stem cell development and differentiation in plants. The study of At ROW1 in plant may provide an ideal model for further elucidating the functional mechanism of BARD1 in mammals.
基金supported by the Natural Science Foundation of China(Nos.21602162 and 31690090)the National Science and Technology Major Project(No.2016ZX08005003-001)the Fundamental Research Funds for the Central Universities(No.104862016)
文摘Due to the economic value of natural textile fiber, cotton has attracted much research attention, which has led to the publication of two diploid genomes and two tetraploid genomes. These big data facilitate functional genomic study in cotton, and allow researchers to investigate cotton genome structure, gene expression, and protein function on the global scale using high-throughput methods. In this review, we summarized recent studies of cotton genomes. Population genomic analyses revealed the domestication history of cultivated upland cotton and the roles of transposable elements in cotton genome evolution.Alternative splicing of cotton transcriptomes was evaluated genome-widely. Several important gene families like MYC, NAC, Sus and GhPLDal were systematically identified and classified based on genetic structure and biological function. High-throughput proteomics also unraveled the key functional proteins correlated with fiber development. Functional genomic studies have provided unprecedented insights into global-scale methods for cotton research.
基金supported by grants from the National Natural Science Foundation of China(32325009,32170420)the Peking-Tsinghua Center for Life Sciences,and the State Key Laboratory of Protein and Plant Gene Research to WZ and grants from the China Postdoctoral Science Foundation(2023M730082,BX20230026)to SW.
文摘Kallima butterflies are famous for their leaf-mimicking wing patterns.Yet the characterization of Kallima species is still under debate owing to their high phenotypic similarity.With the release of the K.inachus reference genome,phylogenetic studies based on genome-wide data have been carried out,thus improving the understanding of the evolutionary relationships of the genus Kallima.However,we noticed that there is some conflict between genome-based phylogenies and morphological classifications in butterflies.We further examined the cause of this conflict by conducting an in-depth study of the relationships among Kallima butterflies to test possible reticulate phylogenetic topologies.We constructed phylogenies based on various datasets(including SNPs in single-copy genes,coding sequences,neutral regions and all remaining sites across the genome)to compare the topologies,revealing the complex evolutionary history of Kallima butterflies.Our results suggest that the reticulate species topology may constitute a pervasive pattern present not only in species with adaptive radiations but also in gradually evolving species,with Kallima butterflies as an example.
基金supported by the National Natural Science Foundation of China(31621062 to W.W.,32070482 to X.Y.L.)Chinese Academy of Sciences(“Light of West China”to X.Y.L.,XDB13000000 to W.W.)+1 种基金Yunnan Provincial Science and Technology Department(Talent Project of Yunnan:202105AC160039)Biodiversity Conservation Program of the Ministry of Ecology and Environment,China(China BON-Butterflies)。
文摘Swallowtail butterflies(Papilionidae)are a historically significant butterfly group due to their colorful wing patterns,extensive morphological diversity,and phylogenetically important position as a sister group to all other butterflies and have been widely studied regarding ecological adaption,phylogeny,genetics,and evolution.Notably,they contain a unique class of pigments,i.e.,papiliochromes,which contribute to their color diversity and various biological functions such as predator avoidance and mate preference.To date,however,the genomic and genetic basis of their color diversity and papiliochrome origin in a phylogenetic and evolutionary context remain largely unknown.Here,we obtained high-quality reference genomes of 11 swallowtail butterfly species covering all tribes of Papilioninae and Parnassiinae using long-read sequencing technology.Combined with previously published butterfly genomes,we obtained robust phylogenetic relationships among tribes,overcoming the challenges of incomplete lineage sorting(ILS)and gene flow.Comprehensive genomic analyses indicated that the evolution of Papilionidae-specific conserved non-exonic elements(PSCNEs)and transcription factor binding sites(TFBSs)of patterning and transporter/cofactor genes,together with the rapid evolution of transporters/cofactors,likely promoted the origin and evolution of papiliochromes.These findings not only provide novel insights into the genomic basis of color diversity,especially papiliochrome origin in swallowtail butterflies,but also provide important data resources for exploring the evolution,ecology,and conservation of butterflies.