Pheretima,also called“earthworms”,is a well-known animal-derived traditional Chinese medicine that is extensively used in over 50 Chinese patent medicines(CPMs)in Chinese Pharmacopoeia(2020 edition).However,its zool...Pheretima,also called“earthworms”,is a well-known animal-derived traditional Chinese medicine that is extensively used in over 50 Chinese patent medicines(CPMs)in Chinese Pharmacopoeia(2020 edition).However,its zoological origin is unclear,both in the herbal market and CPMs.In this study,a strategy for integrating in-house annotated protein databases constructed from close evolutionary relationship-sourced RNA sequencing data from public archival resources and various sequencing algorithms(restricted search,open search,and de novo)was developed to characterize the phenotype of natural peptides of three major commercial species of Pheretima,including Pheretima aspergillum(PA),Pheretima vulgaris(PV),and Metaphire magna(MM).We identified 10,477 natural peptides in the PA,7,451 in PV,and 5,896 in MM samples.Five specific signature peptides were screened and then validated using synthetic peptides;these demonstrated robust specificity for the authentication of PA,PV,and MM.Finally,all marker peptides were successfully applied to identify the zoological origins of Brain Heart capsules and Xiaohuoluo pills,revealing the inconsistent Pheretima species used in these CPMs.In conclusion,our integrated strategy could be used for the in-depth characterization of natural peptides of other animal-derived traditional Chinese medicines,especially non-model species with poorly annotated protein databases.展开更多
Pediatric central nervous system tumors are the most common tumors in children,it constitute 15%–20%of all malignancies in children and are the leading cause of cancer related deaths in children.Proteogenomics is an ...Pediatric central nervous system tumors are the most common tumors in children,it constitute 15%–20%of all malignancies in children and are the leading cause of cancer related deaths in children.Proteogenomics is an emerging field of biological research that utilizes a combination of proteomics,genomics,and transcriptomics to aid in the discovery and identification of biomarkers for diagnosis and therapeutic purposes.Integrative proteogenomics analysis of pediatric tumors identified underlying biological processes and potential treatments as well as the functional effects of somatic mutations and copy number variation driving tumorigenesis.展开更多
Diatoms are unicellular eukaryotic phytoplankton that account for approximately 20%of global carbon fixation and 40%of marine primary productivity;thus,they are essential for global carbon biogeochemical cycling and c...Diatoms are unicellular eukaryotic phytoplankton that account for approximately 20%of global carbon fixation and 40%of marine primary productivity;thus,they are essential for global carbon biogeochemical cycling and climate.The availability of ten diatom genome sequences has facilitated evolutionary,biological and ecological research over the past decade;however,a complimentary map of the diatom proteome with direct measurements of proteins and peptides is still lacking.Here,we present a proteome map of the model marine diatom Thalassiosira pseudonana using high-resolution mass spectrometry combined with a proteogenomic strategy.In-depth proteomic profiling of three different growth phases and three nutrient-deficient samples identified 9526 proteins,accounting for~81%of the predicted protein-coding genes.Proteogenomic analysis identified 1235 novel genes,975 revised genes,104 splice variants and 234 single amino acid variants.Furthermore,our quantitative proteomic analysis experimentally demonstrated that a considerable number of novel genes were differentially translated under different nutrient conditions.These findings substantially improve the genome annotation of T.pseudonana and provide insights into new biological functions of diatoms.This relatively comprehensive diatom proteome catalog will complement available diatom genome and transcriptome data to advance biological and ecological research of marine diatoms.展开更多
Jasmonic acid is a crucial phytohormone that plays a pivotal role,serving as a regulator to balancing plant development and resistance.However,there are analogous and distinctive characteristics exhibited in JA biosyn...Jasmonic acid is a crucial phytohormone that plays a pivotal role,serving as a regulator to balancing plant development and resistance.However,there are analogous and distinctive characteristics exhibited in JA biosynthesis,perception,and signal transduction pathways in both herbaceous and woody plants.Moreover,the majority of research subjects have predominantly focused on the function of JA in model or herbaceous plants.Consequently,there is a significant paucity of studies investigating JA regulation networks in woody plants,particularly concerning post-transcriptional regulatory events such as alternative splicing(AS).This review article aims to conduct a comprehensive summary of advancements that JA signals regulate plant development across various woody species,comparing the analogous features and regulatory differences to herbaceous counterparts.In addition,we summarized the involvement of AS events including splicing factor(SF)and transcripts in the JA regulatory network,highlighting the effectiveness of high-throughput proteogenomic methods.A better understanding of the JA signaling pathway in woody plants has pivotal implications for forestry production,including optimizing plant management and enhancing secondary metabolite production.展开更多
We propose a novel conditional graphical model -- spaceMap -- to construct gene regulatory networks from multiple types of high dimensional omic profiles. A motivating application is to characterize the perturbation o...We propose a novel conditional graphical model -- spaceMap -- to construct gene regulatory networks from multiple types of high dimensional omic profiles. A motivating application is to characterize the perturbation of DNA copy number alterations (CNAs) on downstream protein levels in tumors. Through a penalized multivariate regression framework, spaceMap jointly models high dimensional protein levels as responses and high dimensional CNAs as predictors. In this setup, spaceMap infers an undirected network among proteins together with a directed network encoding how CNAs perturb the protein network, spaceMap can be applied to learn other types of regulatory relationships from high dimensional molecular profiles, especially those exhibiting hub structures. Simulation studies show spaceMap has greater power in detecting regulatory relationships over competing methods. Additionally, spaceMap includes a network analysis toolkit for biological interpretation of inferred networks. We applies spaceMap to the CNAs, gene expression and proteomics data sets from CPTAC-TCGA breast (n = 77) and ovarian (n = 174) cancer studies. Each cancer exhibits disruption of'ion transmembrane transport' and 'regulation from RNA polymerase lI promoter' by CNA events unique to each cancer. Moreover, using protein levels as a response yields a more functionally-enriched network than using RNA expressions in both cancer types. The network results also help to pinpoint crucial cancer genes and provide insights on the functional consequences of important CNA in breast and ovarian cancers.展开更多
Acute myeloid leukaemia(AML)is characterized mainly by an increase in the number of myeloid cells in the bone marrow and a decrease in the number of mature cells;AML accounts for 28%of leukaemia cases,and it has a fiv...Acute myeloid leukaemia(AML)is characterized mainly by an increase in the number of myeloid cells in the bone marrow and a decrease in the number of mature cells;AML accounts for 28%of leukaemia cases,and it has a five-year survival rate of only30.5%[1].The prognosis of AML patients is mostly poor,and drug resistance eventually emerges with the long-term use of chemotherapy or targeted therapy;this drug resistance represents a daunting challenge in the management of AML[2,3].展开更多
Diatoms comprise a diverse and ecologically important group of eukaryotic phytoplankton that signifi- cantly contributes to marine primary production and global carbon cycling. Phaeodactylum tricornutum is commonly us...Diatoms comprise a diverse and ecologically important group of eukaryotic phytoplankton that signifi- cantly contributes to marine primary production and global carbon cycling. Phaeodactylum tricornutum is commonly used as a model organism for studying diatom biology. Although its genome was sequenced in 2008, a high-quality genome annotation is still not available for this diatom. Here we report the develop- ment of an integrated proteogenomic pipeline and its application for improved annotation of P. tricornutum genome using mass spectrometry (MS)-based proteomics data. Our proteogenomic analysis unambigu- ously identified approximately 8300 genes and revealed 606 novel proteins, 506 revised genes, 94 splice variants, 58 single amino acid variants, and a holistic view of post-translational modifications in P. tricor- nutum. We experimentally confirmed a subset of novel events and obtained MS evidence for more than 200 micropeptides in P. tricornutum. These findings expand the genomic landscape of P. tricornutum and provide a rich resource for the study of diatom biology. The proteogenomic pipeline we developed in this study is applicable to any sequenced eukaryote and thus represents a significant contribution to the toolset for eukaryotic proteogenomic analysis. The pipeline and its source code are freely available at https://sourceforge.net/projects/gapeproteogeno mic.展开更多
Pear is an important fruit tree that is widely distributed around the world.The first pear genome map was reported from our laboratory approximately 10 years ago.To further study global protein expression patterns in ...Pear is an important fruit tree that is widely distributed around the world.The first pear genome map was reported from our laboratory approximately 10 years ago.To further study global protein expression patterns in pear,we generated pear proteome data based on 24 major tissues.The tissue-resolved profiles provided evidence of the expression of 17953 proteins.We identified 4294 new coding events and improved the pear genome annotation via the proteogenomic strategy based on 18090 peptide spectra with peptide spectrum matches>1.Among the eight randomly selected new short coding open reading frames that were expressed in the style,four promoted and one inhibited the growth of pear pollen tubes.Based on gene coexpression module analysis,we explored the key genes associated with important agronomic traits,such as stone cell formation in fruits.The network regulating the synthesis of lignin,a major component of stone cells,was reconstructed,and receptor-like kinases were implicated as core factors in this regulatory network.Moreover,we constructed the online database PearEXP(http://www.peardb.org.cn)to enable access to the pear proteogenomic resources.This study provides a paradigm for in-depth proteogenomic studies of woody plants.展开更多
Water pollution is a significant problem in almost all parts of the world.The complexity of anthropogenic activities along the watershed seems to lead the river to function as a giant disposal container.The river is u...Water pollution is a significant problem in almost all parts of the world.The complexity of anthropogenic activities along the watershed seems to lead the river to function as a giant disposal container.The river is under threat of degradation,mainly due to heavy metal pollution from anthropogenic actions.Heavy metals become harmful if they pollute waters since they are accumulative,toxic,and carcinogenic in water bodies and biota.Various biomarkers to evaluate heavy metal contamination in several aquatic organisms have been widely reported.The use of molecular biomarkers become more popular in the last years and still lead for future prospect.Proteomics and genomics with bioinformatics approaches have been expanded with technological methods through DNA and RNA sequencing and mass spectrometry based proteomics.Therefore,this article aims to review studies using biomarker approaches in many aquatic organisms.This review is expected to reference and encourage future biomarker research,especially for monitoring heavy metal pollution in rivers.展开更多
基金supported by the Key Program of the National Natural Science Foundation of China(Grant No.:82130111)the National Natural Science Foundation of China(Grant No.:81803716)+1 种基金the Qi-Huang Chief Scientist Project of the National Administration of Traditional Chinese Medicine,China(2020)the SIMM-SHUTCM Traditional Chinese Medicine Innovation Joint Research Program,China(Grant No.:E2G809H).
文摘Pheretima,also called“earthworms”,is a well-known animal-derived traditional Chinese medicine that is extensively used in over 50 Chinese patent medicines(CPMs)in Chinese Pharmacopoeia(2020 edition).However,its zoological origin is unclear,both in the herbal market and CPMs.In this study,a strategy for integrating in-house annotated protein databases constructed from close evolutionary relationship-sourced RNA sequencing data from public archival resources and various sequencing algorithms(restricted search,open search,and de novo)was developed to characterize the phenotype of natural peptides of three major commercial species of Pheretima,including Pheretima aspergillum(PA),Pheretima vulgaris(PV),and Metaphire magna(MM).We identified 10,477 natural peptides in the PA,7,451 in PV,and 5,896 in MM samples.Five specific signature peptides were screened and then validated using synthetic peptides;these demonstrated robust specificity for the authentication of PA,PV,and MM.Finally,all marker peptides were successfully applied to identify the zoological origins of Brain Heart capsules and Xiaohuoluo pills,revealing the inconsistent Pheretima species used in these CPMs.In conclusion,our integrated strategy could be used for the in-depth characterization of natural peptides of other animal-derived traditional Chinese medicines,especially non-model species with poorly annotated protein databases.
基金the National Institutes of Health,USA(NIH,P30 DK063491).
文摘Pediatric central nervous system tumors are the most common tumors in children,it constitute 15%–20%of all malignancies in children and are the leading cause of cancer related deaths in children.Proteogenomics is an emerging field of biological research that utilizes a combination of proteomics,genomics,and transcriptomics to aid in the discovery and identification of biomarkers for diagnosis and therapeutic purposes.Integrative proteogenomics analysis of pediatric tumors identified underlying biological processes and potential treatments as well as the functional effects of somatic mutations and copy number variation driving tumorigenesis.
基金This work was partially supported by research grants from the National Natural Science Foundation of China(Project No.42030404 and 41425021)the Ministry of Science and Technology of the People's Republic of China(Project No.2015CB954003)D-ZW was also supported by the Ten Thousand Talents Program for leading talents in science and technological innovation.
文摘Diatoms are unicellular eukaryotic phytoplankton that account for approximately 20%of global carbon fixation and 40%of marine primary productivity;thus,they are essential for global carbon biogeochemical cycling and climate.The availability of ten diatom genome sequences has facilitated evolutionary,biological and ecological research over the past decade;however,a complimentary map of the diatom proteome with direct measurements of proteins and peptides is still lacking.Here,we present a proteome map of the model marine diatom Thalassiosira pseudonana using high-resolution mass spectrometry combined with a proteogenomic strategy.In-depth proteomic profiling of three different growth phases and three nutrient-deficient samples identified 9526 proteins,accounting for~81%of the predicted protein-coding genes.Proteogenomic analysis identified 1235 novel genes,975 revised genes,104 splice variants and 234 single amino acid variants.Furthermore,our quantitative proteomic analysis experimentally demonstrated that a considerable number of novel genes were differentially translated under different nutrient conditions.These findings substantially improve the genome annotation of T.pseudonana and provide insights into new biological functions of diatoms.This relatively comprehensive diatom proteome catalog will complement available diatom genome and transcriptome data to advance biological and ecological research of marine diatoms.
基金supported by the Natural Science Foundation of Jiangsu Province(BK20221334)the Jiangsu Agricultural Science and Technology Innovation Fund(CX(21)2023)+2 种基金the Science Technology and Innovation Committee of Shenzhen(JCYJ20210324115408023)the Major Project of Natural Science Research in Colleges of Jiangsu Province(20KJA220001)the Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX23_1115).
文摘Jasmonic acid is a crucial phytohormone that plays a pivotal role,serving as a regulator to balancing plant development and resistance.However,there are analogous and distinctive characteristics exhibited in JA biosynthesis,perception,and signal transduction pathways in both herbaceous and woody plants.Moreover,the majority of research subjects have predominantly focused on the function of JA in model or herbaceous plants.Consequently,there is a significant paucity of studies investigating JA regulation networks in woody plants,particularly concerning post-transcriptional regulatory events such as alternative splicing(AS).This review article aims to conduct a comprehensive summary of advancements that JA signals regulate plant development across various woody species,comparing the analogous features and regulatory differences to herbaceous counterparts.In addition,we summarized the involvement of AS events including splicing factor(SF)and transcripts in the JA regulatory network,highlighting the effectiveness of high-throughput proteogenomic methods.A better understanding of the JA signaling pathway in woody plants has pivotal implications for forestry production,including optimizing plant management and enhancing secondary metabolite production.
基金supported by the Floyd and Mary Schwall Fellowship in Medical Research and grants NIH R01-GM082802, R01-GM108711, R01-CA189532 and NSF DMS-1148643partly supported by grant U24 CA 210093, from the National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC)
文摘We propose a novel conditional graphical model -- spaceMap -- to construct gene regulatory networks from multiple types of high dimensional omic profiles. A motivating application is to characterize the perturbation of DNA copy number alterations (CNAs) on downstream protein levels in tumors. Through a penalized multivariate regression framework, spaceMap jointly models high dimensional protein levels as responses and high dimensional CNAs as predictors. In this setup, spaceMap infers an undirected network among proteins together with a directed network encoding how CNAs perturb the protein network, spaceMap can be applied to learn other types of regulatory relationships from high dimensional molecular profiles, especially those exhibiting hub structures. Simulation studies show spaceMap has greater power in detecting regulatory relationships over competing methods. Additionally, spaceMap includes a network analysis toolkit for biological interpretation of inferred networks. We applies spaceMap to the CNAs, gene expression and proteomics data sets from CPTAC-TCGA breast (n = 77) and ovarian (n = 174) cancer studies. Each cancer exhibits disruption of'ion transmembrane transport' and 'regulation from RNA polymerase lI promoter' by CNA events unique to each cancer. Moreover, using protein levels as a response yields a more functionally-enriched network than using RNA expressions in both cancer types. The network results also help to pinpoint crucial cancer genes and provide insights on the functional consequences of important CNA in breast and ovarian cancers.
基金supported by the National Natural Science Foundation of China(81673466,81821005,32322048,22225702,92153302,32471497,82404655,and 82273951)Guangdong High-level New R&D Institute(2019B090904008)+8 种基金Guangdong High-level Innovative Research Institute(2021B0909050003)Science and Technology Commission of Shanghai Municipality(18431907100 and 19430750100)the National Key Research and Development Program of China(2020YFE0202200)Program of Shanghai Academic Research Leader(2XD1420900)Shanghai Rising-Star Program(22QA1411100)the Youth Innovation Promotion Association(CAS2021276)the support of the Sanofi scholarship program,Shandong Laboratory Program(SYS202205)Taishan Scholars Program(tstp0648)Innovative Research Team of High-level Local Universities in Shanghai,and Shanghai Science and Technology Development Funds(24YF2755300)。
文摘Acute myeloid leukaemia(AML)is characterized mainly by an increase in the number of myeloid cells in the bone marrow and a decrease in the number of mature cells;AML accounts for 28%of leukaemia cases,and it has a five-year survival rate of only30.5%[1].The prognosis of AML patients is mostly poor,and drug resistance eventually emerges with the long-term use of chemotherapy or targeted therapy;this drug resistance represents a daunting challenge in the management of AML[2,3].
基金This work was supported by the National Key Research and Development Program (2016YFA0501304), the National Natural Science Foundation of China (grant no. 31570829), and the Strategic Priority Research Program of the Chinese Academy of Sciences (grant no. XDB14030202).
文摘Diatoms comprise a diverse and ecologically important group of eukaryotic phytoplankton that signifi- cantly contributes to marine primary production and global carbon cycling. Phaeodactylum tricornutum is commonly used as a model organism for studying diatom biology. Although its genome was sequenced in 2008, a high-quality genome annotation is still not available for this diatom. Here we report the develop- ment of an integrated proteogenomic pipeline and its application for improved annotation of P. tricornutum genome using mass spectrometry (MS)-based proteomics data. Our proteogenomic analysis unambigu- ously identified approximately 8300 genes and revealed 606 novel proteins, 506 revised genes, 94 splice variants, 58 single amino acid variants, and a holistic view of post-translational modifications in P. tricor- nutum. We experimentally confirmed a subset of novel events and obtained MS evidence for more than 200 micropeptides in P. tricornutum. These findings expand the genomic landscape of P. tricornutum and provide a rich resource for the study of diatom biology. The proteogenomic pipeline we developed in this study is applicable to any sequenced eukaryote and thus represents a significant contribution to the toolset for eukaryotic proteogenomic analysis. The pipeline and its source code are freely available at https://sourceforge.net/projects/gapeproteogeno mic.
基金funded by the National Key Research and Development Program of China(2022YFF1003100-02,2020YFE0202900)the National Natural Science Foundation of China(32172543,31830081,22274130,32202411)+5 种基金Fundamental Research Funds for the Central Universities(JCQY201901,KYZ201888)Jiangsu Agriculture Science and Technology Innovation Fund(CX(19)2028)the seed industry promotion project of Jiangsu(JBGS(2021)022)the guidance foundation of Hainan Institute of Nanjing Agricultural University(NAUSY-MS08)the Earmarked Fund for China Agriculture Research System(CARS-28)the project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions.
文摘Pear is an important fruit tree that is widely distributed around the world.The first pear genome map was reported from our laboratory approximately 10 years ago.To further study global protein expression patterns in pear,we generated pear proteome data based on 24 major tissues.The tissue-resolved profiles provided evidence of the expression of 17953 proteins.We identified 4294 new coding events and improved the pear genome annotation via the proteogenomic strategy based on 18090 peptide spectra with peptide spectrum matches>1.Among the eight randomly selected new short coding open reading frames that were expressed in the style,four promoted and one inhibited the growth of pear pollen tubes.Based on gene coexpression module analysis,we explored the key genes associated with important agronomic traits,such as stone cell formation in fruits.The network regulating the synthesis of lignin,a major component of stone cells,was reconstructed,and receptor-like kinases were implicated as core factors in this regulatory network.Moreover,we constructed the online database PearEXP(http://www.peardb.org.cn)to enable access to the pear proteogenomic resources.This study provides a paradigm for in-depth proteogenomic studies of woody plants.
文摘Water pollution is a significant problem in almost all parts of the world.The complexity of anthropogenic activities along the watershed seems to lead the river to function as a giant disposal container.The river is under threat of degradation,mainly due to heavy metal pollution from anthropogenic actions.Heavy metals become harmful if they pollute waters since they are accumulative,toxic,and carcinogenic in water bodies and biota.Various biomarkers to evaluate heavy metal contamination in several aquatic organisms have been widely reported.The use of molecular biomarkers become more popular in the last years and still lead for future prospect.Proteomics and genomics with bioinformatics approaches have been expanded with technological methods through DNA and RNA sequencing and mass spectrometry based proteomics.Therefore,this article aims to review studies using biomarker approaches in many aquatic organisms.This review is expected to reference and encourage future biomarker research,especially for monitoring heavy metal pollution in rivers.