High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging ...High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging foundation models and multimodal learning frameworks are enabling scalable and transferable representations of cellular states,while advances in interpretability and real-world data integration are bridging the gap between discovery and clinical application.This paper outlines a concise roadmap for AI-driven,transcriptome-centered multi-omics integration in precision medicine(Figure 1).展开更多
The joural Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on“AI+BT for Big Clinical Omics Data”scheduled for publication in the Autumn ...The joural Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on“AI+BT for Big Clinical Omics Data”scheduled for publication in the Autumn of 2026.This special issue seeks submissions that focus on integrating artificial intelligence(AI)and biotechnologies(BT)to largely improve the collection,modelling,analysis,and application of large-scale clinical omics data.展开更多
The journal Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on"AI+BT for Big Clinical Omics Data"scheduled for publication in th...The journal Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on"AI+BT for Big Clinical Omics Data"scheduled for publication in the Autumn of 2026.展开更多
The journal Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on“AI+BT for Big Clinical Omics Data”scheduled for publication in the Autumn...The journal Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on“AI+BT for Big Clinical Omics Data”scheduled for publication in the Autumn of 2026.This special issue seeks submissions that focus on integrating artificial intelligence(AI)and biotechnologies(BT)to largely improve the collection,modelling,analysis,and application of large-scale clinical omics data.The goal is to address the challenges posed by the high-dimensional and dynamic nature of big clinical omics data and explore their potential to advance the diagnosis and treatment of complex diseases.展开更多
The rapid growth of population-scale whole-genome resequencing,RNA sequencing,bisulfite sequencing,and metabolomic and proteomic profiling has led quantitative genetics into the era of big omics data.Asso-ciation analys...The rapid growth of population-scale whole-genome resequencing,RNA sequencing,bisulfite sequencing,and metabolomic and proteomic profiling has led quantitative genetics into the era of big omics data.Asso-ciation analyses of omics data,such as genome-,transcriptome-,proteome-,and methylome-wide associ-ation studies,along with integrative analyses of multiple omics datasets,require various bioinformatics tools,which rely on advanced programming skills and command-line interfaces and thus pose challenges for wet-lab biologists.Here,we present EasyOmics,a stand-alone R Shiny application with a user-friendly interface that enables wet-lab biologists to perform population-scale omics data association,integration,and visualization.The toolkit incorporates multiple functions designed to meet the increasing demand for population-scale omics data analyses,including data quality control,heritability estimation,genome-wide association analysis,conditional association analysis,omics quantitative trait locus mapping,omics-wide association analysis,omics data integration,and visualization.A wide range of publication-quality graphs can be prepared in EasyOmics by pointing and clicking.EasyOmics is a platform-independent software that can be run under all operating systems,with a docker container for quick installation.It is freely available to non-commercial users at Docker Hub https://hub.docker.com/r/yuhan2000/easyomics.展开更多
Nowadays,biological databases are playing an increasingly critical role in biological research.Myceliophthora thermophila is an excellent thermophilic fungal chassis for industrial enzyme production and plant biomass-...Nowadays,biological databases are playing an increasingly critical role in biological research.Myceliophthora thermophila is an excellent thermophilic fungal chassis for industrial enzyme production and plant biomass-based chemical synthesis.The lack of a dedicated public database has made access to and reanalysis of M.thermophila data difficult.To bridge this gap,we developed MTD(https://mtd.biodesign.ac.cn/),a cloud-based omics database and interactive platform for M.thermophila.MTD integrates comprehensive genome annotations,sequence-based predictions,transcriptome data,curated experimental descriptions,and bioinformatics analysis tools,offering a comprehensive,one-stop solution with a‘top-down’search strategy to streamline M.thermophila research.The platform supports data reproduction,rapid querying,and in-depth mining of existing tran-scriptome datasets.Based on analyses using data and tools in MTD,we identified shifts in metabolic allocation in a glucoamylase hyperproduction strain of M.thermophila,highlighting changes in fatty acid biosynthesis and amino acids biosynthesis pathways,which provide new insights into the underlying phenotypic alterations.As a pioneering resource,MTD marks a key advancement in M.thermophila research and sets the model for developing similar databases for other species.展开更多
Proteins play a pivotal role in coordinating the functions of organisms,essentially governing their traits,as the dynamic arrangement of diverse amino acids leads to a multitude of folded configurations within peptide...Proteins play a pivotal role in coordinating the functions of organisms,essentially governing their traits,as the dynamic arrangement of diverse amino acids leads to a multitude of folded configurations within peptide chains.Despite dynamic changes in amino acid composition of an individual protein(referred to as AAP)and great variance in protein expression levels under different conditions,our study,utilizing transcriptomics data from four model organisms uncovers surprising stability in the overall amino acid composition of the total cellular proteins(referred to as AACell).Although this value may vary between different species,we observed no significant differences among distinct strains of the same species.This indicates that organisms enforce system-level constraints to maintain a consistent AACell,even amid fluctuations in AAP and protein expression.Further exploration of this phenomenon promises insights into the intricate mechanisms orchestrating cellular protein expression and adaptation to varying environmental challenges.展开更多
Osteoarthritis(OA)is a degenerative joint disease with significant clinical and societal impact.Traditional diagnostic methods,including subjective clinical assessments and imaging techniques such as X-rays and MRIs,a...Osteoarthritis(OA)is a degenerative joint disease with significant clinical and societal impact.Traditional diagnostic methods,including subjective clinical assessments and imaging techniques such as X-rays and MRIs,are often limited in their ability to detect early-stage OA or capture subtle joint changes.These limitations result in delayed diagnoses and inconsistent outcomes.Additionally,the analysis of omics data is challenged by the complexity and high dimensionality of biological datasets,making it difficult to identify key molecular mechanisms and biomarkers.Recent advancements in artificial intelligence(AI)offer transformative potential to address these challenges.This review systematically explores the integration of AI into OA research,focusing on applications such as AI-driven early screening and risk prediction from electronic health records(EHR),automated grading and morphological analysis of imaging data,and biomarker discovery through multi-omics integration.By consolidating progress across clinical,imaging,and omics domains,this review provides a comprehensive perspective on how AI is reshaping OA research.The findings have the potential to drive innovations in personalized medicine and targeted interventions,addressing longstanding challenges in OA diagnosis and management.展开更多
Objective:To use the gene chip of pseudomonas aeruginosa as a research sample and to explore it at an omics level,aiming at elucidating the co-expression network characteristics of the virulence genes exoS and exoU of...Objective:To use the gene chip of pseudomonas aeruginosa as a research sample and to explore it at an omics level,aiming at elucidating the co-expression network characteristics of the virulence genes exoS and exoU of pseudomonas aeruginosa in the lower respiratory tract from the perspective of molecular biology and identifying its key regulatory genes.Methods:From March 2016 to May 2018,312 patients infected with pseudomonas aeruginosa in the lower respiratory tract who were admitted to Department of Respiratory Medicine of Baogang Hospital and given follow-up treatments in the hospital were selected as subjects by use of cluster sampling.Alveolar lavage fluid and sputum collected from those patients were used as biological specimens.The genes of pseudomonas aeruginosa were detected with the help of oligonucleotide probes to make a pre-processing of chip data.A total of 8 common antibiotics(ceftazidime,gentamicin,piperacillin,amikacin,ciprofloxacin,levofloxacin,doripenem and ticarcillin)against Gram-negative bacteria were selected to determine the drug resistance of biological specimens.MCODE algorithm was used to construct a co-expression network model of the drug-resistance genes focused on exoS/exoU.Results:The expression level of exoS/exoU in the drug-resistance group was significantly higher than that in the non-resistance group(p<0.05).The top 5 differentially expressed genes in the alveolar lavage fluid specimens from the drug-resistance group were RAC1,ITGB1,ITGB5,CRK and IGF1R in the order from high to low.In the sputum specimens,the top 5 differentially expressed genes were RAC1,CRK,IGF1R,ITGB1 and ITGB5.In the alveolar lavage fluid specimens,only RAC1 had a positive correlation with the expression of exoS and exoU(p<0.05).In the sputum specimens,RAC1,ITGB1,ITGB5,CRK and IGF1R were positively correlated with the expression of exoS and exoU(p<0.05).The genes included in the co-expression network contained exoS,exoU,RAC1,ITGB1,ITGB5,CRK,CAMK2D,RHOA,FLNA,IGF1R,TGFBR2 and FOS.Among them,RAC1 had a highest score in the aspect of regulatory ability(72.00)and the largest number of regulatory genes(6);followed by ITGB1,ITGB5 and CRK genes.Conclusions:The high expression of exoS and exoU in the sputum specimens suggests that pseudomonas aeruginosa has a higher probability to get resistant to antibiotics;RAC1,ITGB1,ITGB5 and CRK genes may be the key genes that can regulate the expression of exoS and exoU.展开更多
Gastrointestinal(GI)cancers are a set of diverse diseases affecting many parts/organs.The five most frequent GI cancer types are esophageal,gastric cancer(GC),liver cancer,pancreatic cancer,and colorectal cancer(CRC);...Gastrointestinal(GI)cancers are a set of diverse diseases affecting many parts/organs.The five most frequent GI cancer types are esophageal,gastric cancer(GC),liver cancer,pancreatic cancer,and colorectal cancer(CRC);together,they give rise to 5 million new cases and cause the death of 3.5 million people annually.We provide information about molecular changes crucial to tumorigenesis and the behavior and prognosis.During the formation of cancer cells,the genomic changes are microsatellite instability with multiple chromosomal arrangements in GC and CRC.The genomically stable subtype is observed in GC and pancreatic cancer.Besides these genomic subtypes,CRC has epigenetic modification(hypermethylation)associated with a poor prognosis.The pathway information highlights the functions shared by GI cancers such as apoptosis;focal adhesion;and the p21-activated kinase,phosphoinositide 3-kinase/Akt,transforming growth factor beta,and Toll-like receptor signaling pathways.These pathways show survival,cell proliferation,and cell motility.In addition,the immune response and inflammation are also essential elements in the shared functions.We also retrieved information on protein-protein interaction from the STRING database,and found that proteins Akt1,catenin beta 1(CTNNB1),E1A binding protein P300,tumor protein p53(TP53),and TP53 binding protein 1(TP53BP1)are central nodes in the network.The protein expression of these genes is associated with overall survival in some GI cancers.The low TP53BP1 expression in CRC,high EP300 expression in esophageal cancer,and increased expression of Akt1/TP53 or low CTNNB1 expression in GC are associated with a poor prognosis.The Kaplan Meier plotter database also confirmed the association between expression of the five central genes and GC survival rates.In conclusion,GI cancers are very diverse at the molecular level.However,the shared mutations and protein pathways might be used to understand better and reveal diagnostic/prognostic or drug targets.展开更多
Background:Precision medicine(PM)has taken center stage in healthcare since the completion of the genomic project.Developed countries have gradually integrated PM into mainstream patient management.However,Nigeria sti...Background:Precision medicine(PM)has taken center stage in healthcare since the completion of the genomic project.Developed countries have gradually integrated PM into mainstream patient management.However,Nigeria still grapples with wide acceptance,key translational research and implementation of PM.This study sought to explore the knowledge and attitude of PM among pharmacists as key stakeholders in the healthcare team.Methods:A cross‐sectional study was conducted in selected tertiary hospitals across the country.A 21‐item semi‐structured questionnaire was administered by hybrid online and physical methods and the results analyzed with Statistical Package for the Social Sciences Version 25.Descriptive statistics were used to summarize the data.A chi‐square test was employed to determine the association of knowledge of PM and the sociodemographic characteristics of the study population.Results:A total of 167 hospital pharmacists participated in the study.A high proportion of the participants are familiar with artificial intelligence(91.75%),Pharmacogenomics(84.5%),and precision medicine(61%).Overall,38.9%of the pharmacists had a good knowledge while 13.2%had a poor knowledge of PM and associated terms.The level of knowledge did not correlate significantly with gender(X^(2)=3.21,p=0.201),age(X^(2)=5,p=0.27),marital status(X^(2)=3.21,p=0.201),and professional level(X^(2)=6.85,p=0.144).The most important value of precision medicine to hospital pharmacists is the ability to minimize the impact of disease through preventive medicine(49%)while a large portion are pursuing and or actively planning to pursue additional education in precision medicine.Conclusions:There is a highly positive attitude toward the prospect of PM among hospital pharmacists in Nigeria.Education modules in this field are highly recommended as most do not have a holistic knowledge of terms used in PM.Also,more research aimed at translating PM knowledge into clinical practice is recommended.展开更多
Cancer is a complex and heterogeneous disease characterized by various genetic and epigenetic alterations.Early diagnosis,accurate subtyping,and staging are essential for effective,personalized treatment and improved ...Cancer is a complex and heterogeneous disease characterized by various genetic and epigenetic alterations.Early diagnosis,accurate subtyping,and staging are essential for effective,personalized treatment and improved survival rates.Traditional diagnostic methods,such as biopsies,are invasive and carry operational risks that hinder repeated use,underscoring the need for noninvasive and personalized alternatives.In response,this study integrates transcriptomic data into human genome-scale metabolic models(GSMMs)to derive patient-specific flux distributions,which are then combined with genomic,proteomic,and fluxomic(JX)data to develop a robust multi-omic classifier for lung cancer subtyping and early diagnosis.The JX classifier is further enhanced by analyzing heterogeneous datasets from RNA sequencing and microarray analyses derived from both tissue samples and cell culture experiments,thereby enabling the identification of key marker features and enriched pathways such as lipid metabolism and energy production.This integrated approach not only demonstrates high performance in distinguishing lung cancer subtypes and early-stage disease but also proves robust when applied to limited pancreatic cancer data.By linking genotype to phenotype,GSMM-driven flux analysis overcomes challenges related to metabolome data scarcity and platform variability by proposing marker processes and reactions for further investigation,ultimately facilitating noninvasive diagnostics and the identification of actionable biomarkers for targeted therapeutic intervention.These findings offer significant promise for streamlining clinical workflows and enabling personalized therapeutic strategies,and they highlight the potential of our versatile workflow for unveiling novel biomarker landscapes in less studied diseases.展开更多
Screening biomolecular markers from high-dimensional biological data is one of the long-standing tasks for biomedical translational research.With its advantages in both feature shrinkage and biological interpretabilit...Screening biomolecular markers from high-dimensional biological data is one of the long-standing tasks for biomedical translational research.With its advantages in both feature shrinkage and biological interpretability,Least Absolute Shrinkage and Selection Operator(LASSO)algorithm is one of the most popular methods for the scenarios of clinical biomarker development.However,in practice,applying LASSO on omics-based data with high dimensions and low-sample size may usually result in an excess number of predictive variables,leading to the overfitting of the model.Here,we present VSOLassoBag,a wrapped LASSO approach by integrating an ensemble learning strategy to help select efficient and stable variables with high confidence from omics-based data.Using a bagging strategy in combination with a parametric method or inflection point search method,VSOLassoBag can integrate and vote variables generated from multiple LASSO models to determine the optimal candidates.The application of VSOLassoBag on both simulation datasets and real-world datasets shows that the algorithm can effectively identify markers for either case-control binary classification or prognosis prediction.In addition,by comparing with multiple existing algorithms,VSOLassoBag shows a comparable performance under different scenarios while resulting in fewer features than others.In summary,VSOLassoBag,which is available at https://seqworld.com/VSOLassoBag/under the GPL v3 license,provides an alternative strategy for selecting reliable biomarkers from high-dimensional omics data.For user’s convenience,we implement VSOLassoBag as an R package that provides multithreading computing configurations.展开更多
Dear Editor,Multi-omics association analysis is a key method in crop germplasm research,helping to elucidate the regulatory mechanisms of agronomic traits(Liu et al.,2020;Liang et al.,2021).However,most existing multi...Dear Editor,Multi-omics association analysis is a key method in crop germplasm research,helping to elucidate the regulatory mechanisms of agronomic traits(Liu et al.,2020;Liang et al.,2021).However,most existing multi-omics association studies focus on omics data under a single condition,posing challenges in identifying stress-related agronomically important genes.This difficultymainly arises fromthe increased complexity ofmulti-omics analyseswhen comparing control and stress conditions.展开更多
Explainable artificial intelligence aims to interpret how machine learning models make decisions,and many model explainers have been developed in the computer vision field.However,understanding of the applicability of...Explainable artificial intelligence aims to interpret how machine learning models make decisions,and many model explainers have been developed in the computer vision field.However,understanding of the applicability of these model explainers to biological data is still lacking.In this study,we comprehensively evaluated multiple explainers by interpreting pre-trained models for predicting tissue types from transcriptomic data and by identifying the top contributing genes from each sample with the greatest impacts on model prediction.To improve the reproducibility and interpretability of results generated by model explainers,we proposed a series of optimization strategies for each explainer on two different model architectures of multilayer perceptron(MLP)and convolutional neural network(CNN).We observed three groups of explainer and model architecture combinations with high reproducibility.Group II,which contains three model explainers on aggregated MLP models,identified top contributing genes in different tissues that exhibited tissue-specific manifestation and were potential cancer biomarkers.In summary,our work provides novel insights and guidance for exploring biological mechanisms using explainable machine learning models.展开更多
Chronic diseases such as heart disease,cancer,and diabetes are leading drivers of mortality worldwide,underscoring the need for improved efforts around early detection and prediction.The pathophysiology and management...Chronic diseases such as heart disease,cancer,and diabetes are leading drivers of mortality worldwide,underscoring the need for improved efforts around early detection and prediction.The pathophysiology and management of chronic diseases have benefitted from emerging fields in molecular biology like genomics,transcriptomics,proteomics,glycomics,and lipidomics.The complex biomarker and mechanistic data from these"omics"studies present analytical and interpretive challenges,especially for traditional statistical methods.Machine learning(ML)techniques offer considerable promise in unlocking new pathways for data-driven chronic disease risk assessment and prognosis.This review provides a comprehensive overview of state-of-the-art applications of ML algorithms for chronic disease detection and prediction across datasets,including medical imaging,genomics,wearables,and electronic health records.Specifically,we review and synthesize key studies leveraging major ML approaches ranging from traditional techniques such as logistic regression and random forests to modern deep learning neural network architectures.We consolidate existing literature to date around ML for chronic disease prediction to synthesize major trends and trajectories that may inform both future research and clinical translation efforts in this growing field.While highlighting the critical innovations and successes emerging in this space,we identify the key challenges and limitations that remain to be addressed.Finally,we discuss pathways forward toward scalable,equitable,and clinically implementable ML solutions for transforming chronic disease screening and prevention.展开更多
Natural components, evolved to help organisms adapt and defend against threats, are also vital sources for drug discovery due to their diverse and potent bioactivities. In the present work, we proposed the Gene-encode...Natural components, evolved to help organisms adapt and defend against threats, are also vital sources for drug discovery due to their diverse and potent bioactivities. In the present work, we proposed the Gene-encoded Natural Diverse Components Repository (GNDC, https://cbcb.cdutcm.edu.cn/gndc/), a primary and most extensive database dedicated to cataloging diverse natural components. GNDC currently catalogs over 234 million natural components that are organized into four specialized sub-databases: HerbalMDB for 2.32 million secondary metabolites, HerbalPDB for 229 million small peptides, HerbalRDB for 2.38 million small RNAs, and HerbalCDB for 0.26 million carbohydrates. By leveraging customized pipelines for high-throughput multi-omics data and AI technologies, the GNDC enables large-scale discovery and annotation of natural products from nuclear and organellar genomes of species listed in eight global pharmacopoeias and multi-resource data. Compared to existing resources, GNDC achieves a 10-fold increase in component yield and introduces over 200 million previously unreported components. To support this unprecedented data volume and complexity, state-of-the-art AI tools are seamlessly integrated to decipher and annotate vast data collections, such as classification and gene expression signature generation of millions of secondary metabolites. We envision that the GNDC will drive the transformation of drug discovery from an “experience-driven” approach to a “big data-driven” paradigm.展开更多
The aBIOTECH journal is pleased to announce that it will publish a Feature Issue on"AI in Crop Breeding".In this issue,submission of articles addressing the following research areas would be welcomed:Integra...The aBIOTECH journal is pleased to announce that it will publish a Feature Issue on"AI in Crop Breeding".In this issue,submission of articles addressing the following research areas would be welcomed:Integration of multi-omic big data Functional genomics and gene mining Phenotype prediction Intelligent sensing of crop information AI applications in breeding decision support We welcome the following types of articles:Timely reviews,Research articles,Letters,and Stepby-step protocol(s).展开更多
The aBIOTECH journal is pleased to announce that it will publish a Feature Issue on"AI in 0 Crop Breeding".In this issue,submission of articles addressing the following research areas would be welcomed:Integ...The aBIOTECH journal is pleased to announce that it will publish a Feature Issue on"AI in 0 Crop Breeding".In this issue,submission of articles addressing the following research areas would be welcomed:Integration of multi-omic big data.展开更多
The convergence of artificial intelligence(AI)and microbial therapeutics offers promising avenues for novel discoveries and therapeutic interventions.With the exponential growth of omics datasets and rapid advancement...The convergence of artificial intelligence(AI)and microbial therapeutics offers promising avenues for novel discoveries and therapeutic interventions.With the exponential growth of omics datasets and rapid advancements in AI technology,the next generation of AI is increasingly prevalent in microbiology research.In microbial research,AI is instrumental in the classification and functional annotation of microorganisms.Machine learning algorithms facilitate efficient and accurate categorization of microbial taxa,enabling the identification of functional traits and metabolic pathways within microbial communities.Additionally,AI-driven protein design strategies hold promise for engineering enzymes with enhanced catalytic activities and stabilities.By predicting protein structures,functions,and interactions,AI algorithms enable the rational design of proteins and enzymes tailored for specific applications.AI systems are already present in clinical microbiology laboratories in the form of expert rules used by some automated susceptibility testing and identification systems.In the future,microbiology technologists will rely more heavily on AI for initial screening,allowing them to focus on diagnostic challenges and complex technical interpretations.AI-driven approaches hold immense promise in advancing our understanding of microbial ecosystems,accelerating drug discovery processes,and fostering the development of groundbreaking therapeutic interventions.This review aims to summarize common algorithms in AI and their applications within microbiology and synthetic biology.We provide a comprehensive evaluation of AI’s utility in microbial research,discussing both its advantages and challenges.Finally,we explore future research directions and the bottlenecks faced by AI in the microbial field.展开更多
文摘High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging foundation models and multimodal learning frameworks are enabling scalable and transferable representations of cellular states,while advances in interpretability and real-world data integration are bridging the gap between discovery and clinical application.This paper outlines a concise roadmap for AI-driven,transcriptome-centered multi-omics integration in precision medicine(Figure 1).
文摘The joural Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on“AI+BT for Big Clinical Omics Data”scheduled for publication in the Autumn of 2026.This special issue seeks submissions that focus on integrating artificial intelligence(AI)and biotechnologies(BT)to largely improve the collection,modelling,analysis,and application of large-scale clinical omics data.
文摘The journal Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on"AI+BT for Big Clinical Omics Data"scheduled for publication in the Autumn of 2026.
文摘The journal Genomics,Proteomics&Bioinformatics(GPB)invites leading scholars to contribute high-quality manuscripts for a special issue on“AI+BT for Big Clinical Omics Data”scheduled for publication in the Autumn of 2026.This special issue seeks submissions that focus on integrating artificial intelligence(AI)and biotechnologies(BT)to largely improve the collection,modelling,analysis,and application of large-scale clinical omics data.The goal is to address the challenges posed by the high-dimensional and dynamic nature of big clinical omics data and explore their potential to advance the diagnosis and treatment of complex diseases.
基金funded by the State Key Research&Development Project-Youth Scientist program(2023YFD1202400)the National Science Foundation of China(32030006 and 32200503)+1 种基金the Taishan Young Scholar Program and Distinguished Overseas Young Talents Program from Shandong province(2024HWYQ-079)the Agricultural Science and Technology Innovation Program(ASTIP-TRIC01)from the Chinese Academy of Agricultural Sciences.
文摘The rapid growth of population-scale whole-genome resequencing,RNA sequencing,bisulfite sequencing,and metabolomic and proteomic profiling has led quantitative genetics into the era of big omics data.Asso-ciation analyses of omics data,such as genome-,transcriptome-,proteome-,and methylome-wide associ-ation studies,along with integrative analyses of multiple omics datasets,require various bioinformatics tools,which rely on advanced programming skills and command-line interfaces and thus pose challenges for wet-lab biologists.Here,we present EasyOmics,a stand-alone R Shiny application with a user-friendly interface that enables wet-lab biologists to perform population-scale omics data association,integration,and visualization.The toolkit incorporates multiple functions designed to meet the increasing demand for population-scale omics data analyses,including data quality control,heritability estimation,genome-wide association analysis,conditional association analysis,omics quantitative trait locus mapping,omics-wide association analysis,omics data integration,and visualization.A wide range of publication-quality graphs can be prepared in EasyOmics by pointing and clicking.EasyOmics is a platform-independent software that can be run under all operating systems,with a docker container for quick installation.It is freely available to non-commercial users at Docker Hub https://hub.docker.com/r/yuhan2000/easyomics.
基金funded by the Strategic Priority Research Program of the Chinese Academy of Sciences(XDC0110300)the National Key R&D Program of China(2023YFC3403602 and 2022YFC2106000)+2 种基金National Natural Science Foundation of China(32300529,32270100,and 32271481)the Innovation Fund of Haihe Laboratory of Synthetic Biology(22HHSWSS00014)the Tianjin Synthetic Biotechnology Inno-vation Capacity Improvement Project(TSBICIP-PTJJ-007-12).
文摘Nowadays,biological databases are playing an increasingly critical role in biological research.Myceliophthora thermophila is an excellent thermophilic fungal chassis for industrial enzyme production and plant biomass-based chemical synthesis.The lack of a dedicated public database has made access to and reanalysis of M.thermophila data difficult.To bridge this gap,we developed MTD(https://mtd.biodesign.ac.cn/),a cloud-based omics database and interactive platform for M.thermophila.MTD integrates comprehensive genome annotations,sequence-based predictions,transcriptome data,curated experimental descriptions,and bioinformatics analysis tools,offering a comprehensive,one-stop solution with a‘top-down’search strategy to streamline M.thermophila research.The platform supports data reproduction,rapid querying,and in-depth mining of existing tran-scriptome datasets.Based on analyses using data and tools in MTD,we identified shifts in metabolic allocation in a glucoamylase hyperproduction strain of M.thermophila,highlighting changes in fatty acid biosynthesis and amino acids biosynthesis pathways,which provide new insights into the underlying phenotypic alterations.As a pioneering resource,MTD marks a key advancement in M.thermophila research and sets the model for developing similar databases for other species.
基金This research was funded by the National Key R&D Program of China(2022YFC2106000)National Natural Science Foundation of China(32300529,32201242,12326611)+2 种基金Tianjin Synthetic Biotechnology Innovation Capacity Improvement Projects(TSBICIP-PTJS-001,TSBICIP-PTJJ-007)Major Program of Haihe Laboratory of Synthetic Biology(22HHSWSS00021)Strategic Priority Research Program of the Chinese Academy of Sciences(XDC0120201)。
文摘Proteins play a pivotal role in coordinating the functions of organisms,essentially governing their traits,as the dynamic arrangement of diverse amino acids leads to a multitude of folded configurations within peptide chains.Despite dynamic changes in amino acid composition of an individual protein(referred to as AAP)and great variance in protein expression levels under different conditions,our study,utilizing transcriptomics data from four model organisms uncovers surprising stability in the overall amino acid composition of the total cellular proteins(referred to as AACell).Although this value may vary between different species,we observed no significant differences among distinct strains of the same species.This indicates that organisms enforce system-level constraints to maintain a consistent AACell,even amid fluctuations in AAP and protein expression.Further exploration of this phenomenon promises insights into the intricate mechanisms orchestrating cellular protein expression and adaptation to varying environmental challenges.
基金supported by the National Natural Science Foundation of China(82302757)Shenzhen Science and Technology Program(JCY20240813145204006,SGDX20201103095600002,JCYJ20220818103417037,KJZD20230923115200002)+1 种基金Shenzhen Key Laboratory of Digital Surgical Printing Project(ZDSYS201707311542415)Shenzhen Development and Reform Program(XMHT20220106001).
文摘Osteoarthritis(OA)is a degenerative joint disease with significant clinical and societal impact.Traditional diagnostic methods,including subjective clinical assessments and imaging techniques such as X-rays and MRIs,are often limited in their ability to detect early-stage OA or capture subtle joint changes.These limitations result in delayed diagnoses and inconsistent outcomes.Additionally,the analysis of omics data is challenged by the complexity and high dimensionality of biological datasets,making it difficult to identify key molecular mechanisms and biomarkers.Recent advancements in artificial intelligence(AI)offer transformative potential to address these challenges.This review systematically explores the integration of AI into OA research,focusing on applications such as AI-driven early screening and risk prediction from electronic health records(EHR),automated grading and morphological analysis of imaging data,and biomarker discovery through multi-omics integration.By consolidating progress across clinical,imaging,and omics domains,this review provides a comprehensive perspective on how AI is reshaping OA research.The findings have the potential to drive innovations in personalized medicine and targeted interventions,addressing longstanding challenges in OA diagnosis and management.
文摘Objective:To use the gene chip of pseudomonas aeruginosa as a research sample and to explore it at an omics level,aiming at elucidating the co-expression network characteristics of the virulence genes exoS and exoU of pseudomonas aeruginosa in the lower respiratory tract from the perspective of molecular biology and identifying its key regulatory genes.Methods:From March 2016 to May 2018,312 patients infected with pseudomonas aeruginosa in the lower respiratory tract who were admitted to Department of Respiratory Medicine of Baogang Hospital and given follow-up treatments in the hospital were selected as subjects by use of cluster sampling.Alveolar lavage fluid and sputum collected from those patients were used as biological specimens.The genes of pseudomonas aeruginosa were detected with the help of oligonucleotide probes to make a pre-processing of chip data.A total of 8 common antibiotics(ceftazidime,gentamicin,piperacillin,amikacin,ciprofloxacin,levofloxacin,doripenem and ticarcillin)against Gram-negative bacteria were selected to determine the drug resistance of biological specimens.MCODE algorithm was used to construct a co-expression network model of the drug-resistance genes focused on exoS/exoU.Results:The expression level of exoS/exoU in the drug-resistance group was significantly higher than that in the non-resistance group(p<0.05).The top 5 differentially expressed genes in the alveolar lavage fluid specimens from the drug-resistance group were RAC1,ITGB1,ITGB5,CRK and IGF1R in the order from high to low.In the sputum specimens,the top 5 differentially expressed genes were RAC1,CRK,IGF1R,ITGB1 and ITGB5.In the alveolar lavage fluid specimens,only RAC1 had a positive correlation with the expression of exoS and exoU(p<0.05).In the sputum specimens,RAC1,ITGB1,ITGB5,CRK and IGF1R were positively correlated with the expression of exoS and exoU(p<0.05).The genes included in the co-expression network contained exoS,exoU,RAC1,ITGB1,ITGB5,CRK,CAMK2D,RHOA,FLNA,IGF1R,TGFBR2 and FOS.Among them,RAC1 had a highest score in the aspect of regulatory ability(72.00)and the largest number of regulatory genes(6);followed by ITGB1,ITGB5 and CRK genes.Conclusions:The high expression of exoS and exoU in the sputum specimens suggests that pseudomonas aeruginosa has a higher probability to get resistant to antibiotics;RAC1,ITGB1,ITGB5 and CRK genes may be the key genes that can regulate the expression of exoS and exoU.
文摘Gastrointestinal(GI)cancers are a set of diverse diseases affecting many parts/organs.The five most frequent GI cancer types are esophageal,gastric cancer(GC),liver cancer,pancreatic cancer,and colorectal cancer(CRC);together,they give rise to 5 million new cases and cause the death of 3.5 million people annually.We provide information about molecular changes crucial to tumorigenesis and the behavior and prognosis.During the formation of cancer cells,the genomic changes are microsatellite instability with multiple chromosomal arrangements in GC and CRC.The genomically stable subtype is observed in GC and pancreatic cancer.Besides these genomic subtypes,CRC has epigenetic modification(hypermethylation)associated with a poor prognosis.The pathway information highlights the functions shared by GI cancers such as apoptosis;focal adhesion;and the p21-activated kinase,phosphoinositide 3-kinase/Akt,transforming growth factor beta,and Toll-like receptor signaling pathways.These pathways show survival,cell proliferation,and cell motility.In addition,the immune response and inflammation are also essential elements in the shared functions.We also retrieved information on protein-protein interaction from the STRING database,and found that proteins Akt1,catenin beta 1(CTNNB1),E1A binding protein P300,tumor protein p53(TP53),and TP53 binding protein 1(TP53BP1)are central nodes in the network.The protein expression of these genes is associated with overall survival in some GI cancers.The low TP53BP1 expression in CRC,high EP300 expression in esophageal cancer,and increased expression of Akt1/TP53 or low CTNNB1 expression in GC are associated with a poor prognosis.The Kaplan Meier plotter database also confirmed the association between expression of the five central genes and GC survival rates.In conclusion,GI cancers are very diverse at the molecular level.However,the shared mutations and protein pathways might be used to understand better and reveal diagnostic/prognostic or drug targets.
文摘Background:Precision medicine(PM)has taken center stage in healthcare since the completion of the genomic project.Developed countries have gradually integrated PM into mainstream patient management.However,Nigeria still grapples with wide acceptance,key translational research and implementation of PM.This study sought to explore the knowledge and attitude of PM among pharmacists as key stakeholders in the healthcare team.Methods:A cross‐sectional study was conducted in selected tertiary hospitals across the country.A 21‐item semi‐structured questionnaire was administered by hybrid online and physical methods and the results analyzed with Statistical Package for the Social Sciences Version 25.Descriptive statistics were used to summarize the data.A chi‐square test was employed to determine the association of knowledge of PM and the sociodemographic characteristics of the study population.Results:A total of 167 hospital pharmacists participated in the study.A high proportion of the participants are familiar with artificial intelligence(91.75%),Pharmacogenomics(84.5%),and precision medicine(61%).Overall,38.9%of the pharmacists had a good knowledge while 13.2%had a poor knowledge of PM and associated terms.The level of knowledge did not correlate significantly with gender(X^(2)=3.21,p=0.201),age(X^(2)=5,p=0.27),marital status(X^(2)=3.21,p=0.201),and professional level(X^(2)=6.85,p=0.144).The most important value of precision medicine to hospital pharmacists is the ability to minimize the impact of disease through preventive medicine(49%)while a large portion are pursuing and or actively planning to pursue additional education in precision medicine.Conclusions:There is a highly positive attitude toward the prospect of PM among hospital pharmacists in Nigeria.Education modules in this field are highly recommended as most do not have a holistic knowledge of terms used in PM.Also,more research aimed at translating PM knowledge into clinical practice is recommended.
文摘Cancer is a complex and heterogeneous disease characterized by various genetic and epigenetic alterations.Early diagnosis,accurate subtyping,and staging are essential for effective,personalized treatment and improved survival rates.Traditional diagnostic methods,such as biopsies,are invasive and carry operational risks that hinder repeated use,underscoring the need for noninvasive and personalized alternatives.In response,this study integrates transcriptomic data into human genome-scale metabolic models(GSMMs)to derive patient-specific flux distributions,which are then combined with genomic,proteomic,and fluxomic(JX)data to develop a robust multi-omic classifier for lung cancer subtyping and early diagnosis.The JX classifier is further enhanced by analyzing heterogeneous datasets from RNA sequencing and microarray analyses derived from both tissue samples and cell culture experiments,thereby enabling the identification of key marker features and enriched pathways such as lipid metabolism and energy production.This integrated approach not only demonstrates high performance in distinguishing lung cancer subtypes and early-stage disease but also proves robust when applied to limited pancreatic cancer data.By linking genotype to phenotype,GSMM-driven flux analysis overcomes challenges related to metabolome data scarcity and platform variability by proposing marker processes and reactions for further investigation,ultimately facilitating noninvasive diagnostics and the identification of actionable biomarkers for targeted therapeutic intervention.These findings offer significant promise for streamlining clinical workflows and enabling personalized therapeutic strategies,and they highlight the potential of our versatile workflow for unveiling novel biomarker landscapes in less studied diseases.
基金supported by National Key R&D Program of China(2021YFA1302100 to Q.Z)the National Natural Science Foundation of China(82172861 to Q.Z)+1 种基金Guangdong Basic and Applied Basic Research Foundation(2021A1515011743 to Q.Z)National Key Clinical Discipline(to D.Z)。
文摘Screening biomolecular markers from high-dimensional biological data is one of the long-standing tasks for biomedical translational research.With its advantages in both feature shrinkage and biological interpretability,Least Absolute Shrinkage and Selection Operator(LASSO)algorithm is one of the most popular methods for the scenarios of clinical biomarker development.However,in practice,applying LASSO on omics-based data with high dimensions and low-sample size may usually result in an excess number of predictive variables,leading to the overfitting of the model.Here,we present VSOLassoBag,a wrapped LASSO approach by integrating an ensemble learning strategy to help select efficient and stable variables with high confidence from omics-based data.Using a bagging strategy in combination with a parametric method or inflection point search method,VSOLassoBag can integrate and vote variables generated from multiple LASSO models to determine the optimal candidates.The application of VSOLassoBag on both simulation datasets and real-world datasets shows that the algorithm can effectively identify markers for either case-control binary classification or prognosis prediction.In addition,by comparing with multiple existing algorithms,VSOLassoBag shows a comparable performance under different scenarios while resulting in fewer features than others.In summary,VSOLassoBag,which is available at https://seqworld.com/VSOLassoBag/under the GPL v3 license,provides an alternative strategy for selecting reliable biomarkers from high-dimensional omics data.For user’s convenience,we implement VSOLassoBag as an R package that provides multithreading computing configurations.
基金supported by the Biological Breeding-Major Projects(2023ZD04076)the Pinduoduo-China Agricultural University Research Fund(PC2023B01012)+1 种基金the 2115 Talent Development Program of China Agricultural University,the National Natural Science Foundation of China(32201718)the Science and Technology Demonstration Project of Shandong Province(2024SFGC0402).
文摘Dear Editor,Multi-omics association analysis is a key method in crop germplasm research,helping to elucidate the regulatory mechanisms of agronomic traits(Liu et al.,2020;Liang et al.,2021).However,most existing multi-omics association studies focus on omics data under a single condition,posing challenges in identifying stress-related agronomically important genes.This difficultymainly arises fromthe increased complexity ofmulti-omics analyseswhen comparing control and stress conditions.
文摘Explainable artificial intelligence aims to interpret how machine learning models make decisions,and many model explainers have been developed in the computer vision field.However,understanding of the applicability of these model explainers to biological data is still lacking.In this study,we comprehensively evaluated multiple explainers by interpreting pre-trained models for predicting tissue types from transcriptomic data and by identifying the top contributing genes from each sample with the greatest impacts on model prediction.To improve the reproducibility and interpretability of results generated by model explainers,we proposed a series of optimization strategies for each explainer on two different model architectures of multilayer perceptron(MLP)and convolutional neural network(CNN).We observed three groups of explainer and model architecture combinations with high reproducibility.Group II,which contains three model explainers on aggregated MLP models,identified top contributing genes in different tissues that exhibited tissue-specific manifestation and were potential cancer biomarkers.In summary,our work provides novel insights and guidance for exploring biological mechanisms using explainable machine learning models.
文摘Chronic diseases such as heart disease,cancer,and diabetes are leading drivers of mortality worldwide,underscoring the need for improved efforts around early detection and prediction.The pathophysiology and management of chronic diseases have benefitted from emerging fields in molecular biology like genomics,transcriptomics,proteomics,glycomics,and lipidomics.The complex biomarker and mechanistic data from these"omics"studies present analytical and interpretive challenges,especially for traditional statistical methods.Machine learning(ML)techniques offer considerable promise in unlocking new pathways for data-driven chronic disease risk assessment and prognosis.This review provides a comprehensive overview of state-of-the-art applications of ML algorithms for chronic disease detection and prediction across datasets,including medical imaging,genomics,wearables,and electronic health records.Specifically,we review and synthesize key studies leveraging major ML approaches ranging from traditional techniques such as logistic regression and random forests to modern deep learning neural network architectures.We consolidate existing literature to date around ML for chronic disease prediction to synthesize major trends and trajectories that may inform both future research and clinical translation efforts in this growing field.While highlighting the critical innovations and successes emerging in this space,we identify the key challenges and limitations that remain to be addressed.Finally,we discuss pathways forward toward scalable,equitable,and clinically implementable ML solutions for transforming chronic disease screening and prevention.
基金supported by Natural Science Foundation of Sichuan(2024ZDZX0019).
文摘Natural components, evolved to help organisms adapt and defend against threats, are also vital sources for drug discovery due to their diverse and potent bioactivities. In the present work, we proposed the Gene-encoded Natural Diverse Components Repository (GNDC, https://cbcb.cdutcm.edu.cn/gndc/), a primary and most extensive database dedicated to cataloging diverse natural components. GNDC currently catalogs over 234 million natural components that are organized into four specialized sub-databases: HerbalMDB for 2.32 million secondary metabolites, HerbalPDB for 229 million small peptides, HerbalRDB for 2.38 million small RNAs, and HerbalCDB for 0.26 million carbohydrates. By leveraging customized pipelines for high-throughput multi-omics data and AI technologies, the GNDC enables large-scale discovery and annotation of natural products from nuclear and organellar genomes of species listed in eight global pharmacopoeias and multi-resource data. Compared to existing resources, GNDC achieves a 10-fold increase in component yield and introduces over 200 million previously unreported components. To support this unprecedented data volume and complexity, state-of-the-art AI tools are seamlessly integrated to decipher and annotate vast data collections, such as classification and gene expression signature generation of millions of secondary metabolites. We envision that the GNDC will drive the transformation of drug discovery from an “experience-driven” approach to a “big data-driven” paradigm.
文摘The aBIOTECH journal is pleased to announce that it will publish a Feature Issue on"AI in Crop Breeding".In this issue,submission of articles addressing the following research areas would be welcomed:Integration of multi-omic big data Functional genomics and gene mining Phenotype prediction Intelligent sensing of crop information AI applications in breeding decision support We welcome the following types of articles:Timely reviews,Research articles,Letters,and Stepby-step protocol(s).
文摘The aBIOTECH journal is pleased to announce that it will publish a Feature Issue on"AI in 0 Crop Breeding".In this issue,submission of articles addressing the following research areas would be welcomed:Integration of multi-omic big data.
基金supported by the National Natural Science Foundation Projects of China(No.82350003,No.92049201).
文摘The convergence of artificial intelligence(AI)and microbial therapeutics offers promising avenues for novel discoveries and therapeutic interventions.With the exponential growth of omics datasets and rapid advancements in AI technology,the next generation of AI is increasingly prevalent in microbiology research.In microbial research,AI is instrumental in the classification and functional annotation of microorganisms.Machine learning algorithms facilitate efficient and accurate categorization of microbial taxa,enabling the identification of functional traits and metabolic pathways within microbial communities.Additionally,AI-driven protein design strategies hold promise for engineering enzymes with enhanced catalytic activities and stabilities.By predicting protein structures,functions,and interactions,AI algorithms enable the rational design of proteins and enzymes tailored for specific applications.AI systems are already present in clinical microbiology laboratories in the form of expert rules used by some automated susceptibility testing and identification systems.In the future,microbiology technologists will rely more heavily on AI for initial screening,allowing them to focus on diagnostic challenges and complex technical interpretations.AI-driven approaches hold immense promise in advancing our understanding of microbial ecosystems,accelerating drug discovery processes,and fostering the development of groundbreaking therapeutic interventions.This review aims to summarize common algorithms in AI and their applications within microbiology and synthetic biology.We provide a comprehensive evaluation of AI’s utility in microbial research,discussing both its advantages and challenges.Finally,we explore future research directions and the bottlenecks faced by AI in the microbial field.