As a high-value eudicot family,many famous horticultural crop genomes have been deciphered in Oleaceae.However,there are currently no bioinformatics platforms focused on empowering genome research in Oleaceae.Herein,w...As a high-value eudicot family,many famous horticultural crop genomes have been deciphered in Oleaceae.However,there are currently no bioinformatics platforms focused on empowering genome research in Oleaceae.Herein,we developed the first comprehensive Oleaceae Genome Research Platform(OGRP,https://oleaceae.cgrpoee.top/).In OGRP,70 genomes of 10 Oleaceae species and 46 eudicots and 366 transcriptomes involving 18 Oleaceae plant tissues can be obtained.We built 34 window-operated bioinformatics tools,collected 38 professional practical software programs,and proposed 3 new pipelines,namely ancient polyploidization identification,ancestral karyotype reconstruction,and gene family evolution.Employing these pipelines to reanalyze the Oleaceae genomes,we clarified the polyploidization,reconstructed the ancestral karyotypes,and explored the effects of paleogenome evolution on genes with specific biological regulatory roles.Significantly,we generated a series of comparative genomic resources focusing on the Oleaceae,comprising 108 genomic synteny dot plots,1952225 collinear gene pairs,multiple genome alignments,and imprints of paleochromosome rearrangements.Moreover,in Oleaceae genomes,researchers can efficiently search for 1785987 functional annotations,22584 orthogroups,29582 important trait genes from 74 gene families,12664 transcription factor-related genes,9178872 transposable elements,and all involved regulatory pathways.In addition,we provided downloads and usage instructions for the tools,a species encyclopedia,ecological resources,relevant literatures,and external database links.In short,ORGP integrates rich data resources and powerful analytical tools with the characteristic of continuous updating,which can efficiently empower genome research and agricultural breeding in Oleaceae and other plants.展开更多
Rice is one of cereal crops and a model species for monocots.Since the release of the first draft rice genome sequences in 2002,considerable progress has been achieved in rice genomic researches,thanks to rapid develo...Rice is one of cereal crops and a model species for monocots.Since the release of the first draft rice genome sequences in 2002,considerable progress has been achieved in rice genomic researches,thanks to rapid development and efficient utilization of bioinformatics methods and tools.In this review,we summarize the progress of studies of rice genome sequencing and other omics and introduce the wellmaintained bioinformatics databases and tools developed for rice genome resources and breeding.After reviewing the history of rice bioinformatics,we use single-cell sequencing and machine learning as examples showing how bioinformatics integrates emerging technologies and how it continues to develop for future rice research.展开更多
Polyploidy is common among agriculturally important crops. Popular genetic methods and their implementations cannot always be applied to polyploid genetic data. We give an overview about available tools and their limi...Polyploidy is common among agriculturally important crops. Popular genetic methods and their implementations cannot always be applied to polyploid genetic data. We give an overview about available tools and their limitations in terms of levels of ploidy, auto- and allo-ploidy. The main classes of tools are genotype calling, linkage mapping and haplotyping. The usability of the tools is discussed with a focus on their applicability to data sets produced by state of the art technologies. We show that many challenges remain until the toolset for polyploidy provides similar functionalities as those which are already available for diploids. Some tools have been developed over a decade ago and are now outdated. In addition, we discuss necessary steps to overcome this shortage in the future.展开更多
The massive extension in biological data induced a need for user-friendly bioinformatics tools could be used for routine biological data manipulation. Bioanalyzer is a simple analytical software implements a variety o...The massive extension in biological data induced a need for user-friendly bioinformatics tools could be used for routine biological data manipulation. Bioanalyzer is a simple analytical software implements a variety of tools to perform common data analysis on different biological data types and databases. Bioanalyzer provides general aspects of data analysis such as handling nucleotide data, fetching different data formats information, NGS quality control, data visualization, performing multiple sequence alignment and sequence BLAST. These tools accept common biological data formats and produce human-readable output files could be stored on local computer machines. Bioanalyzer has a user-friendly graphical user interface to simplify massive biological data analysis and consume less memory and processing power. Bioanalyzer source code was written through Python programming language which provides less memory usage and initial startup time. Bioanalyzer is a free and open source software, where its code could be modified, extended or integrated in different bioinformatics pipelines. Bioinformatics Produce huge data in FASTA and Genbank format which can be used to produce a lot of annotation information which can be done with Python programming language that open the door form bioinformatics tool due to their elasticity in data analysis and simplicity which inspire us to develop new multiple tool software able to manipulate FASTA and Genbank files. The goal Develop new software uses Genomic data files to produce annotated data. Software was written using python programming language and biopython packages.展开更多
In this editorial preface, I briefly r eview cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field: detection of chromatin states; detection of SN...In this editorial preface, I briefly r eview cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field: detection of chromatin states; detection of SNP- containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer. We expect this issue to provide bioinformatics scientists, cancer biologists, and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.展开更多
Objective This study aims to investigate the expression,prognostic value,and function of kinesin superfamily 4A(KIF4A)in cervical cancer.Methods Cervical cancer cell lines(Hela and SiHa)and TCGA data were used for exp...Objective This study aims to investigate the expression,prognostic value,and function of kinesin superfamily 4A(KIF4A)in cervical cancer.Methods Cervical cancer cell lines(Hela and SiHa)and TCGA data were used for experimental and bioinformatic analyses.Overall survival(OS)and progression free survival(PFS)were compared between patients with high or low KIF4A expression.Copy number variation(CNV)and somatic mutations of patients were visualized and GISTIC 2.0 was used to identify significantly altered sites.The function of KIF4A was also explored based on transcriptome analysis and validated by experimental methods.Chemotherapeutic and immunotherapeutic benefits were inferred using multiple reference databases and algorithms.Results Patients with high KIF4A expression had better OS and PFS.KIF4A could inhibit proliferation and migration and induce G1 arrest of cervical cancer cells.Higher CNV load was observed in patients with low KIF4A expression,while the group with low KIF4A expression displayed more significantly altered sites.A total of 13 genes were found to mutate more in the low KIF4A expression group,including NOTCH1 and PUM1.The analysis revealed that low KIF4A expression may indicate an immune escape phenotype,and patients in this group may benefit more from immunotherapy.With respect to chemotherapy,cisplatin and gemcitabine may respond better in patients with high KIF4A expression,while 5-fluorouracil etc.may be responded better in patients with low KIF4A expression Conclusion KIF4A is a tumor suppressor gene in cervical cancer,and it can be used as a prognostic and therapeutic biomarker in cervical cancer.展开更多
Severe acute respiratory syndrome coronavirus(SARS-CoV)and SARS-CoV-2 are thought to transmit to humans via wild mammals,especially bats.However,evidence for direct bat-to-human transmission is lacking.Involvement of ...Severe acute respiratory syndrome coronavirus(SARS-CoV)and SARS-CoV-2 are thought to transmit to humans via wild mammals,especially bats.However,evidence for direct bat-to-human transmission is lacking.Involvement of intermediate hosts is considered a reason for SARS-CoV-2 transmission to humans and emergence of outbreak.Large biodiversity is found in tropical territories,such as Brazil.On the similar line,this study aimed to predict potential coronavirus hosts among Brazilian wild mammals based on angiotensin-converting enzyme 2(ACE2)sequences using evolutionary bioinformatics.Cougar,maned wolf,and bush dogs were predicted as potential hosts for coronavirus.These indigenous carnivores are philogenetically closer to the known SARS-CoV/SARS-CoV-2 hosts and presented low ACE2 divergence.A new coronavirus transmission chain was developed in which white-tailed deer,a susceptible SARS-CoV-2 host,have the central position.Cougar play an important role because of its low divergent ACE2 level in deer and humans.The discovery of these potential coronavirus hosts will be useful for epidemiological surveillance and discovery of interventions that can contribute to break the transmission chain.展开更多
Big biological data contains a large amount of life science information,yet extracting meaningful insights from this data remains a complex challenge.The hidden Markov model(HMM),a statistical model widely utilized in...Big biological data contains a large amount of life science information,yet extracting meaningful insights from this data remains a complex challenge.The hidden Markov model(HMM),a statistical model widely utilized in machine learning,has proven effective in addressing various problems in bioinformatics.Despite its broad applicability,a more detailed and comprehensive discussion is needed regarding the specific ways in which HMMs are employed in this field.This review provides an overview of the HMM,including its fundamental concepts,the three canonical problems associated with it,and the relevant algorithms used for their resolution.The discussion emphasizes the model’s significant applications in bioinformatics,particularly in areas such as transmembrane protein prediction,gene discovery,sequence alignment,CpG island detection,and copy number variation analysis.Finally,the strengths and limitations of the HMM are discussed,and its prospects in bioinformatics are predicted.HMMs can play a pivotal role in addressing complex biological problems and advancing our understanding of biological sequences and systems.This review can provide bioinformatics researchers with comprehensive information on HMM and guide their work.展开更多
Bioinformatics analysis often requires the filtering of multi-datasets,based on frequency or frequency of occurrence,for decisions on retention or deletion.Existing tools for this purpose often present a challenge wit...Bioinformatics analysis often requires the filtering of multi-datasets,based on frequency or frequency of occurrence,for decisions on retention or deletion.Existing tools for this purpose often present a challenge with complex installation,which necessitate custom coding,thereby impeding efficient data processing activities.To address this issue,Filterx,a user-friendly command line tool that written in C language,was developed that supports multi-condition filtering,based on frequency or occurrence.This tool enables users to complete the data processing tasks through a simple command line,greatly reducing both workload and data processing time.In addition,future development of this tool could facilitate its integration into various bioinformatics data analysis pipelines.展开更多
Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic...Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contex- tualization of the term TBI, describes the discipline's brief history and past accomplishments, as well as current loci, and concludes with predictions of future directions in the field.展开更多
目的探究不同类型(认知/运动)与负荷(简单/困难)双任务步行对人体前额叶皮层激活特征及行走稳定性的影响。方法采用功能性近红外光谱(functional near infrared spectroscopy,fNIRS)技术和三维动态捕捉系统同时测量33名健康成年人单任...目的探究不同类型(认知/运动)与负荷(简单/困难)双任务步行对人体前额叶皮层激活特征及行走稳定性的影响。方法采用功能性近红外光谱(functional near infrared spectroscopy,fNIRS)技术和三维动态捕捉系统同时测量33名健康成年人单任务步行、简单/困难认知双任务步行、运动双任务步行条件下人体前额叶皮层的氧合血红蛋白浓度和运动学参数,并基于运动学数据间接计算动态稳度(margin of stability,MOS)。结果右背外侧前额叶皮质在困难认知双任务步行激活程度高于困难运动双任务步行(F=7.067,P=0.012);左背外侧前额叶皮质在困难认知双任务步行中的激活程度也高于困难运动双任务步行(F=4.831,P=0.035)。此外,右额极区(P=0.029)、右眶额皮质(P=0.046)、左腹外侧前额叶皮质(P=0.039)、左额极区(P=0.022)认知双任务步行中的激活程度均显著高于运动双任务步行。简单认知双任务行走时MOS_(ap)小于困难认知双任务行走(F=13.357,P=0.001);困难认知双任务行走MOS_(ap)大于困难运动双任务行走(F=8.571,P=0.006);简单认知双任务行走时MOS_(ml)小于困难认知双任务行走(F=5.394,P=0.027);困难认知双任务行走时MOS_(ml)大于困难运动双任务行走(F=4.703,P=0.038)。结论双任务执行涉及前额叶亚区的层级化协同调控机制,其中背外侧前额叶皮层优先协调高阶认知任务的资源分配。不同类型与负荷双任务对前额叶皮层激活及行走稳定性的影响存在交互效应。认知双任务步行时多个前额叶亚区的神经激活强度高于运动双任务;困难认知任务在引发前额叶皮层高激活的同时,伴随行走稳定性的下降。简单认知双任务行走稳定性优于困难认知双任务行走,而困难运动双任务行走稳定性优于困难认知双任务行走。展开更多
Realizing personalized medicine requires integrating diverse data types with bioinformatics.The most vital data are genomic information for individuals that are from advanced next-generation sequencing(NGS) technologi...Realizing personalized medicine requires integrating diverse data types with bioinformatics.The most vital data are genomic information for individuals that are from advanced next-generation sequencing(NGS) technologies at present.The technologies continue to advance in terms of both decreasing cost and sequencing speed with concomitant increase in the amount and complexity of the data.The prodigious data together with the requisite computational pipelines for data analysis and interpretation are stressors to IT infrastructure and the scientists conducting the work alike.Bioinformatics is increasingly becoming the rate-limiting step with numerous challenges to be overcome for translating NGS data for personalized medicine.We review some key bioinformatics tasks,issues,and challenges in contexts of IT requirements,data quality,analysis tools and pipelines,and validation of biomarkers.展开更多
Natural products are among the most important sources of lead molecules for drug discovery.With the development of affordable whole-genome sequencing technologies and other‘omics tools,the field of natural products r...Natural products are among the most important sources of lead molecules for drug discovery.With the development of affordable whole-genome sequencing technologies and other‘omics tools,the field of natural products research is currently undergoing a shift in paradigms.While,for decades,mainly analytical and chemical methods gave access to this group of compounds,nowadays genomics-based methods offer complementary approaches to find,identify and characterize such molecules.This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work.In this context,this review gives a summary of tools and databases that currently are available to mine,identify and characterize natural product biosynthesis pathways and their producers based on‘omics data.A web portal called Secondary Metabolite Bioinformatics Portal(SMBP at http://www.secondarymetabolites.org)is introduced to provide a one-stop catalog and links to these bioinformatics resources.In addition,an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field.展开更多
In the 2017 first issue of this Journal - Genomes, Proteomes and Bioinformatics - a special database article entitled "GSA: Gen- ome Sequence Archive" is published. This article provides a brief introduction to th...In the 2017 first issue of this Journal - Genomes, Proteomes and Bioinformatics - a special database article entitled "GSA: Gen- ome Sequence Archive" is published. This article provides a brief introduction to the platform developed by the authors from the BIG Data Center (BIGD) of Beijing Institute of Genomics (BIG), Chinese Academy of Sciences (CAS). The aim of the GSA project is to collect, integrate, and archive raw sequence data submitted by domestic and international users. It is one of the major activities being carried on by a team of around 50 young bioinformaticians at BIGD. In addition to the GSA system, they are also working on several bioinformatics service-orientated projects as described in one of their recent publications .展开更多
In light of the pressing global challenges of climate change,declining crop resilience,and hidden hunger,it is imperative to overcome the limitations of conventional crop breeding to enhance both the nutritional quali...In light of the pressing global challenges of climate change,declining crop resilience,and hidden hunger,it is imperative to overcome the limitations of conventional crop breeding to enhance both the nutritional quality and stress tolerance of crops.Synthetic metabolic engineering presents innovative strategies for the precision modification and de novo design of metabolic pathways.This approach generally encompasses three essential steps:identifying key metabolites through metabolomics,integrating multi-omics technologies to investigate the synthesis and regulation of these metabolites,and utilizing gene editing or de novo design to modify crop metabolic pathways associated with desirable agronomic traits.This review underscores the vital role of plant metabolite diversity in enhancing crop nutritional quality and stress resilience.Integrated multi-omics analyses facilitate the metabolic engineering by identifying key genes,transporters,and transcription factors that regulate metabolite biosynthesis.Precision modification strategies employ genome editing tools to reprogram endogenous metabolic networks,while de novo design reconstructs metabolic pathways through the introduction of exogenous biological elements—thereby both approaches enable the targeted enhancement of desired traits.These strategies have been effectively implemented in major food crops.However,simultaneously enhancing nutritional quality and stress resilience remains challenging due to inherent trade-offs and resource competition in distinct metabolic pathways within plants.Future research should integrate AI-driven predictive models with multi-omics datasets to decipher dynamic metabolic homeostasis and engineer climate-smart crops that maximize yield while preserving quality and environmental adaptability.展开更多
基金supported by the National Natural Science Foundation of China(32470676 and 32170236)Central Guidance on Local Science and Technology Development Fund of Hebei Province(246Z2508G)+2 种基金Hebei Natural Science Foundation(C2020209064)Tangshan Science and Technology Program Project(21130217C)Key research project of North China University of Science and Technology(ZD-YG-202313-23).
文摘As a high-value eudicot family,many famous horticultural crop genomes have been deciphered in Oleaceae.However,there are currently no bioinformatics platforms focused on empowering genome research in Oleaceae.Herein,we developed the first comprehensive Oleaceae Genome Research Platform(OGRP,https://oleaceae.cgrpoee.top/).In OGRP,70 genomes of 10 Oleaceae species and 46 eudicots and 366 transcriptomes involving 18 Oleaceae plant tissues can be obtained.We built 34 window-operated bioinformatics tools,collected 38 professional practical software programs,and proposed 3 new pipelines,namely ancient polyploidization identification,ancestral karyotype reconstruction,and gene family evolution.Employing these pipelines to reanalyze the Oleaceae genomes,we clarified the polyploidization,reconstructed the ancestral karyotypes,and explored the effects of paleogenome evolution on genes with specific biological regulatory roles.Significantly,we generated a series of comparative genomic resources focusing on the Oleaceae,comprising 108 genomic synteny dot plots,1952225 collinear gene pairs,multiple genome alignments,and imprints of paleochromosome rearrangements.Moreover,in Oleaceae genomes,researchers can efficiently search for 1785987 functional annotations,22584 orthogroups,29582 important trait genes from 74 gene families,12664 transcription factor-related genes,9178872 transposable elements,and all involved regulatory pathways.In addition,we provided downloads and usage instructions for the tools,a species encyclopedia,ecological resources,relevant literatures,and external database links.In short,ORGP integrates rich data resources and powerful analytical tools with the characteristic of continuous updating,which can efficiently empower genome research and agricultural breeding in Oleaceae and other plants.
基金supported by the National Natural Science Foundation of China(31971865)Zhejiang Natural Science Foundation(LZ17C130001)+1 种基金the Innovation Method Project of China(2018IM0301002)the Jiangsu Collaborative Innovation Center for Modern Crop Production。
文摘Rice is one of cereal crops and a model species for monocots.Since the release of the first draft rice genome sequences in 2002,considerable progress has been achieved in rice genomic researches,thanks to rapid development and efficient utilization of bioinformatics methods and tools.In this review,we summarize the progress of studies of rice genome sequencing and other omics and introduce the wellmaintained bioinformatics databases and tools developed for rice genome resources and breeding.After reviewing the history of rice bioinformatics,we use single-cell sequencing and machine learning as examples showing how bioinformatics integrates emerging technologies and how it continues to develop for future rice research.
文摘Polyploidy is common among agriculturally important crops. Popular genetic methods and their implementations cannot always be applied to polyploid genetic data. We give an overview about available tools and their limitations in terms of levels of ploidy, auto- and allo-ploidy. The main classes of tools are genotype calling, linkage mapping and haplotyping. The usability of the tools is discussed with a focus on their applicability to data sets produced by state of the art technologies. We show that many challenges remain until the toolset for polyploidy provides similar functionalities as those which are already available for diploids. Some tools have been developed over a decade ago and are now outdated. In addition, we discuss necessary steps to overcome this shortage in the future.
文摘The massive extension in biological data induced a need for user-friendly bioinformatics tools could be used for routine biological data manipulation. Bioanalyzer is a simple analytical software implements a variety of tools to perform common data analysis on different biological data types and databases. Bioanalyzer provides general aspects of data analysis such as handling nucleotide data, fetching different data formats information, NGS quality control, data visualization, performing multiple sequence alignment and sequence BLAST. These tools accept common biological data formats and produce human-readable output files could be stored on local computer machines. Bioanalyzer has a user-friendly graphical user interface to simplify massive biological data analysis and consume less memory and processing power. Bioanalyzer source code was written through Python programming language which provides less memory usage and initial startup time. Bioanalyzer is a free and open source software, where its code could be modified, extended or integrated in different bioinformatics pipelines. Bioinformatics Produce huge data in FASTA and Genbank format which can be used to produce a lot of annotation information which can be done with Python programming language that open the door form bioinformatics tool due to their elasticity in data analysis and simplicity which inspire us to develop new multiple tool software able to manipulate FASTA and Genbank files. The goal Develop new software uses Genomic data files to produce annotated data. Software was written using python programming language and biopython packages.
文摘In this editorial preface, I briefly r eview cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field: detection of chromatin states; detection of SNP- containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer. We expect this issue to provide bioinformatics scientists, cancer biologists, and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.
基金supported by grants from Wuhan University Medical Faculty Innovation Seed Fund Cultivation Project(No.TFZZ2018025)Xiao-ping CHEN Foundation for the Development of Science and Technology of Hubei Province(No.CXPJJH12000001-2020313)the National Natural Science Foundation of China(No.81670123 and No.81670144).
文摘Objective This study aims to investigate the expression,prognostic value,and function of kinesin superfamily 4A(KIF4A)in cervical cancer.Methods Cervical cancer cell lines(Hela and SiHa)and TCGA data were used for experimental and bioinformatic analyses.Overall survival(OS)and progression free survival(PFS)were compared between patients with high or low KIF4A expression.Copy number variation(CNV)and somatic mutations of patients were visualized and GISTIC 2.0 was used to identify significantly altered sites.The function of KIF4A was also explored based on transcriptome analysis and validated by experimental methods.Chemotherapeutic and immunotherapeutic benefits were inferred using multiple reference databases and algorithms.Results Patients with high KIF4A expression had better OS and PFS.KIF4A could inhibit proliferation and migration and induce G1 arrest of cervical cancer cells.Higher CNV load was observed in patients with low KIF4A expression,while the group with low KIF4A expression displayed more significantly altered sites.A total of 13 genes were found to mutate more in the low KIF4A expression group,including NOTCH1 and PUM1.The analysis revealed that low KIF4A expression may indicate an immune escape phenotype,and patients in this group may benefit more from immunotherapy.With respect to chemotherapy,cisplatin and gemcitabine may respond better in patients with high KIF4A expression,while 5-fluorouracil etc.may be responded better in patients with low KIF4A expression Conclusion KIF4A is a tumor suppressor gene in cervical cancer,and it can be used as a prognostic and therapeutic biomarker in cervical cancer.
文摘Severe acute respiratory syndrome coronavirus(SARS-CoV)and SARS-CoV-2 are thought to transmit to humans via wild mammals,especially bats.However,evidence for direct bat-to-human transmission is lacking.Involvement of intermediate hosts is considered a reason for SARS-CoV-2 transmission to humans and emergence of outbreak.Large biodiversity is found in tropical territories,such as Brazil.On the similar line,this study aimed to predict potential coronavirus hosts among Brazilian wild mammals based on angiotensin-converting enzyme 2(ACE2)sequences using evolutionary bioinformatics.Cougar,maned wolf,and bush dogs were predicted as potential hosts for coronavirus.These indigenous carnivores are philogenetically closer to the known SARS-CoV/SARS-CoV-2 hosts and presented low ACE2 divergence.A new coronavirus transmission chain was developed in which white-tailed deer,a susceptible SARS-CoV-2 host,have the central position.Cougar play an important role because of its low divergent ACE2 level in deer and humans.The discovery of these potential coronavirus hosts will be useful for epidemiological surveillance and discovery of interventions that can contribute to break the transmission chain.
基金supported by the National Natural Science Foundation of China(No.31970651,92046018)the Mathematical Tianyuan Fund of the National Natural Science Foundation of China(No.12026414).
文摘Big biological data contains a large amount of life science information,yet extracting meaningful insights from this data remains a complex challenge.The hidden Markov model(HMM),a statistical model widely utilized in machine learning,has proven effective in addressing various problems in bioinformatics.Despite its broad applicability,a more detailed and comprehensive discussion is needed regarding the specific ways in which HMMs are employed in this field.This review provides an overview of the HMM,including its fundamental concepts,the three canonical problems associated with it,and the relevant algorithms used for their resolution.The discussion emphasizes the model’s significant applications in bioinformatics,particularly in areas such as transmembrane protein prediction,gene discovery,sequence alignment,CpG island detection,and copy number variation analysis.Finally,the strengths and limitations of the HMM are discussed,and its prospects in bioinformatics are predicted.HMMs can play a pivotal role in addressing complex biological problems and advancing our understanding of biological sequences and systems.This review can provide bioinformatics researchers with comprehensive information on HMM and guide their work.
基金supported by grant CNTC-110202101039(JY-16)and YNTC-2022530000241008.
文摘Bioinformatics analysis often requires the filtering of multi-datasets,based on frequency or frequency of occurrence,for decisions on retention or deletion.Existing tools for this purpose often present a challenge with complex installation,which necessitate custom coding,thereby impeding efficient data processing activities.To address this issue,Filterx,a user-friendly command line tool that written in C language,was developed that supports multi-condition filtering,based on frequency or occurrence.This tool enables users to complete the data processing tasks through a simple command line,greatly reducing both workload and data processing time.In addition,future development of this tool could facilitate its integration into various bioinformatics data analysis pipelines.
基金supported in part by the Clinical and Translational Science Award(Grant No.UL1TR001117)to Duke University from the National Institutes of Health(NIH),United States
文摘Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contex- tualization of the term TBI, describes the discipline's brief history and past accomplishments, as well as current loci, and concludes with predictions of future directions in the field.
文摘目的探究不同类型(认知/运动)与负荷(简单/困难)双任务步行对人体前额叶皮层激活特征及行走稳定性的影响。方法采用功能性近红外光谱(functional near infrared spectroscopy,fNIRS)技术和三维动态捕捉系统同时测量33名健康成年人单任务步行、简单/困难认知双任务步行、运动双任务步行条件下人体前额叶皮层的氧合血红蛋白浓度和运动学参数,并基于运动学数据间接计算动态稳度(margin of stability,MOS)。结果右背外侧前额叶皮质在困难认知双任务步行激活程度高于困难运动双任务步行(F=7.067,P=0.012);左背外侧前额叶皮质在困难认知双任务步行中的激活程度也高于困难运动双任务步行(F=4.831,P=0.035)。此外,右额极区(P=0.029)、右眶额皮质(P=0.046)、左腹外侧前额叶皮质(P=0.039)、左额极区(P=0.022)认知双任务步行中的激活程度均显著高于运动双任务步行。简单认知双任务行走时MOS_(ap)小于困难认知双任务行走(F=13.357,P=0.001);困难认知双任务行走MOS_(ap)大于困难运动双任务行走(F=8.571,P=0.006);简单认知双任务行走时MOS_(ml)小于困难认知双任务行走(F=5.394,P=0.027);困难认知双任务行走时MOS_(ml)大于困难运动双任务行走(F=4.703,P=0.038)。结论双任务执行涉及前额叶亚区的层级化协同调控机制,其中背外侧前额叶皮层优先协调高阶认知任务的资源分配。不同类型与负荷双任务对前额叶皮层激活及行走稳定性的影响存在交互效应。认知双任务步行时多个前额叶亚区的神经激活强度高于运动双任务;困难认知任务在引发前额叶皮层高激活的同时,伴随行走稳定性的下降。简单认知双任务行走稳定性优于困难认知双任务行走,而困难运动双任务行走稳定性优于困难认知双任务行走。
文摘Realizing personalized medicine requires integrating diverse data types with bioinformatics.The most vital data are genomic information for individuals that are from advanced next-generation sequencing(NGS) technologies at present.The technologies continue to advance in terms of both decreasing cost and sequencing speed with concomitant increase in the amount and complexity of the data.The prodigious data together with the requisite computational pipelines for data analysis and interpretation are stressors to IT infrastructure and the scientists conducting the work alike.Bioinformatics is increasingly becoming the rate-limiting step with numerous challenges to be overcome for translating NGS data for personalized medicine.We review some key bioinformatics tasks,issues,and challenges in contexts of IT requirements,data quality,analysis tools and pipelines,and validation of biomarkers.
文摘Natural products are among the most important sources of lead molecules for drug discovery.With the development of affordable whole-genome sequencing technologies and other‘omics tools,the field of natural products research is currently undergoing a shift in paradigms.While,for decades,mainly analytical and chemical methods gave access to this group of compounds,nowadays genomics-based methods offer complementary approaches to find,identify and characterize such molecules.This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work.In this context,this review gives a summary of tools and databases that currently are available to mine,identify and characterize natural product biosynthesis pathways and their producers based on‘omics data.A web portal called Secondary Metabolite Bioinformatics Portal(SMBP at http://www.secondarymetabolites.org)is introduced to provide a one-stop catalog and links to these bioinformatics resources.In addition,an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field.
文摘In the 2017 first issue of this Journal - Genomes, Proteomes and Bioinformatics - a special database article entitled "GSA: Gen- ome Sequence Archive" is published. This article provides a brief introduction to the platform developed by the authors from the BIG Data Center (BIGD) of Beijing Institute of Genomics (BIG), Chinese Academy of Sciences (CAS). The aim of the GSA project is to collect, integrate, and archive raw sequence data submitted by domestic and international users. It is one of the major activities being carried on by a team of around 50 young bioinformaticians at BIGD. In addition to the GSA system, they are also working on several bioinformatics service-orientated projects as described in one of their recent publications .
基金supported by the Project of Sanya Yazhou Bay Science and Technology City (SKJC-JYRC-2024-26)the National Natural Science Foundation of China (32460072)+4 种基金Hainan Provincial Natural Science Foundation of China (323RC421)the Hainan Province Science and Technology Special Fund (ZDYF2022XDNY144)the Hainan Provincial Academician Innovation Platform Project (HDYSZX-202004)the Collaborative Innovation Center of Nanfan and High-Efficiency Tropical Agriculture, Hainan University (XTCX2022NYB06)Hainan Postdoctoral Research Grant Project
文摘In light of the pressing global challenges of climate change,declining crop resilience,and hidden hunger,it is imperative to overcome the limitations of conventional crop breeding to enhance both the nutritional quality and stress tolerance of crops.Synthetic metabolic engineering presents innovative strategies for the precision modification and de novo design of metabolic pathways.This approach generally encompasses three essential steps:identifying key metabolites through metabolomics,integrating multi-omics technologies to investigate the synthesis and regulation of these metabolites,and utilizing gene editing or de novo design to modify crop metabolic pathways associated with desirable agronomic traits.This review underscores the vital role of plant metabolite diversity in enhancing crop nutritional quality and stress resilience.Integrated multi-omics analyses facilitate the metabolic engineering by identifying key genes,transporters,and transcription factors that regulate metabolite biosynthesis.Precision modification strategies employ genome editing tools to reprogram endogenous metabolic networks,while de novo design reconstructs metabolic pathways through the introduction of exogenous biological elements—thereby both approaches enable the targeted enhancement of desired traits.These strategies have been effectively implemented in major food crops.However,simultaneously enhancing nutritional quality and stress resilience remains challenging due to inherent trade-offs and resource competition in distinct metabolic pathways within plants.Future research should integrate AI-driven predictive models with multi-omics datasets to decipher dynamic metabolic homeostasis and engineer climate-smart crops that maximize yield while preserving quality and environmental adaptability.