期刊文献+
共找到700篇文章
< 1 2 35 >
每页显示 20 50 100
Gene Expression Data Analysis Based on Mixed Effects Model
1
作者 Yuanbo Dai 《Journal of Computer and Communications》 2025年第2期223-235,共13页
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres... DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions. 展开更多
关键词 Mixed Effects Model gene Expression data Analysis gene Analysis gene Chip
暂未订购
A Novel Soft Clustering Approach for Gene Expression Data
2
作者 E.Kavitha R.Tamilarasan +1 位作者 Arunadevi Baladhandapani M.K.Jayanthi Kannan 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期871-886,共16页
Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode p... Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode proteins which in turn will dictate the cell function. The production of messengerRNA along with processing the same are the two main stages involved in the process of gene expression. The biological networks complexity added with thevolume of data containing imprecision and outliers increases the challenges indealing with them. Clustering methods are hence essential to identify the patternspresent in massive gene data. Many techniques involve hierarchical, partitioning,grid based, density based, model based and soft clustering approaches for dealingwith the gene expression data. Understanding the gene regulation and other usefulinformation from this data can be possible only through effective clustering algorithms. Though many methods are discussed in the literature, we concentrate onproviding a soft clustering approach for analyzing the gene expression data. Thepopulation elements are grouped based on the fuzziness principle and a degree ofmembership is assigned to all the elements. An improved Fuzzy clustering byLocal Approximation of Memberships (FLAME) is proposed in this workwhich overcomes the limitations of the other approaches while dealing with thenon-linear relationships and provide better segregation of biological functions. 展开更多
关键词 REINFORCEMENT MEMBERSHIP CENTROID threshold STATISTICS BIOINFORMATICS gene expression data
在线阅读 下载PDF
Deep Learning Enabled Microarray Gene Expression Classification for Data Science Applications
3
作者 Areej A.Malibari Reem M.Alshehri +5 位作者 Fahd N.Al-Wesabi Noha Negm Mesfer Al Duhayyim Anwer Mustafa Hilal Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第11期4277-4290,共14页
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha... In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures. 展开更多
关键词 BIOINFORMATICS data science microarray gene expression data classification deep learning metaheuristics
在线阅读 下载PDF
Prediction of Lung Cancer Stage Using Tumor Gene Expression Data
4
作者 Yadi Gu 《Journal of Cancer Therapy》 2024年第8期287-302,共16页
Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based... Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based models for classifying cancer types using machine learning techniques. By applying Log2 normalization to gene expression data and conducting Wilcoxon rank sum tests, the researchers employed various classifiers and Incremental Feature Selection (IFS) strategies. The study culminated in two optimized models using the XGBoost classifier, comprising 10 and 74 genes respectively. The 10-gene model, due to its simplicity, is proposed for easier clinical implementation, whereas the 74-gene model exhibited superior performance in terms of Specificity, AUC (Area Under the Curve), and Precision. These models were evaluated based on their sensitivity, AUC, and specificity, aiming to achieve high sensitivity and AUC while maintaining reasonable specificity. 展开更多
关键词 Lung Cancer Detection Stage Prediction gene Expression data Xgboost Machine Learning
暂未订购
Analysis of Gene Expression Profiles of Rice Mutant SLR1 Based on Microarray Data
5
作者 Weihua LIU Yue CHEN +4 位作者 Lingxian WANG Ge HUANG Qian ZOU Zhenhua ZHU Mingliang DING 《Asian Agricultural Research》 2019年第1期54-55,59,共3页
Gibberellins are an important class of plant hormones.They play an important regulatory role in all stages of growth and development of higher plants.The use of mutants to study gibberellin metabolism and signal trans... Gibberellins are an important class of plant hormones.They play an important regulatory role in all stages of growth and development of higher plants.The use of mutants to study gibberellin metabolism and signal transduction pathways is currently a research hotspot.This article takes the data of Affymetrix chips of rice as an example,bioinformatics method was used to study rice SLR1 mutant and mine differentially expressed wild-type genes,thus exploring the expression regulation network of gibberellin signaling pathway-related genes. 展开更多
关键词 GIBBERELLIN gene CHIP data MINING
在线阅读 下载PDF
Incorporating heterogeneous biological data sources in clustering gene expression data
6
作者 Gang-Guo Li Zheng-Zhi Wang 《Health》 2009年第1期17-23,共7页
In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity... In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity measure. On the basis of the similarity measures of protein- protein interaction data and chip-chip data, the combined dissimilarity measure is defined. The combined distance measure is introduced into K-means method, which can be considered as an improved K-means method. The improved K-means method and other three clustering methods are evaluated by a real dataset. Per-formance of these methods is assessed by a prediction accuracy analysis through known gene annotations. Our results show that the improved K-means method outperforms other clustering methods. The performance of the improved K-means method is also tested by varying the tuning coefficients of the combined dissimilarity measure. The results show that it is very helpful and meaningful to incorporate het-erogeneous data sources in clustering gene expression data, and those coefficients for the genome-wide or completed data sources should be given larger values when constructing the combined dissimilarity measure. 展开更多
关键词 STATISTICAL Analysis Similarity/ DISSIMILARITY MEASURE gene Expression data Clustering data Fusion
暂未订购
Genome-Wide Identification of Genes Responsive to ABA and Cold/Salt Stresses in Gossypium hirsutum by Data-Mining and Expression Pattern Analysis
7
作者 ZHU Long-fu HE Xin +6 位作者 YUAN Dao-jun XU Lian XU Li TU Li-li SHEN Guo-xin ZHANG Hong ZHANG Xian-long 《Agricultural Sciences in China》 CAS CSCD 2011年第4期499-508,共10页
For making better use of nucleic acid resources of Gossypium hirsutum, a data-mining method was used to identify putative genes responsive to various abiotic stresses in G. hirsutum. Based on the compiled database inc... For making better use of nucleic acid resources of Gossypium hirsutum, a data-mining method was used to identify putative genes responsive to various abiotic stresses in G. hirsutum. Based on the compiled database including genes involved in abiotic stress response in Arabidopsis thaliana and the comprehensive analysis tool of GENEVESTIGATOR v3, 826 genes up-regulated or down-regulated significantly in roots or leaves during salt or cold treatment in Arabidopsis were identified. As compared to these 826 Arabidopsis genes annotated, 38 homologous expressed sequence tags (ESTs) from G. hirsutum were selected randomly and their expression patterns were studied using a quantitative real-time reverse transcription-polymerase chain reaction method. Among these 38 ESTs, about 55% of the genes (21 of 38) were different in response to ABA between cotton and Arabidopsis, whereas 70% of genes had similar responses to cold and salt treatments, and some of them which had not been characterized in Arabidopsis are now being investigated in gene function studies. According to these results, this approach of analyzing ESTs appears effective in large-scale identification of cotton genes involved in abiotic stress and might be adopted to determine gene functions in various biologic processes in cotton. 展开更多
关键词 cold stress salt stress data-MINING gene Gossypium hirsutum
在线阅读 下载PDF
Optimizing Cancer Classification and Gene Discovery with an Adaptive Learning Search Algorithm for Microarray Analysis
8
作者 Chiwen Qu Heng Yao +1 位作者 Tingjiang Pan Zenghui Lu 《Journal of Bionic Engineering》 2025年第2期901-930,共30页
DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we presen... DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we present Fs-LSA (F-score based Learning Search Algorithm), a novel gene selection algorithm designed to enhance the precision and efficiency of target gene identification from microarray data for cancer classification. This algorithm is divided into two phases: the first leverages F-score values to prioritize and select feature genes with the most significant differential expression;the second phase introduces our Learning Search Algorithm (LSA), which harnesses swarm intelligence to identify the optimal subset among the remaining genes. Inspired by human social learning, LSA integrates historical data and collective intelligence for a thorough search, with a dynamic control mechanism that balances exploration and refinement, thereby enhancing the gene selection process. We conducted a rigorous validation of Fs-LSA’s performance using eight publicly available cancer microarray expression datasets. Fs-LSA achieved accuracy, precision, sensitivity, and F1-score values of 0.9932, 0.9923, 0.9962, and 0.994, respectively. Comparative analyses with state-of-the-art algorithms revealed Fs-LSA’s superior performance in terms of simplicity and efficiency. Additionally, we validated the algorithm’s efficacy independently using glioblastoma data from GEO and TCGA databases. It was significantly superior to those of the comparison algorithms. Importantly, the driver genes identified by Fs-LSA were instrumental in developing a predictive model as an independent prognostic indicator for glioblastoma, underscoring Fs-LSA’s transformative potential in genomics and personalized medicine. 展开更多
关键词 gene selection Learning search algorithm gene expression data CLASSIFICATION
暂未订购
Challenges Analyzing RNA-Seq Gene Expression Data
9
作者 Liliana López-Kleine Cristian González-Prieto 《Open Journal of Statistics》 2016年第4期628-636,共9页
The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pr... The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pre-processing: Two different paths can be chosen: Transform RNA- sequencing count data to a continuous variable or continue to work with count data. For each data type, analysis tools have been developed and seem appropriate at first sight, but a deeper analysis of data distribution and structure, are a discussion worth. In this review, open questions regarding RNA-sequencing data nature are discussed and highlighted, indicating important future research topics in statistics that should be addressed for a better analysis of already available and new appearing gene expression data. Moreover, a comparative analysis of RNAseq count and transformed data is presented. This comparison indicates that transforming RNA-seq count data seems appropriate, at least for differential expression detection. 展开更多
关键词 RNA-Seq Analysis Count data PREPROCESSING Differential Expression gene Co-Expression Network
暂未订购
Gene Ontology在生物数据整合中的应用 被引量:8
10
作者 夏燕 张忠平 +2 位作者 曹顺良 朱扬勇 李亦学 《计算机工程》 EI CAS CSCD 北大核心 2005年第2期57-58,76,共3页
异构数据的高效整合,在生物数据呈爆炸性增长、生物数据库复杂度不断增加的今天,具有重要的理论价值和实际意义。该文基于BioDW——一个整合的生物信息学数据仓库平台,利用统一的GeneOntology语义模型,建立异构数据库之间的语义链接,在... 异构数据的高效整合,在生物数据呈爆炸性增长、生物数据库复杂度不断增加的今天,具有重要的理论价值和实际意义。该文基于BioDW——一个整合的生物信息学数据仓库平台,利用统一的GeneOntology语义模型,建立异构数据库之间的语义链接,在概念和联系层次上有效地解决了生物异构数据的整合问题,实现了对生物数据智能化的多重、复合和交叉检索,为生物信息的进一步研究奠定了坚实的基础。 展开更多
关键词 生物 整合问题 实际 检索 数据整合 层次 联系 异构数据库 语义模型 数据仓库
在线阅读 下载PDF
GEO(Gene Expression Omnibus):高通量基因表达数据库 被引量:9
11
作者 刘华 马文丽 郑文岭 《中国生物化学与分子生物学报》 CAS CSCD 北大核心 2007年第3期236-244,共9页
GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被... GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被存档,例如基因表达系列分析(serial analysis of gene expression,SAGE)和蛋白质鉴定技术.迄今为止,GEO数据库包含的数据含概10000个杂交实验和来自30种不同生物体的SAGE库.本文概述了GEO数据库的查询和浏览,数据下载和格式,数据分析,贮存与更新,并着重分析GEO数据浏览器中控制词汇的使用,阐述了GEO数据库的数据挖掘以及GEO在分子生物学领域中的应用前景.GEO可由此公众网址直接登陆http://www.ncbi.nlm.nih.gov/projects/geo/. 展开更多
关键词 基因表达 数据库 控制词汇 数据挖掘
在线阅读 下载PDF
GeneSifter在基因表达谱芯片数据挖掘中的应用 被引量:5
12
作者 廖之君 马文丽 +1 位作者 梁爽 郑文岭 《医学信息(西安上半月)》 2007年第11期1882-1887,共6页
介绍一款基因芯片数据分析工具──GeneSifter软件,具有快速、直观、便捷等特点,尤其适用于基因表达谱的数据挖掘。芯片数据一般以格式化文本文档形式上载,根据实验目的、设计不同,总共有4种上载向导工具,数据分析从控制台Analysis项目... 介绍一款基因芯片数据分析工具──GeneSifter软件,具有快速、直观、便捷等特点,尤其适用于基因表达谱的数据挖掘。芯片数据一般以格式化文本文档形式上载,根据实验目的、设计不同,总共有4种上载向导工具,数据分析从控制台Analysis项目下的Pairwise或Projects进入,需要设置滤过、阈值和统计分析等参数,Pairwise可获得的结果有:差异显著性基因列表、基因本体报告和KEGG通路报告等,Projects有更为强大的功能,可获取聚类等6种结果。GeneSifter独特的一站式单击设置,可获得相关基因的11个数据库最新链接。GeneSifter适用于基因芯片数据挖掘的生物研究人员。 展开更多
关键词 geneSifter软件 数据挖掘 基因本体术语 KEGG通路 聚类
暂未订购
DYRK2:基于东亚和欧洲人群揭示类风湿关节炎合并骨质疏松症的治疗新靶点
13
作者 吴治林 何秦 +4 位作者 王枰稀 石现 袁松 张骏 王浩 《中国组织工程研究》 北大核心 2026年第6期1569-1579,共11页
背景:研究表明,类风湿关节炎与骨质疏松症呈正相关趋势,但因果关系和相关机制仍未得到证实。随着计算机科学和生命科学的交叉融合,基于全基因组关联研究数据和转录组测序数据进行孟德尔随机化和生信分析,可以评估两疾病间的因果关系、... 背景:研究表明,类风湿关节炎与骨质疏松症呈正相关趋势,但因果关系和相关机制仍未得到证实。随着计算机科学和生命科学的交叉融合,基于全基因组关联研究数据和转录组测序数据进行孟德尔随机化和生信分析,可以评估两疾病间的因果关系、探索相关机制以及挖掘治疗靶点,这将利于类风湿关节炎合并骨质疏松症的精准治疗。目的:采用双样本孟德尔随机化分析类风湿关节炎和骨质疏松症间的因果关系,同时基于汇总数据的孟德尔随机化分析和生信分析挖掘潜在共病靶点和靶向药物,旨在为类风湿关节炎合并骨质疏松症的机制探索和精准治疗提供理论依据。方法:①从基于亚洲人群和欧洲人群的GWAS Catalog、IEU Open GWAS、FinnGen以及eQTLGen数据库下载类风湿关节炎、骨质疏松症和顺式表达数量性状位点的全基因组关联研究数据,用于双样本孟德尔随机化和基于汇总数据的孟德尔随机化分析。②从GEO数据库下载类风湿关节炎的转录组测序数据(GSE93272和GSE15573),用于生物信息学分析。③以逆方差加权法作为主要分析方法,进行类风湿关节炎和骨质疏松症之间的正向和反向双样本孟德尔随机化分析,并用MR Egger法、简单模式法、加权中位数法和加权模式法对结果加以佐证。④基于汇总数据的孟德尔随机化分析鉴定与类风湿关节炎和骨质疏松症相关的基因,并基于交叉分析挖掘出类风湿关节炎和骨质疏松症共病靶点。同时,基于生信分析和细胞实验验证共病靶点的生物学功能。⑤此外,基于DYRK2构建类风湿关节炎风险预测诺莫图,通过受试者特征曲线、矫正曲线和决策曲线验证预测性能。最后,基于Enrichr数据库挖掘靶点潜在药物并进行分子对接。结果与结论:①正向孟德尔随机化分析结果显示,除外GCST90044540和GCST90086118无统计学意义,其他所有结果均表明类风湿关节炎和骨质疏松症间存在显著因果关系,并且呈正相关。②反向孟德尔随机化分析结果提示,骨质疏松症和类风湿关节炎间未见显著因果关系。③基于汇总数据的孟德尔随机化分析共鉴定出412和344个与类风湿关节炎和骨质疏松症正相关的基因,421和347个负相关基因。基于交叉分析得到26个共病基因。其中,DYRK2是潜在治疗靶点,后续生信分析和细胞实验证实DYRK2在类风湿关节炎和骨质疏松症的进展过程中发挥重要作用。④此外,构建的诺莫图具有出色的预测性能。最后,挖掘出4个DYRK2的潜在靶向药物(Undecanoic Acid、Metyrapone、JNJ-38877605和ACA),分子对接也证明具有可靠的靶向能力。⑤总之,基于亚洲人群和欧洲人群的全基因组关联研究数据证明了类风湿关节炎和骨质疏松症在遗传学层面存在着因果关系,DYRK2是潜在治疗靶点,有4种小分子是潜在靶向药物。 展开更多
关键词 类风湿关节炎 骨质疏松症 孟德尔随机化 基于汇总数据的孟德尔随机化 共病基因 DYRK2
暂未订购
Gene Panel流程的并行设计与优化研究 被引量:1
14
作者 王元戎 曾平 +2 位作者 臧大伟 谭光明 孙凝晖 《计算机学报》 EI CSCD 北大核心 2019年第11期2429-2446,共18页
随着二代测序技术的快速发展,基因测序成本迅速下降,这导致基因数据的爆炸式增长,基因数据分析工具逐渐无法满足如此大规模的数据分析需求.一方面,基因数据分析工具大多仍为串行执行,无法有效地利用多核结构提升性能并导致计算资源的严... 随着二代测序技术的快速发展,基因测序成本迅速下降,这导致基因数据的爆炸式增长,基因数据分析工具逐渐无法满足如此大规模的数据分析需求.一方面,基因数据分析工具大多仍为串行执行,无法有效地利用多核结构提升性能并导致计算资源的严重浪费;另一方面,由于前期设计和开发的局限性,分析工具所依赖的底层算法库不能兼顾高性能与友好的用户接口.Gene Panel是当前主流的面向癌症检测的基因数据分析流程,它也是由多种基因数据分析工具组成的.该文面向Gene Panel流程:(1)设计并实现了一套全新的并行Gene Panel基因数据分析流程,通过数据并行和任务并行两种主要并行手段并结合负载均衡等其他优化方法,有效地提升了多核平台的资源利用率,并获得了4~7倍的整体加速比;(2)设计并实现了一种接口友好的高性能基因数据分析底层库HCC.由于相似的算法特征,该文的优化方法同样适用于除Gene Panel外的其他测序流程. 展开更多
关键词 大数据 gene PANEL 并行优化 负载均衡 底层库优化
在线阅读 下载PDF
Identify the signature genes for diagnose of uveal melanoma by weight gene co-expression network analysis 被引量:10
15
作者 Kai Shi Zhi-Tong Bing +4 位作者 Gui-Qun Cao Ling Guo Ya-Na Cao Hai-Ou Jiang Mei-Xia Zhang 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2015年第2期269-274,共6页
AIM: To identify and understand the relationship between co-expression pattern and clinic traits in uveal melanoma, weighted gene co-expression network analysis(WGCNA) is applied to investigate the gene expression lev... AIM: To identify and understand the relationship between co-expression pattern and clinic traits in uveal melanoma, weighted gene co-expression network analysis(WGCNA) is applied to investigate the gene expression levels and patient clinic features. Uveal melanoma is the most common primary eye tumor in adults. Although many studies have identified some important genes and pathways that were relevant to progress of uveal melanoma, the relationship between co-expression and clinic traits in systems level of uveal melanoma is unclear yet. We employ WGCNA to investigate the relationship underlying molecular and phenotype in this study.METHODS: Gene expression profile of uveal melanoma and patient clinic traits were collected from the Gene Expression Omnibus(GEO) database. The gene co-expression is calculated by WGCNA that is the R package software. The package is used to analyze the correlation between pairs of expression levels of genes.The function of the genes were annotated by gene ontology(GO).RESULTS: In this study, we identified four co-expression modules significantly correlated with clinictraits. Module blue positively correlated with radiotherapy treatment. Module purple positively correlates with tumor location(sclera) and negatively correlates with patient age. Module red positively correlates with sclera and negatively correlates with thickness of tumor. Module black positively correlates with the largest tumor diameter(LTD). Additionally, we identified the hug gene(top connectivity with other genes) in each module. The hub gene RPS15 A, PTGDS, CD53 and MSI2 might play a vital role in progress of uveal melanoma.CONCLUSION: From WGCNA analysis and hub gene calculation, we identified RPS15 A, PTGDS, CD53 and MSI2 might be target or diagnosis for uveal melanoma. 展开更多
关键词 weighted gene co-expression network analysis microarray data gene ontology
原文传递
A Survey on Acute Leukemia Expression Data Classification Using Ensembles
16
作者 Abdel Nasser H.Zaied Ehab Rushdy Mona Gamal 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1349-1364,共16页
Acute leukemia is an aggressive disease that has high mortality rates worldwide.The error rate can be as high as 40%when classifying acute leukemia into its subtypes.So,there is an urgent need to support hematologists... Acute leukemia is an aggressive disease that has high mortality rates worldwide.The error rate can be as high as 40%when classifying acute leukemia into its subtypes.So,there is an urgent need to support hematologists during the classification process.More than two decades ago,researchers used microarray gene expression data to classify cancer and adopted acute leukemia as a test case.The high classification accuracy they achieved confirmed that it is possible to classify cancer subtypes using microarray gene expression data.Ensemble machine learning is an effective method that combines individual classifiers to classify new samples.Ensemble classifiers are recognized as powerful algorithms with numerous advantages over traditional classifiers.Over the past few decades,researchers have focused a great deal of attention on ensemble classifiers in a wide variety of fields,including but not limited to disease diagnosis,finance,bioinformatics,healthcare,manufacturing,and geography.This paper reviews the recent ensemble classifier approaches utilized for acute leukemia gene expression data classification.Moreover,a framework for classifying acute leukemia gene expression data is proposed.The pairwise correlation gene selection method and the Rotation Forest of Bayesian Networks are both used in this framework.Experimental outcomes show that the classification accuracy achieved by the acute leukemia ensemble classifiers constructed according to the suggested framework is good compared to the classification accuracy achieved in other studies. 展开更多
关键词 LEUKEMIA CLASSIFICATION ENSEMBLE rotation forest pairwise correlation bayesian networks gene expression data MICROARRAY gene selection
在线阅读 下载PDF
Data Mining Based on Principal Component Analysis Application to the Nitric Oxide Response in Escherichia coli
17
作者 AiLing Teh Donovan Layton +2 位作者 Daniel R. Hyduke Laura R. Jarboe Derrick K. Rollins Sd 《Journal of Statistical Science and Application》 2014年第1期1-18,共18页
This work evaluates a recently developed multivariate statistical method based on the creation of pseudo or latent variables using principal component analysis (PCA). The application is the data mining of gene expre... This work evaluates a recently developed multivariate statistical method based on the creation of pseudo or latent variables using principal component analysis (PCA). The application is the data mining of gene expression data to find a small subset of the most important genes in a set of thousand or tens of thousands of genes from a relatively small number of experimental runs. The method was previously developed and evaluated on artificially generated data and real data sets. Its evaluations consisted of its ability to rank the genes against known truth in simulated data studies and to identify known important genes in real data studies. The purpose of the work described here is to identify a ranked set of genes in an experimental study and then for a few of the most highly ranked unverified genes, experimentally verify their importance.This method was evaluated using the transcriptional response of Escherichia coli to treatment with four distinct inhibitory compounds: nitric oxide, S-nitrosoglutathione, serine hydroxamate and potassium cyanide. Our analysis identified genes previously recognized in the response to these compounds and also identified new genes.Three of these new genes, ycbR, yJhA and yahN, were found to significantly (p-values〈0.002) affect the sensitivityofE, coli to nitric oxide-mediated growth inhibition. Given that the three genes were not highly ranked in the selected ranked set (RS), these results support strong sensitivity in the ability of the method to successfully identify genes related to challenge by NO and GSNO. This ability to identify genes related to the response to an inhibitory compound is important for engineering tolerance to inhibitory metabolic products, such as biofuels, and utilization of cheap sugar streams, such as biomass-derived sugars or hydrolysate. 展开更多
关键词 data mining principal component analysis (PCA) gene expression data analysis
在线阅读 下载PDF
Modeling of gene regulatory networks: A review
18
作者 Nedumparambathmarath Vijesh Swarup Kumar Chakrabarti Janardanan Sreekumar 《Journal of Biomedical Science and Engineering》 2013年第2期223-231,共9页
Gene regulatory networks play an important role the molecular mechanism underlying biological processes. Modeling of these networks is an important challenge to be addressed in the post genomic era. Several methods ha... Gene regulatory networks play an important role the molecular mechanism underlying biological processes. Modeling of these networks is an important challenge to be addressed in the post genomic era. Several methods have been proposed for estimating gene networks from gene expression data. Computational methods for development of network models and analysis of their functionality have proved to be valuable tools in bioinformatics applications. In this paper we tried to review the different methods for reconstructing gene regulatory networks. 展开更多
关键词 gene NETWORK gene EXPRESSION data gene REGULATION
暂未订购
The application of hidden markov model in building genetic regulatory network
19
作者 Rui-Rui Ji Ding Liu Wen Zhang 《Journal of Biomedical Science and Engineering》 2010年第6期633-637,共5页
The research hotspot in post-genomic era is from sequence to function. Building genetic regulatory network (GRN) can help to understand the regulatory mechanism between genes and the function of organisms. Probabilist... The research hotspot in post-genomic era is from sequence to function. Building genetic regulatory network (GRN) can help to understand the regulatory mechanism between genes and the function of organisms. Probabilistic GRN has been paid more attention recently. This paper discusses the Hidden Markov Model (HMM) approach served as a tool to build GRN. Different genes with similar expression levels are considered as different states during training HMM. The probable regulatory genes of target genes can be found out through the resulting states transition matrix and the determinate regulatory functions can be predicted using nonlinear regression algorithm. The experiments on artificial and real-life datasets show the effectiveness of HMM in building GRN. 展开更多
关键词 geneTIC REGULATORY Network Hidden MARKOV Model STATES TRANSITION gene Expression data
暂未订购
上一页 1 2 35 下一页 到第
使用帮助 返回顶部