期刊文献+
共找到777篇文章
< 1 2 39 >
每页显示 20 50 100
Gene Expression Data Analysis Based on Mixed Effects Model
1
作者 Yuanbo Dai 《Journal of Computer and Communications》 2025年第2期223-235,共13页
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres... DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions. 展开更多
关键词 Mixed Effects Model gene Expression data Analysis gene Analysis gene Chip
暂未订购
Comparative study of microarray and experimental data on Schwann cells in peripheral nerve degeneration and regeneration: big data analysis 被引量:6
2
作者 Ulfuara Shefa Junyang Jung 《Neural Regeneration Research》 SCIE CAS CSCD 2019年第6期1099-1104,共6页
A Schwann cell has regenerative capabilities and is an important cell in the peripheral nervous system.This microarray study is part of a bioinformatics study that focuses mainly on Schwann cells. Microarray data prov... A Schwann cell has regenerative capabilities and is an important cell in the peripheral nervous system.This microarray study is part of a bioinformatics study that focuses mainly on Schwann cells. Microarray data provide information on differences between microarray-based and experiment-based gene expression analyses. According to microarray data, several genes exhibit increased expression(fold change) but they are weakly expressed in experimental studies(based on morphology, protein and mRNA levels). In contrast, some genes are weakly expressed in microarray data and highly expressed in experimental studies;such genes may represent future target genes in Schwann cell studies. These studies allow us to learn about additional genes that could be used to achieve targeted results from experimental studies. In the current big data study by retrieving more than 5000 scientific articles from PubMed or NCBI, Google Scholar, and Google, 1016(up-and downregulated) genes were determined to be related to Schwann cells. However,no experiment was performed in the laboratory; rather, the present study is part of a big data analysis. Our study will contribute to our understanding of Schwann cell biology by aiding in the identification of genes.Based on a comparative analysis of all microarray data, we conclude that the microarray could be a good tool for predicting the expression and intensity of different genes of interest in actual experiments. 展开更多
关键词 Schwann cells big data analysis PERIPHERAL NERVE DEgeneRATION PERIPHERAL NERVE REgeneRATION MICROARRAY matched geneS promising geneS gene ranking
暂未订购
For robust big data analyses:a collection of 150 important pro-metastatic genes 被引量:3
3
作者 Yan Mei Jun-Ping Yang Chao-Nan Qian 《Chinese Journal of Cancer》 SCIE CAS CSCD 2017年第3期112-120,共9页
Metastasis is the greatest contributor to cancer?related death.In the era of precision medicine,it is essential to predict and to prevent the spread of cancer cells to significantly improve patient survival.Thanks to ... Metastasis is the greatest contributor to cancer?related death.In the era of precision medicine,it is essential to predict and to prevent the spread of cancer cells to significantly improve patient survival.Thanks to the application of a variety of high?throughput technologies,accumulating big data enables researchers and clinicians to identify aggressive tumors as well as patients with a high risk of cancer metastasis.However,there have been few large?scale gene collection studies to enable metastasis?related analyses.In the last several years,emerging efforts have identi?fied pro?metastatic genes in a variety of cancers,providing us the ability to generate a pro?metastatic gene cluster for big data analyses.We carefully selected 285 genes with in vivo evidence of promoting metastasis reported in the literature.These genes have been investigated in different tumor types.We used two datasets downloaded from The Cancer Genome Atlas database,specifically,datasets of clear cell renal cell carcinoma and hepatocellular carcinoma,for validation tests,and excluded any genes for which elevated expression level correlated with longer overall survival in any of the datasets.Ultimately,150 pro?metastatic genes remained in our analyses.We believe this collection of pro?metastatic genes will be helpful for big data analyses,and eventually will accelerate anti?metastasis research and clinical intervention. 展开更多
关键词 Pro-metastatic gene Big data analysis Renal cancer Liver cancer
暂未订购
Modeling viscosity of methane,nitrogen,and hydrocarbon gas mixtures at ultra-high pressures and temperatures using group method of data handling and gene expression programming techniques 被引量:1
4
作者 Farzaneh Rezaei Saeed Jafari +1 位作者 Abdolhossein Hemmati-Sarapardeh Amir H.Mohammadi 《Chinese Journal of Chemical Engineering》 SCIE EI CAS CSCD 2021年第4期431-445,共15页
Accurate gas viscosity determination is an important issue in the oil and gas industries.Experimental approaches for gas viscosity measurement are timeconsuming,expensive and hardly possible at high pressures and high... Accurate gas viscosity determination is an important issue in the oil and gas industries.Experimental approaches for gas viscosity measurement are timeconsuming,expensive and hardly possible at high pressures and high temperatures(HPHT).In this study,a number of correlations were developed to estimate gas viscosity by the use of group method of data handling(GMDH)type neural network and gene expression programming(GEP)techniques using a large data set containing more than 3000 experimental data points for methane,nitrogen,and hydrocarbon gas mixtures.It is worth mentioning that unlike many of viscosity correlations,the proposed ones in this study could compute gas viscosity at pressures ranging between 34 and 172 MPa and temperatures between 310 and 1300 K.Also,a comparison was performed between the results of these established models and the results of ten wellknown models reported in the literature.Average absolute relative errors of GMDH models were obtained 4.23%,0.64%,and 0.61%for hydrocarbon gas mixtures,methane,and nitrogen,respectively.In addition,graphical analyses indicate that the GMDH can predict gas viscosity with higher accuracy than GEP at HPHT conditions.Also,using leverage technique,valid,suspected and outlier data points were determined.Finally,trends of gas viscosity models at different conditions were evaluated. 展开更多
关键词 Gas Viscosity High pressure high temperature Group method of data handling gene expression programming
在线阅读 下载PDF
Identification of candidate genes controlling fiber quality traits in upland cotton through integration of meta-QTL,significant SNP and transcriptomic data 被引量:1
5
作者 XU Shudi PAN Zhenyuan +6 位作者 YIN Feifan YANG Qingyong LIN Zhongxu WEN Tianwang ZHU Longfu ZHANG Dawei NIE Xinhui 《Journal of Cotton Research》 2020年第4期324-335,共12页
Background:Meta-analysis of quantitative trait locus(QTL)is a computational technique to identify consensus QTL and refine QTL positions on the consensus map from multiple mapping studies.The combination of meta-QTL i... Background:Meta-analysis of quantitative trait locus(QTL)is a computational technique to identify consensus QTL and refine QTL positions on the consensus map from multiple mapping studies.The combination of meta-QTL intervals,significant SNPs and transcriptome analysis has been widely used to identify candidate genes in various plants.Results:In our study,884 QTLs associated with cotton fiber quality traits from 12 studies were used for meta-QTL analysis based on reference genome TM-1,as a result,74 meta-QTLs were identified,including 19 meta-QTLs for fiber length;18 meta-QTLs for fiber strength;11 meta-QTLs for fiber uniformity;11 meta-QTLs for fiber elongation;and 15 meta-QTLs for micronaire.Combined with 8589 significant single nucleotide polymorphisms associated with fiber quality traits collected from 15 studies,297 candidate genes were identified in the meta-QTL intervals,20 of which showed high expression levels specifically in the developing fibers.According to the function annotations,some of the 20 key candidate genes are associated with the fiber development.Conclusions:This study provides not only stable QTLs used for marker-assisted selection,but also candidate genes to uncover the molecular mechanisms for cotton fiber development. 展开更多
关键词 Fiber quality traits Meta-QTL Significant SNPs Candidate genes Transcriptomic data
在线阅读 下载PDF
Incorporating heterogeneous biological data sources in clustering gene expression data
6
作者 Gang-Guo Li Zheng-Zhi Wang 《Health》 2009年第1期17-23,共7页
In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity... In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity measure. On the basis of the similarity measures of protein- protein interaction data and chip-chip data, the combined dissimilarity measure is defined. The combined distance measure is introduced into K-means method, which can be considered as an improved K-means method. The improved K-means method and other three clustering methods are evaluated by a real dataset. Per-formance of these methods is assessed by a prediction accuracy analysis through known gene annotations. Our results show that the improved K-means method outperforms other clustering methods. The performance of the improved K-means method is also tested by varying the tuning coefficients of the combined dissimilarity measure. The results show that it is very helpful and meaningful to incorporate het-erogeneous data sources in clustering gene expression data, and those coefficients for the genome-wide or completed data sources should be given larger values when constructing the combined dissimilarity measure. 展开更多
关键词 STATISTICAL Analysis Similarity/ DISSIMILARITY MEASURE gene Expression data Clustering data Fusion
暂未订购
Analysis of Gene Expression Profiles of Rice Mutant SLR1 Based on Microarray Data
7
作者 Weihua LIU Yue CHEN +4 位作者 Lingxian WANG Ge HUANG Qian ZOU Zhenhua ZHU Mingliang DING 《Asian Agricultural Research》 2019年第1期54-55,59,共3页
Gibberellins are an important class of plant hormones.They play an important regulatory role in all stages of growth and development of higher plants.The use of mutants to study gibberellin metabolism and signal trans... Gibberellins are an important class of plant hormones.They play an important regulatory role in all stages of growth and development of higher plants.The use of mutants to study gibberellin metabolism and signal transduction pathways is currently a research hotspot.This article takes the data of Affymetrix chips of rice as an example,bioinformatics method was used to study rice SLR1 mutant and mine differentially expressed wild-type genes,thus exploring the expression regulation network of gibberellin signaling pathway-related genes. 展开更多
关键词 GIBBERELLIN gene CHIP data MINING
在线阅读 下载PDF
Deep Learning Enabled Microarray Gene Expression Classification for Data Science Applications
8
作者 Areej A.Malibari Reem M.Alshehri +5 位作者 Fahd N.Al-Wesabi Noha Negm Mesfer Al Duhayyim Anwer Mustafa Hilal Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第11期4277-4290,共14页
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha... In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures. 展开更多
关键词 BIOINFORMATICS data science microarray gene expression data classification deep learning metaheuristics
在线阅读 下载PDF
A Novel Soft Clustering Approach for Gene Expression Data
9
作者 E.Kavitha R.Tamilarasan +1 位作者 Arunadevi Baladhandapani M.K.Jayanthi Kannan 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期871-886,共16页
Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode p... Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode proteins which in turn will dictate the cell function. The production of messengerRNA along with processing the same are the two main stages involved in the process of gene expression. The biological networks complexity added with thevolume of data containing imprecision and outliers increases the challenges indealing with them. Clustering methods are hence essential to identify the patternspresent in massive gene data. Many techniques involve hierarchical, partitioning,grid based, density based, model based and soft clustering approaches for dealingwith the gene expression data. Understanding the gene regulation and other usefulinformation from this data can be possible only through effective clustering algorithms. Though many methods are discussed in the literature, we concentrate onproviding a soft clustering approach for analyzing the gene expression data. Thepopulation elements are grouped based on the fuzziness principle and a degree ofmembership is assigned to all the elements. An improved Fuzzy clustering byLocal Approximation of Memberships (FLAME) is proposed in this workwhich overcomes the limitations of the other approaches while dealing with thenon-linear relationships and provide better segregation of biological functions. 展开更多
关键词 REINFORCEMENT MEMBERSHIP CENTROID threshold STATISTICS BIOINFORMATICS gene expression data
在线阅读 下载PDF
Challenges Analyzing RNA-Seq Gene Expression Data
10
作者 Liliana López-Kleine Cristian González-Prieto 《Open Journal of Statistics》 2016年第4期628-636,共9页
The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pr... The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pre-processing: Two different paths can be chosen: Transform RNA- sequencing count data to a continuous variable or continue to work with count data. For each data type, analysis tools have been developed and seem appropriate at first sight, but a deeper analysis of data distribution and structure, are a discussion worth. In this review, open questions regarding RNA-sequencing data nature are discussed and highlighted, indicating important future research topics in statistics that should be addressed for a better analysis of already available and new appearing gene expression data. Moreover, a comparative analysis of RNAseq count and transformed data is presented. This comparison indicates that transforming RNA-seq count data seems appropriate, at least for differential expression detection. 展开更多
关键词 RNA-Seq Analysis Count data PREPROCESSING Differential Expression gene Co-Expression Network
暂未订购
Prediction of Lung Cancer Stage Using Tumor Gene Expression Data
11
作者 Yadi Gu 《Journal of Cancer Therapy》 2024年第8期287-302,共16页
Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based... Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based models for classifying cancer types using machine learning techniques. By applying Log2 normalization to gene expression data and conducting Wilcoxon rank sum tests, the researchers employed various classifiers and Incremental Feature Selection (IFS) strategies. The study culminated in two optimized models using the XGBoost classifier, comprising 10 and 74 genes respectively. The 10-gene model, due to its simplicity, is proposed for easier clinical implementation, whereas the 74-gene model exhibited superior performance in terms of Specificity, AUC (Area Under the Curve), and Precision. These models were evaluated based on their sensitivity, AUC, and specificity, aiming to achieve high sensitivity and AUC while maintaining reasonable specificity. 展开更多
关键词 Lung Cancer Detection Stage Prediction gene Expression data Xgboost Machine Learning
暂未订购
Mining and analysis of intracranial aneurysms formation and fracture-related genes based on Gene Expression Omnibus database
12
作者 Jing-Bo Bai Ting Zhang +1 位作者 Yue Tu Yang Liu 《Journal of Hainan Medical University》 2019年第17期1-6,共6页
Objective: To explore potential genes associated with the formation and rupture of intracranial aneurysms based on the Gene Expression Omnibus (GEO) database. Methods: A total of 133 mRNA microarrays were collected fr... Objective: To explore potential genes associated with the formation and rupture of intracranial aneurysms based on the Gene Expression Omnibus (GEO) database. Methods: A total of 133 mRNA microarrays were collected from the GEO database. Differential mRNA gene analysis was performed on the data of each group in the GEO2R platform, and the common differential genes were screened and the gene ontology enrichment analysis and the Kyoto Gene and Genomic Encyclopedia pathway enrichment analysis were completed. The screened differential genes were introduced into the String online database to obtain the interaction between the proteins encoded by the differential genes. Results: Forty-two common differential genes were screened, and the main biological processes involved included the transcriptional regulation of oxidative stress, the positive regulation of chemokine production, and the positive regulation of autophagy of giant cells by RNA polymerase II promoter. Molecular functions included protein binding, RNA polymerase II transcriptional co-repressor activity, transcriptional activator activity, and protein kinase C binding. The main signal pathways covered included hypoxia-inducible factor-1 signaling pathway, glucagon signaling pathway, and metabolic pathway signaling pathway. Conclusions: The formation and rupture of the intracranial aneurysm may be initially screened with amidoxime reduction component 1, tumor necrosis factor-α-inducible protein 6, haptoglobin, mast cell membrane-expressing protein 1, zipper containing kinase, phospholipase Cβ4 and blood and nervous system expression factor-1. In addition to the previously knownintracranial aneurysms mechanisms, cellular autophagy and hypoxia inducible factor-1 pathway may also be involved in the formation of intracranial aneurysms. 展开更多
关键词 INTRACRANIAL ANEURYSM data MINING gene
暂未订购
The expression significance of RACGAP1 gene in hepatocellular carcinoma and its prognostic effect were analyzed based on TCGA database
13
作者 Xiao-Meng Wang Yu Chen 《Cancer Advances》 2022年第22期1-6,共6页
Objective:To explore the expression and clinical significance of RACGAP1 gene in hepatocellular carcinoma.Methods:Data about RACGAP1 gene and clinic pathological data in liver cancer were retrieved from The Cancer Gen... Objective:To explore the expression and clinical significance of RACGAP1 gene in hepatocellular carcinoma.Methods:Data about RACGAP1 gene and clinic pathological data in liver cancer were retrieved from The Cancer Genome Atlas(TCGA).The relationship between the expression of RACGAP1 gene and clinic pathological parameters,and prognosis were analyzed by R 2.15.3 software.The association between RACGAP1 gene expression and prognosis of liver cancer patients was analyzed by Kaplan-Meier survival function analysis and Cox regression analysis.Results:TCGA database was used to collect 235 cases of liver cancer with clinical pathological parameters and their corresponding RACGAP1 expression levels.After the incomplete cases and those with no detailed pathological parameters were excluded,and it was found that RACGAP1 was highly expressed in liver cancer tissues.Meanwhile,the expression of RACGAP1 in patients with liver cancer in the TCGA tumor database was further analyzed with the matching clinical data parameters.The expression level of RACGAP1 was significantly correlated with the pathological grade and T stage of liver cancer patients(all P<0.05),but was not significantly correlated with American Joint Committee on Cancer(AJCC)pathological stage and gender(P>0.05).There was a significant correlation between RACGAP1 expression level and overall survival(OS)in patients with liver cancer(P<0.05),and the overall survival time of patients with low expression was better than that of patients with high expression(P<0.05).Cox regression was used to analyze the correlation between T stage,M stage,N stage and RACGAP1 expression in patients with hepatocellular carcinoma(HCC),and RACGAP1 became an independent prognostic factor in patients with HCC(P<0.05).Conclusion:Based on the tumor-related gene information in the public database TCGA,RACGAP1 gene is highly expressed in liver cancer tissues and becomes an independent prognostic factor of liver cancer,which is expected to become an important therapeutic target of drug therapy for liver cancer. 展开更多
关键词 The Cancer Genome Atlas liver cancer RACGAP1 gene data mining
暂未订购
Genome-Wide Identification of Genes Responsive to ABA and Cold/Salt Stresses in Gossypium hirsutum by Data-Mining and Expression Pattern Analysis
14
作者 ZHU Long-fu HE Xin +6 位作者 YUAN Dao-jun XU Lian XU Li TU Li-li SHEN Guo-xin ZHANG Hong ZHANG Xian-long 《Agricultural Sciences in China》 CAS CSCD 2011年第4期499-508,共10页
For making better use of nucleic acid resources of Gossypium hirsutum, a data-mining method was used to identify putative genes responsive to various abiotic stresses in G. hirsutum. Based on the compiled database inc... For making better use of nucleic acid resources of Gossypium hirsutum, a data-mining method was used to identify putative genes responsive to various abiotic stresses in G. hirsutum. Based on the compiled database including genes involved in abiotic stress response in Arabidopsis thaliana and the comprehensive analysis tool of GENEVESTIGATOR v3, 826 genes up-regulated or down-regulated significantly in roots or leaves during salt or cold treatment in Arabidopsis were identified. As compared to these 826 Arabidopsis genes annotated, 38 homologous expressed sequence tags (ESTs) from G. hirsutum were selected randomly and their expression patterns were studied using a quantitative real-time reverse transcription-polymerase chain reaction method. Among these 38 ESTs, about 55% of the genes (21 of 38) were different in response to ABA between cotton and Arabidopsis, whereas 70% of genes had similar responses to cold and salt treatments, and some of them which had not been characterized in Arabidopsis are now being investigated in gene function studies. According to these results, this approach of analyzing ESTs appears effective in large-scale identification of cotton genes involved in abiotic stress and might be adopted to determine gene functions in various biologic processes in cotton. 展开更多
关键词 cold stress salt stress data-MINING gene Gossypium hirsutum
在线阅读 下载PDF
Combining single-cell RNA-sequencing and bulk data to reveal immunity-related genes expression pattern in the systemic lupus erythematosus and target organ kidney
15
作者 Ying Zhang Tong Zhou +4 位作者 Yi-Ting Wang Xiao-Xian Pei Zhe Sun Ming-Cheng Li Wen-Gang Song 《Medical Data Mining》 2023年第1期1-9,共9页
Background:Systemic lupus erythematosus(SLE)is a complex chronic autoimmune disease with no known cure.However,the regulatory mechanism of immunity-related genes is not fully understood in SLE.In order to explore new ... Background:Systemic lupus erythematosus(SLE)is a complex chronic autoimmune disease with no known cure.However,the regulatory mechanism of immunity-related genes is not fully understood in SLE.In order to explore new therapeutic targets,we used bioinformatical methods to analyze a series of data.Methods:After downloading and processing the data from Gene Expression Omnibus database,the differentially expressed genes of SLE were analyzed.CIBERSORT algorithm was used to analyze the immune infiltration of SLE.Based on single-cell RNA-sequencing data,the role of immune-related genes in SLE and its target organ(kidney)were analyzed.Key transcription factors affecting immune-related genes were identified.Cell-cell communication networks in SLE were analyzed.Results:In total,15 hub genes and 4 transcription factors were found in the bulk data.Monocytes and macrophages in GSE81622(SLE)showed more infiltration.There were four cell types were annotated in scRNA sequencing dataset(GSE135779),as follows T cells,monocyte,NK cells and B cells.Immunity-related genes were overexpressed in monocytes.Conclusion:The present study shows that immune-related genes affect SLE through monocytes and play an important role in target organ renal injury. 展开更多
关键词 systemic lupus erythematosus single-cell RNA-sequencing data immunity-related genes Lupus nephritis monocytes
暂未订购
Optimizing Cancer Classification and Gene Discovery with an Adaptive Learning Search Algorithm for Microarray Analysis
16
作者 Chiwen Qu Heng Yao +1 位作者 Tingjiang Pan Zenghui Lu 《Journal of Bionic Engineering》 2025年第2期901-930,共30页
DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we presen... DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we present Fs-LSA (F-score based Learning Search Algorithm), a novel gene selection algorithm designed to enhance the precision and efficiency of target gene identification from microarray data for cancer classification. This algorithm is divided into two phases: the first leverages F-score values to prioritize and select feature genes with the most significant differential expression;the second phase introduces our Learning Search Algorithm (LSA), which harnesses swarm intelligence to identify the optimal subset among the remaining genes. Inspired by human social learning, LSA integrates historical data and collective intelligence for a thorough search, with a dynamic control mechanism that balances exploration and refinement, thereby enhancing the gene selection process. We conducted a rigorous validation of Fs-LSA’s performance using eight publicly available cancer microarray expression datasets. Fs-LSA achieved accuracy, precision, sensitivity, and F1-score values of 0.9932, 0.9923, 0.9962, and 0.994, respectively. Comparative analyses with state-of-the-art algorithms revealed Fs-LSA’s superior performance in terms of simplicity and efficiency. Additionally, we validated the algorithm’s efficacy independently using glioblastoma data from GEO and TCGA databases. It was significantly superior to those of the comparison algorithms. Importantly, the driver genes identified by Fs-LSA were instrumental in developing a predictive model as an independent prognostic indicator for glioblastoma, underscoring Fs-LSA’s transformative potential in genomics and personalized medicine. 展开更多
关键词 gene selection Learning search algorithm gene expression data CLASSIFICATION
暂未订购
MaterialsGalaxy:A platform fusing experimental and theoretical data in condensed matter physics
17
作者 Tiannian Zhu Zhong Fang +1 位作者 Quansheng Wu Hongming Weng 《Chinese Physics B》 2025年第12期208-216,共9页
Modern materials science generates vast and diverse datasets from both experiments and computations,yet these multi-source,heterogeneous data often remain disconnected in isolated“silos”.Here,we introduce MaterialsG... Modern materials science generates vast and diverse datasets from both experiments and computations,yet these multi-source,heterogeneous data often remain disconnected in isolated“silos”.Here,we introduce MaterialsGalaxy,a comprehensive platform that deeply fuses experimental and theoretical data in condensed matter physics.Its core innovation is a structure similarity-driven data fusion mechanism that quantitatively links cross-modal records—spanning diffraction,crystal growth,computations,and literature—based on their underlying atomic structures.The platform integrates artificial intelligence(AI)tools,including large language models(LLMs)for knowledge extraction,generative models for crystal structure prediction,and machine learning property predictors,to enhance data interpretation and accelerate materials discovery.We demonstrate that MaterialsGalaxy effectively integrates these disparate data sources,uncovering hidden correlations and guiding the design of novel materials.By bridging the long-standing gap between experiment and theory,MaterialsGalaxy provides a new paradigm for data-driven materials research and accelerates the discovery of advanced materials. 展开更多
关键词 MaterialsGalaxy data fusion materials gene materials database
原文传递
Expression Profile Changes of Genes Involved in Lipid Metabolism Pathway During Liver Regeneration in Mice 被引量:1
18
作者 袁运生 张夕原 +3 位作者 严德珺 杨婷旭 郜尽 俞雁 《Agricultural Science & Technology》 CAS 2009年第2期41-45,共5页
[ Objective ] The aim of the research was to study the expression profile changes of genes involved in lipid metabolism pathway during liver regeneration in mice. [ Method] The CCI4 induced mouse model of liver regene... [ Objective ] The aim of the research was to study the expression profile changes of genes involved in lipid metabolism pathway during liver regeneration in mice. [ Method] The CCI4 induced mouse model of liver regeneration was established and the total RNA was isolated from liver tissue of mouse. Then the changes of genes involved in lipid metabolism pathway during different stages of liver regeneration were detected through micro-array chip gene technique and their specific functions were also analyzed. [ Result] Dudng the process of liver regeneration, the expression level of 98 genes involved in lipid metabolism pathway changed, which were divided into eight groups according to change trend. In the mass, the expression of genes was inhibited in the early stage and up-regulated in the late phase. And the gene expression associated with fatty acid synthesis pathway was mainly up-regulated while the catabolic pathway did not change significantly. Most of genes involved in bile acid synthesis pathway were suppressed before 4.5 d and up-regulated after 4.5 d or 7 d. [ Conclusion] During the process of liver regeneration, the genes associated with lipid metabolism are expressed in different trends, and this data should provide a specific range of genes for further studying the regulation effect of lipid metabolism related pathway on liver regeneration. 展开更多
关键词 Upid metabolism gene expression profiles Liver regeneration micro-array chip
在线阅读 下载PDF
Gene Ontology在生物数据整合中的应用 被引量:8
19
作者 夏燕 张忠平 +2 位作者 曹顺良 朱扬勇 李亦学 《计算机工程》 EI CAS CSCD 北大核心 2005年第2期57-58,76,共3页
异构数据的高效整合,在生物数据呈爆炸性增长、生物数据库复杂度不断增加的今天,具有重要的理论价值和实际意义。该文基于BioDW——一个整合的生物信息学数据仓库平台,利用统一的GeneOntology语义模型,建立异构数据库之间的语义链接,在... 异构数据的高效整合,在生物数据呈爆炸性增长、生物数据库复杂度不断增加的今天,具有重要的理论价值和实际意义。该文基于BioDW——一个整合的生物信息学数据仓库平台,利用统一的GeneOntology语义模型,建立异构数据库之间的语义链接,在概念和联系层次上有效地解决了生物异构数据的整合问题,实现了对生物数据智能化的多重、复合和交叉检索,为生物信息的进一步研究奠定了坚实的基础。 展开更多
关键词 生物 整合问题 实际 检索 数据整合 层次 联系 异构数据库 语义模型 数据仓库
在线阅读 下载PDF
GEO(Gene Expression Omnibus):高通量基因表达数据库 被引量:9
20
作者 刘华 马文丽 郑文岭 《中国生物化学与分子生物学报》 CAS CSCD 北大核心 2007年第3期236-244,共9页
GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被... GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被存档,例如基因表达系列分析(serial analysis of gene expression,SAGE)和蛋白质鉴定技术.迄今为止,GEO数据库包含的数据含概10000个杂交实验和来自30种不同生物体的SAGE库.本文概述了GEO数据库的查询和浏览,数据下载和格式,数据分析,贮存与更新,并着重分析GEO数据浏览器中控制词汇的使用,阐述了GEO数据库的数据挖掘以及GEO在分子生物学领域中的应用前景.GEO可由此公众网址直接登陆http://www.ncbi.nlm.nih.gov/projects/geo/. 展开更多
关键词 基因表达 数据库 控制词汇 数据挖掘
在线阅读 下载PDF
上一页 1 2 39 下一页 到第
使用帮助 返回顶部