期刊文献+
共找到695篇文章
< 1 2 35 >
每页显示 20 50 100
Gene Expression Data Analysis Based on Mixed Effects Model
1
作者 Yuanbo Dai 《Journal of Computer and Communications》 2025年第2期223-235,共13页
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres... DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions. 展开更多
关键词 Mixed Effects Model gene Expression data Analysis gene Analysis gene Chip
暂未订购
Prediction of Lung Cancer Stage Using Tumor Gene Expression Data
2
作者 Yadi Gu 《Journal of Cancer Therapy》 2024年第8期287-302,共16页
Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based... Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based models for classifying cancer types using machine learning techniques. By applying Log2 normalization to gene expression data and conducting Wilcoxon rank sum tests, the researchers employed various classifiers and Incremental Feature Selection (IFS) strategies. The study culminated in two optimized models using the XGBoost classifier, comprising 10 and 74 genes respectively. The 10-gene model, due to its simplicity, is proposed for easier clinical implementation, whereas the 74-gene model exhibited superior performance in terms of Specificity, AUC (Area Under the Curve), and Precision. These models were evaluated based on their sensitivity, AUC, and specificity, aiming to achieve high sensitivity and AUC while maintaining reasonable specificity. 展开更多
关键词 Lung Cancer Detection Stage Prediction gene Expression data Xgboost Machine Learning
暂未订购
Optimizing Cancer Classification and Gene Discovery with an Adaptive Learning Search Algorithm for Microarray Analysis
3
作者 Chiwen Qu Heng Yao +1 位作者 Tingjiang Pan Zenghui Lu 《Journal of Bionic Engineering》 2025年第2期901-930,共30页
DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we presen... DNA microarrays, a cornerstone in biomedicine, measure gene expression across thousands to tens of thousands of genes. Identifying the genes vital for accurate cancer classification is a key challenge. Here, we present Fs-LSA (F-score based Learning Search Algorithm), a novel gene selection algorithm designed to enhance the precision and efficiency of target gene identification from microarray data for cancer classification. This algorithm is divided into two phases: the first leverages F-score values to prioritize and select feature genes with the most significant differential expression;the second phase introduces our Learning Search Algorithm (LSA), which harnesses swarm intelligence to identify the optimal subset among the remaining genes. Inspired by human social learning, LSA integrates historical data and collective intelligence for a thorough search, with a dynamic control mechanism that balances exploration and refinement, thereby enhancing the gene selection process. We conducted a rigorous validation of Fs-LSA’s performance using eight publicly available cancer microarray expression datasets. Fs-LSA achieved accuracy, precision, sensitivity, and F1-score values of 0.9932, 0.9923, 0.9962, and 0.994, respectively. Comparative analyses with state-of-the-art algorithms revealed Fs-LSA’s superior performance in terms of simplicity and efficiency. Additionally, we validated the algorithm’s efficacy independently using glioblastoma data from GEO and TCGA databases. It was significantly superior to those of the comparison algorithms. Importantly, the driver genes identified by Fs-LSA were instrumental in developing a predictive model as an independent prognostic indicator for glioblastoma, underscoring Fs-LSA’s transformative potential in genomics and personalized medicine. 展开更多
关键词 gene selection Learning search algorithm gene expression data CLASSIFICATION
暂未订购
A Novel Soft Clustering Approach for Gene Expression Data
4
作者 E.Kavitha R.Tamilarasan +1 位作者 Arunadevi Baladhandapani M.K.Jayanthi Kannan 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期871-886,共16页
Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode p... Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode proteins which in turn will dictate the cell function. The production of messengerRNA along with processing the same are the two main stages involved in the process of gene expression. The biological networks complexity added with thevolume of data containing imprecision and outliers increases the challenges indealing with them. Clustering methods are hence essential to identify the patternspresent in massive gene data. Many techniques involve hierarchical, partitioning,grid based, density based, model based and soft clustering approaches for dealingwith the gene expression data. Understanding the gene regulation and other usefulinformation from this data can be possible only through effective clustering algorithms. Though many methods are discussed in the literature, we concentrate onproviding a soft clustering approach for analyzing the gene expression data. Thepopulation elements are grouped based on the fuzziness principle and a degree ofmembership is assigned to all the elements. An improved Fuzzy clustering byLocal Approximation of Memberships (FLAME) is proposed in this workwhich overcomes the limitations of the other approaches while dealing with thenon-linear relationships and provide better segregation of biological functions. 展开更多
关键词 REINFORCEMENT MEMBERSHIP CENTROID threshold STATISTICS BIOINFORMATICS gene expression data
在线阅读 下载PDF
Deep Learning Enabled Microarray Gene Expression Classification for Data Science Applications
5
作者 Areej A.Malibari Reem M.Alshehri +5 位作者 Fahd N.Al-Wesabi Noha Negm Mesfer Al Duhayyim Anwer Mustafa Hilal Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第11期4277-4290,共14页
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha... In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures. 展开更多
关键词 BIOINFORMATICS data science microarray gene expression data classification deep learning metaheuristics
在线阅读 下载PDF
Genome-Wide Identification of Genes Responsive to ABA and Cold/Salt Stresses in Gossypium hirsutum by Data-Mining and Expression Pattern Analysis
6
作者 ZHU Long-fu HE Xin +6 位作者 YUAN Dao-jun XU Lian XU Li TU Li-li SHEN Guo-xin ZHANG Hong ZHANG Xian-long 《Agricultural Sciences in China》 CAS CSCD 2011年第4期499-508,共10页
For making better use of nucleic acid resources of Gossypium hirsutum, a data-mining method was used to identify putative genes responsive to various abiotic stresses in G. hirsutum. Based on the compiled database inc... For making better use of nucleic acid resources of Gossypium hirsutum, a data-mining method was used to identify putative genes responsive to various abiotic stresses in G. hirsutum. Based on the compiled database including genes involved in abiotic stress response in Arabidopsis thaliana and the comprehensive analysis tool of GENEVESTIGATOR v3, 826 genes up-regulated or down-regulated significantly in roots or leaves during salt or cold treatment in Arabidopsis were identified. As compared to these 826 Arabidopsis genes annotated, 38 homologous expressed sequence tags (ESTs) from G. hirsutum were selected randomly and their expression patterns were studied using a quantitative real-time reverse transcription-polymerase chain reaction method. Among these 38 ESTs, about 55% of the genes (21 of 38) were different in response to ABA between cotton and Arabidopsis, whereas 70% of genes had similar responses to cold and salt treatments, and some of them which had not been characterized in Arabidopsis are now being investigated in gene function studies. According to these results, this approach of analyzing ESTs appears effective in large-scale identification of cotton genes involved in abiotic stress and might be adopted to determine gene functions in various biologic processes in cotton. 展开更多
关键词 cold stress salt stress data-MINING gene Gossypium hirsutum
在线阅读 下载PDF
Analysis of Gene Expression Profiles of Rice Mutant SLR1 Based on Microarray Data
7
作者 Weihua LIU Yue CHEN +4 位作者 Lingxian WANG Ge HUANG Qian ZOU Zhenhua ZHU Mingliang DING 《Asian Agricultural Research》 2019年第1期54-55,59,共3页
Gibberellins are an important class of plant hormones.They play an important regulatory role in all stages of growth and development of higher plants.The use of mutants to study gibberellin metabolism and signal trans... Gibberellins are an important class of plant hormones.They play an important regulatory role in all stages of growth and development of higher plants.The use of mutants to study gibberellin metabolism and signal transduction pathways is currently a research hotspot.This article takes the data of Affymetrix chips of rice as an example,bioinformatics method was used to study rice SLR1 mutant and mine differentially expressed wild-type genes,thus exploring the expression regulation network of gibberellin signaling pathway-related genes. 展开更多
关键词 GIBBERELLIN gene CHIP data MINING
在线阅读 下载PDF
Incorporating heterogeneous biological data sources in clustering gene expression data
8
作者 Gang-Guo Li Zheng-Zhi Wang 《Health》 2009年第1期17-23,共7页
In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity... In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity measure. On the basis of the similarity measures of protein- protein interaction data and chip-chip data, the combined dissimilarity measure is defined. The combined distance measure is introduced into K-means method, which can be considered as an improved K-means method. The improved K-means method and other three clustering methods are evaluated by a real dataset. Per-formance of these methods is assessed by a prediction accuracy analysis through known gene annotations. Our results show that the improved K-means method outperforms other clustering methods. The performance of the improved K-means method is also tested by varying the tuning coefficients of the combined dissimilarity measure. The results show that it is very helpful and meaningful to incorporate het-erogeneous data sources in clustering gene expression data, and those coefficients for the genome-wide or completed data sources should be given larger values when constructing the combined dissimilarity measure. 展开更多
关键词 STATISTICAL Analysis Similarity/ DISSIMILARITY MEASURE gene Expression data Clustering data Fusion
暂未订购
Challenges Analyzing RNA-Seq Gene Expression Data
9
作者 Liliana López-Kleine Cristian González-Prieto 《Open Journal of Statistics》 2016年第4期628-636,共9页
The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pr... The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pre-processing: Two different paths can be chosen: Transform RNA- sequencing count data to a continuous variable or continue to work with count data. For each data type, analysis tools have been developed and seem appropriate at first sight, but a deeper analysis of data distribution and structure, are a discussion worth. In this review, open questions regarding RNA-sequencing data nature are discussed and highlighted, indicating important future research topics in statistics that should be addressed for a better analysis of already available and new appearing gene expression data. Moreover, a comparative analysis of RNAseq count and transformed data is presented. This comparison indicates that transforming RNA-seq count data seems appropriate, at least for differential expression detection. 展开更多
关键词 RNA-Seq Analysis Count data PREPROCESSING Differential Expression gene Co-Expression Network
暂未订购
基于GEO数据库筛选结核病关键基因及信号通路的研究
10
作者 石洁 常文静 +6 位作者 郑丹薇 苏茹月 马晓光 朱岩昆 王少华 孙建伟 孙定勇 《中国防痨杂志》 北大核心 2025年第6期769-778,共10页
目的:利用生物信息学方法鉴定结核病表达差异基因及相关信号通路,以发现可用于结核病诊断的生物标志物。方法:从高通量基因表达数据库(GEO)中搜索结核病患者样本及健康人群的基因表达芯片数据集,下载GSE139825基因芯片微阵列数据集作为... 目的:利用生物信息学方法鉴定结核病表达差异基因及相关信号通路,以发现可用于结核病诊断的生物标志物。方法:从高通量基因表达数据库(GEO)中搜索结核病患者样本及健康人群的基因表达芯片数据集,下载GSE139825基因芯片微阵列数据集作为分析数据集,使用R语言中的limma包对测序数据进行标准化校正和鉴定差异基因(DEGs),使用clusterProfiler包进行基因本体论(GO)及京都基因和基因组百科全书(KEGG)信号通路分析。使用STRING在线数据库进行差异基因的蛋白互作网络(PPI)分析并用Cytoscape软件进行可视化和筛选核心基因。下载GSE19439基因芯片微阵列数据集作为表达差异的核心基因的验证数据集,同时使用酶联免疫吸附试验验证候选生物标记物,并使用受试者工作特征曲线下面积(AUC)评估其诊断能力。结果:通过分析GSE139825数据库共筛选出206个差异基因,其中172个基因表达上调,34个基因表达下调,其中,下调50%以上的基因有PDK4和CABLES1,上调8倍以上的有IL1B、LOC728835、CXCL10和IL8。GO和KEGG分析表明,差异基因的生物过程主要集中在细胞因子介导的信号通路、白细胞细胞间黏附、对脂多糖的应答反应等方面,主要发挥细胞因子受体结合、细胞因子的活性等分子功能,并在细胞因子之间的相互作用、TNF信号通路、结核病相关通路等信号通路上富集显著。PPI分析鉴定出10个核心基因,分别为IL1B、TNF、IL6、IL1A、CCL20、CXCL1、CXCL10、CXCL8、CCL3和CCR7。通过GSE19439验证数据集分析,发现10个核心基因中CXCL10和IL1B同样表达上调;酶联免疫吸附实验验证也发现健康对照和结核病患者的CXCL10蛋白的ELISA平均值分别为0.570和0.827,IL1B蛋白分别为1.245和2.067,差异均有统计学意义(t=25.353,P<0.001;t=11.840,P=0.002);logistic回归模型分析显示,CXCL10和IL1B在区分健康组和结核病组方面均表现良好(AUC CXCL10=0.854,AUC IL1B=0.818)。结论:研究揭示了结核病发病相关基因间的相关作用,发现CXCL10和IL1B均能较好的区分健康对照和结核病患者,可作为新型结核病诊断的生物标志物。 展开更多
关键词 结核 基因组文库 数据挖掘 表达基因 生物学标记
暂未订购
Comparative study of microarray and experimental data on Schwann cells in peripheral nerve degeneration and regeneration: big data analysis 被引量:6
11
作者 Ulfuara Shefa Junyang Jung 《Neural Regeneration Research》 SCIE CAS CSCD 2019年第6期1099-1104,共6页
A Schwann cell has regenerative capabilities and is an important cell in the peripheral nervous system.This microarray study is part of a bioinformatics study that focuses mainly on Schwann cells. Microarray data prov... A Schwann cell has regenerative capabilities and is an important cell in the peripheral nervous system.This microarray study is part of a bioinformatics study that focuses mainly on Schwann cells. Microarray data provide information on differences between microarray-based and experiment-based gene expression analyses. According to microarray data, several genes exhibit increased expression(fold change) but they are weakly expressed in experimental studies(based on morphology, protein and mRNA levels). In contrast, some genes are weakly expressed in microarray data and highly expressed in experimental studies;such genes may represent future target genes in Schwann cell studies. These studies allow us to learn about additional genes that could be used to achieve targeted results from experimental studies. In the current big data study by retrieving more than 5000 scientific articles from PubMed or NCBI, Google Scholar, and Google, 1016(up-and downregulated) genes were determined to be related to Schwann cells. However,no experiment was performed in the laboratory; rather, the present study is part of a big data analysis. Our study will contribute to our understanding of Schwann cell biology by aiding in the identification of genes.Based on a comparative analysis of all microarray data, we conclude that the microarray could be a good tool for predicting the expression and intensity of different genes of interest in actual experiments. 展开更多
关键词 Schwann cells big data analysis PERIPHERAL NERVE DEgeneRATION PERIPHERAL NERVE REgeneRATION MICROARRAY matched geneS promising geneS gene ranking
暂未订购
数智赋能蔡氏古民居建筑基因的智能设计传承
12
作者 张安华 刘伟林 吕少卿 《包装工程》 北大核心 2025年第4期342-353,共12页
目的数智时代,基于大数据技术挖掘蔡氏古民居的网络用户评价,发现现存问题,提出有效设计策略并验证其可行性。方法首先,对互联网上的热门旅游平台和社交媒体中有关蔡氏古民居建筑群的用户评论数据进行挖掘,分析总结用户需求,确定开发传... 目的数智时代,基于大数据技术挖掘蔡氏古民居的网络用户评价,发现现存问题,提出有效设计策略并验证其可行性。方法首先,对互联网上的热门旅游平台和社交媒体中有关蔡氏古民居建筑群的用户评论数据进行挖掘,分析总结用户需求,确定开发传承蔡氏古民居建筑基因的文旅品牌IP形象,将IP形象作为吸引物置于虚拟导览程序和公共展示空间中,以提升蔡氏古民居景区的吸引力;其次,依据建筑模因理论构建蔡氏古民居建筑基因图谱,对建筑的形式基因、空间基因及思维基因分别进行图谱绘制;再次,运用S-O-R理论模型进行实验评价,根据用户反应时长筛选出用于设计实践的建筑因子;最后,采用生成式人工智能工具辅助,将提取出的建筑因子应用于IP形象的设计方案。结果选择智能生成IP形象中的最优方案,采用人工完成三维建模并运用到蔡氏古民居的线上线下旅游推广场景中。结论本研究提出了一个结合大数据挖掘、建筑模因理论和人工智能辅助设计的创新框架,对于文化遗产基因的传承与推广具有重要的理论研究意义和实践应用价值。 展开更多
关键词 大数据 蔡氏古民居 建筑基因 IP形象 生成式人工智能(AIGC)
在线阅读 下载PDF
面向基因调控网络的基因关联分析算法
13
作者 李志杰 廖莎 +1 位作者 刘安丰 李青蓝 《计算机工程与应用》 北大核心 2025年第3期155-165,共11页
基因调控网络是基于微阵列基因表达数据,对基因之间表达关系依赖程度的一种仿真或重建。从基因表达数据挖掘基因之间存在的一定程度因果关系,对重构基因调控网络具有十分重要的意义。提出一种基于频繁原子序列关联熵的基因关联分析算法... 基因调控网络是基于微阵列基因表达数据,对基因之间表达关系依赖程度的一种仿真或重建。从基因表达数据挖掘基因之间存在的一定程度因果关系,对重构基因调控网络具有十分重要的意义。提出一种基于频繁原子序列关联熵的基因关联分析算法,通过基因关联熵有效识别基因之间的因果关系,并采用启发式搜索策略构建基因关联贝叶斯调控网络(gene association based Bayesian regulatory,GABR)。与基因贝叶斯网络描述基因表达水平值之间依赖关系不同,GABR是一种基因序列贝叶斯网络,基因关联分析对象是生物组织样本的基因表达值排序并置换为基因列下标所形成的序列。算法的优势在于基因变量取值原子序列,该基因为原子序列的结果,基因关联熵以及条件概率分布的计算更符合基因表达数据分析的生物本质特征。ALARM网络模拟数据的实验结果表明,基因关联分析算法性能明显优于同类算法。在酵母菌微阵列基因数据GDS2267和小鼠胚胎基因GSE76118等GEO数据集进行实验,测试结果表明GABR方法重构的基因调控网络具有较高的有效性和鲁棒性。 展开更多
关键词 基因表达数据 基因调控 频繁原子序列 关联熵 基因序列贝叶斯网络
在线阅读 下载PDF
学科交叉推动肿瘤诊疗技术变革
14
作者 张万广 陈孝平 《中国科学基金》 北大核心 2025年第1期132-143,共12页
多学科交叉融合的诊疗方式极大推动了医学诊断和治疗模式。近年来,CRISPR-Cas9基因编辑、多模态人工智能诊断模型、达芬奇手术机器人以及基因检测技术在医疗领域广泛应用,这些学科优势的融合交叉逐步形成了肿瘤诊疗的新范式。本综述将... 多学科交叉融合的诊疗方式极大推动了医学诊断和治疗模式。近年来,CRISPR-Cas9基因编辑、多模态人工智能诊断模型、达芬奇手术机器人以及基因检测技术在医疗领域广泛应用,这些学科优势的融合交叉逐步形成了肿瘤诊疗的新范式。本综述将从基因诊断、基因编辑和肿瘤精准治疗、虚拟现实、人工智能与大数据、机器人外科手术、肠道微生物与肿瘤微环境等方面阐述肿瘤诊疗的变革过程。依托学科优势并结合新兴技术是响应学科交叉国家战略的重要举措,对推动学科交叉肿瘤诊疗新范式的建设具有重要意义。 展开更多
关键词 学科交叉 肿瘤基因诊疗 虚拟现实 大数据 肠道菌群
原文传递
GeneSifter在基因表达谱芯片数据挖掘中的应用 被引量:5
15
作者 廖之君 马文丽 +1 位作者 梁爽 郑文岭 《医学信息(西安上半月)》 2007年第11期1882-1887,共6页
介绍一款基因芯片数据分析工具──GeneSifter软件,具有快速、直观、便捷等特点,尤其适用于基因表达谱的数据挖掘。芯片数据一般以格式化文本文档形式上载,根据实验目的、设计不同,总共有4种上载向导工具,数据分析从控制台Analysis项目... 介绍一款基因芯片数据分析工具──GeneSifter软件,具有快速、直观、便捷等特点,尤其适用于基因表达谱的数据挖掘。芯片数据一般以格式化文本文档形式上载,根据实验目的、设计不同,总共有4种上载向导工具,数据分析从控制台Analysis项目下的Pairwise或Projects进入,需要设置滤过、阈值和统计分析等参数,Pairwise可获得的结果有:差异显著性基因列表、基因本体报告和KEGG通路报告等,Projects有更为强大的功能,可获取聚类等6种结果。GeneSifter独特的一站式单击设置,可获得相关基因的11个数据库最新链接。GeneSifter适用于基因芯片数据挖掘的生物研究人员。 展开更多
关键词 geneSifter软件 数据挖掘 基因本体术语 KEGG通路 聚类
暂未订购
Gene Ontology在生物数据整合中的应用 被引量:8
16
作者 夏燕 张忠平 +2 位作者 曹顺良 朱扬勇 李亦学 《计算机工程》 EI CAS CSCD 北大核心 2005年第2期57-58,76,共3页
异构数据的高效整合,在生物数据呈爆炸性增长、生物数据库复杂度不断增加的今天,具有重要的理论价值和实际意义。该文基于BioDW——一个整合的生物信息学数据仓库平台,利用统一的GeneOntology语义模型,建立异构数据库之间的语义链接,在... 异构数据的高效整合,在生物数据呈爆炸性增长、生物数据库复杂度不断增加的今天,具有重要的理论价值和实际意义。该文基于BioDW——一个整合的生物信息学数据仓库平台,利用统一的GeneOntology语义模型,建立异构数据库之间的语义链接,在概念和联系层次上有效地解决了生物异构数据的整合问题,实现了对生物数据智能化的多重、复合和交叉检索,为生物信息的进一步研究奠定了坚实的基础。 展开更多
关键词 生物 整合问题 实际 检索 数据整合 层次 联系 异构数据库 语义模型 数据仓库
在线阅读 下载PDF
GEO(Gene Expression Omnibus):高通量基因表达数据库 被引量:9
17
作者 刘华 马文丽 郑文岭 《中国生物化学与分子生物学报》 CAS CSCD 北大核心 2007年第3期236-244,共9页
GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被... GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被存档,例如基因表达系列分析(serial analysis of gene expression,SAGE)和蛋白质鉴定技术.迄今为止,GEO数据库包含的数据含概10000个杂交实验和来自30种不同生物体的SAGE库.本文概述了GEO数据库的查询和浏览,数据下载和格式,数据分析,贮存与更新,并着重分析GEO数据浏览器中控制词汇的使用,阐述了GEO数据库的数据挖掘以及GEO在分子生物学领域中的应用前景.GEO可由此公众网址直接登陆http://www.ncbi.nlm.nih.gov/projects/geo/. 展开更多
关键词 基因表达 数据库 控制词汇 数据挖掘
在线阅读 下载PDF
结直肠腺癌发病相关基因的识别:基于汇总数据的孟德尔随机化法
18
作者 白耀文 黑志军 +5 位作者 杨海龙 尹少军 马锋超 胡军红 张志永 夏坤锟 《郑州大学学报(医学版)》 北大核心 2025年第3期348-352,共5页
目的:基于汇总数据的孟德尔随机化(SMR)法识别结直肠腺癌(COAD)发病相关基因。方法:结合GWAS数据、来自不同研究的表达量性状基因座(eQTL)和DNA甲基化性状基因座(mQTL)数据采用SMR法识别COAD发病相关基因,通过单细胞RNA测序分析验证;整... 目的:基于汇总数据的孟德尔随机化(SMR)法识别结直肠腺癌(COAD)发病相关基因。方法:结合GWAS数据、来自不同研究的表达量性状基因座(eQTL)和DNA甲基化性状基因座(mQTL)数据采用SMR法识别COAD发病相关基因,通过单细胞RNA测序分析验证;整合多组学证据结果,明确关键基因和次要基因。结果:SMR分析筛选出6个与COAD发病风险显著相关的基因,包括ATF1、COLCA2、COLCA1、C11orf53,以及LINC01270(RP11-290F20.1)和LINC01271(RP11-290F20.2)。单细胞RNA测序分析证实COLCA1、COLCA2、C11orf53和LINC01270(RP11-290F20.1)的效应。ATF1被认定为关键基因,COLCA1、COLCA2、C11orf53、LINC01270(RP11-290F20.1)和LINC01271(RP11-290F20.2)被认定为次要基因。结论:识别出COAD发病相关基因。 展开更多
关键词 结直肠腺癌 基于汇总数据的孟德尔随机化法 候选基因识别
暂未订购
整合多源转录组学数据鉴别哮喘发生上皮细胞的潜在生物标志物
19
作者 谢连华 卢淑娴 +2 位作者 郭芳阳 张弋峰 刘潜 《细胞与分子免疫学杂志》 北大核心 2025年第8期695-705,共11页
目的基于多源转录组学数据的生物信息学分析,挖掘哮喘上皮细胞的潜在生物标志物,并通过哮喘模型的肺组织及其上皮细胞验证潜在靶基因的表达。方法整合基因表达数据集(GEO)数据库中3组哮喘患者和正常对照组上皮细胞的基因表达谱数据,通... 目的基于多源转录组学数据的生物信息学分析,挖掘哮喘上皮细胞的潜在生物标志物,并通过哮喘模型的肺组织及其上皮细胞验证潜在靶基因的表达。方法整合基因表达数据集(GEO)数据库中3组哮喘患者和正常对照组上皮细胞的基因表达谱数据,通过基因差异表达分析和基因共表达网络分析,鉴别与哮喘相关的关键基因及生物学通路。最后,利用哮喘动物模型在肺组织和肺组织上皮细胞验证主要关键基因。结果通过差异基因表达分析,哮喘上皮细胞与正常对照组相比,存在1121个上调基因和1484个下调基因。生物学通路富集分析表明,上调基因主要参与糖基化过程,下调基因主要参与免疫细胞分化过程。基因共表达网络分析发现,糖基化相关通路的模块9与哮喘呈显著正相关,而参与胰岛素等信号通路的模块17与哮喘呈显著负相关。我们发现多肽N-乙酰半乳糖胺转移酶5(GALNT5)、吡咯啉-5-羧酸还原酶-1(PYCR1)和癌胚抗原相关细胞黏附分子5(CEACAM5)基因是模块9的关键基因且在哮喘中显著上调。最后,我们在哮喘肺组织上皮细胞中验证了GALNT5、PYCR1和CEACAM5在模型组中表达均上调。同时,通过构建大鼠哮喘模型,我们进一步证实了这三个基因的蛋白水平在模型组肺组织中也显著上调。结论本研究通过数据整合和实验验证,鉴定出与哮喘疾病发展密切相关的关键基因和生物学通路,这些结果为哮喘的诊断和治疗提供新的理论基础和潜在靶点。 展开更多
关键词 多源转录组学数据 基因共表达网络 上皮细胞 生物标志物 生物学通路
原文传递
基于生成对抗网络的乳腺癌基因数据生成与挖掘
20
作者 杨锦 边太成 +2 位作者 李晓晖 焦强 朱习军 《计算机测量与控制》 2025年第11期219-227,共9页
针对组学数据挖掘中遇到的数据样本量少、数据高维度和特征泛化性差的问题,提出了结合残差网络与软阈值化方法的生成模型RS-CGAN;该模型通过一维卷积层和残差网络结构提升高维数据的特征学习能力,并引入残差软阈值化的生成器和残差注意... 针对组学数据挖掘中遇到的数据样本量少、数据高维度和特征泛化性差的问题,提出了结合残差网络与软阈值化方法的生成模型RS-CGAN;该模型通过一维卷积层和残差网络结构提升高维数据的特征学习能力,并引入残差软阈值化的生成器和残差注意力的判别器以降低噪声影响并防止过拟合;采用距离相似度惩罚项指导生成器学习,增强训练质量;提出基于结构因果模型的特征选择模块,通过构建因果结构图,实现群体平均因果治疗效应估计值的计算,识别具有泛化性和因果关系的生物标志物;实验结果表明,该方法在数据生成质量方面具有优势,且特征选择后的数据集在预测模型中的准确率提升了30.58%,最终识别10个乳腺癌生物标志物,其中4个已经过临床医学验证为风险位点,证明了该标志物选择方法的有效性。 展开更多
关键词 生成对抗网络 数据增强 标志物挖掘 因果推断 乳腺癌基因数据
在线阅读 下载PDF
上一页 1 2 35 下一页 到第
使用帮助 返回顶部