The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadv...The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadvantages of this approach such as excluding SNPs that do not have strong single effects when tested on their own but do have strong joint effects when tested together with other SNPs. The interpretation of results from the traditional gene score may lack biological insight since the functional unit of interest is often the gene, not the single SNP. In this paper we present a new gene scoring method, which overcomes these problems as it generates a gene score for each gene, and the total gene score for all the genes available. First, we calculate a gene score for each gene and second, we test the association between this gene score and the outcome of interest (i.e. trait). Only the gene scores which are significantly associated with the outcome after multiple testing correction for the number of gene tests (not SNPs) are considered in the total gene score calculation. This method controls false positive results caused by multiple tests within genes and between genes separately, and has the advantage of identifying multi-locus genetic effects, compared with the Bonferroni correction, false discovery rate (FDR), and permutation tests for all SNPs. Another main feature of this method is that we select the SNPs, which have different effects within a gene by using adjustment in multiple regressions and then combine the information from the selected SNPs within a gene to create a gene score. A simulation study has been conducted to evaluate finite sample performance of the proposed method.展开更多
针对处理肿瘤基因表达数据特征选择问题,提出了一种特征选择方法 LLE Score.该方法是典型的过滤器类型特征选择方法,在样本类别信息的基础上,LLE Score针对特征向量的局部邻域保存能力进行评价,并且根据评价结果进行特征的选取,以此达...针对处理肿瘤基因表达数据特征选择问题,提出了一种特征选择方法 LLE Score.该方法是典型的过滤器类型特征选择方法,在样本类别信息的基础上,LLE Score针对特征向量的局部邻域保存能力进行评价,并且根据评价结果进行特征的选取,以此达到良好的特征选择效果.在实验部分对肿瘤数据集进行特征选择,并采用支持向量机分类器计算分类准确率.通过分类准确率说明了该方法的有效性.展开更多
BACKGROUND Hepatocellular carcinoma(HCC)is a common cancer with a poor prognosis.Previous studies revealed that the tumor microenvironment(TME)plays an important role in HCC progression,recurrence,and metastasis,leadi...BACKGROUND Hepatocellular carcinoma(HCC)is a common cancer with a poor prognosis.Previous studies revealed that the tumor microenvironment(TME)plays an important role in HCC progression,recurrence,and metastasis,leading to poor prognosis.However,the effects of genes involved in TME on the prognosis of HCC patients remain unclear.Here,we investigated the HCC microenvironment to identify prognostic genes for HCC.AIM To identify a robust gene signature associated with the HCC microenvironment to improve prognosis prediction of HCC.METHODS We computed the immune/stromal scores of HCC patients obtained from The Cancer Genome Atlas based on the ESTIMATE algorithm.Additionally,a risk score model was established based on Differentially Expressed Genes(DEGs)between high and lowimmune/stromal score patients.RESULTS The risk score model consisting of eight genes was constructed and validated in the HCC patients.The patients were divided into high-or low-risk groups.The genes(Disabled homolog 2,Musculin,C-X-C motif chemokine ligand 8,Galectin 3,B-cell-activating transcription factor,Killer cell lectin like receptor B1,Endoglin and adenomatosis polyposis coli tumor suppressor)involved in our risk score model were considered to be potential immunotherapy targets,and they may provide better performance in combination.Functional enrichment analysis showed that the immune response and T cell receptor signaling pathway represented the major function and pathway,respectively,related to the immune-related genes in the DEGs between high-and low-risk groups.The receiver operating characteristic(ROC)curve analysis confirmed the good potency of the risk score prognostic model.Moreover,we validated the risk score model using the International Cancer Genome Consortium and the Gene Expression Omnibus database.A nomogram was established to predict the overall survival of HCC patients.CONCLUSION The risk score model and the nomogram will benefit HCC patients through personalized immunotherapy.展开更多
目的评估基于同种异体移植排斥相关基因(ARRGS)的风险评分模型对肝细胞癌(HCC)患者的预后预测能力,并验证关键基因血管球蛋白(GLMN)在HCC中的表达水平。方法从癌症基因组图谱(The Cancer Genome Atlas,TCGA)数据库下载HCC表达数据,从Mis...目的评估基于同种异体移植排斥相关基因(ARRGS)的风险评分模型对肝细胞癌(HCC)患者的预后预测能力,并验证关键基因血管球蛋白(GLMN)在HCC中的表达水平。方法从癌症基因组图谱(The Cancer Genome Atlas,TCGA)数据库下载HCC表达数据,从Misgdb数据库下载ARRGS基因集,构建ARRGS相关的分子亚型。采用Lasso-COX回归分析建立预后模型,并通过Kaplan-Meier生存分析、时间依赖性ROC曲线及内部验证来评估模型的预测性能。收集2020年1月~2023年12月间,于阜阳民生医院接受根治性手术的12对HCC肿瘤和癌旁组织样本,利用免疫印迹试验(WB)、实时荧光定量PCR(RT-qPCR)和免疫组织化学(IHC)染色验证GLMN在HCC样本中的表达水平,并分析GLMN与HCC患者甲胎蛋白(AFP)水平的相关性。结果基于ARRGS构建了两个分子亚型,Cluster 1表现出较差的预后和较高的免疫浸润水平。利用ARRGS相关差异基因构建了以DYRK3,GLMN,MRPL3和NME1组成的预后模型,对患者1,3,5年的预后具有较高的预测效能。临床样本中,WB,RT-qPCR证实了GLMN的蛋白水平(t=4.275,P=0.023)和mRNA水平(t=8.313,P=0.003)在肿瘤组织中升高,且与患者AFP水平的升高密切相关(r=0.688,P=0.028)。结论基于ARRGS的风险评分模型,能够有效预测HCC患者的预后,而GLMN在HCC肿瘤组织中表达升高,可能成为HCC可靠的诊断标志物。展开更多
文摘The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadvantages of this approach such as excluding SNPs that do not have strong single effects when tested on their own but do have strong joint effects when tested together with other SNPs. The interpretation of results from the traditional gene score may lack biological insight since the functional unit of interest is often the gene, not the single SNP. In this paper we present a new gene scoring method, which overcomes these problems as it generates a gene score for each gene, and the total gene score for all the genes available. First, we calculate a gene score for each gene and second, we test the association between this gene score and the outcome of interest (i.e. trait). Only the gene scores which are significantly associated with the outcome after multiple testing correction for the number of gene tests (not SNPs) are considered in the total gene score calculation. This method controls false positive results caused by multiple tests within genes and between genes separately, and has the advantage of identifying multi-locus genetic effects, compared with the Bonferroni correction, false discovery rate (FDR), and permutation tests for all SNPs. Another main feature of this method is that we select the SNPs, which have different effects within a gene by using adjustment in multiple regressions and then combine the information from the selected SNPs within a gene to create a gene score. A simulation study has been conducted to evaluate finite sample performance of the proposed method.
文摘针对处理肿瘤基因表达数据特征选择问题,提出了一种特征选择方法 LLE Score.该方法是典型的过滤器类型特征选择方法,在样本类别信息的基础上,LLE Score针对特征向量的局部邻域保存能力进行评价,并且根据评价结果进行特征的选取,以此达到良好的特征选择效果.在实验部分对肿瘤数据集进行特征选择,并采用支持向量机分类器计算分类准确率.通过分类准确率说明了该方法的有效性.
基金Supported by National Natural Science Foundation of China,No.81972255,No.81772597 and No.81672412Guangdong Natural Science Foundation,No.2017A030311002+4 种基金Guangdong Science and Technology Foundation,No.2017A020215196Fundamental Research Funds for the Central Universities of Sun YatSen University,No.17ykpy44Science Foundation of Jiangxi,No.20181BAB214002Education Department Science and Technology Foundation of Jiangxi,No.GJJ170936Grant from Guangdong Science and Technology Department,No.2017B030314026
文摘BACKGROUND Hepatocellular carcinoma(HCC)is a common cancer with a poor prognosis.Previous studies revealed that the tumor microenvironment(TME)plays an important role in HCC progression,recurrence,and metastasis,leading to poor prognosis.However,the effects of genes involved in TME on the prognosis of HCC patients remain unclear.Here,we investigated the HCC microenvironment to identify prognostic genes for HCC.AIM To identify a robust gene signature associated with the HCC microenvironment to improve prognosis prediction of HCC.METHODS We computed the immune/stromal scores of HCC patients obtained from The Cancer Genome Atlas based on the ESTIMATE algorithm.Additionally,a risk score model was established based on Differentially Expressed Genes(DEGs)between high and lowimmune/stromal score patients.RESULTS The risk score model consisting of eight genes was constructed and validated in the HCC patients.The patients were divided into high-or low-risk groups.The genes(Disabled homolog 2,Musculin,C-X-C motif chemokine ligand 8,Galectin 3,B-cell-activating transcription factor,Killer cell lectin like receptor B1,Endoglin and adenomatosis polyposis coli tumor suppressor)involved in our risk score model were considered to be potential immunotherapy targets,and they may provide better performance in combination.Functional enrichment analysis showed that the immune response and T cell receptor signaling pathway represented the major function and pathway,respectively,related to the immune-related genes in the DEGs between high-and low-risk groups.The receiver operating characteristic(ROC)curve analysis confirmed the good potency of the risk score prognostic model.Moreover,we validated the risk score model using the International Cancer Genome Consortium and the Gene Expression Omnibus database.A nomogram was established to predict the overall survival of HCC patients.CONCLUSION The risk score model and the nomogram will benefit HCC patients through personalized immunotherapy.
基金supported by grants from National Natural Science Foundation of China(81171280)Biomedical Engineering Cross Research Foundation of Shanghai Jiao Tong University(YG2013MS65)
文摘目的评估基于同种异体移植排斥相关基因(ARRGS)的风险评分模型对肝细胞癌(HCC)患者的预后预测能力,并验证关键基因血管球蛋白(GLMN)在HCC中的表达水平。方法从癌症基因组图谱(The Cancer Genome Atlas,TCGA)数据库下载HCC表达数据,从Misgdb数据库下载ARRGS基因集,构建ARRGS相关的分子亚型。采用Lasso-COX回归分析建立预后模型,并通过Kaplan-Meier生存分析、时间依赖性ROC曲线及内部验证来评估模型的预测性能。收集2020年1月~2023年12月间,于阜阳民生医院接受根治性手术的12对HCC肿瘤和癌旁组织样本,利用免疫印迹试验(WB)、实时荧光定量PCR(RT-qPCR)和免疫组织化学(IHC)染色验证GLMN在HCC样本中的表达水平,并分析GLMN与HCC患者甲胎蛋白(AFP)水平的相关性。结果基于ARRGS构建了两个分子亚型,Cluster 1表现出较差的预后和较高的免疫浸润水平。利用ARRGS相关差异基因构建了以DYRK3,GLMN,MRPL3和NME1组成的预后模型,对患者1,3,5年的预后具有较高的预测效能。临床样本中,WB,RT-qPCR证实了GLMN的蛋白水平(t=4.275,P=0.023)和mRNA水平(t=8.313,P=0.003)在肿瘤组织中升高,且与患者AFP水平的升高密切相关(r=0.688,P=0.028)。结论基于ARRGS的风险评分模型,能够有效预测HCC患者的预后,而GLMN在HCC肿瘤组织中表达升高,可能成为HCC可靠的诊断标志物。