目的采用生物信息学方法构建并验证肺腺癌免疫基因预后风险模型,探究该模型对早期肺腺癌患者预后的预测潜力。方法将癌症基因组图谱(The Cancer Genome Atlas,TCGA)来源的肺腺癌及正常组织数据作为训练集,将基因表达综合数据库(Gene Exp...目的采用生物信息学方法构建并验证肺腺癌免疫基因预后风险模型,探究该模型对早期肺腺癌患者预后的预测潜力。方法将癌症基因组图谱(The Cancer Genome Atlas,TCGA)来源的肺腺癌及正常组织数据作为训练集,将基因表达综合数据库(Gene Expression Omnibus,GEO)来源的肺腺癌及正常组织数据作为测试集。根据免疫学数据库和分析平台(Immunology Database and Analysis Portal,ImmPort)中提供的免疫相关基因,利用生物信息学手段根据训练集数据建立预后风险模型并在测试集中进行外部验证。采用该模型对38例临床早期肺腺癌患者的转录组数据进行分析,评估高、低危组患者的临床病理参数差异。结果构建了包含12个差异表达免疫基因(CYBB、ARG2、UTS2、LIFR、SHC3、CTLA4、FGF2、SEMA7A、INHA、GPI、ANGPTL4、TNFRSF11A)的肺腺癌预后风险模型;在训练集中,该模型ROC曲线下面积(AUC^(ROC))为0.759;在测试集中,AUC^(ROC)为0.707。对于训练集及测试集中的早期肺腺癌患者,该模型也有良好的预后预测能力。在早期肺腺癌临床样本中,高风险患者与更大的肿瘤直径及更差的病理分型有关。结论该模型在训练集及测试集中都表现出良好的预后预测能力,并对临床早期肺腺癌患者预后有一定的提示作用。这些免疫基因能够为早期肺腺癌诊断、患者预后评估及新的治疗靶点研究提供方向。展开更多
In this paper, with the spatial analysis functions in ArcGIS and the county-level census data of 2000 in China, the population density map was divided and shown by classes, meanwhile, the map system of population dist...In this paper, with the spatial analysis functions in ArcGIS and the county-level census data of 2000 in China, the population density map was divided and shown by classes, meanwhile, the map system of population distribution and a curve of population centers were formed; in accordance with the geographical proximity principle, the classes of population densities were reclassified and a population density map was obtained which had the spatial clustering characteristic. The multi-layer superposition based on the population density classification shows that the population densities become denser from the Northwest to the Southeast; the multi-layer clustering phenomenon of the Chinese population distribution is obvious, the populations have a water-based characteristic gathering towards the rivers and coastlines. The curve of population centers shows the population densities transit from the high density region to the low one on the whole, while in low-density areas there are relatively dense areas, and in high-density areas there are relatively sparse areas. The reclassification research on the population density map based on the curve of population centers shows that the Chinese population densities can be divided into 9 classes, hereby, the geographical distribution of Chinese population can be divided into 9 type regions: the concentration core zone, high concentration zone, moderate concentration zone, low concentration zone, general transitional zone, relatively sparse area, absolute sparse area, extreme sparse area, and basic no-man's land. More than 3/4 of the population of China is concentrated in less than 1/5 of the land area, and more than half of the land area is inhabited by less than 2% of the population, the result reveals a better space law of China's population distribution.展开更多
高熵合金由于其形成独特显微组织的固溶体、金属间化合物和非晶相而具有更好的物理化学性能.因此,高熵合金中的相预测是合金设计的第一步.采用机器学习算法中的支持向量机、随机森林和决策树3种模型对高熵合金的相位分类进行预测,通过...高熵合金由于其形成独特显微组织的固溶体、金属间化合物和非晶相而具有更好的物理化学性能.因此,高熵合金中的相预测是合金设计的第一步.采用机器学习算法中的支持向量机、随机森林和决策树3种模型对高熵合金的相位分类进行预测,通过网格搜索方法优化模型,并对模型进行交叉验证和性能评估.结果表明:随机森林的预测能力最佳,达到0.93的预测精度,且该模型对高熵合金固溶体相的分类效果最好,最后采用随机森林模型预测Ti Zr Nb Mo系难熔高熵合金的生成相,其预测生成相与实验结果一致.由此可见,机器学习技术对未来高熵合金的设计有很大的帮助.展开更多
文摘目的采用生物信息学方法构建并验证肺腺癌免疫基因预后风险模型,探究该模型对早期肺腺癌患者预后的预测潜力。方法将癌症基因组图谱(The Cancer Genome Atlas,TCGA)来源的肺腺癌及正常组织数据作为训练集,将基因表达综合数据库(Gene Expression Omnibus,GEO)来源的肺腺癌及正常组织数据作为测试集。根据免疫学数据库和分析平台(Immunology Database and Analysis Portal,ImmPort)中提供的免疫相关基因,利用生物信息学手段根据训练集数据建立预后风险模型并在测试集中进行外部验证。采用该模型对38例临床早期肺腺癌患者的转录组数据进行分析,评估高、低危组患者的临床病理参数差异。结果构建了包含12个差异表达免疫基因(CYBB、ARG2、UTS2、LIFR、SHC3、CTLA4、FGF2、SEMA7A、INHA、GPI、ANGPTL4、TNFRSF11A)的肺腺癌预后风险模型;在训练集中,该模型ROC曲线下面积(AUC^(ROC))为0.759;在测试集中,AUC^(ROC)为0.707。对于训练集及测试集中的早期肺腺癌患者,该模型也有良好的预后预测能力。在早期肺腺癌临床样本中,高风险患者与更大的肿瘤直径及更差的病理分型有关。结论该模型在训练集及测试集中都表现出良好的预后预测能力,并对临床早期肺腺癌患者预后有一定的提示作用。这些免疫基因能够为早期肺腺癌诊断、患者预后评估及新的治疗靶点研究提供方向。
基金National Project of Strategic Research on Population Development, No.F2005-5
文摘In this paper, with the spatial analysis functions in ArcGIS and the county-level census data of 2000 in China, the population density map was divided and shown by classes, meanwhile, the map system of population distribution and a curve of population centers were formed; in accordance with the geographical proximity principle, the classes of population densities were reclassified and a population density map was obtained which had the spatial clustering characteristic. The multi-layer superposition based on the population density classification shows that the population densities become denser from the Northwest to the Southeast; the multi-layer clustering phenomenon of the Chinese population distribution is obvious, the populations have a water-based characteristic gathering towards the rivers and coastlines. The curve of population centers shows the population densities transit from the high density region to the low one on the whole, while in low-density areas there are relatively dense areas, and in high-density areas there are relatively sparse areas. The reclassification research on the population density map based on the curve of population centers shows that the Chinese population densities can be divided into 9 classes, hereby, the geographical distribution of Chinese population can be divided into 9 type regions: the concentration core zone, high concentration zone, moderate concentration zone, low concentration zone, general transitional zone, relatively sparse area, absolute sparse area, extreme sparse area, and basic no-man's land. More than 3/4 of the population of China is concentrated in less than 1/5 of the land area, and more than half of the land area is inhabited by less than 2% of the population, the result reveals a better space law of China's population distribution.
文摘高熵合金由于其形成独特显微组织的固溶体、金属间化合物和非晶相而具有更好的物理化学性能.因此,高熵合金中的相预测是合金设计的第一步.采用机器学习算法中的支持向量机、随机森林和决策树3种模型对高熵合金的相位分类进行预测,通过网格搜索方法优化模型,并对模型进行交叉验证和性能评估.结果表明:随机森林的预测能力最佳,达到0.93的预测精度,且该模型对高熵合金固溶体相的分类效果最好,最后采用随机森林模型预测Ti Zr Nb Mo系难熔高熵合金的生成相,其预测生成相与实验结果一致.由此可见,机器学习技术对未来高熵合金的设计有很大的帮助.