摘要
【目的】为促进科研人员间的交流合作,实现科研效率最大化,提出一种改进的翻译模型TransTopic,用于干细胞领域的科研合作预测研究。【方法】TransTopic旨在将科研合作网络中的节点和边统一映射为低维向量。利用LDA主题模型抽取论文的主题分布特征,使用深度自编码器将主题特征编码为边向量,基于翻译机制得到节点向量,通过向量间的语义计算实现科研合作预测。【结果】TransTopic在链接预测上的AUC(95.21%)和MeanRank(17.48)指标均表现最优,并且主题预测的准确率达到86.52%。【局限】合作预测方法仅考虑了一步的翻译路径,并且作者的机构、研究兴趣和发文等级等多元信息没有得到充分的利用。【结论】基于翻译模型的预测方法可以有效完成干细胞领域的科研合作预测工作。
[Objective]This paper proposes a modified translation model(TransTopic)to predict research cooperation,aiming to promote exchanges among researchers and maximize efficiency.[Methods]We used TransTopic to uniformly map the nodes and edges of the scientific research cooperation network to lowdimensional vectors.First,we used the LDA model to extract the topic distribution features of stem cells papers.Then,we turned topic features to edge vectors with the deep autoencoder and obtained node vectors based on the translation mechanism.Finally,we predicted the scientific cooperation through the semantic calculation between the vectors.[Results]TransTopic’s AUC(95.21%)and MeanRank(17.48)indicators for link prediction are better than those of the existing models,and its topic prediction accuracy rate reached 86.52%.[Limitations]The proposed method only considered a one-step translation path,and did not fully utilized information like author’s institution,research interests,and publication levels.[Conclusions]The proposed method based on translation model could effectively predict research cooperation in the field of stem cells.
作者
陈文杰
Chen Wenjie(Chengdu Library and Information Center,Chinese Academy of Sciences,Chengdu 610041,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2020年第10期28-36,共9页
Data Analysis and Knowledge Discovery
基金
中国科学院十三五信息化基金项目“面向干细胞领域知识发现的科研信息化应用”(项目编号:XXH13506)的研究成果之一。