摘要
置信度指的是一个问题回答系统(QA系统)对其所作回答的自信程度.描述了一种基于最大熵模型的算法.首先,从训练语料中提取若干因素来训练最大熵模型;然后应用训练好的模型在测试集上计算置信度.在2002年度的文本检索会议(TREC)中,QA系统用该算法计算每个问题答案的置信度,并依此排序,获得了显著的成绩.
Confidence score describes how confident a question-answering system is about its response. This paper presents a Maximum Entropy Model based algorithm which uses several factors to train an ME model, and then the ME model is used to calculate the confidence of other questions. Efficiency of this method has been proved by the TRECll's QA evaluation, where the performance of the system has been improved dramatically after confidence ranking.
出处
《软件学报》
EI
CSCD
北大核心
2005年第8期1407-1414,共8页
Journal of Software
基金
No.60435020国家自然科学基金
No.035115028上海市科委重点项目~~
关键词
自然语言处理
信息检索
问答系统
最大熵模型
置信度
natural language processing
information retrieval
question-answering system
maximum entropy model
confidence score