摘要
话题检测与跟踪是一项面向新闻媒体信息流进行未知话题识别和已知话题跟踪的信息处理技术。自从1996年前瞻性的探索以来,该领域进行的多次大规模评测为信息识别、采集和组织等相关技术提供了新的测试平台。由于话题检测与跟踪相对于信息检索、信息挖掘和信息抽取等自然语言处理技术具备很多共性,并面向具备突发性和延续性规律的新闻语料,因此逐渐成为当前信息处理领域的研究热点。本文简要介绍了话题检测与跟踪的研究背景、任务定义、评测方法以及相关技术,并通过分析目前TDT领域的研究现状展望未来的发展趋势。
Topic detection and tracking, as one of natural language processing technologies, is to detect unknown topic and track known topic from the information of news medium. Since its pilot research in 1996, several largescale evaluation conferences have provided a good environment for evaluating technologies of recognition, collection and organization. As topic detection and tracking shares similar challenges with information retrieval, data mining and information extraction in abrupt and successive data, it has become a hot research issue in the field of nature language processing. This paper introduced the background, definition, evaluation and methods in topic detection and tracking, and explored its future development trend through analyzing current research.
出处
《中文信息学报》
CSCD
北大核心
2007年第6期71-87,共17页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60435020
60575042
60503072)
关键词
计算机应用
中文信息处理
综述
话题检测与跟踪
自然语言处理
事件
新闻报道
computer application
Chinese information processing
overview
topic detection and tracking
natural language processing
event
news story