摘要
查询分类是近年来信息检索领域的研究热点,并且在很多领域得到了广泛地关注。主要讨论根据查询的意图进行分类的研究工作,从查询分类的诞生背景、关键技术、所使用的分类方法和评价方法方面进行综述评论,提出了查询意图分类面临的问题和挑战。认为缺乏权威的评测标准、在大规模数据集上的未经全面测试的性能、如何准确地获取查询的特征以及如何证明分类体系的完备性和独立性是目前查询意图分类研究的关键问题。
An increasing number of researches have been focused on the classification of web queries in recent years. This article centers around the researches on automatic query classification according to query intention. It presents a survey of the background of query classification, its key techniques, the classification algorithms and the evaluation methods. And it outlines the problems and challenges in query intention classification, i.e. lack of authoritative e valuation method, the inadequate performance comparisons on large scale dataset, the acquisition of accurate query features, and the issues in the completeness and objectivity of a category system.
出处
《中文信息学报》
CSCD
北大核心
2008年第4期75-82,共8页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60603094)
国家973课题资助项目(2004CB318109)
关键词
计算机应用
中文信息处理
自动查询分类
查询意图分类
分类方法
数据集
特征提取
机器学习
computer application
Chinese information processing
automatic query classification
query intention classification
classification approach
dataset
feature extraction
machine learning