摘要
该文提出一种基于粗糙集增量式规则自动学习来实现问题分类的方法,通过深入提取问句特征并采用决策表形式构建训练语料,利用机器学习的方法自动获取分类规则。与其他方法相比优势在于,用于分类的规则自动生成,并采用粗糙集理论的简约方法获得优化的最小规则集;首次在问题分类中引入增量式学习理念,不但提高了分类精度,而且避免了繁琐的重新训练过程,大大提高了学习速度,并且提高了分类的可扩展性和适应性。对比实验表明,该方法分类精度高,适应性好。在国际TREC2005Q/A实际评测中表现良好。
This paper presents a method on automatic question classification through incremental rule learning based on rough set theory. The core of the method is appling the machine learning approach to gain classified rules automatically through extract the features of query sentence thoroughly, and the decision table is used to construct the training collection. Comparing with the alternative means, the superiority is that it acquires the classified rule automatically and uses the rough set method to obtain the optimized smallest rule set. Especially, the incremental learning is induced to improve the precision and avoid the tedious re-training process. The performance of the approach is promising, when tested on opposite test. Meanwhile, the method obtains a very good result in the international TREC2005 Q/A track.
出处
《电子与信息学报》
EI
CSCD
北大核心
2008年第5期1127-1130,共4页
Journal of Electronics & Information Technology
基金
国家自然科学基金重点项目(60435020)
国家自然科学基金项目(60504021)
国家863目标导向类课题(2006AA01Z197)资助课题
关键词
粗糙集
问题分类
增量式学习
决策表
特征选择
Rough set
Question classification
Incremental learning
Decision table
Feature selection