摘要
从对突发事件新闻结构特点的分析出发,进行了特定领域文本分类方法的探讨。其中根据文本特点,摒除信息量小的部分,将标题、首部等作为标引源,提高了分类速度。在特征抽取中综合考虑字与词对于文本分类的作用,提高了分类精度。
This paper discusses the special domain text classification method by analyzing the news structure of accident. It gets rid of the part that is not important according to the text characteristics and improves the speed of classification by using title and heading as source of reference. It improves the precision of classification by the integration of Chinese character feature and word feature in feature extraction.
出处
《长治学院学报》
2006年第2期34-35,共2页
Journal of Changzhi University
关键词
文本分类
突发事件新闻
特征抽取
特征组合
text classification
accident news
feature extraction
feature combination