摘要
汉语自动分词是中文信息处理的首要工作。衡量一个分词系统性能优劣指标主要有两个,一个是切分的速度,一个是切分的精度。本文提出的基于知识评价的汉语自动分词算法,可大大提高系统的切分速度,而且利用基于复杂特征集的规则、模式等可处理掉大部分切分歧义。最后,本文对消歧提出了一些设想。
Chinese automatic word segmentation is the first work in Chinese information processing.There are two factors in the evaluation of a Chinese word segmentation system,one is the speed of the segmentation,the other is the accuracy of the segmentation.In this paper,the structure of diction ary and the algorithm of a Chinese automatic word segmentation are presented,these approaches can increase the speed of word segmentation greatly,and can correct the greater part of segmentation ambiguity by using the rule based on complex features.In the end,some conceive plans are proposed.
出处
《情报学报》
CSSCI
北大核心
1996年第2期95-105,共11页
Journal of the China Society for Scientific and Technical Information