摘要
中文分词是中文文本处理和自然语言处理中最基本和最重要的研究,它效果的好坏直接影响到所在领域中进一步研究的效果。本文对于已有的基于词典,基于统计,基于理解的分词方法进行了详细的阐述和讨论,分析了它们的优点和不足,并且介绍了现在的难点,在此基础上,为中文分词的进一步发展提供了建议。
The most basic and most important research in Chinese text processing and natural language processing is word segmentation, the result of which affects the following research in text processing and natural language processing. This paper introduces dictionary based methods, statistics based methods, and understanding based methods for word segmentation by analyzing their advantages and d/sadvantages. This paper introduces the problem of word segmentation and provides suggestions for the further development of the Chinese word segmentation.
出处
《软件》
2012年第12期103-108,共6页
Software
关键词
计算机应用
自然语言处理
中文分词
Computer Application, Nature Language Processing, Chinese Word Segmentation