摘要
本文给出了一种将词类信息融入三元文法模型的汉语组合语言模型。理论分析和实验均表明:该模型不仅复杂度低于三元文法模型,而且对测试文本域的依赖性也优于前者。
A kind of Chinese combined language model,that takes into account POS(part of speech)information in a trigram-based statistical language model, is presented in this paper. The theoretical analysis and experiments all show that the model not only is lower than trigram model in PP(perplexity), but also is superior to trigram model in dependence on test text domain.
基金
国家自然科学基金
省自然科学基金