期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Context Information and Fragments Based Cross-Domain Word Segmentation 被引量:8
1
作者 Huang Degen Tong Deqin 《China Communications》 SCIE CSCD 2012年第3期49-57,共9页
A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOV... A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOVs).After the initial segmentation,the segmentation fragments are divided into two classes as "combination"(combining several fragments as an unknown word) and "segregation"(segregating to some words).So,more OOVs can be recalled.Moreover,for the characteristics of the cross-domain segmentation,context information is reasonably used to guide Chinese Word Segmentation(CWS).This method is proved to be effective through several experiments on the test data from Sighan Bakeoffs 2007 and Bakeoffs 2010.The rates of OOV recall obtain better performance and the overall segmentation performances achieve a good effect. 展开更多
关键词 cross-domain CWS Conditional Ran-dem Fields(CRFs) joint decoding context variables segmentation fragments
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部