摘要
本文对汉语短语结构的定界歧义做了全面考察,从歧义格式的组成成分,歧义对外造成的影响,模式歧义和实例歧义的对应关系三方面考察了短语结构定界歧义的不同类型,并对汉语短语结构定界歧义的不同类型进行了初步统计。希望能将计算机处理汉语时碰到的短语结构边界歧义问题进一步清晰化,供理论研究者和应用系统开发人员参考。
This paper analyses
the ambiguity of determining boundaries of Chinese phrases in automatic parsing by computer.
The type of ambiguity can be classified from three different perspectives. As viewed from
component of ambiguous structures, ambiguous phrases can be classified into two kinds: one
including terminal symbols, the other not including terminal symbols but only non-terminal
symbols. As viewed from the influence of ambiguity, ambiguous phrases can also be classified
into two kinds: self-confined ambiguous phrases and non-self-confined ambiguous phrases. The
influence of the former ambiguity is mainly inside the ambiguous phrases. The influence of the
latter ambiguity is outside of the ambiguous phrases. As viewed from differentiated types of
relation between type and token, ambiguous phrases can be classified into three kinds: the
true-ambiguity, the quasi-ambiguity, and the pseudo-ambiguity. Furthermore, the distribution of
these types of ambiguous phrases in Modern Chinese is also surveyed depending on the
above analysis and a set of rules used for a Chinese-English Machine Translation system. The
authors hope that the analysis on various types of ambiguities mentioned above conduces to
solve the problem of phrase structure ambiguities in Chinese.
出处
《中文信息学报》
CSCD
北大核心
1999年第3期9-17,共9页
Journal of Chinese Information Processing
基金
国家863项目