摘要
国际标准ISO/DP 10646中,把文字规定为用于书写语言的图形字符的完备集。本文作者认为,这个完备集应该是有序完备集。词典中的词以及其它类型的字符串习惯上总按确定的顺序排列。本文讨论了英文、拉丁壮文、欧洲拉丁字母系文字、蒙文、阿拉伯文和维吾尔文,朝鲜文的序性。讨论着重于编码字符串的序性与传统词典顺序的一致性。不幸的是,除英文外,前述的许多文字缺少这种一致性。字符集的序性在许多编码标准中被忽视了。实际上,除藏文外的文字,编码字符串与传统词典序的一致性大多可以通过合理编码获得。
Keywords. Order-complete set, Code character set, Code standard, Multi-lingual software.In the international standard ISO/DP 10646, a script is defined as a complete setof graphic characters used for the written form of one or more Languages. In our opinion,'a complete set'should be'an order-complete set' .Words in a dictionary as well as other character strings of all kinds are always arranged according to 'a definite traditional order. This paper discusses the order of characters of some scripts such as English, Latin .Zhuang writing system (used in Guangxi in China),latin writing systems for some European Languages, Mongolian script, Arabic script, Uygur script, Korean hangul, Tibetan writing and so on. Our discussion is focused on the consistence between the order of strings of coded characters and the traditional order of words in a dictionary. Unfortunately there is no consistence for many aforesaid scripts except English. The order of coded character is ignored in many code standards. Actually the aforementioned consistence can be achieved by reasonably coding for many phonetic scripts except Tibetan.
出处
《中文信息学报》
CSCD
1991年第1期28-35,共8页
Journal of Chinese Information Processing
基金
中国自然科学基金