期刊文献+

书面藏语排序的数学模型及算法 被引量:25

The Sorting Mathematical Model and Algorithm of Written Tibetan Language
在线阅读 下载PDF
导出
摘要 针对中国国家标准及ISO藏文编码字符集提出书面藏语字词的排序涉及藏字结构序、构造级和字符序概念 ,是不同于中文、英文序性而性质独特的一种排序 .文章详尽分析了藏字字形、结构形态、传统字符顺序以及藏字字长和层高等特征 ,构建出藏语排序的数学模型 .然后依据模型要求为每类藏文符号进行数字赋值 ,通过算法逐步确定字符位置并识别字符 ,最后按照抽取字符的对应数值组合排序 ,完成了藏语字词的排序 .该模型现已在Win dows平台上实现 . According to GB16959-1997 and ISO/IEC 10646-1:1993 of coded character set for Tibetan information processing, there is an engineering need for applying the set to all kinds of software and databases, in which sorting is an important technology. As Tibetan sorting involves construction order, classes of constitution and character sequence in the dictionary order, A Written Tibetan word has an inconceivably complex structure with multi-hierarchies. The paper makes an exhaustive analysis to the structures of words, the order of construction categories, and the sequence of characters in each structural position, as well as the length of words and the hierarchies of vertical composition stacks, and then establishes a sorting mathematical model. On the basis of the analysis, the paper assigns distinctive values to all existing characters with numerals in a word, then step by step identifies each character in the words with special algorithm and match it with character-numeral lists. At last, the paper combines all the values extracted from characters of words and compares different combination to make an ordered arrangement for any words in Tibetan language. This processing strategy has been accomplished in Windows 2000/NT Operating System.
作者 江荻 康才晙
出处 《计算机学报》 EI CSCD 北大核心 2004年第4期524-529,共6页 Chinese Journal of Computers
基金 国家自然科学基金 ( 60 173 0 2 4)资助
关键词 藏字 结构序 构造级 字符序 计算机排序 数学模型 written Tibetan construction order classes of constitution character sequence sorting by computer
  • 相关文献

参考文献3

  • 1National Standard of PRC. Information Technology, Tibeyan Coded Character Sets for Information Interchange, Basic Set(GB 16959-1997). Beijing: Standards Press of China, 1998(in Chinese)(中华人民共和国国家标准. 信息技术、信息交换用藏文编码字符集、基本集(GB16959-1997). 北京:中国标准出版社,1998)
  • 2ISO/IEC 10646-1:1993:Information Technology-Universal Multiple-Octet Coded Character(UCS)
  • 3江荻,周季文.论藏文的序性及排序方法[J].中文信息学报,2000,14(1):56-64. 被引量:34

二级参考文献3

  • 1[1]中国国家标准.信息技术信息交换用藏文编码字符集基本集(GB6959).北京:中国标准出版社,1997
  • 2[2]张怡荪.藏汉大词典.北京:民族出版社,1985
  • 3[3]周季文.藏文拼音教材.北京:民族出版社,1983

共引文献33

同被引文献163

引证文献25

二级引证文献82

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部