In the last two decades of the 20th century, there has been an increasing interest in and emphasis on the study of the Hong Kong literature in both the academic and general public in Hong Kong. Recognizing the emergen...In the last two decades of the 20th century, there has been an increasing interest in and emphasis on the study of the Hong Kong literature in both the academic and general public in Hong Kong. Recognizing the emergent need of the resources on Hong Kong literature, the University Library System of the Chinese University of Hong Kong set up the Hong Kong Literature Database (the “Database”), which was the first Chinese literature database in the Internet in 2000. The paper will examine how the database is constructed using XML technology andometadata schema, The database also employs Unicode UTF-8 as the internal code. A mapping table for traditional and simplified Chinese characters was created based on Unihan and is used behind the scene so that a user can either input traditional or simplified Chinese characters and retrieval will give both traditional and simplified Chinese characters. Currently 65% of journals use OCR technology so that full-text searching is possible. The Chinese OCR technology will be examined in greater detail. Special features of the Database such as, page-by-page browse mode, position-highlight for full-page newspaper, linking Table-Of-Contents and book jackets from the Library catalogue, etc. are described. The paper will also bring out the problem of massive downloading and compare the state-of-the-art technology and their shortcomings. This paper shows how the Hong Kong Literature Database facilitates future collaboration and data exchange by using open standard, shareable structure and the latest technology.展开更多
With the increasing number of Chinese learners each year,the interna-tional influence of the Chinese language has also grown.Consequently,Chinese auxiliary learning tools should align more closely with the needs of pre...With the increasing number of Chinese learners each year,the interna-tional influence of the Chinese language has also grown.Consequently,Chinese auxiliary learning tools should align more closely with the needs of present-day learners.This paper focuses on a typical Chinese dictionary app and conducts tests and comparisons of the optical character recognition(OCR)function.Specifically,it compares the OCR function of a widely used Chinese learning dictionary with the more advanced and sophisticated mobile built-in OCR recognition technology available today.By identifying the technical gap,the paper proposes a design that incorporates deep learning techniques to enhance the built-in OCR function of the dictionary.This improvement aims to significantly enhance recognition accuracy in natural scenarios and ultimately enhance the user experience of the Chinese learning dictionary app.展开更多
文摘In the last two decades of the 20th century, there has been an increasing interest in and emphasis on the study of the Hong Kong literature in both the academic and general public in Hong Kong. Recognizing the emergent need of the resources on Hong Kong literature, the University Library System of the Chinese University of Hong Kong set up the Hong Kong Literature Database (the “Database”), which was the first Chinese literature database in the Internet in 2000. The paper will examine how the database is constructed using XML technology andometadata schema, The database also employs Unicode UTF-8 as the internal code. A mapping table for traditional and simplified Chinese characters was created based on Unihan and is used behind the scene so that a user can either input traditional or simplified Chinese characters and retrieval will give both traditional and simplified Chinese characters. Currently 65% of journals use OCR technology so that full-text searching is possible. The Chinese OCR technology will be examined in greater detail. Special features of the Database such as, page-by-page browse mode, position-highlight for full-page newspaper, linking Table-Of-Contents and book jackets from the Library catalogue, etc. are described. The paper will also bring out the problem of massive downloading and compare the state-of-the-art technology and their shortcomings. This paper shows how the Hong Kong Literature Database facilitates future collaboration and data exchange by using open standard, shareable structure and the latest technology.
文摘With the increasing number of Chinese learners each year,the interna-tional influence of the Chinese language has also grown.Consequently,Chinese auxiliary learning tools should align more closely with the needs of present-day learners.This paper focuses on a typical Chinese dictionary app and conducts tests and comparisons of the optical character recognition(OCR)function.Specifically,it compares the OCR function of a widely used Chinese learning dictionary with the more advanced and sophisticated mobile built-in OCR recognition technology available today.By identifying the technical gap,the paper proposes a design that incorporates deep learning techniques to enhance the built-in OCR function of the dictionary.This improvement aims to significantly enhance recognition accuracy in natural scenarios and ultimately enhance the user experience of the Chinese learning dictionary app.