摘要
本文详细介绍Search 2 0 0 0中文检索系统的设计思想及实现方法。与传统的全文检索系统相比 ,基于WEB的信息检索系统 ,具有许多全新的特征。页面为半结构化文档、页面通过超链接相互关联、页面的内容覆盖不同应用领域并且拥有大量专有名词和缩略词汇 ,这些特性成为影响查询精度的主要因素。针对Web的上述特性设计的Search2 0 0 0全文检索系统 ,使用智能化的页面相关分析、评分技术 ,以及高效数据存取、压缩算法和知识库的支持 ,使其具有使用方便、查询时间短、查询精度高等特点。
This paper introduces the design and implementation of Web based Chinese text retrieval system Search2000 in detail.Compared with traditional full text retrieval systems,the Web based text retrieval systems have lots of new properties.The Web pages are semi structured documents and are connected through hyperlinks.The different Web sites and different Web pages may cover different application domains,so there are lots of new words and phrases,such as the proper names and domain terminology,which affect the further improvement of the query precision.Based on the above analysis,a new search scheme based on the intelligent relevant analyzing and scoring,efficient accessing of the index and knowledge databases has been designed for the Search2000 system so as to improve the query precision and reduce the response time.
出处
《中文信息学报》
CSCD
北大核心
2000年第6期14-20,共7页
Journal of Chinese Information Processing
基金
国家青年自然科学基金!(6 99830 0 9)