To improve efficiency of search engines, the query result cache has drawn much attention re- cently. According to the query processing and user's query logs locality, a new hybrid result cache strategy which associat...To improve efficiency of search engines, the query result cache has drawn much attention re- cently. According to the query processing and user's query logs locality, a new hybrid result cache strategy which associates with caching heat and worth is proposed to compute cache score in accord- ance with cost-aware strategies. Exactly, query repeated distance and query length factor are utilized to improve the static result policy, and the dynamic policy is adjusted by the caching worth. The hy- brid result cache is implemented in term of the document content and document ids (docIds) se- quence. Based on a score format and the new hybrid structure, an initial algorithm and a new rou- ting algorithm are designed for result cache. Experiments' results show that the improved caching policies decrease the average response time effectively, and increase the system throughput signifi- cantly. By choosing comfortable combination of page cache and docIds cache, the new hybrid cac- hing strategy almost reduces more than 20% of the only cache and docId-only cache. average query time compared with the basic page-展开更多
The problem of continuously monitoring multiple K-nearest neighbor (K-NN) queries with dynamic object and query dataset is valuable for many location-based applications. A practical method is to partition the data spa...The problem of continuously monitoring multiple K-nearest neighbor (K-NN) queries with dynamic object and query dataset is valuable for many location-based applications. A practical method is to partition the data space into grid cells, with both object and query table being indexed by this grid structure, while solving the problem by periodically joining cells of objects with queries having their influence regions intersecting the cells. In the worst case, all cells of objects will be accessed once. Object and query cache strategies are proposed to further reduce the I/O cost. With object cache strategy, queries remaining static in current processing cycle seldom need I/O cost, they can be returned quickly. The main I/O cost comes from moving queries, the query cache strategy is used to restrict their search-regions, which uses current results of queries in the main memory buffer. The queries can share not only the accessing of object pages, but also their influence regions. Theoretical analysis of the expected I/O cost is presented, with the I/O cost being about 40% that of the SEA-CNN method in the experiment results.展开更多
Due to the proliferation of Internet and Intranet,the distributed storage systems have received a lot of attention. These systems span a large number of machines and store huge amount of data for a lot of users.In the...Due to the proliferation of Internet and Intranet,the distributed storage systems have received a lot of attention. These systems span a large number of machines and store huge amount of data for a lot of users.In the distributed storage systems,a row can be directly accessed using a row key.We concentrate on a problem of efficient processing of queries whose predicate is on a column but not a row key.In this paper,we present a cache management technique,called DICE which maintains query results of range queries to support the next range queries.To accelerate the search time of the cached query results,we use modified Interval Ski Lists.In addition,we devise a novel cache replacement policy since DICE maintains an interval rather than a data item.Since our cache replacement policy considers the properties of intervals,our proposed technique is more efficient than traditional buffer replacement algorithms.Our experimental result demonstrates the efficiency of our proposed technique.展开更多
基金Supported by the National Natural Science Foundation of China(No.61173024)
文摘To improve efficiency of search engines, the query result cache has drawn much attention re- cently. According to the query processing and user's query logs locality, a new hybrid result cache strategy which associates with caching heat and worth is proposed to compute cache score in accord- ance with cost-aware strategies. Exactly, query repeated distance and query length factor are utilized to improve the static result policy, and the dynamic policy is adjusted by the caching worth. The hy- brid result cache is implemented in term of the document content and document ids (docIds) se- quence. Based on a score format and the new hybrid structure, an initial algorithm and a new rou- ting algorithm are designed for result cache. Experiments' results show that the improved caching policies decrease the average response time effectively, and increase the system throughput signifi- cantly. By choosing comfortable combination of page cache and docIds cache, the new hybrid cac- hing strategy almost reduces more than 20% of the only cache and docId-only cache. average query time compared with the basic page-
基金Project (No.ABA048) supported by the Natural Science Foundationof Hubei Province,China
文摘The problem of continuously monitoring multiple K-nearest neighbor (K-NN) queries with dynamic object and query dataset is valuable for many location-based applications. A practical method is to partition the data space into grid cells, with both object and query table being indexed by this grid structure, while solving the problem by periodically joining cells of objects with queries having their influence regions intersecting the cells. In the worst case, all cells of objects will be accessed once. Object and query cache strategies are proposed to further reduce the I/O cost. With object cache strategy, queries remaining static in current processing cycle seldom need I/O cost, they can be returned quickly. The main I/O cost comes from moving queries, the query cache strategy is used to restrict their search-regions, which uses current results of queries in the main memory buffer. The queries can share not only the accessing of object pages, but also their influence regions. Theoretical analysis of the expected I/O cost is presented, with the I/O cost being about 40% that of the SEA-CNN method in the experiment results.
基金supported by National Research Foundation of Korea under Grant No.2010-0016165supported by the IT R&D Program of MIC/IITA under Grant No.2007-S-016-02.
文摘Due to the proliferation of Internet and Intranet,the distributed storage systems have received a lot of attention. These systems span a large number of machines and store huge amount of data for a lot of users.In the distributed storage systems,a row can be directly accessed using a row key.We concentrate on a problem of efficient processing of queries whose predicate is on a column but not a row key.In this paper,we present a cache management technique,called DICE which maintains query results of range queries to support the next range queries.To accelerate the search time of the cached query results,we use modified Interval Ski Lists.In addition,we devise a novel cache replacement policy since DICE maintains an interval rather than a data item.Since our cache replacement policy considers the properties of intervals,our proposed technique is more efficient than traditional buffer replacement algorithms.Our experimental result demonstrates the efficiency of our proposed technique.