期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
LRP:learned robust data partitioning for efficient processing of large dynamic queries
1
作者 Pengju LIU Pan CAI +2 位作者 Kai ZHONG Cuiping LI Hong CHEN 《Frontiers of Computer Science》 2025年第9期43-60,共18页
The interconnection between query processing and data partitioning is pivotal for the acceleration of massive data processing during query execution,primarily by minimizing the number of scanned block files.Existing p... The interconnection between query processing and data partitioning is pivotal for the acceleration of massive data processing during query execution,primarily by minimizing the number of scanned block files.Existing partitioning techniques predominantly focus on query accesses on numeric columns for constructing partitions,often overlooking non-numeric columns and thus limiting optimization potential.Additionally,these techniques,despite creating fine-grained partitions from representative queries to enhance system performance,experience from notable performance declines due to unpredictable fluctuations in future queries.To tackle these issues,we introduce LRP,a learned robust partitioning system for dynamic query processing.LRP first proposes a method for data and query encoding that captures comprehensive column access patterns from historical queries.It then employs Multi-Layer Perceptron and Long Short-Term Memory networks to predict shifts in the distribution of historical queries.To create high-quality,robust partitions based on these predictions,LRP adopts a greedy beam search algorithm for optimal partition division and implements a data redundancy mechanism to share frequently accessed data across partitions.Experimental evaluations reveal that LRP yields partitions with more stable performance under incoming queries and significantly surpasses state-of-the-art partitioning methods. 展开更多
关键词 data partitioning data encoding query prediction beam search data redundancy
原文传递
Query Performance Prediction for Information Retrieval Based on Covering Topic Score 被引量:3
2
作者 郎皓 王斌 +3 位作者 Gareth Jones 李锦涛 丁凡 刘宜轩 《Journal of Computer Science & Technology》 SCIE EI CSCD 2008年第4期590-601,共12页
We present a statistical method called Covering Topic Score (CTS) to predict query performance for information retrieval. Estimation is based on how well the topic of a user's query is covered by documents retrieve... We present a statistical method called Covering Topic Score (CTS) to predict query performance for information retrieval. Estimation is based on how well the topic of a user's query is covered by documents retrieved from a certain retrieval system. Our approach is conceptually simple and intuitive, and can be easily extended to incorporate features beyond bag- of-words such as phrases and proximity of terms. Experiments demonstrate that CTS significantly correlates with query performance in a variety of TREC test collections, and in particular CTS gains more prediction power benefiting from features of phrases and proximity of terms. We compare CTS with previous state-of-the-art methods for query performance prediction including clarity score and robustness score. Our experimental results show that CTS consistently performs better than, or at least as well as, these other methods. In addition to its high effectiveness, CTS is also shown to have very low computational complexity, meaning that it can be practical for real applications. 展开更多
关键词 information storage and retrieval information search and retrieval query performance prediction coveringtopic score
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部