期刊文献+

基于多表频繁项投票和桶映射链的快速检索方法 被引量:5

A Fast Retrieval Method Based on Frequent Items Voting of Multi Table and Bucket Map Chain
在线阅读 下载PDF
导出
摘要 为解决基于随机映射的高维向量快速检索方法位置敏感哈希存在的随机性强和内存消耗大两个问题,在E2LSH(Exact Euclidean Locality Sensitive Hashing)的基础上提出了基于多表频繁项投票和桶映射链的快速检索方法。该方法用检索结果构造基准索引矩阵,并对基准索引矩阵进行频繁项投票和校正得出最终索引来降低检索的随机性;桶映射链利用E2LSH的数据划分特性减少检索时读入内存的数据点的数目,以此来降低内存消耗。实验证明该方法能减弱检索的随机性,并有效地降低检索的内存消耗。这对于提高大规模信息检索尤其是图像检索的可行性有着较大的作用。 To solve the problem of strong randomicity and high memory cost of fast retrieval method Locality Sensitive Hashing (LSH) based on random projection, a fast retrieval method is presented based on multi table frequent items voting and bucket map chain on the basis of Exact Euclidean Locality Sensitive Hashing (E2LSH). The method constructs an index matrix with retrieval vectors, and performs frequent items voting and calibration on this matrix to decrease the randomocity. It also reduces the number of points loaded into memory by making use of the data partition property of E2LSH to decrease the memory cost. The experiments show that this method can decrease the randomicity and efficiently reduce the memory cost of retrieval. This is very important for increasing the feasibility of large scale information retrieval especially image retrieval.
出处 《电子与信息学报》 EI CSCD 北大核心 2012年第11期2574-2581,共8页 Journal of Electronics & Information Technology
基金 国家自然科学基金(60872142)资助课题
关键词 信息检索 位置敏感哈希 随机性 内存消耗 频繁项投票 桶映射链 Information retrieval Locality Sensitive Hashing (LSH) Randomicity Memory cost Frequent items voting Bucket map chain
  • 相关文献

参考文献24

  • 1Indyk P and Motwani R. Approximate nearest neighbors: towards removing the curse of dimensionality[C]. Proceedings of the Symposium on Theory of Computing, Dallas, Texas, USA, ACM, 1998: 604-613.
  • 2Gionis A, Indyk P, and Motwani R. Similarity search in high dimensions via hashing[C]. Proceedings of the 25th International Conference on Very Large Data Bases. Edinburgh, Scotland, Morgan Kaufmann, 1999: 518-529.
  • 3Andoni A and Indyk P. E2LSH 0.1 user manual[OL], http:// www.mit.edu/-andoni/LSH/manual.pdf, October 20, 2011.
  • 4Datar M, Immorlica N, Indyk P, et al.. Locality sensitive hashing scheme based on p-stable distributions[C]. Proceedings of the ACM Symposium on Computational Geometry, New York, USA, ACM, 2004: 253-262.
  • 5Jegou H, Douze M, and Schmid C. Improving bag-of-features for large scale image search[J]. International Journal of Computer Vision, 2010, 87(3): 316-336.
  • 6Gruman K Efficiently searching for similar images[J]. Communications of the A CM, 2010, 53(6): 85-94.
  • 7Avrithis Y, Tolias G, and Kalantidis Y. Feature map hashing: sub-linear indexing of appearance and global geometry[C]. Proceedings of the International Conference on Multimedia, New York, NY, USA. 2010: 231-240.
  • 8Mu Ya-dong and Yan Shui-cheng. Non-metric locality- sensitive gashing[C]. Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, Georgia, USA, 2010: 539-544.
  • 9Liu Zhu, Liu Tao, and David G. Effective and scalable video copy detection[C]. Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval, New York, USA, 2010: 119-128.
  • 10Cao Yi-qun, Jiang Tao, and Thomas G. Accelerated similarity searching and clustering of large compound sets by geometric embedding and locality sensitive hashing[J]. Bioinfomatics, 2010, 26(7): 953-959.

同被引文献84

  • 1冯兵,李芝棠,花广路.基于灰度—梯度共生矩阵的图像型垃圾邮件识别方法[J].通信学报,2013,34(S2):1-4. 被引量:11
  • 2林海卓,王继龙,吴建平,杨家海,徐聪.高校误判垃圾邮件自动召回系统的研究与实现[J].通信学报,2013,34(S2):121-132. 被引量:1
  • 3BREESE J, HECHERMAN D, KADIE C. Empirical analysis of predictive algorithms for collaborative filtering[ C] // Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers, 1998:43 -52.
  • 4ZHANG S, WANG W, FORD J, et al. Learning from incomplete ratings using non-negative matrix factorization[ C]// proceedings of the 6th SIAM Conference on Data Mining. Philadelphia: SIAM, 2006:549-553.
  • 5CAI R, ZHANG C, ZHANG L, et al. Scalable Music recommenda- tion by search[ C]// Proceedings of the 15th ACM International Conference on Multimedia. New York: ACM, 2007:1065 - 1074.
  • 6DATAR M, IMMORLICA N, INDYK P, et al. Locality-sensitive hashing scheme based on p-stable distributions[ C}//Proceedings of the 20th ACM Symposium on Computational Geometry. New York: ACM, 2004:253-262.
  • 7MAREE R, DENIS P, WEHENKEL L. Incremental indexing and distributed image search using shared randomized vocabularies [ C]//Proceedings of the 2007 ACM SIGMM International Confer- ence on Very Large Data Bases. New York: ACM, 2007:950 - 961.
  • 8ANDONI A, INDYK P. Nearest-optimal hashing algorithms for ap- proximate nearest neighbor in high dimensions[ J]. Communications oftheACM, 2008,51(1): 117 -122.
  • 9ANDONI A, INDYK P. E2LSH: Exact Euclidean Locality-Sensi- tive Hashing (E2LSH) 0.1 user manual[ EB/OL]. [ 2005 - 06 - 01]. http://www, mit. edu/ andoni/ LSH/manual. pdf.
  • 10MALCOLM S, MICHAEL C. Locality-sensitive hashing for finding nearest neighbors [ J]. IEEE Signal Processing Magazine, 2008, 8 (3): 128-131.

引证文献5

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部