期刊文献+

基于位置敏感哈希的高效近似最近邻检索研究

Research on efficient approximate nearest neighbor search based on Locality-Sensitive Hashing
在线阅读 下载PDF
导出
摘要 位置敏感哈希(Locality-Sensitive Hashing,LSH)是一种有效的随机化技术,广泛应用于众多机器学习任务中。然而,哈希计算的开销与数据维度成正比,因此在高维数据和使用大量哈希函数的场景下,往往成为性能瓶颈。本文在l2范数下设计了一种简单而高效的LSH方法,称为FastLSH。该方法通过结合随机采样与随机投影,将哈希时间复杂度从O(n)降低至O(m)(其中n为数据维度,m<n为采样维度数)。更重要的是,FastLSH保留了可证明的LSH性质。论文在多个真实和合成数据集上进行了针对最近邻搜索任务的全面实验。实验结果表明,FastLSH在查询质量、空间开销和查询效率等方面与当前先进方法表现相当,同时在哈希函数计算上可实现最高达80倍的加速。 Locality-Sensitive Hashing(LSH)is an effective randomized technique and widely used in many machine learning tasks.The cost of hashing is proportional to data dimensions,and thus the performance bottleneck often occurs when dimensionality is high and the number of hash functions involved is large.This paper designs a simple yet efficient LSH scheme,named FastLSH,under l2 norm.By combining random sampling and random projection,FastLSH reduces the time complexity from O(n)to O(m)(m<n),where n is the data dimensionality and m is the number of sampled dimensions.Moreover,FastLSH has provable LSH property.The paper conducts comprehensive experiments over a collection of real and synthetic datasets for the nearest neighbor search task.Experimental results demonstrate that FastLSH is on par with the state-of-the-arts in terms of answer quality,space occupation and query efficiency,while enjoying up to 80×speedup in hash function evaluation.
作者 谭宗元 王洪亚 TAN Zongyuan;WANG Hongya(School of Computer Science and Technology,Donghua University,Shanghai 201620,China)
出处 《智能计算机与应用》 2025年第9期19-25,共7页 Intelligent Computer and Applications
关键词 位置敏感哈希 机器学习 随机采样 LSH性质 Locality-Sensitive Hashing machine learning random sampling LSH property
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部