期刊文献+

基于改进随机森林的海量高维数据最近邻检索 被引量:3

Nearest Neighbor Retrieval of Massive High-dimensional Data Based on Improved Random Forest
在线阅读 下载PDF
导出
摘要 针对高维大样本数据分类的不足,导致传统海量高维数据最近邻检索存在的召回率低和开销大的问题,提出基于改进随机森林的海量高维数据最近邻检索。收集高维数据并利用局部线性嵌入法对数据进行降维处理。创建最近邻检索索引,利用改进随机森林算法确定高维数据类型,实现海量高维数据最近邻检索。为了测试设计最近邻检索的功能,设计对比实验,经过与传统检索方法的对比得出结论:设计的最近邻检索平均召回率提升了1.2%,内存开销和时间开销均有所降低。 In order to solve the problems of low recall rate and high cost in the traditional nearest neighbor retrieval of massive high-dimen-sional data due to the lack of classification of high-dimensional and large sample data,an improved random forest based nearest neighbor retrieval of massive high-dimensional data is proposed.High-dimensional data is collected and dimensionality reduction is carried out by local linear embedding method.The nearest neighbor retrieval index is created,and the high-dimensional data type is determined by the improved random forest algorithm,and the nearest neighbor retrieval of massive high-dimensional data is realized.In order to test the function of the designed nearest neighbor retrieval,a comparative experiment is designed.By com-paring with the traditional retrieval method,the conclusion is drawn that the average recall rate of the designed nearest neighbor retrieval is increased by 1.2%,and both memory and time cost are reduced.
作者 孙昊 SUN Hao(Urumqi Vocational University,Urumqi 830002 China)
出处 《自动化技术与应用》 2022年第11期73-76,共4页 Techniques of Automation and Applications
基金 乌鲁木齐职业大学2019年度校级重点科研项目(2018XZ001)。
关键词 改进随机森林算法 海量高维数据 数据检索 最近邻检索 improved random forest algorithm massive high-dimensional data data retrieval nearest neighbor retrieval
  • 相关文献

参考文献10

二级参考文献43

共引文献45

同被引文献35

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部