期刊文献+

SCBT-index:基于谱编码的子图索引算法 被引量:1

SCBT-index:Subgraph Indexing Algorithm Based on Spectral Coding
在线阅读 下载PDF
导出
摘要 随着图模型规模的扩大,单机算法难以适应大规模数据集下的子图查询.而现有的分布式算法基于无索引的简单遍历,join过程容易出现内存溢出,而且查询图分布异常时易出现负载不均衡.提出了一种基于谱编码的二叉索引树(SCBT-index),首先对数据图中的顶点谱编码,根据编码信息构建二叉索引树.然后对查询图使用最小查询计划进行分解,最后join过程使用3个剪枝策略:基于拓扑结构的预剪枝、序列化join和基于分布式下的join优化.实验结果表明,SCBT-index在图集下的综合性能优于现有主流算法,单图下的查询时间为现有算法的1/2到1/4. With the expansion of graph scale,single-machine algorithms can hardly adapt to the sub-graph queries in large-scale data sets.As existing distributed algorithms are based on simple traversal without index,the join process is prone to memory overflow in the distributed algorithms and load imbalance occurs when the query graph distribution is abnormal.Therefore,a binary index tree based on spectral coding named SCBT-index is proposed.Firstly,for vertex spectrum coding in the data graph,a binary index tree is constructed according to the coding information.Then,the query graph is decomposed using the minimum query plan.Finally,three pruning strategies are used in the join process:structure matching based on topological structure,serialized join and the distributed join optimization.The experimental results show the comprehensive performance of SCBT-index under the graph set is better than that of the popular algorithms.In addition,the query time under the single graph is 1/2 to 1/4 of that of the existing algorithms.
作者 施炜杰 董一鸿 钱江波 陈华辉 辛宇 SHI Wei-jie;DONG Yi-hong;QIAN Jiang-bo;CHEN Hua-hui;XIN Yu(Faculty of Electrical Engineering and Computer Science,Ningbo University,Ningbo,Zhejiang 315211,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2020年第1期110-117,共8页 Acta Electronica Sinica
基金 国家自然科学基金(No.61572266,No.61602133) 浙江省自然科学基金(No.LY20F020009,No.LZ20F020001)
关键词 谱编码 GINI系数 子图查询 子图索引 spectral coding Gini subgraph query subgraph index
  • 相关文献

参考文献1

二级参考文献28

  • 1汪卫,周皓峰,袁晴晴,楼宇波,施伯乐.基于图论的频繁模式挖掘[J].计算机研究与发展,2005,42(2):230-235. 被引量:17
  • 2李先通,李建中,高宏.一种高效频繁子图挖掘算法[J].软件学报,2007,18(10):2469-2480. 被引量:34
  • 3Borgelt C, Berthold M R, Patterson D E. Molecular fragment mining for drug discovery [G] //Symbolic and Quantitative Approaches to Reasoning with Uncertainty. Berlin: Springer, 2005 : 1002-1013.
  • 4Guralnik V, Karypis G. A scalable algorithm for clustering sequential data [C] //Proc of the 1st IEEE Int Conf on Data Mining. Piscataway, NJ: IEEE, 2001:179-186.
  • 5Yan X, Yu P S, Han J. Graph indexing: A frequent structure-based approach [C] //Proc of the 17th ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2004: 335-346.
  • 6Liu Y, Jiang X, Chen H, et al. Mapreduce-based pattern finding algorithm applied in motif detection for prescription compatibility network [G] //Advanced Parallel Processing Technologies. Berlin: Springer, 2009: 341-355.
  • 7Shahrivari S, Jalili S. Distributed discovery of /requent subgraphs of a network using MapReduce [OL]. [2015-03- 25]. http://link, springer, corn/article/10. 1007/s00607-015 0446 9.
  • 8Elseidy M, Abdelhamid E, Skiadopoulos S, et al. GRAMI: Frequent subgraph and pattern mining in a single large graph [C] //Proc of the 40th Int Conf on Very Large Data Bases. Berlin: Springer, 2014:517-528.
  • 9Bhuiyan M A, A1 Hasan M. An iterative MapReduce based frequent subgraph mining algorithm [J]. IEEE Trans on Knowledge and Data Engineering, 2013, 27(3): 608-620.
  • 10Lu W, Chen G, Tung A K H, et al. Efficiently extracting frequent subgraphs using mapreduce [C] //Proc of the 1st IEEE Int Conf on Big Data. Piscataway, NJ: IEEE, 2013: 639-647.

共引文献20

同被引文献18

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部