期刊文献+

基于随机游走路径的分布式SimRank算法 被引量:2

Distributed Sim Rank Algorithm Based on Random Walk Path
在线阅读 下载PDF
导出
摘要 SimRank算法是一种常用的相似性度量模型,它基于图的拓扑结构信息来衡量任意两个对象之间的相似程度。随着数据规模的不断增大,集中式SimRank算法已不适用,而已有的分布式Sim Rank算法在运行效率和扩展性等方面存在缺陷。针对上述问题,提出了一种两阶段的基于随机游走路径的分布式Sim Rank算法。第一阶段基于BSP(bulk synchronous parallel)模型建立随机游走路径索引信息,支持新路径的动态添加,并通过阈值过滤尽可能减少生成路径的数量;第二阶段利用第一阶段生成的索引信息,提出了基于MapReduce的分布式SimRank算法。最后,通过实验验证了算法的可行性和有效性。 SimRank is a widely used model for computing similarity, it measures similarity between objects based on graph topology. With the rapid increase of data, the way of centralized SimRank is not applicable and current distributed SimRank approaches have some drawbacks in efficiency and scalability. This paper presents a two-stage distributed SimRank algorithm based on random walk path. The first stage is data preprocessing and a Find-K-Paths algorithm based on BSP (bulk synchronous parallel) model is proposed. The algorithm can effectively build the index information of random walk path and support the dynamic adding of new paths. The number of the generated paths can be reduced by threshold filtering. In the second stage, based on the index information, a distributed SimRank algorithm is proposed under MapReduce. The experiments demonstrate the feasibility and effectiveness of the proposed algorithm.
出处 《计算机科学与探索》 CSCD 2014年第12期1422-1431,共10页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金 国家重点基础研究发展计划(973计划) 中央高校基本科研业务费专项基金 高等学校博士学科点专项科研基金 教育部-英特尔信息技术专项科研基金~~
关键词 分布式SimRank 随机游走路径 BSP模型 MAPREDUCE distributed SimRank random walk path BSP model MapReduce
  • 相关文献

参考文献13

  • 1Li Cuiping, Han Jiawei, He Guoming, et al. Fast computa- tion of SimRank for static and dynamic information net- works[C]//Proceedings of the 13th International Confer- ence on Extending Database Technology (EDBT '10), Laus- anne, Switzerland, 2010. New York, NY, USA: ACM, 2010: 465-476.
  • 2Yu Weiren, Lin Xuemin, Zhang Wenjie. Towards efficient SimRank computation on large graphs[C]//Proceedings of the 29th International Conference on Data Engineering (ICDE ' 13), Brisbane, Australia, 2013. Piscataway, N J, USA: IEEE, 2013: 601-612.
  • 3Valiant L G. A bridging model for parallel computation[J]. Communications of the ACM, 1990, 33(8): 103-111.
  • 4Ioaimis A, Hector G M, Chi C C. SimRank++: query rewriting through link analysis of the click graph[J]. Proceedings of the VLDB Endowment, 2008, 1(1): 408-421.
  • 5Cao Liangliang, Kim H D, Tsai M H. Delta-SimRank com- puting on MapReduce[C]//Proceedings of the 1 st Interna- tional Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms Systems Programming Models and Applications (BigMine '12), Beijing, China, 2012. New York, NY, USA: ACM, 2012: 28-35.
  • 6Li Lina, Li Cuiping, Chen Hong, el al. Mapreduce-based SimRank computation and its application in social recom- mender systern[C]//Proceedings of the 2013 IEEE Interna- tional Congress on Big Data (BigData '13), Santa Clara, USA, 2013. Piscataway, NJ, USA: IEEE, 2013: 133-140.
  • 7Malewicz G, Austern M H, Bik A J, et al. Pregel: a system for large-scale graph processing[C]//Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD '10), Ashraf Aboulnaga, 2010. New York, NY, USA: ACM, 2010: 135-146.
  • 8Xi Wenxi, Fox E A, Fan Weiguo, et al. Simfusion: measuring similarity using unified relationship matrix[C]//Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '05), Salvador, Brazil, 2005. New York, NY, USA: ACM, 2005: 130-137.
  • 9Jeh G, Widom J. SimRank: a measure of structural-context similarity[C]//Proceedings of the 8th ACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining (KDD '02), Strathcona, USA, 2002. New York, NY, USA: ACM, 2002: 538-543.
  • 10Page L, Brin S, Motwani R, el al. The pagerank citation ranking: bringing order to the Web[R/OL]. Stanford Univer- sity Database Group (1998)[2014-03-10]. http://citeseer, nj. nec.com/368196.html.

二级参考文献15

  • 1Shah V B, Adviser-Gilbert J R. An interactive system for combinatorial scientific computing with an emphasis on programmer productivity[M]. Santa Barbara: University of California at Santa Barbara, 2007.
  • 2Rabin M 0, Vazirani V V. Maximum matchings in general graphs through randomization[J]. Journal of Algorithms, 1989, 10(4): 557-567.
  • 3Yuster R, Zwick U. Detecting short directed cycles using rectangular matrix multiplication and dynamic program?ming[C]/lProceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '04). Philadel?phia, PA, USA: Society for Industrial and Applied Mathe?matics, 2004: 254-260.
  • 4Zhang Yudong, Wu Lenan, Wei Geng, et al. A novel algo?rithm for all pairs shortest path problem based on matrix multiplication and pulse coupled neural network[J]. Digital Signal Processing, 2011, 21(4): 517-521.
  • 5Qian Zhengping, Chen Xiuwei, Kang Nanxi, et al. Mad?LINQ: large-scale distributed matrix computation for the cloud[C]/lProceedings of the 7th ACM European Confer?ence on Computer Systems (EuroSys '12), Bern, Switzer?land, 2012: 197-210.
  • 6Loos S M, Wise D S. Strassen' s matrix multiplication rela?beled[EB/OL]. (2009)[20 13-03]. http://src.acm.org/loos/loos. html.
  • 7Kang U, Tsourakakis C E, Faloutsos C. PEGASUS: mining peta-scale graphs[J]. Knowledge and Information Systems, 20 II, 27(2): 303-325.
  • 8Seo S, Yoon E J, Kim J, et al. HAMA: an efficient matrix computation with the MapReduce framework[C]//Proceedings of the IEEE 2nd International Conference on Cloud Com?puting Technology and Science (CloudCom ' 10). Wasbing?ton, DC, USA: IEEE Computer Society, 2010: 721-726.
  • 9Saad Y. Iterative methods for sparse linear systems[M). Phila?delphia, PA, USA: Society for Industrial and Applied Mathe?matics, 2003.
  • 10Park S C, Draayer J P, Zbeng S Q. Fast sparse matrix multi?plication[J]. Computer Physics Communications, 1992, 70(3): 557-568.

共引文献2

同被引文献15

  • 1Lieberman E,Hauert C,Nowak M A.Evolutionary dynamics on graphs[J].Nature,2005,433(20):312-316.
  • 2Shakarian P,Roos P,Johnson A.A review of evolutionary graph theory with applications to game theory[J].Bio Systems,2012,107(2):66-80.
  • 3Broom M,Rychtˇr J,Stadler B T.Evolutionary dynamics on graphs—the effect of graph structure and initial placement on mutant spread[J].The Journal of Statistical Theory and Practice,2011,5(3):369-381.
  • 4Ohtsuki H,Hauert C,Lieberman E,et al.A simple rule for the evolution of cooperation on graphs and social networks[J].Nature,2006,441(7092):502-505.
  • 5Nie Puyan,Zhang Peiai.Evolutionary graphs with frequency dependent fitness[J].Internal Journal of Modern Physics B,2009,23(4):537-543.
  • 6Nie Puyan,Zhang Peiai.Fixation time for evolutionary graphs[J].International Journal of Modern Physics B,2010,24(27):5285-5293.
  • 7Zhang Peiai,Nie Puyan,Hu Daiqiang.Bi-level evolutionary graphs with multi-fitness[J].IET Systems Biology,2010,4(1):33-38.
  • 8Zhang Peiai.Fixation probabilities of evolutionary graphs bassed on the positions of new appearing mutants[DB/OL].http://www.hindawi.com/journals/jam/2014/901363/2014-04-17.
  • 9Nie Puyan.Evolutionary graphs on two levels[J].Ars Combinatoria,2008,86:115-120.
  • 10Whigham P A,Dick G.Evolutionary dynamics for the spatial Moran process[J].Genetic Programming and Evolvable Machines,2008,9(2):157-170.

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部