期刊文献+

基于MapReduce的网络信息提取方法 被引量:3

Web Information Extraction Method Based on MapReduce
在线阅读 下载PDF
导出
摘要 网络信息提取技术对于人们高速准确的从海量数据中提取所需要的信息变得非常重要,面临海量计算所带来的挑战,提出了基于MapReduce的网络信息提取方法,以淘宝网为数据源,提取用户对商品感兴趣程度,通过实验仿真,表明该方法对于海量网络信息提取具有较高的效率和很好的适应性。 Web information extraction technology is very important for people to extract the required information from mass data eciently and accurately. However, due to the challenges posed by Large scale computing, a method is proposed based on MapReduce for network information extraction, With the Web data from Taobao, the user interested in the commodity level extracts useful information, Simulation results show that the method has high efficiency and good adaptability from vast Web data sources.
机构地区 铜陵学院
出处 《安徽科技学院学报》 2013年第2期72-75,共4页 Journal of Anhui Science and Technology University
基金 安徽高校省级自然科学研究项目(KJ2012Z417) 铜陵学院人才科研启动基金项目(2011tlxyrc09)
关键词 信息提取 云计算 MAPREDUCE HADOOP Information extraction Cloud computing MapReduce Hadoop
  • 相关文献

参考文献12

  • 1王鹏.云计算的关键技术与应用实例[M].北京:人民邮电出版社,2009.
  • 2郭岩.网络信息抽取技术研究[J].信息技术快报,2008(6):15-23.
  • 3Amazon Incorporation. Amazon elastic compute cloud [ EB/OL ]. http ://aws. amagon, com/ec2. [ 2010 - 04 - 30/2012 - 09 - 06 ].
  • 4孙少陵,罗治国,徐萌,钱岭,王旭.云计算及应用的研究与实现[J].电信工程技术与标准化,2009,22(11):2-7. 被引量:21
  • 5WINANS B, BROWN S. Cloud computing: a collection of working papers [ EB/OL]. http://www, johnseclybrown, corn/eloud- computingpapers, pdf. [ 2010 - 01 - 28/2012 - 09 - 06 ].
  • 6Hadoop ' [ EB/OL]. http ://hadcop. apache, org/commondocA/r0.18.2/cn/quickstart, html. [ 2008 - 12 - 1/2012 - o9 -063.
  • 7Amazon intreduces Elastic MapReduce ( Hadoop Framework) Service [ EB/OL ]. http://www. Byleonic. com/2009/amazon - in- troduces - elastic - mapreduce - Hadoop - framework - service/. [ 2010 - 09 - 27/2012 - 09 - 06 ].
  • 8Hadoop Map/Reduce Tutorial[ EB/OL]. http ://Hadoop. apache, org/common/docs/r0.18.2/mapred - tutorial html. [ 2010 - 09 - 27/2012 - 09 - 06 ].
  • 9智能科学.基于Hadoop的并行分布式数据挖掘平台PDMiner[EB/OL].Http://www.intsei.ac.cn/pdrn/msmirer.hml[2010-06-23/2012-09-06].
  • 10移动labs.基于移动计算的并行数据挖掘工具平台研究(一)[EB/OL].Http://labs,chinamobile.eom[2009-03-25/2012-09-06].

二级参考文献47

  • 1席景科,闫大顺.Web数据挖掘中数据集成问题的研究[J].计算机工程与设计,2006,27(8):1366-1368. 被引量:6
  • 2Sims K. IBM introduces ready-to-use cloud computing collaboration services get clients started with cloud computing. 2007. http://www-03.ibm.com/press/us/en/pressrelease/22613.wss
  • 3Boss G, Malladi P, Quan D, Legregni L, Hall H. Cloud computing. IBM White Paper, 2007. http://download.boulder.ibm.com/ ibmdl/pub/software/dw/wes/hipods/Cloud_computing_wp_final_8Oct.pdf
  • 4Zhang YX, Zhou YZ. 4VP+: A novel meta OS approach for streaming programs in ubiquitous computing. In: Proc. of IEEE the 21st Int'l Conf. on Advanced Information Networking and Applications (AINA 2007). Los Alamitos: IEEE Computer Society, 2007. 394-403.
  • 5Zhang YX, Zhou YZ. Transparent Computing: A new paradigm for pervasive computing. In: Ma JH, Jin H, Yang LT, Tsai JJP, eds. Proc. of the 3rd Int'l Conf. on Ubiquitous Intelligence and Computing (UIC 2006). Berlin, Heidelberg: Springer-Verlag, 2006. 1-11.
  • 6Barroso LA, Dean J, Holzle U. Web search for a planet: The Google cluster architecture. IEEE Micro, 2003,23(2):22-28.
  • 7Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks, 1998,30(1-7): 107-117.
  • 8Ghemawat S, Gobioff H, Leung ST. The Google file system. In: Proc. of the 19th ACM Symp. on Operating Systems Principles. New York: ACM Press, 2003.29-43.
  • 9Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. In: Proc. of the 6th Symp. on Operating System Design and Implementation. Berkeley: USENIX Association, 2004. 137-150.
  • 10Burrows M. The chubby lock service for loosely-coupled distributed systems. In: Proc. of the 7th USENIX Symp. on Operating Systems Design and Implementation. Berkeley: USENIX Association, 2006. 335-350.

共引文献1398

同被引文献12

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部