摘要
网络信息提取技术对于人们高速准确的从海量数据中提取所需要的信息变得非常重要,面临海量计算所带来的挑战,提出了基于MapReduce的网络信息提取方法,以淘宝网为数据源,提取用户对商品感兴趣程度,通过实验仿真,表明该方法对于海量网络信息提取具有较高的效率和很好的适应性。
Web information extraction technology is very important for people to extract the required information from mass data eciently and accurately. However, due to the challenges posed by Large scale computing, a method is proposed based on MapReduce for network information extraction, With the Web data from Taobao, the user interested in the commodity level extracts useful information, Simulation results show that the method has high efficiency and good adaptability from vast Web data sources.
出处
《安徽科技学院学报》
2013年第2期72-75,共4页
Journal of Anhui Science and Technology University
基金
安徽高校省级自然科学研究项目(KJ2012Z417)
铜陵学院人才科研启动基金项目(2011tlxyrc09)