期刊文献+

大数据分析的神经网络方法 被引量:94

Big Data Analysis Using Neural Networks
在线阅读 下载PDF
导出
摘要 大数据蕴含巨大的社会、经济、科学价值,已成为学术界与企业界关注的重点。其关键技术可划分为三大层次:数据平台、分析平台和展示平台,其中分析平台是大数据转化为价值的桥梁。一般来说,大数据拥有体量浩大(volume)、多源异构(variety)、生成快速(velocity)、价值稀疏(value)的"4V"特性,扩大了大数据的价值空间,同时也为大数据的分析技术带来巨大挑战。其中三大挑战比较显著,即多源异构大数据、大量非结构化数据存储、大数据价值稀疏且变化快。其三大核心科学问题为大数据的表达、存储和预测问题。由于传统的数据分析方法难以胜任,发展新的大数据分析方法势在必然。人脑是天然的大数据处理引擎。神经网络是一种模拟人脑大数据分析机制的计算方法,是目前大数据分析中最成功的方法。神经网络的研究主要包括:模拟大脑神经网络结构,构建神经网络结构模型;模拟大脑神经网络的记忆机制,发展学习算法。神经网络的研究历史历经波折。近年来,随着当代计算机计算能力的不断提升,基于神经网络的大数据分析方法取得了巨大成功,尤其是在各应用领域,如语音大数据分析、图像大数据分析、医学大数据分析等,引领了人工智能的发展。Alpha Go在人机围棋大战中获胜,引起了广泛关注。"大数据+神经网络"已成为驱动创新、推动社会发展和改变人类生产生活方式的一种重要力量。以大数据和神经网络为线索,回顾大数据的基本概念与关键技术,梳理神经网络研究的基本框架,可以发现它们之间默契切合、互相促进的关系。一方面,神经网络具有强大的特征提取与抽象能力,能够整合多源信息,处理异构数据,捕捉变化动态,是大数据实现价值转化的桥梁。另一方面,体量浩大的大数据为神经网络提供了充足的训练样本,使得训练越来越大规模的神经网络成为可能。尽管"大数据+神经网络"在众多应用领域已经取得了突破,但是,仍然存在需要解决的核心科学问题。面向神经网络的研究中,神经网络的结构尚需进一步研究,神经网络的大小依然缺少理论性的指导,神经网络的学习算法仍然存在一些内在的问题。围绕大数据分析的三大核心科学问题,需要研究如何保证在高维空间中稀疏表达仍可维持数据的一致性,如何实现"只存储知识而不存储原始数据",如何刻画数据的时空关联以实现大数据的预测。因此,仍然需要对该领域持续投入,加强应用研究和理论研究,尤其应进行跨领域的研究,即与人脑的大数据处理相呼应,结合认知科学、神经科学等相关学科的知识,以解决神经网络和大数据应用中的核心科学问题,推动基于神经网络方法的大数据分析研究。 Big data contains the high social, economic, and scientific value. It has been being spotlighted all around the academia and industry. The key technology can be divide into three levels : the data platform, the analysis platform, and the presentation platform. Among these, analysis platform is the bridge that transforms big data into real value. Generally, there are four specific attributes of big data, known as the volume, variety, velocity, and value. They extend the value space of big data. Meanwhile, they become great challenges of big data analysis. There are three major issues:Multi-sourced and heterogeneous big data, storage of tremendous unstructured data, and sparse value in fast changing big data. The three central scientific problems in big data analysis are representation, storage, and prediction of big data. Traditional methods cannot handle big data well. New methods for big data are imperative. Human brains are naturally excellent big data processor. Neural networks are computational replications of big data analysis principles in human brains. Neural networks are the most successful methods for big data analysis. Simulating neural structure in the brain to build neural network structure models and simulating memory mechanism in the brain to develop learning algorithms are two basic methodology in neural networks research. The history of neural networks research has ups and downs. Today, with the support from developing computational power, big data analysis using neural networks has achieved great success, especially in big data applications, for example, audio big data analysis, visual big data analysis, medical big data analysis. Neural networks are leading the artificial-intelligence researches. AlphaGo beating human champion in Go game attracted even more public interest. "Big data + Neural Networks" is becoming one of the driving forces of innovation, social promotion, and living development. It can be clearly demonstrated how perfectly matched and mutually reinforcing are big data and neural networks by reviewing the basic concepts and key technology in big data as well as inducing the research framework of neural networks. On the one hand, neural networks are capable for extracting abstract features from raw data. They can combine multiple information sources, process heterogeneous data, and capture dynamic change in data. They are the bridge to implementation of big data value transformation. On the other hand, large volume of big data provide tremendous training samples which enable the training of neural networks with large number of parameters. However, there are still some problems in the "big data + neural networks" pat- tern. In the aspect of neural networks research, the structure is needing further investigation and development;the scale of network is lacking theoretical guideline;and the learning algorithm is having some inherent problem. In the aspect of big data there are also three core scientific problems:how to ensure the consistency in high-dimensional sparse space;how to implement the "knowledge-only" stor- age;and how to depict temporal correlation and implement big data prediction. Much more investigations in this area are still in urgent need, including those from theoretical and practical aspects. In particular, cross domain researches are important. Research in the area can be act in coordination with that in understanding big data processing in human brains. More combination with cognitive science and neuroscience are needed to resolve the fundamental scientific problems in neural networks research and big data,in which way the re- search of big data analysis using neural networks could be improved.
出处 《工程科学与技术》 EI CAS CSCD 北大核心 2017年第1期9-18,共10页 Advanced Engineering Sciences
基金 国家自然科学基金重点项目资助(61432012 U1435213)
关键词 大数据 神经网络 人工智能 big data neural networks artificial intelligence
  • 相关文献

参考文献2

二级参考文献44

  • 1郝芳,傅小兰.视觉标记:一种优先选择机制[J].心理科学进展,2006,14(1):7-11. 被引量:10
  • 2工信部电信研究院大数据白皮书(2014)[M].北京:工业和信息化部电信研究院,2014.
  • 3Intel IT Center.大数据101:非结构化数据分析[M/OL].[2015-07-01 ]. http://www. intel. cn/content/dam/www/public/cn/zh/pdfvs/Big-data-101-Unstructured-data-Analytics.pdf, 2012.
  • 4Hinton G E, Salakhutdinov R R. Reducing the dimensionalityof data with neural networks [J], Science, 2006, 313(5786): 504-507.
  • 5Rosenblatt F. The perceptron: A probabilistic model forinformation storage and organization in the brain. [J],Psychological Review, 1958,65(6) : 386-408.
  • 6Rumelhart D E,Hinton G E, Williams R J. Learningrepresentations by back-propagating errors [J]. Nature,1986, 323(6088): 533-536.
  • 7Huang G B, Chen L,Siew C K. Universal approximationusing incremental constructive feedforward networks withrandom hidden nodes [J]. IEEE Trans on Neural Networks,2006, 17(4): 879-892.
  • 8Vincent P,Larochelle H, Lajoie I,et al. Stacked denoisingautoencoders : Learning useful representations in a deepnetwork with a local denoising criterion [J]. Journal ofMachine Learning Research, 2010,11: 3371-3408.
  • 9LeCun Y, Bottou L, Bengio Y. Gradient-based learningapplied to document recognition [J]. Proceedings of theIEEE, 1998, 86(11): 2278-2324.
  • 10Williams R J,Zipser D. Gradient-based learning algorithmsfor recurrent connectionist networks, NU-CCS-90-9 [R].Boston,MA: Northeastern University, 1990.

共引文献89

同被引文献841

引证文献94

二级引证文献602

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部