摘要
现实中海量数据往往持续地产生,如何实现信息和知识的动态挖掘已成为人们关注的理论问题。根据数据集分批分步输入处理的思想,以Copula连接函数为理论基础,给出一种有效海量数据的关联分步测度算法,通过模拟实验验证了该算法的可行性,结果显示所设计的关联算法能显著提高关联效应测量的效率,并能有效地解决超海量数据关联效应的测度问题。
In reality, the massive data are generated continuously, how to realize the dynamic mining of information and knowledge is a theoretical problem which people concerned. According to the idea of batch wise and stepwise input data processing, an effective stepwise algorithm on mass data correlation is introduced which is based on the theory of copula function. Further, the feasibility of the algorithm is verified by simulation experiments. The results show that the efficiency of measurement on the data association effect can be highly improved through the algorithm referred in this paper, and the measurement of super mass data or even infinite data will be solved effectively.
出处
《统计与信息论坛》
CSSCI
2014年第4期10-13,共4页
Journal of Statistics and Information
基金
国家社会科学基金项目<基于多因素的空间抽样调查理论与应用研究>(13BTJ006)
关键词
关联效应
连接函数
挖掘算法
correlation effect
Copula function
mining algorithm