摘要
概括介绍了各种文献中对数据清洗技术的描述和定义,并简要介绍了几种能自动识别数据集中潜在错误的异常检测的方法,给出了在现实数据集中进行实验的结果,讨论了数据清洗问题未来的研究方向。
This paper gives an overview of the descriptions and definitions about data cleaning technique in existing literatures. And briefly introduced several error detection methods to automatically identify potential errors in data sets. Some brief experimental results supporting the use of such methods are given. Finally the future research directions necessary to address the data cleaning problems are discussed.
出处
《山东科技大学学报(自然科学版)》
CAS
2004年第2期55-57,共3页
Journal of Shandong University of Science and Technology(Natural Science)
关键词
数据清洗
异常检测
模式
聚类
关联规则
data cleaning
error detection
pattern
clustering
association rules