期刊文献+

数据仓库环境下以用户为中心的数据清洗过程模型 被引量:15

A Human-Centered Process Model for Data Cleansing under Data Warehousing
在线阅读 下载PDF
导出
摘要 数据清洗是数据仓库和数据挖掘中非常重要的一个环节。本文首先分析总结了数据清洗的有关概念,给出了数据清洗中需要解决的质量问题,并总结了解决这些问题的技术和方法。在此基础上提出了以人为中心的数据清洗过程模型。该模型集成了工作流技术、数据集成、数据转换和数据挖掘技术。给出了每个工具箱应该提供的基本功能。 Data cleansing is an important step both in data warehousing and data mining. This paper reviews some concepts on data cleansing, lists the data quality issues needed to be resolved in data cleansing process, and presents the techniques and methods for data cleansing firstly. Then a human-centered process model for data cleansing is proposed. It combines with workflow, data integration, data transformation, and data mining techniques. It also presents the main functions of each toolkits.
出处 《计算机科学》 CSCD 北大核心 2004年第5期52-55,共4页 Computer Science
基金 国家自然科学基金项目资助(项目编号:60173051)
关键词 数据仓库 用户 数据清洗 数据挖掘 质量问题 工作流技术 数据集成 数据转换 Data cleansing, Process model, Data warehousing, Data mining, Data quality
  • 相关文献

参考文献21

  • 1[1]Wang R Y, Reddy M P, Kon H B. Towards quality data: an attribute-based approach. decision support systems, 1995,13
  • 2[2]Celko J, McDonald J. Don't warehouse dirty data. Datamation,1995,41 (19): 42~45
  • 3[3]Forino R, Data e. Quality: the data quality assessment, part 2.DM Review Online in September 2000. http://www. DMReview. com
  • 4[4]Redman T. The impact of poor data quality on the typical enterprise. Communications of the ACM, 1998,41 (2): 79~82
  • 5[5]Monge A E. Matching algorithm within a duplicate detection system. IEEE Techn. Bulletin Data Engineering,2000,23(4)
  • 6[6]Marcus A, Maletic J I. Utilizing association rules for identification of possible errors in data sets: [The University of Memphis' Technical Report CS-00-02]. 2000
  • 7[7]Knorr E M,Ng R T, A unified notion of outliers. properties and computation. In: Proc. of KDD 97, 1997. 219~222
  • 8[8]Orli R J. Data extraction, transformation, and migration tools part Ⅱ. Available at:http://www. kismeta. com/ex2. html, 1996
  • 9[9]Raman V, Hellerstein J M. Potter's Wheel: an interactive data cleaning system. In: Proc. of the 27th VLDB Conf. Roma, Italy,2001
  • 10[10]Brachman R J, Anand T. The process of knowledge discovery in databases: a human-centered approach. In Advances in Knowledge Discovery and Data Mining, MIT Press/AAAI Press,1996

同被引文献129

引证文献15

二级引证文献168

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部