期刊文献+

数据质量控制研究中若干基本问题 被引量:16

Research on Some Basic Problems in Data Quality Control
在线阅读 下载PDF
导出
摘要 数据质量的定义、数据质量问题的来源、数据质量提高途径等基本问题,是数据质量控制研究的基础。分析了现有数据质量定义的局限性和片面性,依据国际标准化组织对质量的定义,重新对其进行了定义。将数据质量问题来源分为四种情况:数据录入错误、测量错误、简化错误和数据集成错误。归纳了数据质量提高的具体手段,指出数据质量控制需综合应用管理和技术手段。校正了对以上基本问题的认识偏差,为更深入的数据质量研究提供了依据。 Some basic problems of data quality, such as definition, error source, improving approach, etc.are foundation for data quality control.The limitations and one-sidedness of existing data quality definitions are analyzed.Data quality is redefined according as quality definition coming from ISO.Data error sources are divided into four instances:data entry errors, measurement errors, distillation errors and data integration errors.Improving approaches of data quality are summarized, and both management and technology are needed in data quality control.Some errors in above basic problems are advised, and some bases for further research are given.
出处 《微计算机信息》 2010年第9期12-14,共3页 Control & Automation
关键词 数据质量 数据质量控制 数据异常 数据集成 全面数据质量管理 data quality data quality control data error data integration total data quality management
  • 相关文献

参考文献24

  • 1Strong D M, Lee Y W, Wang R Y. 10 Potholes in the Road to Information Quality[J]. IEEE Computer, 1997, 30(8): 38-46.
  • 2Joseph M. Hellerstein. Quantitative Data Cleaning for Large Databases [J/OL]. 2008, EECS Computer Science Division. UC Berkeley, http://db.cs.berkeley.edu/jmh/cleaning-unece.pdf.
  • 3Martin J. Eppler, Rene Algesheimer, Marcus Dimpfel. Quality Criteria of Content-Driven Websites and their Influence on Customer Satisfaction and Loyalty: an Empirical Test of an Information Quality Framework[C]. the Eighth International Conference on Information Quality (IQ 2003), November 7-9. 2003. MIT 2003: 108-120.
  • 4Sara Cushman. Billions Lost to Poor Data Quality [J/OL]. 2002, http://searchdatabase, techtarget.com.
  • 5姜作勤.数据质量研究与实践的现状及空间数据质量标准[J/OL]. 2004(2006-11-17)[2008-9-17].http://www.lrn.cn/books-collection/magazines/maginformatization/2004maginformatization/ 2004_3/200611/t20061117_3136.htm.
  • 6Meta Group. Data Warehouse Scorecard. Meta Group, 1999.
  • 7邓中国,周奕辛.数据清洗技术研究[J].山东科技大学学报(自然科学版),2004,23(2):55-57. 被引量:7
  • 8Lueebber D, Grimmer U. Systematic Development of Data Mining Based Data Quality Tools[C]. In: 29th VLDB, 2003.
  • 9Jeusfeld M A, Quix C, and Jarke M. Design and Analysis of Quality Information for Data Warehouses. In Proc. of the 17th Intemational Conference on the Entity Relationship Approach(ER' 98), Singapore, 1998.
  • 10Aebi D, Perrochon L. Towards Improving Data Quality [C]. In: Proc. of the International Conference on Information Systems and Management of Data, 1993. 273-281.

二级参考文献38

  • 1张汝波,郭必祥,熊江.基于遗传蚁群算法的机器人全局路径规划研究[J].哈尔滨工程大学学报,2004,25(6):724-727. 被引量:10
  • 2刘玉霞,王萍,修春波.基于模拟退火策略的逆向蚁群算法[J].微计算机信息,2006,22(12S):265-267. 被引量:10
  • 3Galhardas H, Florescu D.An Extensible Framework for Data Cleaning[R].Institute National de Recherche en Informatique et en Automatique, Technical Report, 1999.
  • 4Hernandez MA and Stolfo JS.Real-world Data is Dirty:Data Cleansing and The Merge/Purge Problem[J].Journal of Data Mining and Knowledge Discovery,1998,(2).
  • 5Kimball R.Dealing with dirty Data[J].DBMS, 1996,9(10) :55.
  • 6Guyon I,Matic N and Vapnik V.Discovering Information Patterns and Data Cleaning.In Advances in Knowledge Discovery in Data Mining[M].MIT Press/AAAI Press,1996.
  • 7Simoudis E,Livezey B and Kerber R.Using Recon for Data Cleaning[A].Proceedings of KDD[C].1995.282-287.
  • 8Levitin A and Redman T.A Model of the data(life) cycles with application to quality[J].Information and Software Technology,1995,35(4): 217-223.
  • 9Jonathan I,Maletic Andrian Marcus.Data Cleansing: Beyond Integrity Analysis[J].Division of Computer Science 2000,(2).
  • 10Andrian Marcus, Jonathan I.Maletic.Utilizing Association Rules for the Identification of Errors in Data[R].Technical Report CS-00-04.

共引文献309

同被引文献184

引证文献16

二级引证文献119

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部