摘要
当前,我国人工智能数据集面临质量评估方法缺失、能力建设体系不明确等挑战。梳理了人工智能数据集的构成和分类,结合结构化数据质量评估,提出一套人工智能数据集质量评估方法,并基于产业实践,提炼出企业高质量人工智能数据工程体系与能力建设路径。最后给出了我国建设高质量数据集的政策建议。
Currently,the artificial intelligence(AI)datasets in our country face challenges,including the lack of quality evaluation methods and an unclear capability-building system.This article reviews the composition and classification of AI datasets and combines structured data quality assessment to propose a set of evaluation methods for AI dataset quality.Based on industry practices,it distills a high-quality AI data engineering system,summarizes the pathways for enterprises to develop capability building,and provides policy recommendations for constructing high-quality datasets in our country.
作者
姜春宇
白玉真
刘渊
王超伦
JIANG Chunyu;BAI Yuzhen;LIU Yuan;WANG Chaolun(China Academy of Information and Communication Technology,Beijing 100191,China)
出处
《大数据》
2025年第6期47-56,共10页
Big Data Research