摘要
大模型的开源不仅需要开放传统的计算机软件形式的模型架构、训练代码等,也需要开放模型的参数和数据集.根据“四要素分析法”和“三步检验法”的分析框架,尤其是考虑到以开放许可证分发的数据集具有转换性使用的性质和目的,以及对于科技发展和应用的公共利益,可以认定开源大模型数据集的分发属于合理使用,不需要上游权利人的著作权许可.这样,既满足了对于人工智能透明度的治理要求,也具有促进知识共享的积极作用.
The openness of large models requires not only sharing conventional computer software elements such as model architectures and training codes but also disclosing model parameters and datasets.Applying the analytical frameworks of the“four-factor test”and“three-step test”while considering the transformative nature and purpose of dataset distribution under open licenses as well as the public interest in technological development and application,one may conclude that distributing datasets for open-source large models constitutes fair use,thus obviating the necessity for obtaining copyright licenses from upstream right holders.Such an approach satisfies governance requirements regarding artificial-intelligence transparency and actively contributes to promoting knowledge sharing.
作者
赵云虎
杨宇宙
秦琳
ZHAO Yunhu;YANG Yuzhou;QIN Lin(Open-Source Innovation and Digital Governance Research Institute,Shanghai University of International Business and Economics,Shanghai 200120,China;Department of Intellectual Property Rights,Beijing Dacheng Law Offices,LLP(Shanghai),Shanghai 200120,China)
出处
《华东师范大学学报(自然科学版)》
北大核心
2025年第5期183-190,共8页
Journal of East China Normal University(Natural Science)
关键词
人工智能法
开源大模型
许可证
数据集
合理使用
artificial intelligence act
open-source large model
licensing
dataset
fair use