林韬.我国AI训练数据生产流通的制约因素与应对策略研究[J].中国科学院院刊,2025,40(4):672-680.

我国AI训练数据生产流通的制约因素与应对策略研究

Study on constraints and policy responses for production and circulation of AI training data in China
作者
林韬1,2
香港中文大学(深圳) 前海国际事务研究院 深圳 518172;美国华盛顿大学 政治学系 西雅图 98105
LIN Tao1,2
The Insitute for International Affairs, Qianhai, The Chinese University of Hong Kong, Shenzhen, Shenzhen 518172, China;Department of Political Science, University of Washington, Seattle 98105, USA
中文关键词
         人工智能;训练数据;数据要素流通;知识产权;个人信息保护;公共数据
英文关键词
        artificial intelligence;training data;data element circulation;intellectual property;personal information protection;public data
中文摘要
        训练数据的数量和质量对人工智能模型的性能至关重要。然而,目前我国训练数据的生产存在数量不足、质量较低、分布零散等问题,受限于商业生态、监管政策和公共数据开发利用的多重制约。为了解决这些问题,文章提出了一系列政策建议,包括:鼓励科研机构生产开源数据集、打造人工智能应用场景、采取“宽进严出”的监管理念、设立知识产权豁免条款、完善个人信息保护实施细则、加快建设全国统一的公共数据平台等。
英文摘要
        The quantity and quality of training data are critical to the performance of artificial intelligence (AI) models. However, in China, the production of training data is hindered by issues such as insufficient quantity, low quality, and fragmented distribution, compounded by limitations stemming from commercial ecosystems, regulatory frameworks, and restricted development and utilization of public data. To address these challenges, this study proposes several policy recommendations, including incentivizing research institutions to generate open-source datasets, fostering AI application scenarios, adopting a “loose-in, focus-out” regulatory approach, introducing intellectual property exemption provisions, refining personal information protection guidelines, and expediting the establishment of a unified national public data platform
DOI10.16418/j.issn.1000-3045.20241204003
微信关注公众号