人工智能通常被视为现代社会的电力,每天为无数人类互动提供动力。然而,随着大型科技公司和富裕国家在该领域占据主导地位,初创公司和发展中国家面临着明显的劣势,尤其是在两个关键领域:训练数据集和算力。
人工智能的全球监管环境极为复杂,并因私营和公共部门利益相关者之间不同的法规和合作模式而呈分散态势。由于需要协调国际间的监管框架和标准,这种复杂性进一步加剧。
人工智能训练数据集的合理使用规定因地区而异。例如,欧盟的《人工智能法》禁止在未经版权所有者明确授权的情况下使用受版权保护的材料来训练人工智能模型。相反,日本的《文本和数据挖掘法》允许在不区分合法和非法获取的材料的情况下使用受版权保护的数据来训练人工智能模型。相比之下,中国出台了若干原则和法规来规范人工智能训练数据集的使用,这些原则和法规要求合法获取训练数据,这一点与欧盟更为一致。然而,这些法规仅针对面向公众的人工智能服务,而不包括企业和研究机构开发和使用的人工智能服务。
监管环境往往会塑造初创公司的发展轨迹,对其创新和规模扩张能力产生重大影响。一家专注于训练模型的人工智能初创公司——无论是预训练还是后训练阶段——都将面临各种各样的监管挑战,这些挑战可能会影响其长期成功,具体取决于其运营的地区。例如,在爬取受版权保护的互联网数据并将其用于训练强大的人工智能模型方面,日本的初创企业将受到日本《文本和数据挖掘法》的保护,这使其比欧盟的初创企业更具优势。鉴于人工智能技术超越国界,这就需要关键利益攸关方开展协作,制定跨境解决方案,并进行全球合作。
就算力而言,大型企业(无论是国有企业还是私营企业)与初创企业之间存在巨大差距。大型科技公司和国有实体拥有购买和囤积算力的资源,以支持其未来的人工智能发展目标,而不具备这些资源的小型企业则依赖大型企业提供人工智能训练和推理基础设施。围绕计算资源的供应链问题加剧了这一差距,这在全球南方更为明显。例如,在全球前100个能够训练大型人工智能模型的高性能计算(HPC)集群中,没有一个托管在发展中国家。
2023年10月,作为联合国秘书长数字合作路线图的一部分,联合国高级别人工智能咨询机构(HLAB)成立,旨在为联合国成员国提供人工智能国际治理的分析和建议。该小组由39名具有不同背景的人士组成(按地域、性别、年龄和学科划分),涵盖政府、公民社会、私营部门和学术界,以确保人工智能治理的建议既公平又具有包容性。
作为这个过程的一部分,我们对初创企业和中小企业的专家进行了采访,以探讨他们在人工智能训练数据集方面面临的挑战。他们的反馈意见强调了联合国等中立国际机构在监督人工智能国际治理方面的重要性。
联合国高级别人工智能咨询机构关于人工智能训练数据集标准的建议,涵盖了预训练还是后训练的标准,详见新报告《以人为本的人工智能治理》,包括以下内容:
1.建立全球匿名数据交换市场,规范数据相关定义、人工智能训练数据全球治理原则和人工智能训练数据来源,以及透明、基于权利的问责制。这包括引入数据管理和交换流程及标准。
2.促进数据共享,鼓励对代表性不足或缺失的数据进行整理。
3.确保国际数据访问的互操作性。
4.创建以尊重权利的方式补偿数据创建者的机制。
为了解决算力的差距,联合国高级别人工智能咨询机构提出以下建议:
1.在共同利益框架下建立能力建设网络,确保人工智能利益的公平分配。
2.建立全球基金,支持旨在将人工智能应用于当地公共利益用例的研究人员和开发人员获取计算资源。
人工智能的国际治理,尤其是训练数据集和算力的治理,对初创企业和发展中国家至关重要。它为获取基本资源和促进国际合作提供了强有力的框架,使初创企业能够在全球人工智能领域负责任地进行创新和扩展。(财富中文网)
纳兹尼恩·拉贾尼(Nazneen Rajani)博士担任Collinear AI公司的首席执行官,也是联合国高级别人工智能咨询机构的成员。
Fortune.com上发表的评论文章中表达的观点,仅代表作者本人的观点,不代表《财富》杂志的观点和立场。
译者:中慧言-王芳
人工智能通常被视为现代社会的电力,每天为无数人类互动提供动力。然而,随着大型科技公司和富裕国家在该领域占据主导地位,初创公司和发展中国家面临着明显的劣势,尤其是在两个关键领域:训练数据集和算力。
人工智能的全球监管环境极为复杂,并因私营和公共部门利益相关者之间不同的法规和合作模式而呈分散态势。由于需要协调国际间的监管框架和标准,这种复杂性进一步加剧。
人工智能训练数据集的合理使用规定因地区而异。例如,欧盟的《人工智能法》禁止在未经版权所有者明确授权的情况下使用受版权保护的材料来训练人工智能模型。相反,日本的《文本和数据挖掘法》允许在不区分合法和非法获取的材料的情况下使用受版权保护的数据来训练人工智能模型。相比之下,中国出台了若干原则和法规来规范人工智能训练数据集的使用,这些原则和法规要求合法获取训练数据,这一点与欧盟更为一致。然而,这些法规仅针对面向公众的人工智能服务,而不包括企业和研究机构开发和使用的人工智能服务。
监管环境往往会塑造初创公司的发展轨迹,对其创新和规模扩张能力产生重大影响。一家专注于训练模型的人工智能初创公司——无论是预训练还是后训练阶段——都将面临各种各样的监管挑战,这些挑战可能会影响其长期成功,具体取决于其运营的地区。例如,在爬取受版权保护的互联网数据并将其用于训练强大的人工智能模型方面,日本的初创企业将受到日本《文本和数据挖掘法》的保护,这使其比欧盟的初创企业更具优势。鉴于人工智能技术超越国界,这就需要关键利益攸关方开展协作,制定跨境解决方案,并进行全球合作。
就算力而言,大型企业(无论是国有企业还是私营企业)与初创企业之间存在巨大差距。大型科技公司和国有实体拥有购买和囤积算力的资源,以支持其未来的人工智能发展目标,而不具备这些资源的小型企业则依赖大型企业提供人工智能训练和推理基础设施。围绕计算资源的供应链问题加剧了这一差距,这在全球南方更为明显。例如,在全球前100个能够训练大型人工智能模型的高性能计算(HPC)集群中,没有一个托管在发展中国家。
2023年10月,作为联合国秘书长数字合作路线图的一部分,联合国高级别人工智能咨询机构(HLAB)成立,旨在为联合国成员国提供人工智能国际治理的分析和建议。该小组由39名具有不同背景的人士组成(按地域、性别、年龄和学科划分),涵盖政府、公民社会、私营部门和学术界,以确保人工智能治理的建议既公平又具有包容性。
作为这个过程的一部分,我们对初创企业和中小企业的专家进行了采访,以探讨他们在人工智能训练数据集方面面临的挑战。他们的反馈意见强调了联合国等中立国际机构在监督人工智能国际治理方面的重要性。
联合国高级别人工智能咨询机构关于人工智能训练数据集标准的建议,涵盖了预训练还是后训练的标准,详见新报告《以人为本的人工智能治理》,包括以下内容:
1.建立全球匿名数据交换市场,规范数据相关定义、人工智能训练数据全球治理原则和人工智能训练数据来源,以及透明、基于权利的问责制。这包括引入数据管理和交换流程及标准。
2.促进数据共享,鼓励对代表性不足或缺失的数据进行整理。
3.确保国际数据访问的互操作性。
4.创建以尊重权利的方式补偿数据创建者的机制。
为了解决算力的差距,联合国高级别人工智能咨询机构提出以下建议:
1.在共同利益框架下建立能力建设网络,确保人工智能利益的公平分配。
2.建立全球基金,支持旨在将人工智能应用于当地公共利益用例的研究人员和开发人员获取计算资源。
人工智能的国际治理,尤其是训练数据集和算力的治理,对初创企业和发展中国家至关重要。它为获取基本资源和促进国际合作提供了强有力的框架,使初创企业能够在全球人工智能领域负责任地进行创新和扩展。(财富中文网)
纳兹尼恩·拉贾尼(Nazneen Rajani)博士担任Collinear AI公司的首席执行官,也是联合国高级别人工智能咨询机构的成员。
Fortune.com上发表的评论文章中表达的观点,仅代表作者本人的观点,不代表《财富》杂志的观点和立场。
译者:中慧言-王芳
Artificial Intelligence (AI) is often regarded as the modern-day equivalent of electricity, powering countless human interactions daily. However, startups and developing nations face a clear disadvantage as Big Tech companies and richer nations dominate the field, especially when it comes to two critical areas: training datasets and computational power.
The global regulatory landscape for AI is highly complex and fragmented along the lines of varied regulations and collaborations between stakeholders in both the private and public sectors. This complexity is further exacerbated by the need to harmonize regulatory frameworks and standards across international borders.
The regulations governing fair use of AI training datasets differ across regions. For instance, the European Union’s AI Act prohibits the use of copyrighted materials for training AI models without explicit authorization from rights holders. Conversely, Japan’s Text and Data Mining (TDM) law permits the use of copyrighted data for AI model training, without distinguishing between legally and illegally accessed materials. In contrast, China has introduced several principles and regulations to govern the use of AI training datasets that are more in line with the EU in that they require the training data to be lawfully obtained. However, those regulations only target AI services accessible to the general public and exclude those developed and used by enterprises and research institutions.
The regulatory environment often shapes a startup’s trajectory, significantly influencing its ability to innovate and scale. An AI startup focused on training models—whether in the pre-training or post-training phase—will encounter varying regulatory challenges that could affect its long-term success, depending on the region in which it operates. For example, a startup in Japan would have an advantage over one in the EU when it comes to crawling internet data that is copyrighted and using it for training powerful AI models because it would be protected by Japan’s TDM law. Given that AI technologies transcend national borders, this necessitates collaborative, cross-border solutions, and global cooperation among key stakeholders.
In terms of computational power, a significant disparity exists between large players—whether state-owned or private entities—and startups. Bigger tech companies and state entities have the resources to buy and hoard computational power that would support their future AI development goals, whereas smaller players that do not have those resources depend on the bigger players for AI training and inference infrastructure. The supply chain issues surrounding compute resources have intensified this gap, which is even more pronounced in the global South. For example, out of the top 100 high-performance computing (HPC) clusters in the world capable of training large AI models, not one is hosted in a developing country.
In October 2023, the UN’s High-Level Advisory Body (HLAB) on AI was formed as part of the UN Secretary-General’s Roadmap for Digital Cooperation, and designed to offer UN member states analysis and recommendations for the international governance of AI. The group is made up of 39 people with diverse backgrounds (by geography, gender, age, and discipline), spanning government, civil society, the private sector, and academia to ensure recommendations for AI governance are both fair and inclusive.
As part of this process, we conducted interviews with experts from startups and small-to-medium enterprises (SMEs) to explore the challenges they face in relation to AI training datasets. Their feedback underscored the importance of a neutral, international body, such as the United Nations, in overseeing the international governance of AI.
The HLAB’s recommendations on AI training dataset standards, covering both pre-training and post-training, are detailed in the new report Governing AI for Humanity and include the following:
1.Establishing a global marketplace for the exchange of anonymized data that standardizes data-related definitions, principles for global governance of AI training data and AI training data provenance, and transparent, rights-based accountability. This includes introducing data stewardship and exchange processes and standards.
2.Promoting data commons that incentivize the curation of underrepresented or missing data.
3.Ensuring interoperability for international data access.
4.Creating mechanisms to compensate data creators in a rights-respecting manner.
To address the compute gap, the HLAB proposes the following recommendations:
1.Developing a network for capacity building under common-benefit frameworks to ensure equitable distribution of AI’s benefits.
2.Establishing a global fund to support access to computational resources for researchers and developers aiming to apply AI to local public interest use cases.
International governance of AI, particularly of training datasets and computational power, is crucial for startups and developing nations. It provides a robust framework for accessing essential resources and fosters international cooperation, positioning startups to innovate and scale responsibly in the global AI landscape.
Nazneen Rajani, PhD, is the CEO of Collinear AI and a member of the UN’s High-Level Advisory Body on AI.
The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.