斯坦福报告：AI图像生成工具使用大量儿童性虐待图片进行训练

Matt O'Brien, Haleluya Hadero, 美联社

2024-01-13

AI图像生成工具到底如何训练？

文本设置

小号

默认

大号

Plus(0条)

图片来源：GETTY IMAGES

一项最新报告披露，热门AI图片生成工具使用了数千张儿童性虐待图片进行训练，该报告呼吁相关公司采取措施，解决它们开发的技术存在的有害缺陷。

这些图片使AI系统更容易生成逼真露骨的虚假儿童图像，并且可以将青少年在社交媒体上穿着衣服的照片变成裸体照片，这引起了世界各地学校和执法部门的警惕。

直到最近，反虐待研究人员还认为，一些不受控制的AI工具生成非法儿童图像的唯一方法，就是把它们从成人色情内容和良性的儿童照片这两组在线图像中提取的信息组合在一起。

但斯坦福互联网观察站（Stanford Internet Observatory）在庞大的AI数据库LAION中发现了3,200多张疑似儿童性虐待图片。LAION是一个在线图片与标题索引，被用于训练Stable Diffusion等当前领先的图像生成工具。该观察组织来自斯坦福大学（Stanford University）。它与加拿大儿童保护中心（Canadian Centre for Child Protection）和其他反虐待慈善机构合作，发现非法材料，并将原始照片链接举报给执法机关。

它们的行动很快得到响应。在2023年12月20日斯坦福互联网观察站报告发布前夜，LAION对美联社（The Associated Press）表示，它已经临时移除了其数据集。

LAION是非营利组织大规模AI开放网络（Large-scale Artificial Intelligence Open Network）的缩写。该组织在一份声明中称，其“对于非法内容坚持零容忍的政策，我们采取了高度谨慎的做法，把LAION数据集下线，会在保证安全之后再重新发布。”

虽然这些图片在LAION约58亿张图片索引里只是九牛一毛，但斯坦福互联网观察站指出，它可能会影响AI工具生成有害结果的能力，并让多次出现的真实受害者再次回想起先前遭到的虐待。

报告的作者、斯坦福互联网观察站的首席技术专家大卫·泰尔表示，这个问题并不容易解决，原因能够追溯到许多生成式AI项目因为竞争激烈而“急于上市”，并大范围推广。

泰尔在接受采访时说：“汇总整个互联网上的数据，并将数据集用于训练模型，这本应该仅限于研究目的，不应该是开源的，而且必须接受更严格的监管。”

LAION的一个主要用户是位于英国伦敦的初创公司Stability AI，它为LAION数据集的开发提供了帮助。Stability AI开发了文本生成图片的模型Stable Diffusion。斯坦福的报告称，虽然新版Stable Diffusion使用户更难生成有害内容，但2022年发布的一个旧版本（Stability AI称其并未发布该版本）依然被整合到其他应用和工具当中，而且仍然是“最受欢迎的生成露骨图片的模型”。

加拿大儿童保护中心的信息技术总监劳埃德·理查森表示：“我们无法回收这款模型。它被许多人安装在本地的机器上。”加拿大儿童保护中心负责运营加拿大的在线性剥削举报热线。

Stability AI在12月20日表示，其仅提供经过筛查的Stable Diffusion版本，并且“自从接管了对Stable Diffusion的独家开发任务之后，公司便积极采取了预防措施，以减少其被滥用的风险。”

该公司在一份事先准备的声明里称：“这些过滤工具会阻止不安全的内容进入模型。这样做又可以反过来帮助阻止模型生成不安全的内容。”

LAION源自德国研究人员和教师克里斯托弗·舒曼提出的一种理念。他在2023年早些时候告诉美联社，他之所以希望把一个如此庞大的可视化数据库对外公开，部分原因是为了确保未来AI的发展不会被几家强大的公司所控制。

他说：“如果我们能够将AI发展民主化，使整个研究界和全人类都可以从中受益，这将是更安全、更公平的做法。”

LAION的大部分数据来自另外一个数据库Common Crawl。Common Crawl不断从开放互联网中抓取数据，但其执行董事里奇·斯克伦塔指出，LAION“有义务”在使用数据之前进行扫描和过滤。

LAION在2023年年底表示，其开发了“严格的过滤工具”，能够在发布数据集之前监测和移除非法内容，并且依旧在努力完善这些工具。斯坦福的报告承认，LAION的开发者曾经试图过滤掉“未成年”露骨内容，但如果他们事先征求儿童安全专家的意见，本可以做得更好。

许多文本生成图片的工具都使用了LAION数据库进行训练，但尚不确定具体的名单。DALL-E和ChatGPT的开发者OpenAI表示，其并未使用LAION，并且改进了其模型，能够拒绝涉及未成年人的性内容请求。

谷歌（Google）的文本生成图像工具Imagen模型基于LAION的数据集，但2022年，由于谷歌对数据库审查后“发现了大量不良内容，包括色情图像、种族歧视性语言和有害的社会刻板印象”，因此公司决定放弃公开发布该模型。

追溯性清除相关数据困难重重，因此斯坦福互联网观察站呼吁采取更激进的措施。其中一项措施是，任何人如果基于LAION-5B（该模型中包含超过50亿个图片-文本数据对，因此而得名）开发了训练数据集，就应该“删除数据集，或者与中间方合作清理相关材料”。另外一项措施是让旧版Stable Diffusion从互联网最阴暗的角落消失。

泰尔表示，“合法平台可以停止提供相关版本下载”，尤其是在工具被频繁用于生成不良图像且没有阻止此类行为的安全防护措施的情况下。

例如，泰尔点名了CivitAI平台。该平台被人们用于制作AI生成的色情内容而受到欢迎，但该平台缺乏杜绝生成儿童图片的安全措施。报告中还呼吁AI公司Hugging Face采取更有效的方法，举报和删除虐待材料的链接。Hugging Face为模型提供训练数据。

该公司称，它长期与监管部门和儿童安全团体合作，识别和删除儿童虐待材料。CivitAI并未回复在其网页提交的置评请求。

斯坦福的报告还质疑，根据联邦《儿童在线隐私保护法案》（Children’s Online Privacy Protection Act）规定的保护措施，未经家人同意，是否应该把任何儿童的照片，即便是最良性的照片，输入AI系统。

反儿童性虐待组织Thorn的数据科学总监瑞贝卡·波特诺夫表示，她所在机构的研究发现，虽然AI生成的儿童性虐待图像在虐待者中并不流行，但这类图像的流传范围正在持续扩大。

开发者能够保证开发AI模型所使用的数据集中不含儿童虐待材料，从而减少这些伤害。波特诺夫称，即使在模型发布之后，仍旧有机会彻底减少这类有害的使用。

科技公司和儿童安全团体目前正在为视频和图像添加“标签”，通过这种独特的数字标志跟踪和移除儿童虐待内容。波特诺夫指出，这种理念也适用于被滥用的AI模型。

她说：“AI行业目前还没有这样做。但我认为，他们可以而且应该采取这种措施。”（财富中文网）

译者：刘进龙

审校：汪皓

一项最新报告披露，热门AI图片生成工具使用了数千张儿童性虐待图片进行训练，该报告呼吁相关公司采取措施，解决它们开发的技术存在的有害缺陷。

它们的行动很快得到响应。在2023年12月20日斯坦福互联网观察站报告发布前夜，LAION对美联社（The Associated Press）表示，它已经临时移除了其数据集。

该公司在一份事先准备的声明里称：“这些过滤工具会阻止不安全的内容进入模型。这样做又可以反过来帮助阻止模型生成不安全的内容。”

他说：“如果我们能够将AI发展民主化，使整个研究界和全人类都可以从中受益，这将是更安全、更公平的做法。”

泰尔表示，“合法平台可以停止提供相关版本下载”，尤其是在工具被频繁用于生成不良图像且没有阻止此类行为的安全防护措施的情况下。

该公司称，它长期与监管部门和儿童安全团体合作，识别和删除儿童虐待材料。CivitAI并未回复在其网页提交的置评请求。

她说：“AI行业目前还没有这样做。但我认为，他们可以而且应该采取这种措施。”（财富中文网）

译者：刘进龙

审校：汪皓

Hidden inside the foundation of popular artificial intelligence image-generators are thousands of images of child sexual abuse, according to a new report that urges companies to take action to address a harmful flaw in the technology they built.

Those same images have made it easier for AI systems to produce realistic and explicit imagery of fake children as well as transform social media photos of fully clothed real teens into nudes, much to the alarm of schools and law enforcement around the world.

Until recently, anti-abuse researchers thought the only way that some unchecked AI tools produced abusive imagery of children was by essentially combining what they’ve learned from two separate buckets of online images — adult pornography and benign photos of kids.

But the Stanford Internet Observatory found more than 3,200 images of suspected child sexual abuse in the giant AI database LAION, an index of online images and captions that’s been used to train leading AI image-makers such as Stable Diffusion. The watchdog group based at Stanford University worked with the Canadian Centre for Child Protection and other anti-abuse charities to identify the illegal material and report the original photo links to law enforcement.

The response was immediate. On the eve of the December 20 release of the Stanford Internet Observatory’s report, LAION told The Associated Press it was temporarily removing its datasets.

LAION, which stands for the nonprofit Large-scale Artificial Intelligence Open Network, said in a statement that it “has a zero tolerance policy for illegal content and in an abundance of caution, we have taken down the LAION datasets to ensure they are safe before republishing them.”

While the images account for just a fraction of LAION’s index of some 5.8 billion images, the Stanford group says it is likely influencing the ability of AI tools to generate harmful outputs and reinforcing the prior abuse of real victims who appear multiple times.

It’s not an easy problem to fix, and traces back to many generative AI projects being “effectively rushed to market” and made widely accessible because the field is so competitive, said Stanford Internet Observatory’s chief technologist David Thiel, who authored the report.

“Taking an entire internet-wide scrape and making that dataset to train models is something that should have been confined to a research operation, if anything, and is not something that should have been open-sourced without a lot more rigorous attention,” Thiel said in an interview.

A prominent LAION user that helped shape the dataset’s development is London-based startup Stability AI, maker of the Stable Diffusion text-to-image models. New versions of Stable Diffusion have made it much harder to create harmful content, but an older version introduced in 2022 — which Stability AI says it didn’t release — is still baked into other applications and tools and remains “the most popular model for generating explicit imagery,” according to the Stanford report.

“We can’t take that back. That model is in the hands of many people on their local machines,” said Lloyd Richardson, director of information technology at the Canadian Centre for Child Protection, which runs Canada’s hotline for reporting online sexual exploitation.

Stability AI on December 20 said it only hosts filtered versions of Stable Diffusion and that “since taking over the exclusive development of Stable Diffusion, Stability AI has taken proactive steps to mitigate the risk of misuse.”

“Those filters remove unsafe content from reaching the models,” the company said in a prepared statement. “By removing that content before it ever reaches the model, we can help to prevent the model from generating unsafe content.”

LAION was the brainchild of a German researcher and teacher, Christoph Schuhmann, who told the AP earlier 2023 that part of the reason to make such a huge visual database publicly accessible was to ensure that the future of AI development isn’t controlled by a handful of powerful companies.

“It will be much safer and much more fair if we can democratize it so that the whole research community and the whole general public can benefit from it,” he said.

Much of LAION’s data comes from another source, Common Crawl, a repository of data constantly trawled from the open internet, but Common Crawl’s executive director, Rich Skrenta, said it was “incumbent on” LAION to scan and filter what it took before making use of it.

LAION said in the end of 2023 it developed “rigorous filters” to detect and remove illegal content before releasing its datasets and is still working to improve those filters. The Stanford report acknowledged LAION’s developers made some attempts to filter out “underage” explicit content but might have done a better job had they consulted earlier with child safety experts.

Many text-to-image generators are derived in some way from the LAION database, though it’s not always clear which ones. OpenAI, maker of DALL-E and ChatGPT, said it doesn’t use LAION and has fine-tuned its models to refuse requests for sexual content involving minors.

Google built its text-to-image Imagen model based on a LAION dataset but decided against making it public in 2022 after an audit of the database “uncovered a wide range of inappropriate content including pornographic imagery, racist slurs, and harmful social stereotypes.”

Trying to clean up the data retroactively is difficult, so the Stanford Internet Observatory is calling for more drastic measures. One is for anyone who’s built training sets off of LAION‐5B — named for the more than 5 billion image-text pairs it contains — to “delete them or work with intermediaries to clean the material.” Another is to effectively make an older version of Stable Diffusion disappear from all but the darkest corners of the internet.

“Legitimate platforms can stop offering versions of it for download,” particularly if they are frequently used to generate abusive images and have no safeguards to block them, Thiel said.

As an example, Thiel called out CivitAI, a platform that’s favored by people making AI-generated pornography but which he said lacks safety measures to weigh it against making images of children. The report also calls on AI company Hugging Face, which distributes the training data for models, to implement better methods to report and remove links to abusive material.

Hugging Face said it is regularly working with regulators and child safety groups to identify and remove abusive material. CivitAI didn’t return requests for comment submitted to its webpage.

The Stanford report also questions whether any photos of children — even the most benign — should be fed into AI systems without their family’s consent due to protections in the federal Children’s Online Privacy Protection Act.

Rebecca Portnoff, the director of data science at the anti-child sexual abuse organization Thorn, said her organization has conducted research that shows the prevalence of AI-generated images among abusers is small, but growing consistently.

Developers can mitigate these harms by making sure the datasets they use to develop AI models are clean of abuse materials. Portnoff said there are also opportunities to mitigate harmful uses down the line after models are already in circulation.

Tech companies and child safety groups currently assign videos and images a “hash” — unique digital signatures — to track and take down child abuse materials. According to Portnoff, the same concept can be applied to AI models that are being misused.

“It’s not currently happening,” she said. “But it’s something that in my opinion can and should be done.”

财富中文网所刊载内容之知识产权为财富媒体知识产权有限公司及/或相关权利人专属所有或持有。未经许可，禁止进行转载、摘编、复制及建立镜像等任何使用。

0条Plus

精彩评论

撰写或查看更多评论

请打开财富Plus APP

前往打开

热读文章

关注我们

斯坦福报告：AI图像生成工具使用大量儿童性虐待图片进行训练

撰写或查看更多评论