立即打开
科学家钻研计算机的“大脑”,获得惊人发现

科学家钻研计算机的“大脑”,获得惊人发现

Jeremy Kahn 2021-03-05
人工智能研究公司OpenAI声称掌握了一种更先进的方式,能够用来一探“神经网络”这种人工智能软件的内部工作原理,从而有助于更好地了解“决策过程”这一著名的难题。

研究人员称,他们取得了一项重要的发现——可能对电脑、乃至人脑的研究都有着巨大深远的影响。

位于旧金山的人工智能研究公司OpenAI表示,该公司掌握了一种更先进的方式,能够用来一探“神经网络”这种人工智能软件的内部工作原理,从而有助于他们更好地了解“决策过程”这一著名的难题。他们在研究中发现,大型神经网络中的单个神经元可以对某个特定的概念进行编码——这与神经科学家在人脑中得到的发现惊人地相似。

“神经网络”是一种对人脑的结构进行抽象、并以此建模而成的机器学习软件。过去八年来,人工智能领域之所以能够取得飞速的发展,主要就得益于神经网络的广泛应用——包括数字技术辅助的语音识别、面部识别软件以及新的药物研发方法等。

但大型神经网络的一个缺点是,即便是对创造出它们的机器学习专家而言,理解其决策过程背后的原理也可能相当困难。而这带来的结果就是,很难确切知晓该软件何时会发生故障,又会如何发生故障。因而也就不难理解,为什么这类人工智能明明可以比其他自动化软件或人工表现得更好,人们却不太愿意使用它。这种情况在医疗和财务领域尤为普遍,因为一旦决策失误,可能会付出金钱甚至生命的沉重代价。

OpenAI的联合创始人及首席科学家伊利亚·莎士科尔说:“由于我们不了解这些神经网络是如何工作的,因此当它们报错时,很难推理出是哪个环节出了问题。我们不知道它们是否可靠,或者是不是有一些在测试中看不出来的隐藏漏洞。”

但最近,公司研究人员运用多项技术来探究他们创建的一个大型神经网络的内部工作原理,该神经网络是一个用来识别图像并归类存储的软件。他们的重要发现是,神经网络中的单个神经元与某个特定的标签或概念相关。

OpenAI在关于这项研究的一篇博文中说,这一点意义重大,因为它和2005年的一项具有里程碑意义的神经科学研究发现如出一辙——2005年的研究是针对人脑展开的,其发现是,人脑中可能存在一种所谓的“祖母”神经元,会对某个特定图像或概念的刺激作出回应。例如,神经科学家发现,其中一个研究对象似乎有一种与女明星哈莉·贝瑞有关的神经元。当向其展示哈莉·贝瑞的图像时,该神经元就会被触发、产生反应,当这个人听到“哈莉·贝瑞”一词,或看到与哈莉·贝瑞有关的标志性图像时,该神经元也会被激活。

OpenAI的研究对象是他们在今年1月首次推出的一项人工智能系统——该系统无需先对带标签的数据集进行分类识别训练,就能够高度精确地执行各种图像归类任务。这项被称为CLIP(“语言-图像比对预训练”的缩写)的系统从网上摄取了4亿张图像,并与文案匹配。根据这些信息,该系统学会了从32768个文本摘要标签中预测,哪一个最可能与任意一个给定的图像相关联——即使此前从未见过这些图像。例如,当向CLIP展示一碗鳄梨酱的图片时,它不仅可以将其正确地标记为鳄梨酱,而且还知道鳄梨酱是“一种食物”。

OpenAI在新研究中使用的是一种“倒推”的方式,以探究是什么样的图片最能够触发某个人工神经元的反应,让机器认为这是某个概念的“灵魂伴侣”。例如,OpenAI对与“金子”这一概念相关的神经元进行了探测,发现最可以触发它的是含有闪亮的黄色硬币状物体和“金子”两个字本身的图片。而与“蜘蛛人”相关的神经元则会被一个漫画人物的照片触发,也会被“蜘蛛”这个词触发。有趣的是,一个与“黄色”概念相关的神经元会对“香蕉”和“柠檬”产生反应——包括这两个词语本身和它们的颜色。

OpenAI的研究人员加布里埃尔·高告诉《财富》杂志:“这或许证明了,这些神经网络并不像我们想象的那样令人费解。”将来,此类方法能够帮助那些使用了神经网络技术的公司,让他们了解这些机器是如何做出决策的,以及系统可能在何时出现故障或偏差。这还可以启示神经科学家们另辟蹊径:用“人工”神经网络的原理,来研究“人类”的学习和概念形成方式。

并非每个CLIP神经元都只与一个特定的概念相关联——很多会被不同概念激活并产生回应。而且有些神经元似乎能够被同时触发,这或许意味着与之有关的是一个复杂的概念。

OpenAI表示,也会有这样一些概念,即研究人员原本期望系统中有与之相关的神经元,但其实并不存在。例如尽管CLIP可以准确地识别旧金山的照片,甚至通常能够识别出拍摄照片的具体地点,但神经网络似乎并没有与“旧金山”甚至“城市”、“加利福尼亚”等概念本身相关的神经元。OpenAI在其博文中说:“我们认为,这些信息也被编码在该模型的某个触发系统中,但是以一种更独特的方式编码的。”

这种技术还可以发现神经网络系统中潜藏的偏见。在演示中,研究人员发现,CLIP还具有OpenAI所称的一种“中东”神经元。这种神经元不仅会被与中东地区相关的图像和单词触发,并产生反应,也会和“恐怖分子”概念关联。还有一个“移民”神经元,与“拉丁美洲”关联。还有能够同时对深肤色人种和大猩猩产生反应的神经元——OpenAI指出,这种偏见和其他一些带有种族主义色彩的图片标签类似。此前,谷歌(Google)基于神经网络技术的图像分类系统就出现过此类问题。

大型人工智能系统中潜在的种族、性别偏见已经引起了伦理学家和社会组织的日益关注,特别是那些通过互联网大数据进行机器学习、训练的人工智能系统。

研究人员还说,他们的方法发现了CLIP在决策过程中存在的一项特殊偏见,这可能被有心人利用,故意误导人工智能系统做出错误的识别。该系统对表示某个概念的语词或符号文本和那个概念本身紧密关联,以至于如果有人把这种符号或语词放在不同的对象上,系统就会将其分错。例如,身上带有巨大“$$$”标志的狗可能就会被错误地识别成“存钱罐”。

加布里埃尔·高说:“我认为,系统中肯定会带有很多刻板印象。”而莎士科尔表示,能够识别出这些偏差,正是试着纠正它们的第一步。他认为,可以让神经网络进行一点附加的样例训练,这些训练专门用来打破机器在学习中习得的不恰当的相关性。(财富中文网)

编译:陈聪聪

研究人员称,他们取得了一项重要的发现——可能对电脑、乃至人脑的研究都有着巨大深远的影响。

位于旧金山的人工智能研究公司OpenAI表示,该公司掌握了一种更先进的方式,能够用来一探“神经网络”这种人工智能软件的内部工作原理,从而有助于他们更好地了解“决策过程”这一著名的难题。他们在研究中发现,大型神经网络中的单个神经元可以对某个特定的概念进行编码——这与神经科学家在人脑中得到的发现惊人地相似。

“神经网络”是一种对人脑的结构进行抽象、并以此建模而成的机器学习软件。过去八年来,人工智能领域之所以能够取得飞速的发展,主要就得益于神经网络的广泛应用——包括数字技术辅助的语音识别、面部识别软件以及新的药物研发方法等。

但大型神经网络的一个缺点是,即便是对创造出它们的机器学习专家而言,理解其决策过程背后的原理也可能相当困难。而这带来的结果就是,很难确切知晓该软件何时会发生故障,又会如何发生故障。因而也就不难理解,为什么这类人工智能明明可以比其他自动化软件或人工表现得更好,人们却不太愿意使用它。这种情况在医疗和财务领域尤为普遍,因为一旦决策失误,可能会付出金钱甚至生命的沉重代价。

OpenAI的联合创始人及首席科学家伊利亚·莎士科尔说:“由于我们不了解这些神经网络是如何工作的,因此当它们报错时,很难推理出是哪个环节出了问题。我们不知道它们是否可靠,或者是不是有一些在测试中看不出来的隐藏漏洞。”

但最近,公司研究人员运用多项技术来探究他们创建的一个大型神经网络的内部工作原理,该神经网络是一个用来识别图像并归类存储的软件。他们的重要发现是,神经网络中的单个神经元与某个特定的标签或概念相关。

OpenAI在关于这项研究的一篇博文中说,这一点意义重大,因为它和2005年的一项具有里程碑意义的神经科学研究发现如出一辙——2005年的研究是针对人脑展开的,其发现是,人脑中可能存在一种所谓的“祖母”神经元,会对某个特定图像或概念的刺激作出回应。例如,神经科学家发现,其中一个研究对象似乎有一种与女明星哈莉·贝瑞有关的神经元。当向其展示哈莉·贝瑞的图像时,该神经元就会被触发、产生反应,当这个人听到“哈莉·贝瑞”一词,或看到与哈莉·贝瑞有关的标志性图像时,该神经元也会被激活。

OpenAI的研究对象是他们在今年1月首次推出的一项人工智能系统——该系统无需先对带标签的数据集进行分类识别训练,就能够高度精确地执行各种图像归类任务。这项被称为CLIP(“语言-图像比对预训练”的缩写)的系统从网上摄取了4亿张图像,并与文案匹配。根据这些信息,该系统学会了从32768个文本摘要标签中预测,哪一个最可能与任意一个给定的图像相关联——即使此前从未见过这些图像。例如,当向CLIP展示一碗鳄梨酱的图片时,它不仅可以将其正确地标记为鳄梨酱,而且还知道鳄梨酱是“一种食物”。

OpenAI在新研究中使用的是一种“倒推”的方式,以探究是什么样的图片最能够触发某个人工神经元的反应,让机器认为这是某个概念的“灵魂伴侣”。例如,OpenAI对与“金子”这一概念相关的神经元进行了探测,发现最可以触发它的是含有闪亮的黄色硬币状物体和“金子”两个字本身的图片。而与“蜘蛛人”相关的神经元则会被一个漫画人物的照片触发,也会被“蜘蛛”这个词触发。有趣的是,一个与“黄色”概念相关的神经元会对“香蕉”和“柠檬”产生反应——包括这两个词语本身和它们的颜色。

OpenAI的研究人员加布里埃尔·高告诉《财富》杂志:“这或许证明了,这些神经网络并不像我们想象的那样令人费解。”将来,此类方法能够帮助那些使用了神经网络技术的公司,让他们了解这些机器是如何做出决策的,以及系统可能在何时出现故障或偏差。这还可以启示神经科学家们另辟蹊径:用“人工”神经网络的原理,来研究“人类”的学习和概念形成方式。

并非每个CLIP神经元都只与一个特定的概念相关联——很多会被不同概念激活并产生回应。而且有些神经元似乎能够被同时触发,这或许意味着与之有关的是一个复杂的概念。

OpenAI表示,也会有这样一些概念,即研究人员原本期望系统中有与之相关的神经元,但其实并不存在。例如尽管CLIP可以准确地识别旧金山的照片,甚至通常能够识别出拍摄照片的具体地点,但神经网络似乎并没有与“旧金山”甚至“城市”、“加利福尼亚”等概念本身相关的神经元。OpenAI在其博文中说:“我们认为,这些信息也被编码在该模型的某个触发系统中,但是以一种更独特的方式编码的。”

这种技术还可以发现神经网络系统中潜藏的偏见。在演示中,研究人员发现,CLIP还具有OpenAI所称的一种“中东”神经元。这种神经元不仅会被与中东地区相关的图像和单词触发,并产生反应,也会和“恐怖分子”概念关联。还有一个“移民”神经元,与“拉丁美洲”关联。还有能够同时对深肤色人种和大猩猩产生反应的神经元——OpenAI指出,这种偏见和其他一些带有种族主义色彩的图片标签类似。此前,谷歌(Google)基于神经网络技术的图像分类系统就出现过此类问题。

大型人工智能系统中潜在的种族、性别偏见已经引起了伦理学家和社会组织的日益关注,特别是那些通过互联网大数据进行机器学习、训练的人工智能系统。

研究人员还说,他们的方法发现了CLIP在决策过程中存在的一项特殊偏见,这可能被有心人利用,故意误导人工智能系统做出错误的识别。该系统对表示某个概念的语词或符号文本和那个概念本身紧密关联,以至于如果有人把这种符号或语词放在不同的对象上,系统就会将其分错。例如,身上带有巨大“$$$”标志的狗可能就会被错误地识别成“存钱罐”。

加布里埃尔·高说:“我认为,系统中肯定会带有很多刻板印象。”而莎士科尔表示,能够识别出这些偏差,正是试着纠正它们的第一步。他认为,可以让神经网络进行一点附加的样例训练,这些训练专门用来打破机器在学习中习得的不恰当的相关性。(财富中文网)

编译:陈聪聪

Researchers say they have made an important finding that could have big implications for the study of computer brains and, possibly, human ones too.

OpenAI, the San Francisco–based A.I. research company, says that it has advanced methods for peering into the inner workings of artificial intelligence software known as neural networks, helping to make their notoriously opaque decision-making more interpretable. In the process they have uncovered that individual neurons in a large neural network can encode a particular concept, a finding that parallels one that neuroscientists have glimpsed in the human brain.

Neural networks are a kind of machine-learning software loosely modeled on the human brain. The use of these networks has been responsible for most of the rapid advances in artificial intelligence in the past eight years, including the speech recognition found in digital assistants, facial recognition software, and new ways to discover drugs.

But one drawback in large neural networks is that it can be challenging to understand the rationale behind their decisions, even for the machine-learning experts who create them. As a result, it is difficult to know exactly when and how this software can fail. And that has made people understandably reluctant to use such A.I. software, even when these A.I. systems seem to outperform other kinds of automated software or humans. This has particularly been true in medical and financial settings, where a wrong decision may cost money or even lives.

“Because we don’t understand how these neural networks work, it can be hard to reason about their errors,” says Ilya Sutskever, OpenAI’s cofounder and chief scientist. “We don’t know if they are reliable or if they have hidden vulnerabilities that are not apparent from testing.”

But researchers at the company recently used several techniques to probe the inner workings of a large neural network they had created for identifying images and putting them into broad category buckets. The researchers discovered that individual neurons in the network were associated with one particular label or concept.

This was significant, OpenAI said in a blog post discussing its research, because it echoed findings from a landmark 2005 neuroscience study that found the human brain may have “grandmother” neurons that fire in response to one very specific image or concept. For instance, the neuroscientists discovered that one subject in their study seemed to have a neuron that was associated with the actress Halle Berry. The neuron fired when the person was shown an image of Berry, but the same neuron was also activated when the person heard the words “Halle Berry,” or when shown images associated with Berry’s iconic roles.

OpenAI’s research focused on an A.I. system it debuted in January that can perform a wide variety of image classification tasks with a high degree of accuracy, without being specifically trained for those tasks with labeled data sets. The system, called CLIP (short for Contrastive Language-Image Pre-training), ingested 400 million images from the Internet and paired with captions. From this information, the technology learned to predict which of 32,768 text snippet labels was most likely to be associated with any given image, even those it had never encountered before. For instance, show CLIP a picture of a bowl of guacamole, and not only is it able to correctly label the image as guacamole but it also knows that guacamole is “a type of food.”

In the new research, OpenAI used techniques that reverse engineer what makes a particular artificial neuron fire the most to build up a picture of that neuron’s “Platonic ideal” for a given concept. For instance, OpenAI probed one neuron associated with the concept “gold” and found that the image that most activated it would contain shiny yellow coin-like objects as well as a picture of the text “gold” itself. A neuron affiliated with “spider man” was triggered in response to photos of a person dressed up as the comic book hero but also to the word “spider.” Interestingly, a neuron affiliated with the concept “yellow” fired in response to the words “banana” and “lemon,” as well as the color and word itself.

“This is maybe evidence that these neural networks are not as incomprehensible as we might think,” Gabriel Goh, the OpenAI researcher who led the team working on interpreting CLIP’s conceptual reasoning, told Fortune. In the future, such methods could be used to help companies using neural networks to understand how they arrive at decisions and when a system is likely to fail or exhibit bias. It might also point a way for neuroscientists to use artificial neural networks to investigate the ways in which human learning and concept formation may take place.

Not every CLIP neuron was associated with a distinct concept. Many fired in response to a number of different conceptual categories. And some neurons seemed to fire together, possibly meaning that they represented a complex concept.

OpenAI said that some concepts that the researchers expected the system to have a neuron for were absent. Even though CLIP can accurately identify photographs of San Francisco and can often even identify the neighborhood of the city in which they were taken, the neural network did not seem to have a neuron associated with a concept of “San Francisco” or even “city” or “California.” “We believe this information to be encoded within the activations of the model somewhere, but in a more exotic way,” OpenAI said in its blog post.

In a demonstration that this technique can be used to uncover hidden biases in neural networks, the researchers discovered that CLIP also had what OpenAI dubbed a “Middle East” neuron that fired in response to images and words associated with the region, but also in response to those associated with terrorism, the company said. It had an “immigration” neuron that responds to Latin America. And it found a neuron that fired for both dark-skinned people and gorillas, which OpenAI noted was similar to other racist photo tagging that had previously caused problems for neural network–based image classification systems at Google.

Racial and gender biases hidden in large A.I. models have become an increasing area of concern for A.I. ethics researchers and civil society organizations, especially those trained from massive amount of data culled from the Internet.

The researchers also said that their methods had uncovered a particular bias in how CLIP makes decisions that would make it possible for someone to fool the A.I. into making incorrect identifications. The system associated the text of a word or symbol associated with a concept so strongly that if someone put that symbol or word on a different object, the system would misclassify it. For instance, a dog with a big “$$$” sign on it might be misclassified as a piggy bank.

“I think you definitely see a lot of stereotyping in the model,” Goh said. Sutskever said that being able to identify these biases was a first step toward trying to correct them, something he thought could be accomplished by providing the neural network with a relatively small number of additional training examples that are specifically designed to break the inappropriate correlation the system has learned.

热读文章
热门视频
扫描二维码下载财富APP