专访谷歌顶级科学家：人工智能离普及还有多远？

财富中文网

2016-12-25

谷歌的一位顶级科研人员接受《财富》专访谈人工智能的发展。

文本设置

小号

默认

大号

Plus(0条)

等你下一次不管是用谷歌搜索引擎搜索问题也好，还是在谷歌地图上找一家电影院的位置也罢，请你记住，在你看不见的地方，正有一个巨大的大脑在为你提供相关搜索结果，使你不至于在开车时迷了路。

当然，这里说的并不是人的大脑，而是网络搜索巨头谷歌的“谷歌大脑”（Google Brain）研究团队。《财富》记者罗杰·帕洛夫曾专门撰文揭开了这支团队的神秘面纱。“谷歌大脑”研究团队迄今已经开发了1000多个所谓的“深度学习”项目，它们也是YouTube、谷歌翻译、谷歌照片等近年来谷歌公司多个成功产品背后的大功臣。通过深度学习技术，研究人员能够将海量数据输入“神经元网络”软件系统进行处理，该系统能够以人脑完全无法企及的速度，在海量数据中进行学习和模式分析。

近日，“谷歌大脑”团队的创始人和负责人之一的杰夫·迪恩接受了《财富》杂志专访，并谈到了人工智能领域的研究进展及其带来的挑战，以及人工智能技术在谷歌产品中的应用。出于篇幅考虑，以下采访稿有删节。

问：在推动人工智能领域研究的过程中，科研人员主要面临哪些挑战？

人类的学习有大量内容来自无监督式的学习，也就是说，你只是在观察周围的世界，理解事物的道理。这是机器学习研究的一个非常活跃的领域，但目前研究的进展与监督式学习还是不能比拟的。

也就是说，无监督式学习指的是一个人通过观察和感知进行的学习，如果计算机也能自行进行观察和感知，就能帮助我们解决更复杂的问题了?

是的，人类的洞察力主要是通过无监督式学习训练出来的。你从小就会观察世界，但偶尔你也会得到一些监督式学习的信号，比如有人会告诉你：“那是一只长颈鹿”或“那是一辆小汽车”。你获了这些少量的监督式信息后，你的心智模式就会自然地对其产生回应。

我们需要将监督式和非监督式学习更紧密地结合起来。不过以我们大部分机器学习系统的工作模式来看，我们现在还没有完全进展到那个地步。

你能解释一下什么是“强化学习”技术吗？

“强化学习”背后的理念是，你并不一定理解你可能要采取的行动，所以你会先尝试你应该采取的一系列行动，比如你觉得某个想法很好，就可以先尝试一下，然后观察外界的反应。这就好比玩桌游，你可以针对对手的举动做出回应。最终在一系列的类似行为之后，你就会获得某种奖励信号。

强化学习的理念就是，在你获得奖励信号的同时，可以将功劳或过错分配给你在尝试过程中采取的所有行动。这项技术在今天的某些领域的确非常有效。

我觉得强化学习面临的一些挑战主要集中在当你可以采取的行为状态极为宽泛的时候。在真实世界中，人类在任何给定的时候都可以采取一系列极为宽泛的行为。而在你玩桌游的时候，你能采取的只有有限的一系列行为，因为游戏的规则限制了你，而且奖励信号也要明确得多——不是赢就是输。

如果我的目标是泡一杯咖啡之类的，那我可能采取的潜在行为就相当宽泛了，而奖励信号也没有那么明显了。

不过你们还是可以将步骤分解开，对吧？比如，如果你想泡一杯咖啡，你就可以通过学习得知，如果你在冲泡之前不将咖啡豆充分研磨，泡出来的咖啡就不会好喝。

对。我认为增强学习的一个特点就是它需要探索，所以在物理系统环境下使用它往往有些困难。不过我们已经开始尝试在机器人上使用这种技术了。当机器人要需要采取某些行动中，它在特定一天内可以采取的行为是有限的。但是如果使用计算机模拟的话，就可以轻易地使用大量计算机获得上百万个样本。

谷歌已经开始将强化学习技术用在核心搜索产品上了吗？

我们通过与DeepMind（一家人工智能领域的创业公司，2014年被谷歌收购）和我们的数据中心运营人员的共同努力，已经将强化学习技术应用到了我们的核心产品上。他们还将这项技术运用在了数据中心的空调温控系统上，在大大降低能耗的同时，达到了相同的、安全的冷却效果和运行条件。它能探索温控旋钮的哪种设置是合理的，以及当你改变运行条件时应该如何做出响应。

通过强化学习技术，他们能够探索这18个或者更多个温控旋钮的最优设置，而这可能是连专门负责温控的工作人员都没有做过的。熟悉温控系统的人可能会觉得：“这个设置真奇怪。”然而事实上它的工作效果非常好。

什么样的任务更适合应用强化学习技术？

上面说的数据中心这个案例之所以效果很好，就是因为在一段给定时间内并没有太多不同的行为。温控系统大概有18个温控旋钮，你可以把一个旋钮调高或调低，结果都是很容易衡量的。只要你在可以接受的适当温度范围内运行，你的能耗利用率就会更好。从这个角度看，这几乎是一个理想的强化学习技术的使用案例。

而至于在网络搜索中，我应该显示哪些搜索结果，这应该是强化学习技术的运用效果稍差的一个用例了。针对不同的搜索提问，我可以选择显示的搜索结果的面是很宽的，而且奖励信号也不明确。比方说一名用户看到了搜索结果，至于他心里喜不喜欢这个搜索结果，这是很不明显的。

如果他们不喜欢某一搜索结果，你连衡量它都很难吧？

是的，的确有点棘手。我认为这个例子就能说明强化学习技术可能还不够成熟，在这种奖励信号不够明确、约束条件太少的环境下，还不能真正有效地运行。

你们研究出来的这些技术要想应用到人们日常使用的产品中，还将面临哪些最大的挑战？

首先，很多机器学习解决方案和针对这些解决方案的研究是可以在各个不同领域重复使用的。比如我们与谷歌地图团队就在某些研究上展开了合作。他们希望能够识别出街景图片中的所有商户名称和标志牌，以更深入地了解这个世界——比如确定这究竟是一家披萨店还是别的什么。

事实证明，要想识别这些图像中的文字，你可以对一个机器学习模型进行“训练”，给它一些人们在文字周围画圈或画框的样本数据。这样一来，机器学习模型就会学会分辨图像中的哪些部分包含了文字。

这项能力总体还是很有用的。谷歌团队的另一部分人还将该技术运用到了一项卫星图像分析项目中，主要用来分辨美国和全世界的建筑物的房顶，以估算太阳能电池板在房顶上的安装位置。

我们还发现，同样的模型还能协助我们进行医学影响分析方面的一些初级工作。比如说你有一些医学影响，你想在其中发现一些与临床相关的有趣的部分，你就可以用这个模型来帮忙。（财富中文网）

作者：Jonathan Vanian

译者：朴成奎

The next time you enter a query into Google’s search engine or consult the company’s map service for directions to a movie theater, remember that a big brain is working behind the scenes to provide relevant search results and make sure you don’t get lost while driving.

Well, not a real brain per se, but the Google Brain research team. As Fortune’s Roger Parloff wrote, the Google Brain research team has created over 1,000 so-called deep learning projects that have supercharged many of Google’s products over the past few years like YouTube, translation, and photos. With deep learning, researchers can feed huge amounts of data into software systems called neural nets that learn to recognize patterns within the vast information faster than humans.

In an interview with Fortune, one of Google Brain’s co-founders and leaders, Jeff Dean, talks about cutting-edge A.I. research, the challenges involved, and using A.I. in its products. The following has been edited for length and clarity.

What are some challenges researchers face with pushing the field of artificial intelligence?

A lot of human learning comes from unsupervised learning where you’re just sort of observing the world around you and understanding how things behave. That’s a very active area of machine-learning research, but it’s not a solved problem to the extent that supervised learning is.

So unsupervised learning refers to how one learns from observation and perception, and if computers could observe and perceive on their own that could help solve more complex problems?

Right, human vision is trained mostly by unsupervised learning. You’re a small child and you observe the world, but occasionally you get a supervised signal where someone would say, “That’s a giraffe” or “That’s a car.” And that’s your natural mental model of the world in response to that small amount of supervised data you got.

We need to use more of a combination of supervised and unsupervised learning. We’re not really there yet, in terms of how most of our machine learning systems work.

Can you explain the A.I. technique called reinforcement learning?

The idea behind reinforcement learning is you don’t necessarily know the actions you might take, so you explore the sequence of actions you should take by taking one that you think is a good idea and then observing how the world reacts. Like in a board game where you can react to how your opponent plays. Eventually after a whole sequence of these actions you get some sort of reward signal.

Reinforcement learning is the idea of being able to assign credit or blame to all the actions you took along the way while you were getting that reward signal. It’s really effective in some domains today.

I think where reinforcement learning has some challenges is when the action-state you may take is incredibly broad and large. A human operating in the real world might take an incredibly broad set of actions at any given moment. Whereas in a board game there’s a limited set of moves you can take, and the rules of the game constrain things a bit and the reward signal is also much clearer. You either won or lost.

If my goal was to make a cup of coffee or something, there’s a whole bunch of actions I might want to take, and the reward signal is a little less clear.

But you can still break the steps down, right? For instance, while making a cup of coffee, you could learn that you didn’t fully ground the beans before they were brewed—and that it resulted in bad coffee.

Right. I think one of the things about reinforcement learning is that it tends to require exploration. So using it in the context of physical systems is somewhat hard. We are starting to try to use it in robotics. When a robot has to actually take some action, it’s limited to the number of sets of actions it can take in a given day. Whereas in computer simulations, it’s much easier to use a lot of computers and get a million examples.

Is Google incorporating reinforcement learning in the core search product?

The main place we’ve applied reinforcement learning in our core products is through collaboration between DeepMind [the AI startup Google bought in 2014] and our data center operations folks. They used reinforcement learning to set the air conditioning knobs within the data center and to achieve the same, safe cooling operations and operating conditions with much lower power usage. They were able to explore which knob settings make sense and how they reacted when you turn something this way or that way.

Through reinforcement learning they were able to discover knob settings for these 18 or however many knobs that weren’t considered by the people doing that task. People who knew about the system were like, “Oh, that’s a weird setting,” but then it turned out that it worked pretty well.

What makes a task more appropriate for incorporating reinforcement learning?

The data center scenario works well because there are not that many different actions you can take at a time. There’s like 18 knobs, you turn a knob up or down, and you’re there. The outcome is pretty measurable. You have a reward for better power usage assuming you’re operating within the appropriate margins of acceptable temperatures. From that perspective, it’s almost an ideal reinforcement learning problem.

An example of a messier reinforcement learning problem is perhaps trying to use it in what search results should I show. There’s a much broader set of search results I can show in response to different queries, and the reward signal is a little noisy. Like if a user looks at a search result and likes it or doesn’t like it, that’s not that obvious.

How would you even measure if they didn’t like a certain result?

Right. It’s a bit tricky. I think that’s an example of where reinforcement learning is maybe not quite mature enough to really operate in these incredibly unconstrained environments where the reward signals are less crisp.

What are some of the biggest challenges in applying what you’ve learned doing research to actual products people use each day?

One of the things is that a lot of machine learning solutions and research into those solutions can be reused in different domains. For example, we collaborated with our Map team on some research. They wanted to be able to read all the business names and signs that appeared in street images to understand the world better, and know if something’s a pizzeria or whatever.

It turns out that to actually find text in these images, you can train a machine learning model where you give it some example data where people have drawn circles or boxes around the text. You can actually use that to train a model to detect which pixels in the image contain text.

That turns out to be a generally useful capability, and a different part of the Map team is able to reuse that for a satellite-imagery analysis task where they wanted to find roof tops in the U.S. or around the world to estimate the location of solar panel installations on rooftops.

And then we’ve found that the same kind of model can help us on preliminary work on medical imaging problems. Now you have medical images and you’re trying to find interesting parts of those images that are clinically relevant.

财富中文网所刊载内容之知识产权为财富媒体知识产权有限公司及/或相关权利人专属所有或持有。未经许可，禁止进行转载、摘编、复制及建立镜像等任何使用。

0条Plus

精彩评论

撰写或查看更多评论

请打开财富Plus APP

前往打开

热读文章

关注我们

专访谷歌顶级科学家：人工智能离普及还有多远？

撰写或查看更多评论