0条Plus

发动群众回答问题，亚马逊的做法是否明智？

David Morris 2019-09-25

亚马逊高管称，公司用机器学习和算法来排除捣乱分子，但专家表示怀疑。

由Alexa语音助手控制的亚马逊Echo多媒体智能音箱。专家们担心恶意木马将找到盗用亚马逊新门户网站的办法。图片来源：Joby Sessions/T3 Magazine/Future via Getty Images

阿尔伯特·爱因斯坦穿袜子吗？切洋葱时怎么防止流眼泪？伯特·雷诺兹娶了莎莉·菲尔德吗？是什么东西让芥末呈绿色？普通人未必知道这些问题的答案，但在上周四发布的Alexa Answers门户网站帮助下，亚马逊语音助手Alexa也许知道。更准确地说，是有Alexa用户可能知道。

Alexa Answers是一个在线社区，人们登陆后就可以为用户向Alexa提出的问题提供参考答案，此举旨在回答那些Alexa没有现成答案的高难度问题。这些答案的准确性会得到评估并打分，足够好的答案会呈现给Alexa用户。

但为Alexa的学识进行众包是个好主意吗？从种族主义网络钓鱼者“策反”的微软聊天机器人，到类似Alexa Answers但因充斥不良信息而恶名在外的Yahoo Answers，用户提供数据的系统走入歧途的案例在过去几年随处可见。因此不难想象出最糟糕的情景：在Alexa操控下，智能音箱“愉快地”播报着假新闻、危险的阴谋论或者白人至上主义者的观点。

亚马逊负责Alexa Information的副总裁比尔·巴顿向商业杂志《Fast Company》描述Alexa Answers时态度乐观。他说：“我们注入的是贡献者的正能量和善意，我们用机器学习和算法来排除少数聒噪者和坏家伙。”

研究数据应用及其影响的专家们却远没有巴顿那么乐观。

克里斯·吉拉德博士在临近底特律的Macomb社区大学研究亚马逊等科技公司的数据政策。他说：“有充足的证据表明此事进展不会顺利。”众包数据，然后用得到的数据来训练Alexa的算法代表着“亚马逊看来下定决心要去踩雷”。

与谷歌的较量

虽然更好地语音助手和智能音箱推动了声控灯等相关商品的销售，但几十年的搜索业务经验看来让谷歌在理解问题和返回数据方面领先于亚马逊。谷歌的智能音箱一直在稳步夺取Echo的市场份额，Google Assistant在多次比较测试中的表现也几乎全面超越Alexa。

实际上，从爱因斯坦的袜子到芥末的颜色，Google Assistant目前可以回答几乎所有上述问题，只是它们都直接取自Amazon Answers网站。谷歌的答案来自谷歌搜索引擎找到的结果、谷歌featured snippet以及知识图谱。亚马逊想利用公众提供的答案在这个领域追上谷歌。

新西兰惠灵顿维多利亚大学科技伦理学家、尼古拉斯·阿格说：“亚马逊不是谷歌。他们没有谷歌的[数据]能力，所以他们需要外部专家。”

除了为每个问题找到答案，来自Alexa Answers的数据还将用于训练亚马逊语音助手背后的人工智能系统。该公司发言人告诉《财富》杂志：“Alexa Answers不仅是扩展Alexa知识面的另一条途径，还可以让她更好地帮助其他用户并为他们提供更多信息。”当初发布Alexa Answers时，亚马逊将其称为“变得更聪明的”Alexa。

不给钱，免费审核事实

对亚马逊来说，提供答案的人不会得到任何报酬也许和Alexa Answers一样重要。这个系统中的人类编辑应该可以通过工作获得工资，但答案贡献者只能在一个分数和等级构成的系统中获得奖励，这种做法用行业术语来说就是“游戏化”。

阿格相信这会很有效，因为亚马逊利用了人们给予帮助的天性。但他也认为我们应当思考一下企业利用人们直觉的行为。阿格说：“某个人随口一问和亚马逊对这些答案的依赖是不一样的。我觉得这在伦理上是个警示信号。”

吉拉德也认为亚马逊应该向提供答案的人支付酬劳，无论是它自己的员工，还是建立了成熟的事实审核团队的合作伙伴。

亚马逊当然有这方面的基础设施。这家电商巨头还运营着‘临时工’平台Mechanical Turk，后者向从事重复性零工的“Turker”支付酬劳，而且看来很适于为Alexa的培训提供补充。

但吉拉德相信，如果Alexa在公众灌输知识的基础上开始源源不断地提供坏的或冒犯人的答案，依靠‘社区’模式就可以把亚马逊隔离开来。他说：“我认为你可以在不支付酬劳的情况下说，嗯，这是大家的想法。但如果付了钱，别人就会指责你有倾向。”

不过，游戏化的激励系统并非没有危险。2013年，Yahoo Answers关闭了部分用户投票系统，据说原因是一些参与者通过虚假账号来给自己（未必准确）的答案点赞（来源：Quora。同时，这也是众包信息影响可靠性的一个良好例证）。

防范网络恶行

Alexa Answers面对的最大问题是亚马逊能否有效防止对这个新平台的滥用。《财富》杂志想知道人类编辑在这个系统中究竟发挥什么作用，但亚马逊拒绝做出回答。但出现人类编辑本身就代表亚马逊承认当前状态下的自动化系统还不能可靠地识别冒犯人的内容，或者评估答案的准确性。

以前亚马逊从未像Facebook和推特那样直面这些挑战，而且据一些评判者透露，亚马逊甚至无法持续识别自家在线店铺中的虚假评论。巴顿告诉《Fast Company》杂志，亚马逊将设法把政治问题阻挡在自身系统之外。但吉拉德说，这是一项不太好处理的工作，人也有可能出现失误，“AI没办法做这些事，它做不了文字内容方面的工作”。

但自动化系统可以轻松识别并阻拦个人发表的冒犯性语句，尽管在这方面也有负面风险。在一次测试中，笔者在Alexa Answer上回答问题时想提到‘20世纪90年代的摇滚乐队Porno for Pyros’，结果答案被拒，原因不是不准确，而是因为包含了‘porno’这个词。系统提示称：“Alexa不会用这个词”。

并非所有问题都有答案

巴顿告诉《Fast Company》，“我们很希望Alexa能回答人们提出的所有问题，”但这显然不可能。有些问题Alexa绝不会知道，比如生命的意义，而且公众为一些谜团提供的答案可能让整个系统变得更不牢靠。在2018年的一项研究中，研究人员发现搜索相关数据较有限的问题，或者他们称之为“数据空白”的问题更容易让怀有恶意的人伪造出虚假或误导别人的答案。

网络钓鱼并非Alexa在精神卫生方面面临的唯一风险。如果Alexa没有正确解读提问者的话语，就连初衷良好的问题也可能变得荒谬。比如，上周五上午在Alexa Answers出现了一个问题，内容是“What is a piglet titus?”。看来用户实际上问的是“What is Epiglottitis?”（答案是：急性会厌炎，一种罕见咽喉疾病）。如果有足够多的用户尝试回答这个毫无意义的问题，比如小熊维尼的粉丝或者急于获得分数的用户，他们就可能让数据池变得混乱，而不是得到改善。

还不清楚混乱或恶意数据对Alexa的整体表现会有怎样的影响——现在答案还很遥远。但在类似系统经历了所有这些挫折之后，如果亚马逊能认真对待众包答案的风险，那将会是一件很美妙的事。（财富中文网）

译者：Charlie

审校：夏林

Did Albert Einstein wear socks? How do you prevent tears when cutting an onion? Did Burt Reynolds marry Sally Field? What makes wasabi green? The average person might not know the answer to these questions, but Amazon Alexa, through the new Alexa Answers portal that was announced Thursday, might. Well, more accurately, an Alexa user could.

An online community where anyone who logs in can suggest answers to user-supplied questions posed to the voice-activated Alexa A.I. assistant, Alexa Answers is designed to answer the tough questions that can’t already be answered by the voice-enabled assistant. Once the answers are submitted, they are vetted for accuracy, scored, and if they are good enough, make their way back to Alexa users.

But is crowdsourcing Alexa's smarts a good idea? From a Microsoft chatbot subverted by racist trolls to Yahoo Answers, a similar service to Alexa Answers that has become notoriously rife with bad information, the past few years have been littered with cases of user-generated data systems gone bad. So it's not hard to imagine the worst-case scenario: an Alexa-backed smart speaker blithely spouting fake news, dangerous conspiracy theories, or white supremacist talking points.

Describing Alexa Answers to Fast Company, Bill Barton, Amazon’s Vice President of Alexa Information, struck an optimistic tone. “We’re leaning into the positive energy and good faith of the contributors," he said. "And we use machine learning and algorithms to weed out the noisy few, the bad few.”

Experts on data use and its impacts are markedly less cheery.

“We have plenty of examples of why this is not going to play out well,” says Dr. Chris Gillard, who studies the data policies of Amazon and other tech companies at Macomb Community College near Detroit. Crowdsourcing data, and then using that data in training the Alexa algorithm, he says, presents “pitfalls that Amazon seem intent on stepping right into.”

The race to beat Google

While better assistants and smart speakers drive sales of accessories like voice-activated lights, Google’s decades in the search business seem to have given it an advantage over Amazon when it comes to understanding queries and returning data. Google's smart speaker has steadily gained market share against the Echo, and Google Assistant has almost uniformly outperformed Alexa in comparison tests.

In fact, almost all of the questions above, from Einstein's socks to wasabi's color, are are currently answered with Google Assistant, though they were taken directly Amazon Answers' website. Google's answers come from its search engine's results, featured snippets, and knowledge graph. Amazon is trying to use crowd-supplied answers to catch up in this space.

“Amazon’s not Google,” says Dr. Nicholas Agar, a technology ethicist at Victoria University of Wellington, New Zealand. “They don’t have Google’s [data] power, so they need us.”

Beyond just providing missing answers to individual questions, data from Alexa Answers will be used to further train the artificial intelligence systems behind the voice assistant. “Alexa Answers is not only another way to expand Alexa's knowledge,” an Amazon spokesperson tells Fortune, “but also... makes her more helpful and informative for other customers.” In its initial announcement of Alexa Answers, Amazon referred to this as Alexa “getting smarter.”

Money for nothing, facts for free

As important as Alexa Answers might be for Amazon, contributors won’t get any financial compensation for helping out. The system will have human editors who are presumably paid for their work, but contributed answers will be rewarded only through a system of points and ranks, a practice known in industry parlance as ‘gamification.’

Agar believes this will be effective, because Amazon is leveraging people’s natural helpfulness. But he also thinks a corporation leveraging those instincts should give us pause. “There’s a difference between the casual inquiry of a human being, and Amazon relying on those answers," he says. "I think it’s an ethical red flag.”

Gillard also thinks Amazon should pay people to provide answers, whether its one of its own workers or partner with an established fact-checking group.

Amazon certainly has the infrastructure to do it. The ecommerce giant already runs Mechanical Turk, a ‘gig’ platform that pays “Turkers” for performing small, repetitive tasks, and would seem well-suited to supplementing Alexa’s training.

But Gillard believes that relying on a ‘community’ model insulates Amazon if Alexa starts spouting bad or offensive answers, based on crowd input. “I think not paying people lets you say, well, it was sort of the wisdom of the crowd,” he says. “If you pay people, you’re going to be accused of bias.”

A gamified incentive system, though, is not without its own risk. In 2013, Yahoo Answers disabled part of its user voting system. That's allegedly because some participants created fake accounts to upvote their own (not necessarily accurate) answers. (Source: Quora. Also, this is a good example of how crowd-sourcing information impacts reliability.)

Troll stoppers

The biggest question facing Alexa Answers is whether Amazon can effectively prevent abuse its new platform. Amazon declined to answer questions from Fortune about the precise role of human editors in the system. But their presence alone represent an acceptance that automated systems in their current state can't reliably detect offensive content, or evaluate the accuracy of facts.

Amazon has never grappled with these challenges as directly as companies like Facebook and Twitter, though according to some critics, it has failed even to consistently detect fake reviews in its own store. Barton told Fast Company that Amazon will try to keep political questions out of the system, a subtle task Gillard says will likely fall to humans. “A.I. can’t do those things," he says, "It can’t do context.”

Yet automated systems can easily detect and block individual offensive terms, though even that has its downsides. In a test, this reporter attempted to reference the ‘90s rock band Porno for Pyros when suggesting an Alexa Answer. The answer was rejected, not because of inaccuracy, but because of the word ‘porno.’ According to a notification, “Alexa wouldn’t say that.”

Not everything has an answer

Barton told Fast Company that “we’d love it if Alexa can answer any question people ask her,” but that’s clearly impossible. Alexa cannot be expected to know, for instance, what the meaning of life is, and crowdsourcing answers to questions that are enigmas could make the entire system more fragile. In a 2018 study, researchers found that search queries with limited relevant data, which they called “data voids,” were easier for malicious actors to spoof with fake or misleading results.

And trolls aren’t the only risk to Alexa’s mental hygiene. Even well-intentioned questions can wind up nonsensical, if Alexa doesn’t properly interpret the questioner’s speech. For example, the question “What is a piglet titus?” appeared on Alexa Answers Friday morning. It seems likely the user actually asked “What is Epiglottitis?” (Answer: a rare throat condition). If enough users tried to answer the nonsense question—perhaps Winnie the Pooh fans, or users hungry for points—it could muddy the data pool, instead of improving it.

It’s unclear how Alexa's overall performance might be impacted by messy or malicious data—those answers are a ways away yet. Bit it's a wonder if, after all the stumbles of similar systems, Amazon is taking the risks of crowdsourced answers seriously.

精选评论

撰写或查看更多评论, 请打开财富Plus APP

热读文章

热门视频

500强行业分布