订阅

多平台阅读

微信订阅

杂志

申请纸刊赠阅

订阅每日电邮

移动应用

专栏 - 财富书签

大数据的局限性

Michael Schrage 2012年10月25日

《财富》书签(Weekly Read)专栏专门刊载《财富》杂志(Fortune)编辑团队的书评,解读商界及其他领域的新书。我们每周都会选登一篇新的评论。
本期《财富书签》为您推荐两部新书,分别是塞缪尔•阿贝斯曼的著作《事实的半衰期》(The Half-Life of Facts)和内特•希尔的著作《信号与噪音》(The Signal and The Noise)。这两本书认为,算法并不能完全代替人的判断。

    如果你已经听过这个笑话,尽管打断我:有三位统计学家去猎兔。他们发现了一只兔子。第一位统计学家率先开枪,离兔子的头差了一英尺。第二位统计学家开枪射击,离兔子的尾巴差了一英尺。第三位统计学家大喊道:“我们逮住它了!”

    就算你并不觉得这个笑话有多么好笑,但你却很可能跟类似于它所描述的猎兔者的管理人员一起工作过。他们的数学水平或许无可挑剔,但可悲的是,他们在真实世界的成果毫无价值。谎言,该死的谎言。各大组织到底必须掌握什么东西,才能提高其数量分析专家产生真实价值(而不是统计幻象)的几率?不懂数学的高管们怎样才能确保他们不会受到“大数据”(Big Data)的蒙蔽?

    我们或许可以在塞缪尔•阿贝斯曼的著作《事实的半衰期》(The Half-Life of Facts)和内特•希尔的著作《信号与噪音》(The Signal and The Noise)中找到这些问题的精彩答案。这两部既相互独立、又互为补充的著作深入探索了“数据”如何变为“证据”,这么多看似高深莫测的数学模型为什么根本无法区分这两种事物等问题。这两本书接受、并进一步扩展了纳西姆•塔勒布备受欢迎并富于洞见的著作《被随机现象蒙蔽》(Fooled By Randomness)和《黑天鹅》(The Black Swan),以及诺贝尔奖得主丹尼尔•卡尼曼的卓越作品《思考,快与慢》(Thinking, Fast and Slow)所阐述的不确定性和数量的自我欺骗等主题。如同其先驱一样,阿贝斯曼和希尔也写出了不仅妙趣横生、而且具备可操作性的作品。

    两位作者都引用了马克•吐温、威尔•罗杰斯和查尔斯•凯特林等人颇具嘲讽意味的妙语:“引领我们进入困局的并不是我们不知道的事物,而是我们知道、但不那么真实的事物。”两人都探讨了用以区分“真实”知识和“不那么真实的”知识的媒介和机制。阿贝斯曼和希尔都言之凿凿地声称,目前占据上风的是“不那么真实的”知识。处理的数据越多,受到的关注也就越多。

    应用数学家、哈佛大学数量社会科学研究所(Harvard's Institute for Quantitative Social Science)研究员阿贝斯曼解构了“事实”的定义。对读者颇为仁慈的一点是,他并没有跌入后现代主义哲学的泥沼。相反,他深入探索了严肃的科学家如何确定他们自认为了解、与其正在研究的事物相关的事实。这种“科学计量”方式——科学如何衡量其过程和进步的科学——在确定科学家所称的“事实”的生命周期和生态系统方面非常有帮助。通过这种方式,阿贝斯曼提出了一些有趣的问题,比如:“事实”是如何诞生的?它们通常如何复制、变异和进化?它们将在多久之后消逝?

    Stop me if you've heard this one: Three statisticians go rabbit hunting. They spot a rabbit. The first statistician shoots. He misses the rabbit's head by a foot. The second statistician fires; misses the rabbit's tail by a foot. The third statistician cries out, "We got him!"

    Even if you don't find this joke remotely amusing, you've probably worked with exactly the kind of managerial rabbit hunters it describes. Their math may be impeccable but their real-world results, alas, are rubbish. Lies, damned lies, etc. What must organizations know to improve the odds that their quants will deliver real value instead of statistical illusions? How can stochastically innumerate executives be sure they're not being bamboozled by Big Data?

    Excellent answers can be found in Samuel Arbesman's The Half-Life of Facts and Nate Silver'sThe Signal and The Noise, two distinct but complementary efforts that explore how "data" become "evidence" and why so many sophisticated mathematical models fail so spectacularly at distinguishing the two. The books embrace and extend upon the themes of uncertainty and quantitative self-deception articulated by Nassim Taleb's popular and insightful Fooled By Randomness and The Black Swan, as well as Nobel laureate Daniel Kahneman's superiorThinking, Fast and Slow. Like their precursors, Arbesman and Silver have produced entertainingly actionable books.

    Both authors cite the cynically apt line -- variously attributed to Mark Twain, Will Rogers and Charles Kettering -- that 'It ain't so much the things we don't know that get us into trouble. It's the things we know that just ain't so." Both discuss the media and mechanisms used to distinguish between "real" knowledge and "ain't so's. Arbesman and Silver both argue persuasively that the "ain't so's" are winning. The more data you deal with, the more attention that case deserves.

    Arbesman, an applied mathematician and fellow at Harvard's Institute for Quantitative Social Science, deconstructs what it means to be a fact. He mercifully avoids getting bogged down in post-modernist philosophy. Instead he explores how serious scientists attempt to nail down what it is they think they know about what they're studying. This "scientometric" approach -- the science of how science measures its process and progress -- proves extraordinarily helpful in identifying the lifecycles and ecosystems of what scientists call "facts." This approach allows Arbesman to ask intriguing questions, such as: How are "facts" born? How do they typically replicate, mutate and evolve? How long do they take to die?

    The provocative core of Arbesman's argument is that there is a virtual physics of facts. Depending upon how they're defined and measured, 'facts' follow defined laws and trajectories. "Every day that we read the news we have the possibility of being confronted with a fact about our world that is wildly different from what we thought we knew," he writes. "…But it turns out that these rapid changes, while true phase transitions in our knowledge, are not unexpected or random. We understand how they behave in the aggregate, through the use of probability, but we can also predict these changes by searching for the slower, regular changes in our knowledge that underlie them. Fast changes in facts, just like everything else we've seen, have an order to them. One that is measureable and predictable."

1 2 3 下一页

我来点评

  最新文章

最新文章:

中国煤业大迁徙

500强情报中心

财富专栏