

Jeremy Kahn 2022-06-25

吴恩达(Andrew Ng)是深度学习技术的先驱者之一。所谓深度学习,就是将大型神经网络应用于人工智能领域。就广大企业应该如何利用人工智能技术的问题,吴恩达也是最有发言权的专家。吴恩达创办了一家名为Landing AI的公司并自任首席执行官。这家公司的软件,可以让即使不懂编程的人,也能够轻松构建和维护AI系统。这样的话,几乎所有企业都可以使用AI技术了——尤其是计算机视觉应用。目前,一些大型生产商,例如工具制造商史丹利百德(StanleyBlack & Decker)、电子产品制造商富士康(Foxconn),以及汽车零部件制造商电装公司(Denso)都已经成了Landing AI的客户。







吴恩达(Andrew Ng)是深度学习技术的先驱者之一。所谓深度学习,就是将大型神经网络应用于人工智能领域。就广大企业应该如何利用人工智能技术的问题,吴恩达也是最有发言权的专家。吴恩达创办了一家名为Landing AI的公司并自任首席执行官。这家公司的软件,可以让即使不懂编程的人,也能够轻松构建和维护AI系统。这样的话,几乎所有企业都可以使用AI技术了——尤其是计算机视觉应用。目前,一些大型生产商,例如工具制造商史丹利百德(StanleyBlack & Decker)、电子产品制造商富士康(Foxconn),以及汽车零部件制造商电装公司(Denso)都已经成了Landing AI的客户。







Andrew Ng is among the pioneers of deep learning—the use of large neural networks in A.I. He’s also one of the most thoughtful A.I. experts on how real businesses are using the technology. His company, Landing AI, where Ng is founder and CEO, is building software that makes it easy for people, even without coding skills, to build and maintain A.I. systems. This should allow almost any business adopt A.I. —especially computer vision applications. Landing AI’s customers include major manufacturing firms such as toolmaker StanleyBlack & Decker, electronics manufacturer Foxconn, and automotive parts maker Denso.

Ng has become an evangelist for what he calls “data-centric A.I.” The basic premise is that state-of-the-art A.I. algorithms are increasingly ubiquitous thanks to open-source repositories and the publication of cutting edge A.I. research. Companies that would struggle to hire PhDs from top computer science schools can nonetheless access the same software code that Google or NASA might use. The real differentiator between businesses that are successful at A.I. and those that aren’t, Ng argues, is down to data: What data is used to train the algorithm, how it is gathered and processed, and how it is governed? Data-centric A.I., Ng tells me, is the practice of “smartsizing” data so that a successful A.I. system can be built using the least amount of data possible. And he says that “the shift to data-centric A.I.” is the most important shift businesses need to make today to take full advantage of A.I.—calling it as important as the shift to deep learning that has occurred in the past decade.

Ng says that if data is carefully prepared, a company may need far less of it than they think. With the right data, he says companies with just a few dozen examples or few hundred examples can have A.I. systems that work as well as those built by consumer internet giants that have billions of examples. He says one of the keys to extending the benefits of A.I. to companies beyond the online giants is to use techniques that enable A.I. systems to be trained effectively from much smaller datasets.

What’s the right data? Well, Ng has some tips that include making sure that data is what he calls “y consistent.” In essence this means there should be some clear boundary between when something receives a particular classification label and when it doesn’t. (For example, take an A.I. designed to find defects in pills for a pharma company. This system will perform better from less training data if any scratch below a certain length is labelled “not defective,” and any scratch longer than that threshold is labelled “defective" than if there is no consistency in which scratch lengths are labelled defective.)

He says that one way to spot data inconsistencies is to assign the same images in a training set to multiple people to label. If their labels don’t agree, the person designing the system can make a call on the correct label or that example can be discarded from the training set. Ng also urges those curating data sets to clarify labeling instructions by tracking down ambiguous examples. These are tricky cases that are likely to lead to inconsistent labels. Any examples that are unclear or confusing should be eliminated from the data set altogether, he says. Finally, he says people should analyze the errors an A.I. system makes to figure out which subset of examples tend to trip the system up. Adding just a few additional examples in key data subsets leads to faster performance improvements than adding additional examples where the software is already doing well. He also says that A.I. users should see data curation, data improvement, and retraining the A.I. on updated data, as an on-going cycle, not something a user does only once.

The idea of thinking of the building and training of A.I. models as a continuous cycle, not a one-off project, also comes across in a recent report on A.I. adoption from consulting firm Accenture. It found that only 12% of 1,200 companies it looked at globally have advanced their A.I. maturity to the stage where they are seeing superior growth and business transformation. (Another 25% are somewhat advanced in their deployment of A.I., while the rest are still just running pilot projects if anything.) What sets that 12% apart? Well, one factor Accenture identifies is that they have “industrialized” A.I. tools and processes, and that they have created a strong A.I. core team. Other key factors are organizational too: they have top executives who champion A.I. as a strategic priority; they invest heavily in A.I. talent; they design A.I. responsibly from the start; and they prioritize both long- and short-term A.I. projects.
