杀死人工智能

    科技2025-03-16  22

    1. Background2. History of AI3. Limits of ML4. Next Steps5. Killing AI6. Closing Thoughts

    TL;DR: Data scientists have a responsibility to shepherd the term artificial intelligence out of the world. We need to show maturity in walking back the near unattainable promise embedded in this misused phrase.

    TL; DR:数据科学家有责任将“人工智能”一词推向世界。 我们必须表现出成熟度,才能退回到这个误用短语中所包含的近乎无法实现的承诺。

    背景 (Background)

    “That’s a really strange pronunciation of machine learning, friend.”

    “这是机器学习的一个非常奇怪的发音,朋友。”

    This is my response to a fellow data scientist’s use of the words artificial intelligence during a mid-week phone call about how we might grow our business.

    这是我对一位数据科学家在一个关于我们如何发展业务的电话会议的半周电话会议中使用人工智能一词的回应。

    He pauses, and, cognizant that snarky interruptions aren’t an effective teaching tool, I take a deep breath and explain: “Let’s say machine learning because it more accurately describes the process of teaching mathematical models to generate insights by providing them with training examples.”

    他停顿了一下,并意识到下意识的打扰并不是一种有效的教学工具,我深吸了一口气,并解释说:“让我们说机器学习是因为它可以更准确地描述教学数学模型的过程,并通过向他们提供训练示例来产生见解。 ”

    As I’ve deepened my understanding of data science, I’ve increasingly found that the phrase artificial intelligence sparkles with false promise:

    随着我对数据科学的加深理解,我越来越发现人工智能这一短语闪耀着虚假的承诺:

    The term misleads decision-makers into investing in advanced analytics before their organization has reached sufficient data maturity. After all, intelligence should be able to cope with a little data complexity, right?

    该术语误导决策者在其组织达到足够的数据成熟度之前投资于高级分析。 毕竟,智能应该能够应付一点数据复杂性,对吗?

    It confuses the heck out of young students, who might’ve applied a little extra attention in their statistics class if they understood that was the path to a promising career in data science.

    这使年轻学生感到困惑,如果他们知道这是通往数据科学事业的成功之路,他们可能会在统计课上加倍注意。

    It distracts from tried and true methodologies within data science.

    它分散了数据科学中久经考验的方法论的注意力。

    I’m ready to kill artificial intelligence.

    我准备杀死人工智能。

    人工智能的历史 (History of Artificial Intelligence)

    The academic discipline got its start in 1955 with the goal of creating machines capable of mimicking human cognitive function. Learning, problem solving, applying common sense, operating under conditions of ambiguity — taken together, these traits form the basis for general intelligence, the long-standing goal of artificial intelligence.

    该学科于1955年开始,其目标是制造能够模仿人类认知功能的机器。 学习,解决问题,运用常识,在模棱两可的条件下运作-这些特征共同构成了通用情报(人工智能的长期目标)的基础。

    Since inception, AI research has experienced boom and bust cycles, fueled by an abundance of optimism followed by a collapse of funding. These setbacks have been so dramatic and so endemic to the field that they received their own neologism: AI winter. The two most dramatic winters occurred in the mid-to-late ’70s and the mid-80s to mid-90s. Failure to appropriately manage hype is the commonly cited cause of this unfortunate antipattern.

    自成立以来,人工智能研究经历了繁荣与萧条的周期,这是由于大量的乐观情绪以及随之而来的资金崩溃所推动的。 这些挫折在这个领域是如此的戏剧性和流行,以至于他们收到了自己的新词: AI Winter。 两个最引人注目的冬季发生在70年代中后期和80年代中期至90年代中期。 未能适当控制炒作是导致这种不幸的反模式的普遍原因。

    In the words of the chief data scientist of the Strategic Artificial Intelligence Lab, T. Scott Clendaniel:

    用战略人工智能实验室首席数据科学家T. Scott Clendaniel的话来说:

    “I’m really concerned that we’re going to enter a third AI winter … because I think the field has been so overhyped that there’s no way it can possibly live up to expectations.”

    “我真的很担心我们将进入第三个AI冬季……因为我认为这个领域过于夸张,以至于不可能达到期望。”

    Lately, the machine learning community has been fielding a lot of journalistic inquiries about AI. In May 2020, OpenAI released their GPT-3 model for natural language processing. The cost to initially train the model was $4.6 million, requiring 355 GPU years of computational power. As of now, OpenAI has released the model through a controlled API access point rather than making the code freely available to researchers.

    最近,机器学习社区一直在进行许多有关AI的新闻查询。 2020年5月, OpenAI发布了用于自然语言处理的GPT-3模型。 最初训练模型的成本为460万美元,需要355年的GPU计算能力。 截至目前,OpenAI已通过受控的API访问点发布了该模型,而不是将代码免费提供给研究人员。

    Setting aside concerns of impracticality and inaccessibility for a moment — GPT-3 has produced some impressive feats. And yet, importantly, this overhyped development does not move us closer to artificial intelligence.

    暂时搁置不切实际和难以接近的担忧-GPT-3产生了一些令人印象深刻的壮举。 然而,重要的是,这种过度开发并没有使我们更接近人工智能。

    If advancing research into AGI is analogous to sending a spacecraft to explore Mars, then the development of GPT-3 is more or less analogous to investing $4.6 million into a rocket that creates a beautiful fire cloud of exhaust without ever leaving the launchpad.

    如果推进对AGI的研究类似于派飞船探索火星,那么GPT-3的开发或多或少类似于向火箭发射460万美元,该火箭产生美丽的尾气云,而无需离开发射台。

    机器学习的局限性 (Limits of Machine Learning)

    The general consensus of the research community is that AGI won’t be attained by deepening machine learning techniques.

    研究界的普遍共识是,深化机器学习技术将无法获得AGI。

    The creation of natural intelligence costs on average $125,000 and many sleepless nights, but it’s ultimately very fulfilling. Photo by Shirota Yuri on Unsplash. 创建自然情报的平均成本为125,000美元,并花费了许多不眠之夜,但最终却非常令人满意。 Shirota Yuri在 Unsplash上的 照片。

    Machine learning capabilities are narrow. An ML algorithm may be able to achieve better-than-human performance but only on exceedingly specific tasks and only after immensely expensive training.

    机器学习能力很窄。 机器学习算法可能能够实现比人类更好的性能,但是仅在极其特定的任务上并且只有在进行了极其昂贵的培训之后才能实现。

    Human capabilities are comparatively broad. We seem to be exceptionally good at one-shot learning — making inferences and assigning categories based on a very few examples. Young children quickly master tasks of matching, sorting, comparing, and ordering. Infants have an innate desire to explore novelty and draw conclusions about the wider world.

    人员能力相对广泛。 我们似乎非常擅长一键式学习-根据很少的例子进行推理和分配类别。 幼儿可以Swift掌握匹配,排序,比较和排序的任务。 婴儿天生就有探索新颖性并得出关于更广阔世界的结论的天生愿望。

    These capacities are as yet unmatched by machine intelligence. There’s somewhat of a paradox in the fact that machine learning can defeat a grand master at Go, but a robot can’t beat a toddler at sorting blocks.

    机器智能还无法比拟这些能力。 机器学习可以击败Go上的大师,这有点矛盾,但是机器人无法在分拣块上击败幼儿。

    Following the path of improved machine learning techniques seems unlikely to result in the attainment of human levels of common-sense reasoning or versatile problem solving. To quote machine learning pioneer Stuart Russell:

    遵循改进的机器学习技术的道路似乎不太可能导致达到人类常识性推理或通用问题解决水平。 引用机器学习先驱Stuart Russell的话:

    “I don’t think deep learning evolves into AGI. Artificial General Intelligence is not going to be reached by just having bigger deep learning networks and more data… Deep learning systems don’t know anything, they can’t reason, and they can’t accumulate knowledge, they can’t apply what they learned in one context to solve problems in another context etc. And these are just elementary things that humans do all the time.”

    “我认为深度学习不会演变为AGI。 仅仅拥有更大的深度学习网络和更多数据将无法实现人工智能……深度学习系统什么都不知道,他们无法推理,他们无法积累知识,他们无法应用所学知识在一种情况下学习解决在另一种情况下的问题等。这些仅仅是人类一直在做的基本事情。”

    In other words, statistics-based solutions are fairly good at interpolation — i.e., drawing conclusions about novel examples that fall within the bounds of data they’ve already seen. They aren’t very good at extrapolation — i.e., using what they’ve learned to make conclusions about the broader world.

    换句话说,基于统计的解决方案在插值方面相当出色,即得出关于已落入数据范围之内的新颖示例的结论。 他们不是很擅长外推-即利用他们所学的知识得出关于更广阔世界的结论。

    It’s possible that “the greatest trick AI ever pulled was convincing the world it exists.”

    可能是“人工智能有史以来最大的诀窍就是说服它存在的世界。”

    下一步 (Next Steps)

    I think it’s extremely important for machine learning researchers to ask themselves if they’re solving the right challenges.

    我认为对于机器学习研究人员来说,问自己是否正在解决正确的挑战非常重要。

    演示地址

    Here’s a YouTube video about StarGAN v2, a machine learning model that can take a cat photo and use it to create a bunch of similar images inspired by photos of dogs. Meanwhile, data-quality issues cost U.S. organizations $3.1 trillion a year according to analysis from IBM.

    这是一个有关StarGAN v2的YouTube视频, StarGAN v2是一种机器学习模型,可以拍摄猫的照片,并使用它创建一堆受狗照片启发的类似图像。 同时,根据IBM的分析,数据质量问题每年给美国组织造成3.1万亿美元的损失。

    Perhaps removing AI from our lexicon will help debunk the notion that an artificially intelligent tool can address substantive data-management failures. Low-quality data is unfortunately ubiquitous — and it’ll impair business function and impede the implementation of even the simplest of advanced-analytics tools.

    也许从我们的词典中删除AI会有助于揭露人工智能工具可以解决实质性数据管理失败的观念。 不幸的是,低质量的数据无处不在-它会损害业务功能,甚至阻碍最简单的高级分析工具的实施。

    杀死人工智能 (Killing Artificial Intelligence)

    If you’re a data scientist or machine learning engineer, I hope your takeaway from this article is a sense of responsibility to quit feeding into the hype around AI. Unless you’re on the absolute cutting edge of AI research, your use of the term should be reserved for discussion of the superintelligence control problem and other future-oriented considerations.

    如果您是数据科学家或机器学习工程师,我希望您从这篇文章中获得的收获是一种责任感,让我们不再沉迷于AI的炒作。 除非您处于AI研究的绝对前沿,否则应保留使用该术语来讨论超级智能控制问题和其他面向未来的注意事项。

    演示地址

    If you’re a nontechnical person, you can safely replace just about every use of artificial intelligence with very, very advanced statistics. This is certainly true for any tool available on the marketplace today. There’s still room for philosophical discussion around questions of artificial general intelligence, but that technology is still a long ways off.

    如果您不是技术人员,则可以使用非常非常高级的统计信息安全地替换几乎所有使用人工智能的方法。 对于当今市场上可用的任何工具,无疑都是如此。 关于人工智能的问题,仍然存在哲学讨论的空间,但该技术还有很长的路要走。

    And if you happen to be a business leader, particularly one with AI in your title, here’s what Anil Chaudhry, the director of AI implementations at the U.S. GSA, has to say about his role:

    如果您恰好是一位商业领袖,尤其是一位以AI为头衔的商业领袖,这就是美国GSA AI实施总监Anil Chaudhry必须说的:

    “I describe AI to people as augmented intelligence, not artificial intelligence.”

    “我向人们描述的AI是增强智能,而不是人工智能。”

    Even leaders with AI in their title are cringing away from AI.

    甚至以AI为头衔的领导者也开始畏惧AI。

    In summary, the future vision of artificial intelligence won’t be achieved through contemporary methods. The hype around massive, impractical models such as GPT-3 reveals a lack of understanding about the current state of machine intelligence — or lack thereof.

    总而言之,人工智能的未来愿景将无法通过当代方法来实现。 对大规模,不切实际的模型(例如GPT-3)的炒作表明,人们对机器智能的当前状态缺乏了解,或者缺乏这种了解。

    The overuse of artificial intelligence isn’t just a whimsical exaggeration — it’s damaging to the data science community and risks tipping the field into a crisis of confidence.

    人工智能的过度使用不仅是异想天开的夸大其词,它还损害了数据科学界,并有可能使该领域陷入信任危机。

    总结思想 (Closing Thoughts)

    Here are three trends I see for data science in the next three to five years.

    在接下来的三到五年中,我看到了数据科学的三个趋势。

    管理我们领域使用的语言 (Stewardship of the language used in our field)

    Clearly, I feel strongly that there’s an imperative to remove AI from the lexicon.

    显然,我强烈感到有必要从词典中删除AI 。

    越来越依赖以人为本的设计来识别与机器学习不良行为相关的风险 (Increased reliance on human-centered design to identify risks associated with machine learning malpractice)

    Not all errors should be treated as equally bad.

    并非所有错误都应视为同等糟糕。

    To quote Stuart Russell again:

    再次引用Stuart Russell :

    “Some kinds of errors are relatively cheap. Whereas classifying a human as a gorilla, as Google found out, is really expensive, like in the billions of dollars of trashing your goodwill and global reputation. I’m sure it was sort of an innocent error, coming from just using a uniform loss function.”

    某些错误相对便宜。 正如Google所发现的那样,将人类归为大猩猩确实很昂贵,就像浪费数十亿美元破坏您的商誉和全球声誉一样。 我敢肯定,这只是一个单纯的错误,仅来自使用统一损失函数。”

    Identifying the gravity of this potential mistake from the outset of the design process could have saved the ML engineering team at Google a lot of heartache. Incorporating human-centered design into the model-creation process doesn’t just help with selection of the optimal loss function, but it also helps identify potential sources of bias (e.g., racially unbalanced training data) and other risks.

    从设计过程一开始就识别出这个潜在错误的严重性,可以为Google的ML工程团队节省很多心痛。 将以人为中心的设计整合到模型创建过程中,不仅有助于选择最佳损失函数,而且还有助于确定潜在的偏差来源(例如种族不平衡的训练数据)和其他风险。

    回归基本面,包括重新关注理解数据生成的端到端过程 (A return to the fundamentals, including a renewed focus on understanding the end-to-end process of data generation)

    I’d love to see data scientists develop a breadth of skills spanning the data-generation pipeline.

    我很乐意看到数据科学家在数据生成管道中发展出广泛的技能。

    Quoting again, this time from Harvard professor Gary King:

    再次引用哈佛大学教授加里·金的话:

    “You have to know the whole data generation process from which the data emerge… We always focus on that whole chain of evidence… After all, we’re studying the world — we’re not studying some bunch of data.”

    “您必须了解数据从中产生的整个数据生成过程……我们始终专注于整个证据链……毕竟,我们正在研究世界,而不是在研究一些数据。”

    This is why I think it’s crucial for data scientists to develop familiarity with the principles of end-to-end data strategy.

    这就是为什么我认为对于数据科学家来说,熟悉端到端数据策略的原理至关重要。

    翻译自: https://medium.com/better-programming/kill-artificial-intelligence-7bc02f85ea70

    Processed: 0.023, SQL: 8