By Katia Moskvitch

卡蒂亚·莫斯科维奇(Katia Moskvitch)

“那只狗躲在床底下。 再次。” (“The dog hid under the bed. Again.”)

At any other time, IBM computer scientist Danny Gutfreund, then at IBM’s Haifa lab in Israel, would’ve probably largely ignored his older brother’s comment. But this was the third year in a row that the dog, Bona, would hide under the bed in the morning before the start of the fireworks marking Israel’s Independence Day. Somehow the dog knew about the imminent loud bangs; something in her brain made her connect the dots. The year was 2012, and the odd yearly pattern in the dog’s behavior quickly became the subject of dinner conversations. The puzzle was too tempting — there’s still so much we don’t know about the brain, be it dog’s (learning from complex associations from just a few examples) or human’s. Yet, we’ve been trying to imitate it in machines for decades. Intrigued by Bona’s behavior, Danny started working in artificial intelligence (AI).

在任何其他时间,当时在IBM位于以色列海法实验室的IBM计算机科学家Danny Gutfreund可能在很大程度上忽略了他哥哥的评论。 但这是连续第三年,在以色列独立日纪念日烟花汇演开始前的早晨,博纳犬(Bona)会藏在床下。 狗以某种方式知道即将来临的巨响。 她大脑中的某些东西使她无处不在。 那是2012年,狗行为举止的奇怪年度模式Swift成为晚餐对话的主题。 这个难题太诱人了-关于狗的大脑,我们还有很多不为人所知,无论是狗的(从几个例子中从复杂的关联中学习)还是人的。 然而,数十年来,我们一直在尝试在机器中模仿它。 对Bona的行为感兴趣的Danny开始从事人工智能(AI)的工作。

While researchers have been making great progress in AI, they still haven’t been able to give machines the special ingredient that makes us ‘us’: common sense. We just know, seeing a person walk in soaking wet, that it’s raining outside. Dogs have some basic common sense too, or rather what’s called rapid learning; Bona knew, from observing her owners perhaps set up a BBQ on that specific day, year after year, that fireworks were coming. No robot can even do that — yet.

尽管研究人员在AI方面取得了长足的进步,但他们仍然无法为机器提供使我们“与众不同”的特殊成分:常识。 我们只是知道,看到一个人湿透了走路,外面正在下雨。 狗也有一些基本常识,或者说叫做快速学习。 Bona观察到她的主人也许在年复一年的特定日子里设置了烧烤,知道烟花即将来临。 到目前为止,还没有机器人能做到这一点。

Today, machines translate languages, recognize objects and spoken speech. But ask a smartphone assistant something more complex than a basic command, and it will struggle. Machines with common sense, which rely on an emerging AI technique known as neurosymbolic AI, could greatly increase the value of AI for businesses and society at large. Such AI would also require far less training data and manual annotation, as supervised learning consumes a lot of data and energy — to the point that if we keep on our current path of computing growth, by 2040 we’ll exceed the ‘power budget’ of the Earth. There’s simply not enough data or power to continue on with today’s AI.

如今,机器可以翻译语言,识别对象和语音。 但是向智能手机助手询问比基本命令更复杂的东西,这会很麻烦。 具有常识的机器依赖于一种新兴的AI技术(称为神经符号AI) ,可以极大地提高AI对整个企业和整个社会的价值。 这样的AI还需要更少的训练数据和人工注释,因为监督学习会消耗大量数据和精力-到这一点,如果我们继续按照当前的计算增长方式发展,到2040年我们将超过“功率预算”地球的。 根本没有足够的数据或能力来继续使用当今的AI。

This is exactly what a new collaboration between IBM and MIT, Harvard and Stanford universities, financed by the US defense department’s research agency DARPA, aims to change. The idea is to get computers to learn like humans — by developing the same basic building blocks for learning as a six-months-old infant. The researchers want AI not to just recognize objects, but to be able to understand what it sees and apply reasoning to act accordingly in a new situation.

这正是由美国国防部研究机构DARPA资助的IBM与MIT,哈佛大学和斯坦福大学之间新的合作旨在改变的。 这个想法是通过开发与六个月大的婴儿相同的基本学习模块,使计算机像人一样学习。 研究人员希望AI不仅能够识别物体,而且能够理解其所见并在新情况下运用推理来采取相应的行动。

“There is in principle no barrier to creating artificially intelligent systems that match or exceed our capabilities,” says David Cox, computer scientist and IBM director of the MIT-IBM Watson AI Lab in Cambridge, MA.

“在原则上,创建匹配或超过我们能力的人工智能系统没有任何障碍,”位于麻省剑桥的MIT-IBM Watson AI Lab的计算机科学家兼IBM总监David Cox说。

早期模拟大脑的尝试 (Early attempts to simulate the brain)

go back to mid-1950, with the coining of the term “artificial intelligence” in1955. The field really kicked off the year after. Back then, the approach to AI resorted to symbols representing objects and actions, similar to how humans process information. That’s the essence of the once-mainstream approach to machine learning called symbolic AI that is still used today, albeit not very widely. It is based on the idea that humans make sense of the world by creating internal symbolic representations and rules for dealing with them, based on logic. These rules can be turned into a computer algorithm, capturing our daily knowledge — describing, for instance, that if a ball is thrown and there is no wall, it should keep going straight. But if there is a wall, it should bounce back. The computer uses these structured representations of knowledge and applies logic to manipulate them, gaining new knowledge to ‘reason’ somewhat similar to humans.

可以追溯到1950年中期,1955年“人工智能”一词的产生。 次年真正开始了。 那时,人工智能的方法是使用代表对象和动作的符号,类似于人类如何处理信息。 这就是曾经主流化的机器学习方法(称为符号AI)的本质,尽管这种方法还不是很广泛,但今天仍在使用。 它基于这样的思想,即人类通过基于逻辑创建内部符号表示和处理它们的规则来理解世界。 这些规则可以转化为一种计算机算法,以捕获我们的日常知识-例如,描述如果扔了一个球而没有围墙,则该球应该一直走下去。 但是,如果有一堵墙,它应该会反弹。 计算机使用知识的这些结构化表示形式并应用逻辑来操纵它们,从而获得新的知识以“推理”类似于人类的事物。

Image for post
The IBM 702: a computer used by the first generation of AI researchers.
IBM 702:第一代AI研究人员使用的计算机。

The field looked promising for over a decade, with the US government in the 1960s pouring billions into symbolic AI research. But the technology wasn’t progressing as fast as expected, the systems were costly and needed constant updating, and would often become less precise when more rules were added to them. AI researchers themselves were getting pessimistic, with doubts spilling into the media. Government funding turned into a trickle and the research stalled around 1974, tossing us straight into the first AI winter, with overpromises followed by under-deliveries.

十年来,该领域看起来充满希望,美国政府在1960年代投入了数十亿美元用于象征性AI研究。 但是该技术的发展速度并没有达到预期的速度,系统成本高昂,需要不断更新,并且当向其中添加更多规则时,其精度往往会降低。 人工智能研究人员本身正变得悲观,怀疑情绪蔓延至媒体。 政府资金变成了trick细流,研究停滞在1974年左右,使我们直接进入了第一个AI冬季,出现了过高的承诺,接着是交付不足。

Symbolic AI simply could not cope with the messiness of the real world. For instance, to create a code that would comb through hundreds of images and select those of a specific person, such systems had to compare any new images to the original one. But if in the new images the person was pictured from a different angle, the program floundered. There were some spikes of progress: in the early 1980s, a team led by computer scientist Terry Sejnowski decided to challenge symbolic AI and developed a data-fueled program that could learn how to pronounce words, just like babies do.

象征性AI根本无法应对现实世界的混乱。 例如,要创建可梳理数百幅图像并选择特定人的图像的代码,此类系统必须将任何新图像与原始图像进行比较。 但是,如果在新图像中从另一个角度为该人拍照,则该程序将陷入困境。 有一些突飞猛进的进展:在1980年代初期,由计算机科学家Terry Sejnowski领导的团队决定挑战符号AI,并开发了一个数据驱动程序,可以像婴儿一样学习如何发音。

The 1990s saw major developments through really powerful tools of probabilistic models and statistical inference, the paradigm that gave us the modern field of machine learning. “People might not have called it AI, but it was as much as today’s AI,” says Josh Tenenbaum, an MIT professor of computational cognition. “It’s all about trying to make good guesses, building models that make inferences from patterns of observed data to the underlying causes.” This probabilistic approach led to advances in natural language processing and machine learning, and drove technologies we now take for granted such as scalable internet search at Google.

1990年代通过功能强大的概率模型和统计推断工具取得了重大进展,该范式为我们提供了现代机器学习领域。 麻省理工学院计算认知教授乔什·特南鲍姆(Josh Tenenbaum)说:“人们可能没有将其称为AI,但它与当今的AI一样多。” “这是所有关于尝试做出好的猜测,建立模型的模型,这些模型可以从观察到的数据模式推断出根本原因。” 这种概率方法带来了自然语言处理和机器学习的进步,并推动了我们现在理所当然的技术,例如Google的可扩展互联网搜索。

Then in 2009, Stanford University computer scientist Fei-Fei Li created ImageNet, giving a boost to a different approach to machine intelligence: deep learning. While deep learning has been around since the 1960s, it was the combination of large-scale datasets (such as ImageNet), strong computing machinery, increasingly powerful computing machinery (such as Graphical Processing Units, or GPUs) and advances in algorithms and programming languages that created the perfect conditions for this revolution.

然后在2009年,斯坦福大学的计算机科学家李飞飞创建了ImageNet,从而推动了另一种机器智能方法:深度学习。 深度学习自1960年代就已经存在,它是大规模数据集(例如ImageNet),强大的计算设备,功能日益强大的计算设备(例如图形处理单元或GPU)以及算法和编程语言的进步的结合为这场革命创造了完美的条件。

Deep learning relies on neural networks originally inspired by an attempt to replicate the nerve cells in the brain, the neurons, and all the complex interactions they have when you make a split-second decision like putting your hand out to catch a vase falling from a shelf. Machines can’t yet do tasks like this. Still, during the past decade deep learning-based AI has made huge progress, processing mountains of data, computing complex problems humans struggle with and creating models predicting future outcomes based on previous patterns.

深度学习依赖于神经网络,最初是受到神经网络的启发,该网络试图复制大脑中的神经细胞,神经元以及它们在进行瞬间决策时所具有的所有复杂相互作用,例如伸出手来抓住花瓶从花瓶中掉下来。架。 机器尚无法执行此类任务。 尽管如此,在过去的十年中,基于深度学习的AI取得了巨大进步,处理了海量数据,计算了人类所面临的复杂问题,并创建了基于先前模式预测未来结果的模型。

The development of deep learning triggered an AI boom and the field exploded around 2012, launching the era of convergence of bits a computer relies on to process information and synthetic neurons, the basic computing units of a neural net. Researchers and companies suddenly realized that data had much more value than they had ever imagined.

深度学习的发展引发了AI的兴起,该领域在2012年左右爆发,开创了计算机用来处理信息和合成神经元(神经网络的基本计算单元)所依赖的位融合时代。 研究人员和公司突然意识到,数据的价值超出了他们的想象。

Image for post
Dan Gutfreund, Computer Scientist, IBM Research
IBM研究部计算机科学家Dan Gutfreund

That was the year when Danny Gutfreund became the manager of one of the most ambitious AI initiatives yet — IBM’s Project Debater. His colleague at the Haifa lab, computer scientist Noam Slonim, decided to build it after the supercomputer IBM Watson, using a combination of symbolic AI and probabilistic inference, outwitted two humans in the TV quiz show Jeopardy! in February 2011. Fast-forward eight years, and Project Debater, a neural nets-based brain embodied by a black monolith with blinking blue lights, confronted a debate champion Harish Natarajan on 11 February 2019. “I suspect you’ve never debated a machine,” Project Debater said to its rival. “Welcome to the future.” Natarajan chuckled, slightly uneasy at first, but quickly got used to speaking to his digital opponent as if it were human. Able to sift through hundreds of millions of articles and answer queries based on the data it acquired, the AI could be of use to businesses.

那年Danny Gutfreund成为了迄今为止最雄心勃勃的AI计划之一-IBM的Project Debater的经理。 他在海法实验室的同事,计算机科学家Noam Slonim决定在超级计算机IBM Watson使用象征性AI和概率推理相结合,在电视测验节目Jeopardy中击败两个人之后,决定建造它 在2011年2月。快进了八年,Project Debater是一个基于神经网络的大脑,由黑色巨石和闪烁的蓝光构成,于2019年2月11日遇到了辩论冠军Harish Natarajan。“我怀疑您从未辩论过机器,” Debater项目对其竞争对手说。 “欢迎来到未来。” Natarajan轻笑着,一开始有点不舒服,但很快就习惯了和他的数字对手说话,就好像是人类一样。 该AI能够筛选上亿篇文章并根据所获取的数据回答查询,这可能对企业有用。

And, crucially, Project Debater’s digital brain follows similar processes humans go through — to an extent. Its neural nets are driven by data, learning from examples. That’s why neural networks are great in recognizing patterns, be it in language or imagery. But while we only need one or two examples to recognize an object or understand a sentence with an unfamiliar word, a neural net needs hundreds.

而且至关重要的是,Project Debater的数字大脑在一定程度上遵循着人类所经历的类似过程。 它的神经网络由数据驱动,从实例中学习。 这就是为什么神经网络能够很好地识别语言或图像等模式。 但是,尽管我们只需要一个或两个示例即可识别一个对象或理解一个单词不熟悉的句子,但神经网络则需要数百个实例。

Image for post
Harish Natarajan with his opponent, IBM Project Debater in San Diego, 2019
Harish Natarajan与对手圣地亚哥的IBM Project Debater合作,2019年

Still, deep learning has led to dramatic advances in many areas. In computer vision, instead of searching for specific pixel patterns, such as edges, like symbolic AI would, the neural net’s algorithm is first trained on many images over time. It then creates a model so that when faced with a new picture, it outputs a probability over all possible predictions, leading to accurate image recognition. Deep neural networks have also greatly improved natural language processing, enabling machines to perform complex translation to multiple languages. They help us find errors and inconsistencies in heaps of tax returns and assist us in the design of new materials by creating predictive models for unknown molecules — exactly the tasks where symbolic AI fails.

深度学习仍然在许多领域带来了巨大的进步。 在计算机视觉中,不是像符号AI那样搜索特定的像素模式(例如边缘),而是随着时间的推移首先在许多图像上训练神经网络的算法。 然后,它会创建一个模型,以便在面对新图片时,它会针对所有可能的预测输出概率,从而实现准确的图像识别。 深度神经网络还大大改善了自然语言处理能力,使机器能够执行多种语言的复杂翻译。 它们帮助我们发现大量报税表中的错误和不一致之处,并通过创建未知分子的预测模型(正是符号AI失败的任务)来帮助我们设计新材料。

But deep learning isn’t without its limitations.


One significant challenge is that neural nets can’t explain how objects relate to each other. As they rely on available data, they can’t reason — they can’t have common sense. “Common sense is all of the implicit knowledge that we have that’s never written down anywhere,” says Cox. “I know that if I take a cup and put it on the table, the table will support it. And even if we ingest a giant corpus of natural language into a machine, we’re not going to find a lot of examples of somebody stating that fact.” For all their awesomeness, neural nets don’t work the way human brains do — and likely never will.

一项重大挑战是神经网络无法解释对象之间的关系。 由于他们依赖可用数据,因此无法推理-他们没有常识。 考克斯说:“常识是我们所拥有的所有隐性知识,这些知识永远都不会写下来。” “我知道,如果我拿杯子放在桌子上,桌子会支撑它。 即使我们将巨大的自然语言语料库吸收到机器中,我们也不会找到很多说明这一事实的例子。” 尽管神经网络非常出色,但它们却无法像人脑那样工作,而且可能永远也不会。

向往解决这个常识之谜 (The yearning to solve this common-sense riddle)

brought Danny Gutfreund from sunny Haifa to Cambridge, home to MIT and Harvard, on the eastern coast of Massachusetts Bay. He wanted to try something new.

将丹尼·格特弗勒(Danny Gutfreund)从阳光明媚的海法带到马萨诸塞湾东海岸的剑桥,这是麻省理工学院和哈佛大学的所在地。 他想尝试一些新事物。

To help machines reason like us, Gutfreund looked to mix the symbolic AI of the past with neural nets, fusing logic and learning. Neural nets, he reasoned, would enhance symbolic AI systems by splitting the world into symbols — recognizing images and translating pixels into a symbolic representation. And symbolic AI algorithms would inject into the neural nets common sense reasoning and domain knowledge. They would apply logic and semantic reasoning to describe relationships between objects, predict various interactions, answer questions and make decisions — just like a human would. He wanted to give neurosymbolic AI a try — a new field, with a handful of groups exploring it.

为了帮助像我们这样的机器推理,Gutfreund希望将过去的象征性AI与神经网络,逻辑和学习融合在一起。 他认为,神经网络可以通过将世界划分为符号来增强符号AI系统,即识别图像并将像素转换为符号表示。 并将符号AI算法注入神经网络的常识推理和领域知识。 他们将运用逻辑和语义推理来描述对象之间的关系,预测各种交互作用,回答问题并做出决策,就像人类一样。 他想尝试一下神经符号AI,这是一个新领域,有许多小组对其进行了探索。

In Boston, Gutfreund encountered the babies.


Or rather, the data from lengthy research spanning several decades into how babies perceive the world. “Many people imagine young babies as passive recipients of environmental experience. They look passive because they can’t do anything,” says Rebecca Saxe, an MIT professor of cognitive neuroscience. Indeed — very young infants can’t yet sit, walk, or talk, and the way they seem to learn may remind how researchers pre-train machine learning algorithms. Scientists ingest vast banks of data into software as passive experience, and let machines extract statistics, patterns and structure.

或者更确切地说,来自数十年的长期研究得出的数据涉及婴儿如何看待世界。 “许多人把婴儿想象成是环境经历的被动接受者。 他们看起来无能为力,因为他们无能为力。”麻省理工学院认知神经科学教授丽贝卡·萨克斯(Rebecca Saxe)说。 确实-很小的婴儿还不能坐着,走路或说话,而且他们的学习方式可能会提醒研究人员如何训练机器学习算法。 科学家将大量的数据库作为被动体验吸收到软件中,并让机器提取统计数据,模式和结构。

But human infants are not passive, says Saxe. “Right from the very moment that they are born, they are making choices of what their experience is like,” she says. They learn by extracting structure from vast amounts of experience — but they are actively choosing what to look at and what to learn from.

萨克斯说,但人类婴儿并不消极。 她说:“从他们出生的那一刻起,他们就在选择自己的经历。” 他们通过从大量经验中提取结构来学习-但他们正在积极选择要看的东西和从中学到的东西。

Image for post
Rebecca Saxe speaking at TEDxCambridge
丽贝卡·萨克斯(Rebecca Saxe)在TEDxCambridge上发表演讲

For years, Saxe has been trying to understand human cognition, observing five months old infants and studying their gaze. Her work, as well as that of Harvard cognitive psychologist Elizabeth Spelke and others, has given crucial insights into the processes inside the still-growing brain — from the way a baby looks at an object and for how long. For their experiments, the researchers resort to near-infrared spectroscopy (NIRS), studying neural activity with light. “You shine light through a baby’s scalp, and then use a detector to measure the amount of reflectance of two different wavelengths,” says Saxe. “That tells you the relative oxygenation of the blood in the brain, because when neurons are more active, they consume more oxygen.”

多年来,Saxe一直试图了解人类的认知,观察五个月大的婴儿并研究他们的凝视。 她的工作以及哈佛认知心理学家伊丽莎白·斯Perl克(Elizabeth Spelke)等人的工作,对婴儿仍在成长的大脑内部的过程提供了至关重要的见解-从婴儿观察物体的方式以及持续了多长时间。 对于他们的实验,研究人员诉诸于近红外光谱(NIRS),研究光的神经活动。 萨克斯说:“您可以通过婴儿的头皮照亮光线,然后使用检测器测量两种不同波长的反射率。” “这告诉你大脑中血液的相对氧合,因为当神经元更加活跃时,它们会消耗更多的氧气。”

The measurements have helped her understand changes in the neural activity. A baby may be looking at something for longer because it’s a familiar object, or because he or she likes it, or finds it surprising, or perhaps scary. Different triggers lead to sparks of activity in different brain regions. “It surprises me that you can measure a baby’s cognition from their gaze,” says Saxe. “It surprises me that you can disentangle their different motivations using neuroimaging. It’s pretty wild, but it seems to be working.”

这些测量值帮助她了解了神经活动的变化。 婴儿可能会因为看似熟悉的物体,或者因为他或她喜欢它,或者发现它令人惊讶,甚至感到恐惧而长时间待着。 不同的触发因素会在不同的大脑区域引发活动火花。 萨克斯说:“您可以从他们的凝视来衡量婴儿的认知能力,这让我感到惊讶。” “令我惊讶的是,您可以使用神经影像技术来分辨他们的不同动机。 这很疯狂,但似乎可以正常工作。”

Intrigued by Saxe’s results, Gutfreund, Cox and their IBM AI colleagues in Boston decided to have a chat with her about a possible collaboration. What if we combine psychology and neuroscience with machine learning, they reckoned, to eventually try to apply theories about infant cognition to AI algorithms? “The hypothesis is that as we make AI more like babies in those ways, we will get insights that will push new AI away from doing just pattern classification, which it mostly is right now, and towards being actual reasoning and cognition,” says Saxe.

Saxe的研究结果引起了他们的兴趣,Gutfreund,Cox和他们在波士顿的IBM AI同事决定就可能的合作与她聊天。 他们认为,如果我们将心理学和神经科学与机器学习相结合,最终尝试将有关婴儿认知的理论应用于AI算法,该怎么办? “假设是,当我们通过这种方式使AI变得更像婴儿时,我们将获得洞察力,这将促使新的AI不再仅仅进行模式分类(现在主要是模式分类),而是进行实际的推理和认知,” Saxe说。 。

In addition to Saxe, the IBM team also teamed up with Elizabeth Spelke and Josh Tenenbaum, MIT professor of cognitive science and computation, along with Harvard psychology professor Tomer Ullman and MIT computer scientist Vikash Mansinghka. The MIT and Harvard researchers have an intriguing approach to AI, drawing on the insights of Spelke, Saxe and others about infant minds: They posit that humans are born with a pre-programmed rough understanding of the world, in some ways analogous to the game engines used to build interactive immersive video games. This “game engine in the head” provides the ability to simulate the world and our interactions with it, and serves as the target of perception and the world model that guides our planning.

除了Saxe之外,IBM团队还与麻省理工学院认知科学与计算教授Elizabeth Spelke和Josh Tenenbaum以及哈佛大学心理学教授Tomer Ullman和MIT计算机科学家Vikash Mansinghka合作。 麻省理工学院和哈佛大学的研究人员利用Spelke,Saxe等人关于婴儿思想的见解,对AI提出了一种有趣的方法:他们认为人类天生具有对世界的预先编程的粗略了解,在某些方面类似于游戏用于构建互动式沉浸式视频游戏的引擎。 这种“头脑中的游戏引擎”提供了模拟世界以及我们与之互动的能力,并成为感知的目标和指导我们计划的世界模型。

Crucially, this game engine learns from data, starting in infancy, to be able to model the actual situations — the endless range of “games” — we find ourselves in. It is approximate yet gets more and more efficient — to the point that very quickly, humans make instant mental approximations that are good enough to thrive in the world. And, the researchers think, it’s possible to replicate this type of system in a machine by embedding ideas and tools from game engine design inside frameworks for neurosymbolic AI and probabilistic modeling and inference known as probabilistic programs.

至关重要的是,这个游戏引擎从婴儿期开始就从数据中学习,能够对实际情况(我们发现自己可以进入无尽范围的“游戏”范围)进行建模。尽管如此,但它越来越有效了,以至于快速地,人类做出了足以在世界上蓬勃发展的即时心理近似。 而且,研究人员认为,可以通过将来自游戏引擎设计的思想和工具嵌入神经符号AI和概率建模与推理(称为概率程序)的框架中,在机器中复制这种类型的系统。

在2019年8月,研究人员开始工作, (In August 2019, the researchers got to work,)

aiming to give machines true common sense — by reverse-engineering a child’s brain. Soon, more scientists joined, including other developmental psychologists, computational neuroscientists, computer scientists and cognitive scientists from MIT, Harvard and Stanford. With the blessing of DARPA, which awarded the collaboration several million dollars for a four-year project to research and build computational models mimicking core cognitive capabilities of babies, Gutfreund and colleagues embarked on an ambitious adventure. Because, according to DARPA, the absence of common sense is the most significant barrier between the narrowly-focused AI applications of today and the more general, human-like AI systems hoped for in the future.

旨在通过对儿童的大脑进行逆向工程来赋予机器真正的常识。 不久,包括麻省理工学院,哈佛大学和斯坦福大学的其他发展心理学家,计算神经科学家,计算机科学家和认知科学家在内的更多科学家加入了研究。 在DARPA的祝福下,DARPA为这项为期四年的项目提供了数百万美元的资金,用于研究和建立模仿婴儿核心认知能力的计算模型,Gutfreund及其同事开始了一次雄心勃勃的冒险。 因为,根据DARPA的说法,缺乏常识是当今狭AI的AI应用与未来希望的更通用的类人AI系统之间的最大障碍。

Image for post
David Cox

“I think this is where we have to go,” says Cox. “AI has gone in common waves of winters and springs: we overpromise, then underdeliver. We’re in an AI spring right now. And I think it’s existential that AI research moves in this direction — learning like babies do.”

“我认为这是我们必须去的地方,” Cox说。 “人工智能进入了冬天和春天的常见浪潮:我们承诺过高,然后交付不足。 我们现在处于AI之春。 我认为,人工智能研究朝着这个方向发展是必要的,就像婴儿一样学习。”

Recently, researchers from the MIT and Harvard teams created an algorithm that relies on the combination of neural networks, symbolic AI, and powered by a probabilistic physics inference model, to track and react to objects as they move and may become suddenly hidden from view. Babies already know by the time they are three months old that if this happens, the object they cannot see anymore will remain in place and not vanish.

最近,麻省理工学院和哈佛大学团队的研究人员创建了一种算法该算法依赖于神经网络,符号AI的组合,并由概率物理推理模型提供支持,可在物体移动并可能对物体突然隐藏起来时对其进行跟踪和做出React。 婴儿在三个月大的时候就已经知道,如果发生这种情况,他们看不见的物体将保留在原位并且不会消失。

To get the machine to learn this common-sense knowledge, the researchers relied on a deep neural network to identify the physical properties of the objects — their shape type, location and velocity. The model translated the pixels in the video to symbolic representations. Then, feeding on the symbols, a probabilistic physics-based reasoning model tracked how the scene unfolded, indicating any unexpected event — such as an object suddenly vanishing. The machine did well — when the cube in the simulation suddenly disappeared after the blocking object was removed, the software flagged it as an implausible event — just like a baby would look at the empty space for longer, surprised at the violation of physics.

为了使机器学习常识性知识,研究人员依靠一个深度神经网络来识别物体的物理特性-物体的形状类型,位置和速度。 该模型将视频中的像素转换为符号表示。 然后,以符号为基础,基于概率的物理推理模型跟踪场景如何展开,从而指示出任何意外事件,例如物体突然消失。 机器运行良好–当模拟中的多维数据集在移除阻塞对象后突然消失时,软件将其标记为令人难以置信的事件–就像婴儿会长时间注视着空旷的空间,对违反物理原理感到惊讶。

And it’s not just vision that makes us reason the way we do. It’s also language. “Our ability to bridge between from our perceptual systems, including vision, to language is crucial to our intelligence: that we can talk about the things that we perceive and imagine scenes when we talk,” says Roger Levy, a cognitive scientist at MIT. “I can tell you about a silver dusted porcupine that’s living five miles underwater, on the ocean floor near a coral reef. That definitely doesn’t exist in the world, it’s absurd. But you probably have a very rich picture in your mind of what that’s like right now — because you can go from perception to mental representations to language and back.” We should be able to recreate all of this in a machine, he adds, and also include all the other sensory modalities that connect to language.

不仅仅是愿景使我们推理自己的行为方式。 这也是语言。 麻省理工学院的认知科学家罗杰·列维(Roger Levy)说:“我们在包括视觉在内的感知系统与语言之间架桥的能力对于我们的智力至关重要:我们可以谈论我们在交谈时可以感知和想象的事物。” “我可以告诉你,一只银色的豪猪生活在水下五英里处,在珊瑚礁附近的海底。 这绝对是世界上不存在的,这很荒谬。 但是您可能会对现在的情况有一个非常丰富的了解-因为您可以从感知到心理表征再到语言再回到过去。” 他补充说,我们应该能够在一台机器上重新创建所有这些内容,并且还应该包括所有其他与语言相关的感官形式。

有多个小组研究该语言 (There are multiple groups looking into the language)

side of machine intelligence, among them another team at IBM. Led by Alexander Gray, VP of IBM AI Science based in the company’s Yorktown lab near New York, the researchers are relying on recent advances in statistical AI for natural language processing. “Classical AI is not cool anymore; deep learning is cool. So we’re definitely in a minority — or you can look at it as we’re ahead of the game,” laughs Gray. “We think we’re ahead of the game.”

在机器智能方面,其中包括IBM的另一个团队。 由位于纽约州约克镇实验室的IBM AI科学副总裁Alexander Gray领导,研究人员依靠统计AI的最新进展进行自然语言处理。 “经典的AI不再酷了; 深度学习很酷。 所以我们绝对是少数派,或者您可以在我们领先于游戏的同时看看它,” Gray笑着说。 “我们认为我们领先于游戏。”

His aim is to gradually move from pure black box neural net models to models that can be understood as logic-like knowledge — but not necessarily the knowledge elicited from humans. “You can’t rely on a bunch of humans to write down all the knowledge in the world,” says Gray. “Instead, we’re going to learn that knowledge, to acquire it automatically from text.”

他的目标是逐步从纯黑匣子神经网络模型过渡到可以理解为类似于逻辑的知识的模型,但不一定是从人类身上获得的知识。 格雷说:“您不能依靠一群人来写下世界上所有的知识。” “相反,我们将学习该知识,以自动从文本中获取知识。”

For the past few years, Gray and his team have been using so-called semantic parsing, translating a natural language sentence into a logic-like sentence — mapping the words to explicit symbolic concepts. “Take the phrase ‘Mary had a little lamb’ — we will identify the word Mary, and map it to the concept of a person within a knowledge graph, allowing the use of other rich information, such as the fact that a person is a kind of mammal, which is a kind of living thing, and so on. This allows us to apply common sense knowledge that the machine can use to perform more general tasks,” says Gray.

在过去的几年中,Gray和他的团队一直在使用所谓的语义解析,将自然语言的句子翻译成类似逻辑的句子-将单词映射到明确的符号概念。 “采用“玛丽有只小羊羔”这一短语-我们将识别“玛丽”一词,并将其映射到知识图中的一个人的概念,从而允许使用其他丰富的信息,例如一个人是一种哺乳动物,这是一种生物,等等。 这使我们能够应用机器可以用来执行更一般任务的常识。” Gray说。

Another part of the research program will automatically acquire that knowledge. “The advantage of using a model which has a logic-like form is that you can then perform reasoning to get the answer to more sophisticated questions,” says Gray. “This is a possible path to true natural language understanding.”

研究程序的另一部分将自动获取该知识。 “使用具有类似逻辑形式的模型的优势在于,您可以执行推理来获得更复杂问题的答案,”格雷说。 “这是通往真正自然语言理解的可能途径。”

Another team at the MIT-IBM Watson AI Lab is also interested in combining vision and language. The researchers developed an algorithm called the Neuro-Symbolic Concept Learner, where an AI with two neural networks answers questions about objects in images. One network creates a table with characteristics of the objects such as color, location and size. The other one is trained on question-answer pairs, such as “What’s the color of the cube?” — “Red.” That neural net then transforms each question into a symbolic AI program that references the table to get an answer.

MIT-IBM Watson AI Lab的另一个团队也对将视觉和语言结合感兴趣。 研究人员开发了一种称为神经符号概念学习器的算法,其中具有两个神经网络的AI可以回答有关图像中对象的问题。 一个网络创建一个具有对象特征(例如颜色,位置和大小)的表。 另一个在问题-答案对上接受了培训,例如“立方体的颜色是什么?” —“红色”。 然后,该神经网络将每个问题转换成一个符号AI程序,该程序引用该表以获得答案。

That’s perception — the equivalent of a photon of light hitting our retina and streaming the visual data into the brain, which then translates it into something we can describe in language. Crucially, the researchers have been able to gradually relax the amount of innate knowledge that the system has to have. First, it had to know the different kinds of objects that were there, their colors and sizes. Then the system knew that there was something called color but it didn’t know that blue or red are colors — it had to figure it out from context and learn implicitly how it was tied to language. And finally, the system didn’t even know that color was a concept, it had to figure it out and learn what the color corresponds to. “They’ve been on this interesting progression where the system is given less and less and it has to learn — basically, developing common sense,” says Cox.

那就是感知力-相当于光的光子撞击我们的视网膜并将视觉数据流到大脑,然后将其转换为我们可以用语言描述的东西。 至关重要的是,研究人员已经能够逐渐放松系统必须具备的先天知识。 首先,它必须知道那里的各种物体,它们的颜色和大小。 然后,系统知道有一种叫做颜色的东西,但它不知道蓝色或红色是颜色-它必须从上下文中找出颜色,然后隐式地了解它与语言的联系。 最后,系统甚至不知道颜色是一个概念,它必须弄清楚颜色并了解颜色对应的含义。 Cox说:“他们一直在这个有趣的过程中,系统的使用越来越少,它必须学习—基本上是在发展常识。”

这些实验只是在摸索- (These experiments are just scratching the surface —)

there’s a lot more to learn about the brain. Perhaps in a roundabout way, our progress in giving machines the ability to learn and reason like humans will also help us understand how babies know that objects don’t teleport. Perhaps it will even help Danny Gutfreund finally find out how his brother’s dog knew that every year at around the same time there would be fireworks.

还有很多关于大脑的知识。 也许以一种round回的方式,我们在赋予机器像人类一样的学习能力和推理能力方面的进步也将帮助我们理解婴儿如何知道物体不会隐形。 也许这甚至可以帮助Danny Gutfreund最终了解他哥哥的狗如何知道每年大约在同一时间都会放烟花。

But most importantly, this neurosymbolic AI research should help us build machines of the future, autonomous systems able to accomplish tasks without external input — drastically important in critical situations such as natural disasters or industrial accidents.


We may be in an AI spring right now, but there’s a long way to get to an AI summer — and to stay there without overhyping the research and underdelivering. “We’re still working in a petri dish. But we are now finally starting to mimic the human ability to acquire common sense knowledge in an unsupervised way, by acting and being in an environment of learning,” says Cox. “Yes, AI research is still a very simple toy world. But a toy world that at last holds a lot of promise.”

我们可能现在正处于AI的春天,但是要到达AI的夏天还有很长的路要走,而且要保持在那里并不会过度夸大研究和交付不足。 “我们仍在培养皿中工作。 但是我们现在终于开始模仿人类通过行动并处于学习环境中而以无监督的方式获得常识知识的能力。” Cox说。 是的,人工智能研究仍然是一个非常简单的玩具世界。 但是,一个玩具世界最终充满了希望。”

Image for post

翻译自: https://medium.com/swlh/neurosymbolic-ai-to-give-us-machines-with-true-common-sense-9c133b78ab13


< <上一篇