Visual Turing test

Researchers at MIT, New York University, and the University of Toronto have developed a computer system whose ability to produce a variation of a character in an unfamiliar writing system, on the first try, is indistinguishable from that of humans.

That means that the system in some sense discerns what’s essential to the character — its general structure — but also what’s inessential — the minor variations characteristic of any one instance of it.

As such, the researchers argue, their system captures something of the elasticity of human concepts, which often have fuzzy boundaries but still seem to delimit coherent categories. It also mimics the human ability to learn new concepts from few examples. It thus offers hope, they say, that the type of computational structure it’s built on, called a probabilistic program, could help model human acquisition of more sophisticated concepts as well.

“In the current AI landscape, there’s been a lot of focus on classifying patterns,” says Josh Tenenbaum, a professor in the Department of Brain and Cognitive sciences at MIT, a principal investigator in the MIT Center for Brains, Minds and Machines, and one of the new system’s co-developers. “But what’s been lost is that intelligence isn’t just about classifying or recognizing; it’s about thinking.”

“This is partly why, even though we’re studying hand-written characters, we’re not shy about using a word like ‘concept,’” he adds. “Because there are a bunch of things that we do with even much richer, more complex concepts that we can do with these characters. We can understand what they’re built out of. We can understand the parts. We can understand how to use them in different ways, how to make new ones.”

The new system was the thesis work of Brenden Lake, who earned his PhD in cognitive science from MIT last year as a member of Tenenbaum’s group, and who won the Glushko Prize for outstanding dissertations from the Cognitive Science Society. Lake, who is now a postdoc at New York University, is first author on a paper describing the work in the latest issue of the journal Science. He’s joined by Tenenbaum and Ruslan Salakhutdinov, an assistant professor of computer science at the University of Toronto who was a postdoc in Tenenbaum’s group from 2009 to 2011.

Rough ideas

“We analyzed these three core principles throughout the paper,” Lake says. “The first we called compositionality, which is the idea that representations are built up from simpler primitives. Another is causality, which is that the model represents the abstract causal structure of how characters are generated. And the last one was learning to learn, this idea that knowledge of previous concepts can help support the learning of new concepts. Those ideas are relatively general. They can apply to characters, but they could apply to many other types of concepts.”

The researchers subjected their system to a battery of tests. In one, they presented it with a single example of a character in a writing system it had never seen before and asked it to produce new instances of the same character — not identical copies, but nine different variations on the same character. In another test, they presented it with several characters in an unfamiliar writing system and asked it to produce new characters that were in some way similar. And in a final test, they asked it to make up entirely new characters in a hypothetical writing system.

Human subjects were then asked to perform the same three tasks. Finally, a separate group of human judges was asked to distinguish the human subjects’ work from the machine’s. Across all three tasks, the judges could identify the machine outputs with about 50 percent accuracy — no better than chance.

Conventional machine-learning systems — such as the ones that led to the speech-recognition algorithms on smartphones — often perform very well on constrained classification tasks, but they must first be trained on huge sets of training data. Humans, by contrast, frequently grasp concepts after just a few examples. That type of “one-shot learning” is something that the researchers designed their system to emulate.

Learning to learn

Like a human subject, however, the system comes to a new task with substantial background knowledge, which in this case is captured by a probabilistic program. Whereas a conventional computer program systematically decomposes a high-level task into its most basic computations, a probabilistic program requires only a very sketchy model of the data it will operate on. Inference algorithms then fill in the details of the model by analyzing a host of examples.