By James Bridle, from New Dark Age, which was published this month by Verso. Bridle is a writer and artist.
Here’s a story about how machines learn. Say you are the US Army and you want to be able to locate enemy tanks in a forest. The tanks are painted with camouflage, parked among trees, and covered in brush. To the human eye, the blocky outlines of the tanks are indistinguishable from the foliage. But you develop another way of seeing: you train a machine to identify the tanks. To teach the machine, you take a hundred photos of tanks in the forest, then a hundred photos of the empty forest. You show half of each set to a neural network, a piece of software designed to mimic a human brain. The neural network doesn’t know anything about tanks and forests; it just knows that there are fifty pictures with something important in them and fifty pictures without that something, and it tries to spot the difference. It examines the photos from multiple angles, tweaks and judges them, without any of the distracting preconceptions inherent in the human brain.
When the network has finished learning, you take the remaining photos—fifty of tanks, fifty of empty forest—which it has never seen before, and ask it to sort them. And it does so, perfectly. But once out in the field, the machine fails miserably. In practice it turns out to be about as good at spotting tanks as a coin toss. What happened?
The story goes that when the US Army tried this exercise, it made a crucial error. The photos of tanks were taken in the morning, under clear skies. Then the tanks were removed, and by afternoon, when the photos of the empty forest were taken, the sky had clouded over. The machine hadn’t learned to discern the presence or absence of tanks, but merely whether it was sunny or not.
This cautionary tale, repeated often in the academic literature on machine learning, is probably apocryphal, but it illustrates an important question about artificial intelligence: What can we know about what a machine knows? Whatever artificial intelligence might come to be, it will be fundamentally different from us, and ultimately inscrutable. Despite increasingly sophisticated systems of computation and visualization, we do not truly understand how machine learning does what it does; we can only adjudicate the results.
The first neural network, developed in the Fifties for the United States Office of Naval Research, was called the Perceptron. Like many early computers, it was a physical machine: a set of four hundred light-detecting cells randomly connected by wires to switches. The idea behind the Perceptron was connectionism: the belief that intelligence comes from the connections between neurons, and that by imitating these winding pathways of the brain, machines might be induced to think.
One of the advocates of connectionism was Friedrich Hayek, best known today as the father of neoliberalism. Hayek believed in a fundamental separation between the sensory world of the mind—unknowable, unique to each individual—and the “natural,” external world. Thus the task of science was the construction of a model of the world that ignored human biases. Hayek’s neoliberal ordering of economics, where an impartial and dispassionate market directs the action, offers a clear parallel.
The connectionist model of artificial intelligence fell out of favor for several decades, but it reigns supreme again today. Its primary proponents are those who, like Hayek, believe that the world has a natural order that can be examined and computed without bias.
In the past few years, several important advances in computing have spurred a renaissance of neural networks and led to a revolution in expectations for artificial intelligence. One of the greatest champions of AI is Google; cofounder Sergey Brin once said, “You should presume that someday, we will be able to make machines that can reason, think, and do things better than we can.”
A typical first task for testing intelligent systems is image recognition, something that is relatively easy for companies like Google, which builds ever-larger networks of ever-faster processors while harvesting ever-greater volumes of data from its users. In 2011, Google revealed a project called Google Brain, and soon announced that it had created a neural network using a cluster of a thousand machines containing some 16,000 processors. This network was fed 10 million unlabeled images culled from YouTube videos, and developed the ability to recognize human faces (and cats) with no prior knowledge about what those things signified. Facebook, which had developed a similar program, used 4 million user images to create a piece of software called DeepFace, which can recognize individuals with 97 percent accuracy.
Soon this software will be used not only to recognize but to predict. Two researchers from Shanghai Jiao Tong University recently trained a neural network with the ID photos of 1,126 people with no criminal record and 730 photos of convicted criminals. In a paper published in 2016, they claimed that the software could tell the difference between criminal and noncriminal faces—that is, it used photos of faces to make inferences about criminality.
The paper provoked an uproar on technology blogs, in international newspapers, and among academics. The researchers were accused of reviving nineteenth-century theories of criminal physiognomy and attacked for developing a facial recognition method that amounted to digital phrenology. Appalled at the backlash, they responded, “Like most technologies, machine learning is neutral,” and insisted that if machine learning “can be used to reinforce human biases in social computing problems??.?.?. then it can also be used to detect and correct human biases.” But machines don’t correct our flaws—they replicate them.
Technology does not emerge from a vacuum; it is the reification of the beliefs and desires of its creators. It is assembled from ideas and fantasies developed through evolution and culture, pedagogy and debate, endlessly entangled and enfolded. The very idea of criminality is a legacy of nineteenth-century moral philosophy, and the neural networks used to “infer” it are, as we’ve seen, the products of Hayek’s worldview: the apparent separation of the mind and the world, the apparent neutrality of this separation. The belief in an objective schism between technology and the world is nonsense, and one that has very real outcomes.
Encoded biases are frequently found hidden in new devices: cameras unwittingly optimized for Caucasian eyes, say, or light skin. These biases, given time and thought, can be detected, understood, and corrected for. But there are further consequences of machine learning that we cannot recognize or understand, because they are produced by new models of automated thought, by cognitive processes utterly unlike our own.
Machine thought now operates at a scale beyond human understanding. In 2016, the Google Translate system started using a neural network developed by Google Brain, and its abilities improved exponentially. Ever since it was launched in 2006, the system had used a technique called statistical language inference, which compared a vast corpus of similar texts in different languages, with no attempt to understand how languages actually worked. It was clumsy, the results too literal, more often a source of humor than a sophisticated intelligence.
Reprogrammed by Google Brain, the Translate network no longer simply cross-references loads of texts and produces a set of two-dimensional connections between words, but rather builds its own model of the world: a map of the entire territory. In this new architecture, words are encoded by their distance from one another in a mesh of meaning that only the computer can comprehend. While a human can draw a line between the words “tank” and “water” easily enough, it quickly becomes impossible to add the lines between “tank” and “revolution,” between “water” and “liquidity,” and all the emotions and inferences that cascade from those connections. The Translate network’s map does it easily because it is multidimensional, extending in more directions than the human mind can conceive. Thus the space in which machine learning creates its meaning is, to us, unseeable.
Our inability to visualize is also an inability to understand. In 1997, when Garry Kasparov, the world chess champion, was defeated by the supercomputer Deep Blue, he claimed that some of the computer’s moves were so intelligent and creative that they must have been the result of human intervention. But we know quite well how Deep Blue made those moves: it was capable of analyzing 200 million board positions per second. Kasparov was not outthought; he was outgunned by a machine that could hold more potential outcomes in its mind.
By 2016, when Google’s AlphaGo software defeated Lee Sedol, one of the highest-ranked go players in the world, something crucial had changed. In their second game, AlphaGo stunned Sedol and spectators by placing one of its stones on the far side of the board, seeming to abandon the battle in progress. Fan Hui, another professional go player watching the game, was initially mystified. He later commented, “It’s not a human move. I’ve never seen a human play this move.” He added: “So beautiful.” Nobody in the history of the 2,500-year-old game had ever played in such a fashion. AlphaGo went on to win the game, and the series.
AlphaGo’s engineers developed the software by feeding a neural network millions of moves by expert go players, then having it play itself millions of times, rapidly, learning new strategies that outstripped those of human players. Those strategies are, moreover, unknowable—we can see the moves AlphaGo makes, but not how it decides to make them.
The same process that Google Translate uses to connect and transform words can be applied to anything described mathematically, such as images. Given a set of photographs of smiling women, unsmiling women, and unsmiling men, a neural network can produce entirely new images of smiling men, as shown in a paper published in 2015 by Facebook researchers.
A similar process is already at work in your smartphone. In 2014, Robert Elliott Smith, an artificial intelligence researcher at University College London, was browsing through family vacation photos on Google+ when he noticed an anomaly. In one image, he and his wife were seated at a table in a restaurant, both smiling at the camera. But this photograph had never been taken. His father had held the button down on his iPhone a little long, resulting in a burst of images of the same scene. In one of them, Smith was smiling, but his wife was not; in another, his wife was smiling, but he was not. From these two images, taken fractions of a second apart, Google’s photo-sorting algorithms had conjured a third: a composite in which both subjects were smiling. The algorithm was part of a package later renamed Assistant, which performs a range of tweaks on uploaded images: applying nostalgic filters, making charming animations, and so forth. In this case, the result was a photograph of a moment that had never happened: a false memory, a rewriting of history. Though based on algorithms written by humans, this photo was not imagined by them—it was purely the invention of a machine’s mind.
Machines are reaching further into their own imaginary spaces, to places we cannot follow. After the activation of Google Translate’s neural network, researchers realized that the system was capable of translating not merely between languages but across them. For example, a network trained on Japanese–English and English–Korean text is capable of generating Japanese–Korean translations without ever passing through English. This is called zero-shot translation, and it implies the existence of an interlingual representation: a metalanguage known only to the computer.
In 2016 a pair of researchers at Google Brain decided to see whether neural networks could develop cryptography. Their experiment was modeled on the use of an adversary, an increasingly common component of neural network designs wherein two competing elements attempt to outperform and outguess each other, driving further improvement. The researchers set up three networks called, in the tradition of cryptographic experiments, Alice, Bob, and Eve. Their task was to learn how to encrypt information. Alice and Bob both knew a number—a key, in cryptographic terms—that was unknown to Eve. Alice would perform some operation on a string of text and send it to Bob and Eve. If Bob could decode the message, Alice’s score increased, but if Eve could also decode it, Alice’s score decreased. Over thousands of iterations, Alice and Bob learned to communicate without Eve cracking their code; they developed a private form of encryption like that used in emails today. But as with the other neural networks we’ve seen, we can’t fully understand how this encryption works. What is hidden from Eve is also hidden from us. The machines are learning to keep their secrets.
Isaac Asimov’s three laws of robotics, formulated in the Forties, state that a robot may not injure a human being or allow a human being to come to harm, that a robot must obey the orders given it by human beings, and that a robot must protect its own existence. To these we might consider adding a fourth: a robot—or any intelligent machine—must be able to explain itself to humans. Such a law must intervene before the others. Given that it has, by our own design, already been broken, so will the others. We face a world, not in the future but today, where we do not understand our own creations. The result of such opacity is always and inevitably violence.
When Kasparov was defeated by Deep Blue, he left the game in disbelief. But he channeled his frustration into finding a way to rescue chess from the dominance of machines. He returned a year later with a form of chess he called Advanced Chess.
In Advanced Chess, a human and a computer play as a team against another human-computer pair. The results have been revolutionary, opening up fields and strategies of play previously unseen in the game. Blunders are eliminated, and the human players can analyze their own potential movements to such an extent that it results in perfect tactical play and more rigorously deployed strategic plans.
But perhaps the most extraordinary outcome of Advanced Chess is seen when human and machine play against a solo machine. Since Deep Blue, many computer programs have been developed that can beat humans with ease. But even the most powerful program can be defeated by a skilled human player with access to a computer—even a computer less powerful than the opponent. Cooperation between human and machine turns out to be a more potent strategy than trusting to the computer alone.
This strategy of cooperation, drawing on the respective skills of human and machine rather than pitting one against the other, may be our only hope for surviving life among machines whose thought processes are unknowable to us. Nonhuman intelligence is a reality—it is rapidly outstripping human performance in many disciplines, and the results stand to be catastrophically destructive to our working lives. These technologies are becoming ubiquitous in everyday devices, and we do not have the option of retreating from or renouncing them. We cannot opt out of contemporary technology any more than we can reject our neighbors in society; we are all entangled. To move forward, we need an ethics of transparency and cooperation. And perhaps we’ll learn from such interactions how to live better with these other entities—human and nonhuman—that we share the planet with.