Associated Incidents

On April 1, 2009, Google unveiled Gmail Autopilot, a plug-in that promised to read and generate contextually relevant replies to the messages piling up in users’ inboxes. “As more and more everyday communication takes place over email, lots of people have complained about how hard it is to read and respond to every message,” the product page explained. “This is because they actually read and respond to all their messages.” For those who hadn’t registered the date, the terms-and-conditions page spelled out the joke: “No, we don’t plan to scan every one of your incoming messages and automatically send the perfect reply.”
On November 5, 2015, Google unveiled Smart Reply, a plug-in that reads and suggests responses to e-mails. This time the innovation actually exists, as part of the company’s Inbox app for Android and iOS. If Smart Reply thinks that it understands a message that requires an answer, it will suggest three options, alongside a cheerful invitation to “start composing your reply with one tap.” When I wrote to Matt Jones, the design director at Google Research, to say that it had been great to see him the other night and to ask whether he could put me in touch with the team behind Smart Reply, he sent back a screenshot of the options that the app had given him: “It was great seeing you too!” “It was fun!” “Will do!” He connected me with Alex Gawley, the product manager for Gmail, who said that I shouldn’t be offended. (“I mean, I’m sure it was great to see you.”)
The April Fool’s joke of 2009 began to be taken seriously in early 2015, Gawley told me, thanks to two developments. The Google Research team, which had recently acquired DeepMind, the company behind an arcade-game-winning form of artificial intelligence, was making rapid progress in language-related areas of machine learning, including translation and speech analysis. At the same time, Americans were reading more and more of their e-mails on mobile devices—fifty-three per cent of them, in fact, up from eight per cent in 2011. “It’s a little screen and a little keyboard, which means e-mail is easy to read and really hard to reply to,” Gawley said. He added that the combination of fat thumbs and autocorrect is “a real pain point for our users.”
Smart Reply uses what is known as an artificial neural network—an intimidating term for a particular kind of mathematical model—to tease out the patterns and probabilities that underlie e-mail communications. For privacy reasons, humans are not allowed to read Google’s vast corpus of e-mail messages. Machines, however, are, and by drawing on that data they can gradually sort sentences into “thought vectors,” or coördinates in linguistic space. In other words, by plotting similarities in context, word frequency, and sentence structure, the neural network can teach itself to recognize and group together the endless variety of ways that humans have developed to say much the same thing: “How does this afternoon look for a call?” “Can we talk later today?” Or “Does this P.M. work for a quick chat?” By trawling through the data again, the machine can then find and suggest the most typical responses to this particular thought vector: “Sure, what time were you thinking?” “Sure, anytime.” Or “Sure, what’s up?”
Earlier this year, Google researchers used the network to develop an intelligent chatbot with which they could discuss the purpose of life. (“To serve the greater good,” according to the machine.) Still, when the company’s engineers applied their neural net to the problem of e-mail, it did not work perfectly right away. “Most machine-learning work is actually about tuning,” Gawley said. The A.I. has a tendency to suggest replies that say the same thing in slightly different ways, which is less useful than offering users responses that represent a range of different likely replies. (For instance, “No, sorry, I’m busy” would have been a more useful alternative than “Sure, anytime” in the example above.) The team has corrected for this, to some extent, by adding a parameter that encourages the machine to choose disparate responses—ones that have sufficient distance between them when plotted as vectors in semantic space.
The early iterations of Smart Reply were overly affectionate. “I love you” was the machine’s most common suggested response. This was a touch awkward: because the model has no knowledge of the relationship between an e-mail’s sender and its receiver, it provides the same suggested responses whether you are corresponding with your boss or a long-lost sibling. “The team was really puzzled about this,” Gawley said. “It turns out that our internal testers are very affectionate and that ‘I love you’ is a very common thing for people at Google to say.” When the engineers inspected their model, they discovered that whenever an e-mail did not give a particularly strong signal as to the appropriate response, the machine hedged its bets with a declaration of affection. This fact may yet become the subj