Prev NEXT

How Siri Works

What is Siri?

Siri is ready to talk to you -- just tap the mic to start the conversation.
Siri is ready to talk to you -- just tap the mic to start the conversation.
©iStockphoto.com/Alexander Kirch

Siri is kind of a virtual assistant who listens to your requests and performs actions accordingly. The more you work with it and make corrections when you're misinterpreted, the better it's supposed to get at understanding what you mean. Rather than doing most of its work on your phone's processor, Siri communicates with servers in the cloud to interpret your requests and retrieve the information you need. Since most of Siri's brain exists on remote servers accessed by a great many people, the more people using it, the more it's supposed to learn from everyone else, too.

AI assistants have been the dream of technologists dream for a long time, but they weren't very feasible until recently. They have finally been made more so by things like much faster wireless speeds, more powerful processors (especially those in our mobile devices), the availability of vast amounts of data for training the AI, the advent of cloud computing and improvements in speech recognition methodologies. Most voice-recognition systems in the wild are like the ones you've probably dealt with when calling a large company on the telephone -- they only understand a very limited vocabulary. Siri has more data and learning ability behind it, and it continues to learn and grow.

Advertisement

Siri was not entirely developed by Apple, but instead sprung out of a huge AI initiative started in 2003 that was funded by the U.S. Defense Department's Defense Advanced Research Projects Agency (DARPA) and run by SRI International, a research entity affiliated with Stanford University until the 1970s. The intent was to come up with something that could help military personnel with office work and making decisions. The result of this project was called Cognitive Assistant that Learns and Organizes (CALO), an artificially intelligent assistant that could learn from its users and the vast amounts of data available to it. Not only could it be used to do things like schedule meetings and organize all the necessary documents for meeting participants, but it could even make decisions. For instance, if someone backed out of a meeting, CALO could assess whether they were vital enough to warrant canceling and rescheduling. Another SRI International project called Vanguard created a prototype assistant for a smartphone, but one with nowhere near CALO's capabilities. Several SRI employees created a startup to marry the ideas from both projects. Alumni from companies such as NASA, Apple and Google also worked for the new company, and their work led to Siri Assistant for iPhone 3GS.

This version of Siri would take questions from users via voice or keystrokes, and would send the voice or text data to a remote server for transcription (in the former case) and interpretation. Rather than try to break down entire sentences and interpret their meaning as a whole, as other natural language research has generally attempted, Siri used models of real objects and concepts, as well as how they might work together, to decipher the requests. People can say the same thing in a number of different ways, making sentence interpretation a Herculean effort, so Siri looked for keywords and their context instead. This simpler paradigm, along with a host of programmed phrases and requests it was designed to recognize and carry out, allowed Siri to guess what the user was asking and respond appropriately without having to understand every single word -- with a fair amount of accuracy. It had access to a large amount of data via various Web sites, and could use these sites' application program interfaces (APIs) to tap into any services they offered.

Apple tweaked this app into the Siri we know today. Siri actually lost some capabilities upon integration with iOS, as it used to have access to far more Web sites and services than Apple has paired with to date. It also lost some of its more biting humor and, apparently, a propensity for pottymouth responses. But it gained other skills, such as better integration with iPhone's built-in features, multilingual capabilities and an audible voice. And new features have been added with iOS's subsequent updates. For instance, it regained its ability to book dinner reservations and return movie times and reviews with the introduction of iOS 6. And the ability to purchase movie tickets was returned with the iOS 6.1 update in January 2013; however, it now books your seats through Fandango rather than Movietickets.com.

Unlike a search engine that returns long raw lists of links related to keywords you select, Siri is designed to interpret your request, hone in on what it thinks you want, and perform actions to give you a more limited but more correct amount of data or services in return. Siri understands context. And she still goes to servers in the cloud to retrieve answers via third party services, albeit a smaller set of them than before. Anything related to mathematical computation or scientific fact is likely to come from Wolfram|Alpha. Information related to businesses like restaurants or retail stores is likely to come from Yelp, although restaurant reservations are through OpenTable. Weather info comes from Apple's built-in Weather app, powered by Yahoo. And movie time listings, reviews and other movie information would likely come from Rotten Tomatoes. Any request Siri doesn't understand will cause her to ask you for more information to clarify, or to ask you if you want her to look it up on the Web. She uses your phone's GPS to retrieve and return information relevant to your current location.