Prior to joining the Watson team, I spent about 18 years on the Lotus Notes team, most recently as the chief architect for Lotus Notes. By pretty much any definition I can think of, Lotus Notes is some pretty complex technology.
When I joined Watson, everyone warned me about how incredibly complex Watson is. I've been working on Watson for about 9 months now -- leading the "Watson Platform" team -- and I'm still not convinced that it is all that complex. Before you think I am dissing a technology that I think is extremely cool, innovative, and game-changing, let me explain.
In order to answer a question, Watson simply performs a series of discrete tasks on the input text, each task performed on what is accumulated by the previous tasks. I liken the processing to Henry Ford's assembly line for the Model T. Except for Watson, we call it a pipeline, not an assembly line, and we are building an evidence supported response rather than a mass produced automobile.
As we add to the algorithmic mix of the Watson engine, we add new tasks (called "annotators") or tweak existing ones. This pipeline, based on the Apache open source UIMA technology (Unstructured Informational Management Architecture at http://uima.apache.org), would make Henry Ford proud.
Of course, the secret sauce of Watson is in each of the tasks and how they build on each other.
The natural language parser at the head of the line, for example, is pretty impressive. It is a "Deep NLP" parser that came out of IBM Research, from the team that originally built Watson and succeeded in the Jeopardy! challenge. By "Deep," I mean that it is does much more than find keywords and identify parts of speech. It can tell, for example, that in "Sally took her car to the beach" that her refers to Sally. (Technically, her is called an anaphoric expression -- if you want to impress your friends.)
Other key tasks include searching for possible answers based on the parsed question, finding and scoring evidence for each of the possibilities based on a litany of different algorithms, and finally applying machine learning to separate the wheat from the chaff of the various scores based on historical training data. I admit it. There's years and years of hard work by an amazing team that went into developing those tasks.
In the end, though, what is the test for simplicity? I'll pass on PhD level tests as well as mundane tests like number of lines of code. I'll stick to the basics -- can I explain roughly how it works to my 7-year old? Yep -- chalk one up for simple!
I view that as a tremendous compliment.
As Steve Jobs said: "Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it's worth it in the end because once you get there, you can move mountains."
And from Albert Einstein: "Most of the fundamental ideas of science are essentially simple, and may, as a rule, be expressed in a language comprehensible to everyone."
Certainly an apropos goal for Watson -- expressing responses in a "language comprehensible to everyone."