… on being a serious tinkerer, if not a serious student
In middle school and high school, I perhaps didn’t pay attention in class like I should have! But I loved technology and mechanical things. Through my father, who repaired all manner of electrical equipment, I was building electronic circuits, radio-controlled planes, all sorts of things — even taking apart batteries to turn acid into hydrogen (which I don’t recommend trying at home).
Eventually, though, the math and physics courses at the French missionary high school I attended did grab my attention. And at American University, I was part of the first generation of students who could speak French and English, in addition to Arabic. This ability is what opened the door for me to study in the US.
… on switching from signals of zeroes and ones, to subjects and verbs
Electrical engineering is about understanding signals being sent within a machine and between machines. To a machine, language and speech are just signals. And that was my first job at a research firm in Boston — working on the analysis of speech signals. I essentially developed call-center technology that lets people say “one” or “operator” instead of pressing 1 or 0 on their phone.
From this work, IBM Research invited me to speak at a conference in Austria, which then parlayed into a job offer. Now, this was almost 30 years ago. It was the beginning of machines being able to understand natural language — baby steps compared to what Watson does today. In order to accomplish an accurate level of speech, we built databases full of words and phrases of what people might say in a specific domain. For example, in the late 1990s my team developed a system for T. Rowe Price that could understand phrases like “move money to an account.” Then we annotated the words with semantics, like “cash” is also “money” so their customers could make financial transactions over the phone.
… on trying to develop a digital Rosetta stone
I moved to language translation a few years after joining IBM. And one of my first big projects was to develop Direct Translation Model (DTM), an Arabic-English speech-to-speech translator.
The project’s major aha! moment in 2003 was when we got the computer system to handle multiple parameters. This meant it could learn to translate quickly because it searched thousands of parallel sentences, using 200 million words of parallel phrases of Arabic and English.
The Direct Translation Model is still the most accurate system in the market and has since been scaled to support 12 language pairs. Today, it’s better-known as Watson Language Translation on the Watson Developer Cloud, and can be customized by domain.
… on translating difficult language pairs
Another wrinkle to machine translation is producing fluent translations, especially between English and many Asiatic languages that have different sentence structures. This requires software to analyze the source language words and sentences. Given that information, the input sentence is translated to produce the target language, incrementally. After translating an initial fragment, or even a word, the machine looks for the next thing from the input to translate — which might not be the next fragment in the input.
Behind the scenes, the system is trying all translation possibilities because it doesn’t know which one is correct. Based on context, a sentence could have multiple meanings. Which ones should the machine pick? The system tries all of them. The end of this pruning process should be an accurately translated document. But we still have a tremendous amount of work to do in terms of quality and accuracy. Japanese and Chinese are still challenging languages for a machine to translate to another language. And in some domains, like healthcare, we must have very high, perhaps 99 percent, accuracy if a translation system is to be used.
My goal is that whatever a user’s native language is, they can get information from a machine, independent of the language or format of the source information.
… on languages he wants to learn
I would like to learn Chinese. My team has worked on Chinese-English translation, and Chinese is a language where the speaker assumes context — so not everything that is happening is actually spoken. This makes it hard for the machine. Plus, Chinese does not use pronouns or gender — another challenge for a machine. It would be great to know enough Chinese to understand the process of how Chinese speakers speak, and are understood.