Humans and systems: Creating natural interfaces to augment human ability
Natural Language Processing is the key to enabling devices to listen, learn, and act appropriately by understanding data and language in context.
Imagine a world where artificial intelligence and cognitive computing work in tandem with humans. Natural Language Processing (NLP) can help to facilitate interactions between users and systems. A system can be a device, it can be something that is installed in the house, a huge operating system that resides in a corporate or industrial setting, a mobile phone, or a large scale cloud application with which a person or set of people needs to interact. IBM has created an API for connecting unstructured data with Watson’s computing prowess. Cognitive APIs deliver natural-language processing, machine-learning capabilities, text analytics, and video and image analytics to help realize the potential of the cognitive era with Watson IoT. The Watson APIs for IoT help accelerate the development of cognitive IoT solutions and services on Watson IoT.
The IoT is driving digital disruption of the physical world with more than 12 Billion IoT devices around the world currently connected to the Internet1. IDC predicts there will be 30 Billion IoT devices connected to the Internet by 20201, and that number is expected to increase within the next decade – ranging anywhere between 50 billion to reach 1 trillion.1 Within the same time span, 40% of all data generated will come from connected sensors resulting in data from IoT, yielding insights that drive economic value of more than $11 trillion by 2025.1
It is virtually impossible to fathom the scale of data creation and distillation resulting from this proliferation of connected homes, livestock, people, or cars:
- 212 billion sensors enabled by 2020 2
- 110 million cars produced with 5.5 billion sensors 2
- 1.6 billion connected livestock 2
- 500 million sensors in US factories alone 2
- 1.2 million homes with more than 200 million sensors 2
- 330 million people with a billion sensors 2
Learn more about how the IoT and data are driving digital transformation by downloading the infographic.
Data refinement is the key to competitive differentiation
Competitiveness comes down to how effectively and creatively your organization can use the influx of IoT data, together with other forms of data, to transform core functions and customer relationships. With all the data being collected or stored, is it possible to make sense of the exabytes of device data being created? How can organizations integrate all these connected devices to their existing systems? And how easily and quickly can these new apps, devices and sensors be constructed and altered according to business needs?
Use data as a natural resource with cognitive IoT capabilities
The Internet of Things (IoT) will enable a lot more intelligence to be engineered into the things people interact with every day. By fusing computing capability into our physical world, we gain the ability to capture the experiences from any physical interaction — for example, home appliances such as a washing machine, dryer, or dishwasher. By gathering these experiences through data into a repository, they can be analysed, compared, accessed and then used by other individuals, in different situations. These devices are being extended to present virtual interfaces via web pages and mobile devices to allow remote access to functions — allowing us to remotely activate an action — turn on the dishwasher with a smart phone app, close the garage door, etc.
To bring the power of cognitive analytics to IoT-connected devices, IBM published an API for connecting unstructured data with Watson’s computing prowess. Because this data can come in many forms, such as audio, video, images, and text, there are four classes of APIs now available as part of IBM Watson IoT Platform offering that include Natural Language Processing (NLP) API, Machine Learning Watson API, Video and Image Analytics API, and Text Analytics API. In addition, there are new offerings available which enable customers to tap the Watson IoT platform to develop new voice interfaces for customers – in homes, cars, stores, hotels and offices. For example, Local Motors uses a Watson powered Natural language Interface for Olli – one of the world’s first self-driving vehicles capable of a natural language interaction with its passengers.
What are cognitive APIs?
Cognitive APIs deliver natural-language processing, machine-learning capabilities, text analytics, and video and image analytics to help you realize the potential of the cognitive era with Watson IoT. The Watson APIs for IoT help accelerate the development of cognitive IoT solutions and services on Watson IoT. Using the Watson APIs enables you to build cognitive applications that include:
- Natural language processing (NLP): Enables users to interact with systems and devices by using simple human language.
- Machine learning: Automates data processing and continuously monitors new data and user interactions to rank data and results based on learned priorities.
- Video and image analytics: Enables users to monitor unstructured data from video feeds and image snapshots to identify scenes and patters in video data.
- Text analytics: Enables the mining of unstructured textual data, including transcripts from customer calls at a call center, maintenance procedures and troubleshooting, technician maintenance logs, blog comments, and tweets – to help find correlations and patterns in the vast amounts of data from these sources.
Moving out of sci-fi and into practice
The concept of natural language interaction has been a dream for many years. It’s a frequently used as a situation within film and television – part of a well-trodden science fiction theme where an automated robot or cyber machine interacts with a human character, resulting in collaboration between the humans and the bots. Whether it’s Marvel Comics and Iron Man, Star Trek, Star Wars, 2001: A Space Odyssey, Aliens – there is a fascination with the use of artificial intelligence and the use of natural language processing. Natural Language Processing enables computers, systems and devices to understand human language – context, nuances, slang, and ultimately, intent.
By connecting the user to Watson’s computing power, the Text Analytics API enables mining of unstructured textual data coming from sources like transcripts from customer call centers, maintenance technician logs, blog comments, and tweets. There are many things that can be done today using the technology available in combination with natural language processing which are more than just translating voice into text. The Watson API also offers the ability to understand text, semantic, meaning, in addition to the nuances associated with variations on how people are asking things.
NLP facilitates interactions between users and systems
Natural Language Processing (NLP) can help to facilitate interactions between users and systems. Currently, the accepted way to interact with a system is through a basic user interface — either a user interface in a mobile app, a browser or a control center — where an individual pushes buttons and clicks a set of switches. What makes things different with Natural Language Processing (NLP) is how an individual can invoke this action.
Natural Language Processing (NLP) adds a new dimension – the addition of a very natural ‘dialog’ – to be used with the same devices, giving commands verbally or querying them for status and issues. An easy way to imagine this in practice is with a maintenance technician trying to troubleshoot a problem. Rather than pulling the machine apart to explore different components, the technician has the ability to ask the machine what is happening or has happened. The Internet of Things (IoT) is what makes this possible – connectivity, automated data collection, correlation with other data source, such as weather, and more embedded processing power.
Interestingly, the insights generated by Watson APIs can be used again to retrain the system, enabling the user to further improve the accuracy and timeliness of the information generated going forward. Furthermore, the APIs of the platform can be combined to bring information in another format, such as audio, into the text database for Text Analytics. Tools are available for converting speech into text and text into speech using the Natural Language Processing (NLP) API, which enables the users to interact with the system using simple, human language.
The benefits of using natural language processing in IoT solutions
Applying Natural Language Processing (NLP) in IoT instances helps to address different challenges. One capability derived from the application of NLP in IoT is the use of hands-free operation. For example, while driving, you notice a warning light on the vehicle dashboard. Instead of having to stop and read the manual, the driver is able to ask the car if they need to stop immediately at a service station to have the light investigated. Another example is the maintenance technician who is using his hands and tools to work on an asset, but at the same time needs to ask the device or a maintenance system about how to do a specific task or wants to know if other technicians have reported the same kind of humming noise he is hearing.
A more dramatic example of this capability applied in a real situation might be the technician working on top of a power line, or wind turbine – 100 to 200 feet in the air, with a strong wind blowing, a set of tools on their belt, and a need to identify a problem and fix it on the spot. In this scenario, the technician on site is performing regular scheduled maintenance at the wind farm. While there, the technician hears a subtle vibrating sound coming from the motor – something that is irregular. Having never heard that sound before, the technician starts performing some routine diagnostics to determine the root cause. With no success determining the root cause from diagnostics, the technician connects a mobile device such as a smart phone to the wind turbine network to review the data from the past two months.
Using the “Ask Watson” feature, the technician describes the noise to Watson. Watson’s replies are displayed on the technician’s phone, along with a percentage of certainty with each potential cause. Watson understands the verbal description of the issue and immediately searches historical sensor data, maintenance records, and technician logs for information about that specific turbine as well as other turbines of the same model. In seconds, Watson provides several potential cause and possible solutions. The technician logs his own report on his mobile device. Recording the actual cause of the problem and its solution lets Watson learn from the experience. The next time a technician has a similar question, Watson will be ready.
NLP can help to improve many different processes
NLP capabilities can help to improve many different processes. For example, manufacturers can predict device failures before they occur. By collecting data on the performance of a device in a system, including a variety of conditions and parameters under which it fails, and applying advanced analytics to this information stored in a database, Watson IoT can generate conditions for device failures and enable manufacturers to avoid them before they occur in real time. Similarly, by employing this deep level of data processing in real time on information gathered from engineers, field technicians, customers, and sales representatives, Watson IoT can help companies create innovative products and services.
An airline might combine data from sensors and technicians measuring stress on aircraft with turbulence data in flight to optimize maintenance schedules and, thereby, eliminate expensive repairs, including failures in flight. Organizations engaged in constructing safe, energy-efficient buildings can combine their expertise with Watson’s analytical abilities to deliver energy-efficient buildings that are reliable, environmentally friendly, cost-optimized, and sustainable. Another scenario might involve a technician who is able to access contextual help on a service call to fix a problem with an elevator, or washing machine. By using the large amount of information that is scanned and filtered based on the context or state the machine, the technician is able to quickly access and review the most relevant information based on context, and as a result is equipped to make more accurate decisions, faster. In short, there are countless businesses, big and small, that can benefit from these capabilities.
Another example of a device interface that uses NLP is Olli, a self-driving vehicle brought to market by Local Motors. In the instance of Olli, the use of natural language recognition helps to create a relationship between the passenger and the vehicle. The potential for machines to understand human language – where an individual could enter a vehicle and say, ‘Take me to work,’ is the proverbial tip of the iceberg. In the instance of Olli, in order to interact with the passenger, the vehicle relies on more than 30 sensors that pick up environmental cues that enable autonomous driving. Olli also makes use of streams of data from devices connected to the IBM cloud. Through Watson, passengers are able to interrogate the vehicle asking how Olli works, where they are going, why Olli is making a decision about a route or speed, as examples. Olli can also provide suggestions for popular restaurants or historical sites based on the stated personal preferences of the passenger.
Access to a rich set of APIs democratizes cognitive capabilities
IBM’s NLP is based on years of research and development which has evolved over time. The bulk of the technology originated from the Watson super computer which played Jeopardy! against the past champions and beat them quite soundly. It is the same technology which has evolved and grown, and is now being democratized and made available for a broader set of inventors, innovators and engineers to utilize.
The ability to analyze a large amount of literature and texts, in addition to goal posts, and then be able to extract knowledge and semantic out of it is what IBM has been striving to achieve. Using these new technology capabilities enables organizations to create valuable applications that continue to extend the boundaries of innovation. But more importantly, IBM is helping to democratize these capabilities by making them readily available and consumable through APIs, analytics and cloud. Only a few years ago, in order to engineer this kind of system, an organization might have to have a team of smart PhDs working alongside developers to manage a project that delivered results.
In an API economy, developers are comfortable going to the API to start prototyping with it. A common model is to use the API for 30 minutes over 30 days, test it and then decide what to do with it. For developers and engineers, Natural Language Processing (NLP) instances offer access to a rich set of APIs for enhancing and improving a user’s experience surrounding their IoT applications and devices. To this end, IBM is making a concerted effort in the cognitive space to decompose the big monster into a set of small, easy to use, consumable services. When creating a cognitive system on the hardware side, developers also need a way to interact with a microphone and probably with the speaker.
The Olli project is the perfect example of the democratization of capability — where developers are exposed to technology advancements (rendered as APIs) in consumable formats that are easily accessible in a scalable format, on a cloud platform. The changes in the maturity of the technology, coupled with the fact that technology today exists in the form of an API – something ready for consumption from a programmer perspective — make all the difference. Now any developer can access the capability without having a PhD in text analytics and natural language processing. The result is this developer can very quickly make this API very effective in a new domain. This is radical change in the market. But how does it work?
Orchestrating IoT solutions
When an IoT solution is defined using Natural Language Processing and voice aspects, just invocating a single API is not enough. Typically a developer needs to compose several APIs, define an API flow, defining the learning aspect of it and the feedback loop from the learning aspect, in addition to creating a training set both initially and later on. A critical step is to ensure the solution architecture can be described, tested and eventually deployed. IBM offers developers several capabilities to address this step. One tool which is very easy to use is Node-RED — a well-known, open-source environment that allows a developer to create those interactions. Node-RED is easy to use for this purpose because all the cognitive APIs are exposed as nodes in a visual environment.
Aggregating higher level services into pre-built patterns
At present, IBM provides a wide range of services for implementing Natural Language Processing (NLP) patterns. There are more than eight services on IBM Bluemix directly related to NLP and several more that can be used to further analyze text content. IBM continues to build out NLP services and expand capabilities that enable devices and systems to better understand human intent. More specifically in IoT, IBM is aggregating higher level services that provide pre-built patterns. For example, a voice interaction pattern for a device pre-integrates two to three Watson services to deliver an interactive voice experience with any device. IBM is still learning how to accelerate these processes to improve time to value for Natural Language Processing (NLP) patterns.
Creating an evolving knowledge base
In order to achieve a successful implementation of Natural Language Processing (NLP), the developer needs to create a knowledge base on which an intelligent system can rely — in effect, an NLP system that continues to regenerate itself: as it learns it is augmented over time. As a cognitive system, it learns using a base of knowledge which is refined as it is trained initially, which then with every new piece of information that is added to the base, continues to evolve. To understand the knowledge base, developers need to understand what a cognitive system is, and how it becomes relevant to IoT – essentially a process that replicates how human beings process things like nuances in language, taking into account variants in conditions relative to certain conditions – like weather or mood – and indeed weather which can affect mood.
Can systems simulate human decision-processing which includes logic and emotion?
Think about any decision a person makes – be it a simple decision about whether to take the stairs or the elevator, to something more complex which involves evaluating a set of criteria – for example, selecting a new model of car, or determining a holiday destination. It is through observation and information input that humans make decisions – we rely on facts to rationalize decisions – by looking at artefacts and evidence – to weigh up alternatives. In so doing, we interpret decisions through the lens of our own values – what we care about, how we think, what motivates us. In parallel, human decisions are influenced and sometimes ruled by emotional responses to situations when making choices. It is well documented that people tend to make emotional decisions first, and then use facts and statistics afterwards as a means of rationalizing their decision. It is for this reason that sentiment analysis becomes so interesting and relevant.
Understanding subtleties through sentiment analysis
Sentiment analysis is one way of applying a layer of contextual data which can help guide a system or device towards a deeper understanding of meaning and concepts which are routed in emotion, yet derived through language. The ability to appreciate and assimilate subtleties such as sarcasm and tone of voice become incredibly important when deciphering language and meaning in everyday conversation – they are equally important in the automation and application of intelligent services or APIs used within devices. The question is whether comprehension of emotion can influence machine learning to appropriately influence the outcome of machine-generated decisions.
Applying Natural Language Processing patterns in machines
The ability to make informed decisions requires the ability to understand value functions. Value functions enable individuals to evaluate the end result — how a specific decision will influence metrics. By maximizing or minimizing a specific value, a decision is made. That’s the natural process humans experience every day — every decision without even thinking about it — it’s like breathing for the brain. However, making a computer operate like this is very different from the natural way a human brain processes information. In a programming model creating a basic or deterministic decision relies on decision-tree analysis — a specific script or a specific trace until the program eventually does something. Watson tries to do the same thing — to define the field of knowledge which is the domain. During the process of gathering the body of knowledge in a given area, human intervention is used to tell the machine what a value resources is, and which values should either be disregarded, or deprioritized.
The simplest pattern where Natural Language Processing (NLP) can be applied is in a command-response pattern. The Natural Language Processing (NLP) services can be trained on a few key words that map to actions the device can initiate, such as “Watson, please start the dishwasher.” More complex dialog type patterns (think about the doctor in Star Trek: The Next Generation — “Please state the nature of the emergency”) are not out of the question, they just take a bit more effort and time to train in the fluid nature of these exchanges.
Common examples of human and system interactions
In terms of ideas for the future, there are two compelling areas of interaction between man and systems or machines which offer real promise and are worthy of further exploration. The first and most common example is the interaction between a user and a device or thing which results in a hands-free experience. There are any number of potential situations where hands-free voice interaction between a person or a user and a simple thing, or a complex IoT system occurs resulting in immediate value being derived by the user.
One example is where the user is not necessarily just a consumer sitting in a house sending a comment to a microphone or to a speaker. A user might also be a driver in a vehicle interacting thought NLP with the vehicle; a passenger in a train or in a bus – like Olli, an employee on a production floor, or, a doctor in surgery or a patient in a hospital room.
A few months ago, IBM Watson IoT and Thomas Jefferson University Hospital formulated an idea around the future of healthcare. The team envisioned a hospital setting where patients can speak naturally to a cognitive concierge in their hospital room that will be able to answer questions, adjust their environment based on personal preferences, and anticipate their needs.
Very soon after those early conversations, the teams got to work on a pilot project to produce a working prototype of a cognitive environment of care. Via a speaker-based interface, the envisioned smart hospital room can answer questions and execute requests that are very specific to the context of the user. The system will be able to correlate information across building systems, patient records, CRM systems, and administrative records to draw patterns, remember preferences and allow for a personalized, engaging and interactive patient experience. The solution is on the cutting edge of a cognitive future that will deliver completely amazing customer experiences that puts the experience of users above all and usher in the cognitive era.
A second more complex scenario is the incorporation of contextual information for the purpose of guiding the decision making. Take a situation where a maintenance worker needs to follow a specific procedure. When completing the task or procedure, the maintenance worked has several options which include asking someone, searching on the internet, or, consulting a manual or other piece of literature to learn how to complete the task.
A third option coming available right now is ‘ask the system’. Instead of trying to find the answer to why a system has displayed an error code, you could ask the system something like ‘What does this error code mean? How come I am seeing this error code? Tell me why the red light continue blink after I’ve replaced this pump?’ Another scenario where ‘interrogating’ a system is useful is when a driver of a vehicle is alerted by a red light blinking. Instead of pulling over to consult the manual, the driver can simply ask the vehicle a question – ‘What is this red light? In order to answer your question, vehicle needs to tell the system what the red light is. The car needs to communicate if there are any error codes, or other metrics in context. In this instance, the physical environment is extremely important to answering the question.
Facilitating the ability to interrogate a system
Although there is a lot of sophistication in the plumbing, it is easy to see when NLP is used in conjunction with a text analytics capability, how it facilitates a person’s ability to interrogate a system. Giving individuals the ability to ask questions without having to type into a search engine on the internet enables contextual information to be used to help identify an answer or solution. The internet lacks the context for that particular IoT device. The benefit in being able to ‘interrogate’ a system becomes relevant because the system is able to use the data about the current state and can offer suggestions based on that knowledge and understanding. Not only does it know what is going wrong, it know what the error code means, has access to the system’s history from the messages that device sent over the last week or year. All this information – whether it is from yesterday, or the previous year provides context to help to understand why something might not be functioning in the present.
Enabling human and system collaboration
Imagine a world where artificial intelligence and cognitive computing work in tandem with employees and extended partners to run more efficient operations through natural language recognition, speech to text transformation, and video and image recognition. For example, Ricoh uses IBM Watson’s IoT platform to embed cognitive capabilities into white boards using the Watson Natural Language Classifier API to enable an interactive, cognitive capability by capturing and translating speech in real-time. The office supply giant is embedding cognitive capabilities into its whiteboards with the help of IBM Watson. The whiteboards are currently on display in the IBM IoT headquarters in Munich, so that clients can experience their capabilities first hand.
Reach new heights of efficiency by creating natural interfaces between humans and machines
Contextual understanding and analysis are all services that Watson can provide today to enable staff and technicians to catch issues early, maybe even before an issue turns into a real problem. Watson’s services combined with IoT can drastically change and improve how an organization can keep up with thousands of machines, and hundreds of thousands of component parts. What is that level of efficiency worth to an organization at scale?
Cognitive capabilities enable organizations and individuals to reach new levels of efficiency, create unique customer experiences, and offer more opportunities for businesses to grow.
“Cognitive computing provides incredible opportunities to create unparalleled, customized experiences for customers, taking advantage of the massive amounts of streaming data from all devices connected to the Internet of Things, including an automobile’s myriad sensors and systems.” – Harriet Green, IBM GM. General Manager, Watson Internet of Things, Commerce & Education
Like speech and language, Natural Language Processing (NLP) is a unique technology — there’s really nothing else quite like it, although we’ve been referencing it in literature and film for decades. Natural Language Processing (NLP) solutions in IoT range from simple command response exchanges in a given domain, to rationalizing a user’s location, IoT data, and third party data to truly understand the context a user is in, and is asking questions about. These areas represent two spectrums of training and correlated intelligence which require a high degree of planning and training today.
Start your cognitive journey with Watson IoT Platform
IBM can help software engineers, developers, and data scientists to easily and securely connect devices, create apps that bring together more data, and perform analytics that yield new insights using the Watson IoT Platform. The platform makes it easy to take advantage of Watson APIs including machine learning and image, vide, and text analytics to build more advanced apps and create products that adapt and evolve over time to meet changing demands. Developers can integrate new data sources such as weather data to enrich analytics insights easily.
- Explore cognitive capabilities using a variety of smart Watson Services.
- Find more Watson services in the Bluemix catalog.
- A complete list of Watson service and their details can be accessed on the Watson Developer Cloud portal.
- Start developing with Watson IoT Platform for at no charge.
- Discover more about Watson IoT Platform capabilities by visiting Developer Resources.
There are a number of recipes are available on the IBM developerWorks site where integration of the Watson APIs with Node-RED have been discussed:
- Getting started using Node-RED and Watson IoT Platform.
- General Architecture for Voice Interaction Quickstart guide.
- Watson building concierge.
- Cognitive application using Watson Text to Speech that enables voice interaction between device and human.
- Explore these cognitive computing recipes to gain cognitive IoT skills.
- Bookmark the cognitive IoT cookbook for future updates.
1:IDC, Worldwide Internet of Things Update, 2016-2020, doc# US40755516, May 2016
2:IDC, Worldwide Internet of Things Update, 2016-2020, doc# US40755516, May 2016