25/03/2020 | Written by: Peter Den Haan and Ronald van der Knaap
Categorized: Data & AI
Share this post:
Companies and institutions that take the step towards applying artificial intelligence (AI) and machine learning (ML) often overlook the technical integration and internal alignment that the use of these capabilities require, which means that activities around AI/ML are not taken into production. With the right insights and techniques, this promising technology can be put to good use.
With the help of ML we gained insights in complex subjects and with AI this can be taken one step further. AI can support employees during a wide range of activities. Very complex (AI) models that were impossible to execute in the past within a reasonable time, can now be used with today’s technology.
As a company, how are you going to use and shape all these new possibilities? Ronald van der Knaap, Data & AI Architect at IBM, has an answer: “With Cloud Pak for Data all the required capabilities you need to work with are combined into one platform. Cloud pak for Data gives you an integrated and governed experience for all the different functions you need to ingest data, transform data, to create/manage ML/AI models and to deploy them so they can be easily consumed by users.”
Peter den Haan, Technical Sales Manager, agrees: “Everyone talks about AI and getting value out of data, but that’s not that simple. Everyone does something, but adopting the right approach systematically and incorporating it into your operational landscape often doesn’t happen.”
Van der Knaap likes to illustrate the importance of good data and AI models by asking whether people would allow themselves to be treated in the hospital on the basis of a computer model and then have confidence in it. “In that case, the data must be of high quality and as a user, the doctor in this case, you must also trust the data.”
In a vision paper ‘Ladder to AI‘ IBM summarised how organisations can prepare themselves to make good use of the possibilities of AI.
Structured Data, Governance
There is an important point here, everyone needs to understand what the data with which they work, involves. Van der Knaap: “A hardcore data scientist understands the code that has been programmed by his team, but a citizen data scientist, who typically is more comfortable using visual methods, does not have that skill. Yet both must have the same understanding of the data and the operations applied. That is why it is also important to ensure that data governance methods are used and no mistakes are made in the interpretation of the data.”
Den Haan: “The use of visual programming is very practical, it improves, for example, the discussion with an end user, in this case a doctor, who really understands the logical pieces of the puzzle, but who cannot program it himself. In this way you can validate as an end user and subject expert..”
“In the end, the doctor knows what logical assumptions are or are not, this requires a collaboration with the data-scientist,” adds Van der Knaap.
Something that is widely used and falls under the scope of artificial intelligence is deep learning, a more recent technique to further improve machine learning modeling. However, deep learning creates new challenges. Outcomes of deep learning models are more difficult to explain, in other words: it is no longer clear how a model will determine an outcome. Van der Knaap explains that IBM has functionality to validate whether an outcome is indeed correct. “We can reason back via certain parameters to see if a model does not suffer from certain incorrect assumptions (biases).”
Analysing unstructured data is a big task and AI can be very helpful with that. Van der Knaap says that it is quite normal for people if we create certain connections that our computer would normally not be able to make. People build up a lot of specific domain knowledge if they work in a certain domain for a long time. New users do not yet have this specific domain knowledge and it takes a long time before they have acquired this. It would help if these users could search through an enormous amount of documentation about that domain and quickly find an answer. As an example, with more complex machines it becomes impossible for an engineer to be able to oversee and know everything. Certain processes must not be allowed to come to a standstill, and if something goes wrong, an engineer must be able to identify the problem very quickly.
“The system must understand what you’re asking”, says Van der Knaap. “This is what we call linguistic intelligence, but in the beginning the system is still oblivious. You have to introduce that knowledge as if you were raising a child. Then such a system learns the vocabulary, the dependencies between all kinds of components, in a completely different way than with a straightforward search system. That’s what we use IBM Watson for.”
“The bottom line is that it’s about being able to search through documents very quickly and not just on words, but on understanding what it means. This adds a whole new functionality to the underlying IBM platform, in this case the analysis of unstructured data from text and language, together with a tool for annotation, which you can easily add to the IBM platform by means of cartridges.”
As a final example, a case is presented about shop locations and assortments. “A customer in London, for example, wanted to know how to get more people into the store. It had to become clear where the target group lives, what they are interested in and what products they need to have in the store to prevent people from only buying online. We researched this with many different data points. And we did that for more projects in cities all over the world,” says Van der Knaap.
And you can also use that for shared bikes, smart bikes, for supply chains, patterns that recur in many other projects,” adds Den Haan.
The number of questions you can investigate using machine learning and AI is almost endless. That is a constant journey of research and implementation, with incremental steps everything becomes a little bit smarter. And make sure to use an integrated platform approach when setting up the data management chain between data source and machine learning/AI environment, such that it can bring maximum value to your organization, with optimal quality and appropriate governance.