Share this post:
This entry is the first in a series of blogs entries covering the topic of AI (augmented Intelligence) and IP (Intellectual Property).
Intellectual Property includes Patents, Trade secrets, Know-How, Domain Names, Copyrights, Trademarks, Service Marks and Defensive Publications. For this discussion, we will focus on patents.
Analyzing a patent portfolio can become overwhelming and laborious quickly, and analyzing the full worldwide patents becomes nearly impossible. There is simply too much information in too many formats, languages and sources. Let’s take the use case of Prior Art. The US Patent Office now has over 10 Million patents! Factor in Applications, issued, abandoned, etc and the number grows exponentially. There are at least 10 Million Patent records worldwide. These patents are spread across more than 200,000 CPC Codes.Trying to gather the information and make sense out of this data in a timely manner is nearly impossible. Since Prior Art also includes published information, we also need to consider the over 4 Billion web pages and of course all publications, trade journals, etc.; and all are in different formats and languages.Now let’s look at the current commercial search and analysis tools. They gather data from various patent offices, and some attempt to ‘cleanse’ the data by correcting ownership, corporate trees, etc. In addition to this cleansing, some provide their ‘value added data’ such as Technology concepts, valuation numbers, etc. These tools provide various forms of searching: by company, keyword and semantic to name a few. In most cases, these tools are basic search tools with advanced filtering capabilities and therefore the user has a relatively efficient way to perform patent based searches. The ‘value added data’ (concepts, categories, rankings, etc) may be determined via some AI techniques such as machine learning, however complete AI has yet to be utilized in these commercial tools.
There is also the difficulty of term matching. Some tools use stemming and synonym tables to attempt to match terminology, however the user has little to no control over how these features are used and little insight into the definitions, therefore terms used in the search may not exactly match the terms in the data, and therefore be excluded from the search results. Most companies have their own taxonomy that is different from the patent office codes, yet no commercial tools allow you to add you own.
Once a search has been performed, the commercial tools provide ‘analytics’. These analytics are basically tools that gather and present the data in various formats (bar charts, line graphs, etc) but the only analytics performed or provided is potentially in the ‘value added data’, where data such as concepts or technologies may help analyze the data, but the actual analysis tools are nothing more than charting tools, the user is still required to perform the analysis.
The currently available commercial tools are very good, but limited by the types of analysis and the flexibility of usage. The user based searching and analysis is additionally limited by the user’s timeframe, budget, knowledge and skills. These factor directly control the type of analysis performed and affect the outcome. I believe the commercial tools available today are insufficient to perform a proper analysis including insights.
Artificial/Augmented Intelligence seems to be the only practical answer to this data/analytics problem As with any new technology, we are all learning how to apply AI to IP effectively. In future blogs we will discuss the different challenges AI brings to the solutions.IBM Watson is capable of ingesting, digesting, understanding and analyzing massive amounts of unstructured data, in several languages, rapidly and thoroughly. It has a Natural Language understanding of complex documents and technologies and even better, can be taught to understand additional nuances and technologies as needed (a feature lacking in commercial patent search tools). I believe that using Watson as a co-worker, to augment the effort required to read, understand and analyze this data is the most cost effective and efficient way to handle this data problem. We have created IBM IP Advisor with Watson, leveraging AI for fast ingestion, better insights and analytics.
The next blogs will cover AI topics such as Model Generation, Enrichments, Advanced Analytics, etc….stay tuned