In a 2016 The New York Times article about the challenges facing data scientists, one was quoted as saying, “We really need better tools so we can spend less time on data wrangling and get to the sexy stuff.”1
Manually converting or mapping raw data into data that can generate actionable insights is an important but repetitive task that takes a huge amount of time. Even simple terminology used to define the task of dealing with colossal amounts of modern data makes it sound like something of a chore; a data wrangler or data janitor.
The “sexy stuff” is analysis and modeling to enrich the data relationship.
The key to any good relationship is understanding—prompted by, based on, or demonstrating comprehension, intelligence, discernment, and sentiment. Watson can do the same for your data relationship, finding the gems within and bringing them to the surface.
Watson Discovery Service improves the developer relationship with really big data using cognitive capabilities with simple tooling and APIs to quickly upload, enrich, and index large collections of data. Cognitive systems can take massive quantities of data, even the stuff never intended for machine consumption, and put it into context. Discovery Services suite of APIs provides a pipeline to ingest, store, and enrich your data and get to the good stuff.
Understand your data at scale
Watson ingests and standardizes all your data, no matter the type and no matter how much. Advanced ingestion capabilities allow for easy analysis of HTML, PDF, or Word files, and JSON. An important design consideration, anything done in the tooling is available through the public API – you don’t need to integrate into the workflow. Just feed the pipeline and create a recipe of things you want to do with your data.
Enrich your data
Original data sources can be messy which makes running analysis on it difficult, if not impossible. Embedded Watson algorithms enrich documents with natural language understanding, sentiment and emotion analysis, and concept tagging. Discovery Services takes care of cleaning and normalizing data to make it available and ready to dig through.
Connect your data sets
Watson organizes documents, utterances, and facts across time, people, events and more to show correlations in data and causal factors. Segment and search data to find time-based correlations, identify geo-spatial coordinates, and unearth dimensional correlations. Use entity and keyword extractions to find relationships not explicitly mentioned in data. Chain the aggregation by adding categories by term to bubble up deep information hidden in data.
Try Watson Discovery Service for free right now on IBM Bluemix. Upload your own samples docs or pull a pre-enriched public data set of primarily English language news sources. Watson News is updated continuously and contains hundreds of thousands of new articles with blogs added daily. It’s like access to a full newsroom at your fingertips. See a demo of what you can build here.
See for yourself what Watson Discovery Service can do. Get a free trial today.