Data Analytics

The million dollar question: Where is my data?

Share this post:

The core features comprising Watson Data Platform, Data Science Experience and Data Catalog on IBM Cloud, along with additional embedded AI services, including machine learning and deep learning, are now available in Watson Studio and Watson Knowledge Catalog. Get started for free at

The era of the CDO

Ten years ago, Chief Data Officers (CDOs) were a rarity. Large corporations such as Visa, Capital One and Yahoo! led the way in appointing CDOs, but the job title had yet to become mainstream. Then the global financial crisis of 2007-2008 hit. Organizations heard the alarm bells ring, and CDOs – suddenly in high demand – were asked to help align operations with a raft of new regulatory requirements around data governance and reporting.

Around the same time, companies began to wake up to the value of big data, recognizing the competitive advantage that it could potentially deliver. As big data analytics moved to the forefront, the spotlight on CDOs intensified. Today, beyond enforcing compliance, their role has expanded to put them in the front line of organizations’ efforts to become data-driven enterprises.

The increased importance of CDOs is reflected in their growing numbers: in a recent survey, 54 percent of firms report having appointed one, up from just 12 percent in 2012. As the big data revolution gets underway, CDOs are seen as having the potential to make or break an organization’s fortunes. And to deliver on these big expectations, they need to overcome some major obstacles.


In the hot seat

To get a grip on their organization’s data, CDOs first need to know what data the company has and who is using it. These questions may seem simple, but in reality they are not easy to answer.

It’s usually possible—although not trivial—to work out the location and usage of enterprise data because organizations can look at their data warehouses and operational systems. However, when it comes to data from outside the company, such as social media streams or data that is available publicly, the picture quickly becomes more complicated. And with sentiment analysis and customer profiling growing in popularity, the usage of external data is becoming increasingly important.

Most CDOs have no straightforward way to gain a comprehensive view of the internal and external data used by their organization. They often have to poll different departments individually, before manually building an overview. This manual approach to data governance is rarely 100-percent effective—and whenever there’s a gap in coverage, there’s a risk that breaches will occur.


Why is it so important to find a solution?

Firstly, until a CDO understands the ‘current’ state, it’s very difficult to make positive changes. Without a 360-degree view of data, it can be difficult, if not impossible, to enforce governance policies. As compliance is one of the CDO’s key responsibilities, this is a challenge that needs to be overcome post-haste.

Secondly, most companies have data assets that they’re not using to their full potential, simply because nobody knows about them except the original data owner. Unless the CDO can discover these data sets and make them more widely available, they cannot unlock the benefit for the rest of the enterprise—a missed opportunity.

Thirdly, the true benefit of analytics only manifests itself when you can combine data sets to get a full picture of a situation—and without a comprehensive view of the data assets you have, it’s impossible for your data scientists to achieve this.

For example, a telecoms company looking to upsell new handsets to its customers might look at its own internal data and conclude that it isn’t worthwhile marketing to customers who have upgraded within the last six months. However, by combining that internal data with insight from social media, they might be able to identify high-value customers who love having the latest gadgets and technologies—and who would be happy to upgrade again whenever a new phone is launched. By giving line-of-business teams a richer selection of data sets to include in their analyses, the CDO can play a key role in making those lines of business more data-driven and successful.

To truly liberate value from data, companies must enable self-service analytics for employees. Consequently, CDOs are also under pressure to ensure users can find and access the data they need. If CDOs are struggling to gain insight into their own data landscape, how can they present it to anyone else? It’s clear that they need to find a way to bring together all their sources of data and open them up for self-service—all within the context of a robust security and governance framework.


So, what can they do?

The answer to these pain points lies in a smart catalog solution. First, companies need to index their organization’s internal data storage, and then catalog their external sources too. Even these initial steps will deliver value, so it’s a relatively painless way for CDOs to make inroads in the journey towards a data-driven culture.

For example, by providing a single place to find both enterprise and non-enterprise data catalogs remove a huge burden from data engineers. Instead of distracting data engineers with endless requests for data from different departments, they can focus on core projects, while the users find assets for themselves.

A data catalog can also incentivize employees to start engaging with a collaborative approach to data sharing and analytics, confident that governance policies will be automatically applied. For example, employees who are used to working locally on spreadsheets might become comfortable with the idea of publishing their analyses back into the company’s data catalog, in case they are helpful for other staff members. In this way, CDOs can drive the way away from redundant, repetitive work, to more efficient and productive analytics.

Critically, this democratization of access to and sharing of data does not come at the cost of data governance. With greater transparency, plus automatic enforcement of governance policies, CDOs will have more control than ever.


Benefits for everyone

The real beauty of such a solution is that regardless of where an organization is in its data journey, embracing a smart catalog can provide benefits extremely quickly. In stark contrast to the months or even years you’ll spend building a data lake, a solution such as IBM Data Catalog (currently in beta) empowers you to provision an instance and start connecting data sources in a matter of minutes.

For CDOs, enabling this sort of agility comes down to the core of their mission statement. Gone will be the days of months-long analytics projects based on static, outdated data. In their place comes a new age, where data will truly become the new currency, delivering value to everyone that comes into contact with it. If you’d like to learn how your organization can get started, request access to the Data Catalog beta today.

Jay Limburn

Michael Tucker

More Data Analytics stories
May 7, 2019

We’ve Moved! The IBM Cloud Blog Has a New URL

In an effort better integrate the IBM Cloud Blog with the IBM Cloud web experience, we have migrated the blog to a new URL:

Continue reading

May 6, 2019

Use IBM Cloud Certificate Manager to Obtain Let’s Encrypt TLS Certificates for Your Public Domains

IBM Cloud Certificate Manager now lets you obtain TLS certificates signed by Let’s Encrypt. Let’s Encrypt is an automated, ACME-protocol-based CA that issues free certificates valid for 90 days.

Continue reading

May 6, 2019

Are You Ready for SAP S/4HANA Running on Cloud?

Our clients tell us SAP applications are central to their success and strategy for cloud, with a deadline to refresh the business processes and move to SAP S/4HANA by 2025. Now is the time to assess, plan and execute the journey to cloud and SAP S/4HANA

Continue reading