What does it take to make data fit for purpose?

Share this post:

I just finished reading IBM’s Science & Technology Outlook (STO) 2021. The report starts with a statement that really resonated with me.

“COVID-19’s impact on the world has emphasized the importance of science.” 

Reading the report what stands out to me is the emphasis on scientific method and approach for discovery. As well as the accelerated need of scaling discovery to build knowledge and make decisions. 

“At the same time, science is experiencing a sea change of its own, with data and artificial intelligence being used in new ways to break through long-standing bottlenecks in scientific discovery.“ (IBM STO 2021)

Besides the method for discovery, you also need to have something to discover from, namely Data.

The STO 2021 report sparked a lot of thoughts in relation to some of my favorite topics: Data and Science.

Challenges of making data available

Being a positive, but also a realistic person, my first thoughts go to some of the challenges of making data available for scientific discovery, or any discovery for that matter. These challenges are far from new. Rather they have been there for as long as I can remember:  Making data available at the right time, the right place with known quality.

The timing is getting more challenging as the needed data has to be current and up to date, even for discovery and analytical purposes. As stated in the STO 2021, “We need science to move faster”. What it means is to continuously reduce the time from when data is created to the time it is available for another purpose, usually referred to data latency.

Where to place the data?

The place where the data is needed is also getting more challenging as new types of collaborations across enterprises and their partners develop. The STO 2021 talks about, “Accelerated discovery requires integration of multiple complex workflows with different experts, implementers, and stakeholders “. The answer to this challenge is often: “data needs to be available in the cloud” – but a better answer is perhaps “data available anywhere”. So, when defining the data placement, it should consider data collaboration challenges and ensure the data placement is very flexible, dynamic and easy to move and share. However, this is not only a data placement challenge, it is also very much a data security challenge.

Known data quality

And finally, delivering data with known quality, meaning that it is validated and described from the consumers perspective. This is especially challenging when combining it with the two aspects above. Although the STO 2021, does not specifically call out data quality as a challenge, any increase of data use will increase the challenge of data quality. The paper does however bring up some other data governance aspects, “Putting values into practice requires a “by-design” mindset, infusing privacy, security, and ethical considerations into our engineering and technology development—from the very outset.”

In the end the data needs to be made available in a way where it is easily understood and consumable for the person or application that requires it – whether for discovery or other purposes.

The power of the cloud

Back to my positive attitude, I believe that there are new ways and technology advancements to handle the growing needs of data and analytics, utilising the power of cloud capabilities

  • Possibility to manage and store data in a diverse and flexible way
  • Dynamic scalability and conditional workload distribution of compute
  • Benefiting from cost reduction by optimizing storage based on the type of data and type of consumption and pay-as-you-go commercial models.

I also believe that there is an opportunity to increase the consolidation of data management in a growing number of cases by delivering for both operational and analytical needs, as the requirements have a large overlap.

Fit for purpose data

Summarizing, my key thoughts and reflections from reading IBM’s STO 2021 is that there is a lot to apply in business from scientific approaches and methods, but also to take an even broader view on data needs and to ensure it is fit for purpose:

  1. Fit for purpose from a quality standpoint. Is the data statistically sound for the intended use?
  2. Fit for purpose from a timing perspective. Is the data current enough to form the basis for the intended use?
  3. Fit for purpose for the user. Whatever you discover in data, it needs to be understood in order to take action!

Then a final thought appears: If the purpose is to really discover new things in data, data in any shape should be considered, however my experience tells me that you need to define how the data would be fit for purpose – even for discovery – in order to understand and take action on the findings you make.

Closing with a quote from STO 2021, which in my opinion summarises the relevance and importance of both Science and Data:

“The pandemic has highlighted the potential of science both to produce critical breakthroughs and serve as a rigorous methodology to build knowledge and make decisions.”

Please share your thoughts and reflections on data and accelerated discovery!

Link to the STO:

More stories

New enablement materials for IBM Ecosystem Partners

On October 4th, IBM announced a revamped skilling program available for partners. The skilling and badging program is now available to our partners in the same way that it is available for IBMers, at no cost. This is something that our partners have shared, they want more expertise – more opportunities to sharpen their technical […]

Continue reading

Data Democratization – making data available

One of the trending buzzwords of the last years in my world is “Data Democratization”. Which this year seems to have been complemented by “Data Fabric” and “Data Mesh”. What it is really about the long-standing challenge of making data available. It is another one of these topics that often gets the reaction “How hard […]

Continue reading

How to act in the new regulation of financial sector

Our world is changing. Because of that regulators around the world are taking ambitious steps to improve the sustainability of the financial sector and guide capital towards sustainable economic activity. Especially in EU we are seeing a high level of regulations. These regulatory interventions present complex and sensitive legal challenges for financial sector firms, which […]

Continue reading