What you need to know about hybrid cloud data strategies
From time to time, we invite industry thought leaders to share their opinions and insights on current technology trends to the IBM Systems IT Infrastructure blog. The opinions in these posts are their own, and do not necessarily reflect the views of IBM.
I was recently chatting with two of my friends during a Zoom happy hour (because that is where we are at now). “Tom” is currently employed at a defense contractor company with, as you can imagine, very intense data requirements. “Bob” works at a small pharma startup with different but equally significant data needs. Because of my role at WWT as a Multicloud Consultant, they wanted to talk to me about the challenges that they are facing and I started noticing some striking similarities.
Both Tom and Bob, in two completely different types of businesses:
- Deal with several “clouds” because of various data needs
- Have the same challenges around having a cohesive data strategy
- Need to answer the same question: Where should my data live?
With my curiosity piqued, I started questioning my other friends in other verticals and saw that they, too, were struggling with some of the same basic issues. With this in mind, I decided to try to tackle some of the challenges a company would face when looking at creating a strategy around their data in a multicloud world.
In the conversation with Bob and Tom, we had some disagreements that stemmed from a lack of cohesive definitions around the types of clouds. Before we get started, we need to define some of the “multis” in this multicloud world:
On premises: Data center or colocation facility. This is your owned IT infrastructure: You bought the servers, switches, and SANs, and you are responsible for every layer in some capacity.
Public cloud: A cloud services provider (such as IBM, AWS, Azure, or GCP) provides access to standardized resources and services and is available to subscribers on a pay-per-use basis.
Private cloud: Also known as a corporate/internal cloud. Provides a cloud-like offering to the consumer of the cloud while still having a defined hardware footprint.
Hybrid cloud: Combines resources from private, public, and on-premises environments to take advantage of the cost effectiveness each platform can deliver.
Multicloud: Multiple private, public and on premises environments. Whether this came from M&A activity or data requirements, you now need to move data around several platforms, each with their own unique needs and “quirks.”
It’s probably safe to assume you’re asking the same questions as Bob and Tom about your company’s data. I had the chance to speak with Kelly Robinson, VP Global Sales for IBM Storage, and we discussed why on premises should be considered when your strategy may be “cloud first”. Have a listen:
Deepa Krishnan, Director, Offering Management Storage Cloud SW and SaaS for IBM, writes “We shouldn’t think about problem solving as “To cloud or not to cloud?” Instead, we should ask ourselves, “What is the problem I’m trying to solve?” and “Are cloud deployments (public or private) going to optimize my solution?” in her blog post about the journey to hybrid cloud.
It’s 2020 and data makes the world go ‘round!
To start creating a comprehensive cloud strategy, first you must know your data. Here are some of the (many) questions that need to be considered. Please note that all of these questions could potentially need to be answered for many different datasets within your organization. I have seen companies that have as many as 17-20 separate tiers of data, each with their own requirements.
- What is the scope and size of your datasets?
- Where does most of your data live?
- What applications need to access this data?
- How easy is it to move said data?
If you need to move this data, what needs to move/change with it?
- Who should have access to your data?
- More importantly, who should NOT have access to your data?
- How are you keeping an eye on the above two groups of people?
- How long is your data valid/relevant/useful?
- What about backups and disaster recovery options?
- Are there any compliance or governance rules dictating where and how long your data can live?
- Once it has passed this date, what is supposed to happen to it?
How easy is it to sift through your data (structured vs. unstructured)?
- What kind of latency/performance requirements do your applications have?
- How often does your data need to be accessed?
- If the dataset grows, how does this impact performance?
Other factors (each of which can be their own very long articles):
- Cost (price, performance, tiers, data transfer rates)
- Business continuity requirements
- Disaster recovery requirements
- High availability requirements
- Alignment of your data strategy to your company’s overarching business strategies and guiding principles
The process of answering these questions is one of the steps towards creating a comprehensive cloud strategy. Each answer will lend itself to help paint the larger picture of where data can and cannot live.
What requirements have you seen that impacted your data storage strategy? Get on Twitter and let us know.
Check it out here 😊: https://t.co/fa3yGU7jgI
Thanks to @GestaltIT for introducing us!
— Chris Williams ☁️🐍 (@mistwire) October 14, 2020