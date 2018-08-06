Setting up a Big Data platform on-premise often requires a significant infrastructure investment to support data ingestion, processing, enrichment, storage, and analytics. Enterprises looking to migrate their applications and Big Data platforms to the cloud (to leverage its agility and scalability and move from a significant capex investment to a paygo model) should consider setting up a Big Data platform on IBM Cloud.

Businesses can reap the benefits of Big Data as a service solution on the cloud by leveraging IBM Message Hub (managed Kafka), IBM Streaming Analytics, IBM Analytics Engine (built on open-source Apache Hadoop and Apache Spark), and IBM Cloud Object Storage. Deploying a cloud solution provides flexibility and ease of use without the headaches of setup or high maintenance costs. Furthermore, data scientists can start providing value right away by accessing and analyzing data sets directly from Cloud Object Storage with IBM Data Science Experience.

Helping a mid-size company migrate to the cloud

A few months ago, the IBM Cloud Garage partnered with a mid-size company to assess and transform their entire application portfolio with the cloud. At the end of our initial assessment, we provided a transformation vision and implementation plan based on the IBM Cloud Garage Method. We also provided a target cloud architecture, implemented squad models, and created an actionable plan to divide projects into multiple Minimal Viable Products with a strangler pattern to modernize and migrate their applications to cloud.

In this article, I share insights from this engagement and target architecture components to help guide the implementation of a Big Data platform on the cloud for others.

Based on this engagement and several others, here are a few key requirements for establishing a secure cloud platform:

Having flexibility to scale up and down the infrastructure

Establishing a simplified and consolidated technology stack

Using cloud-native technologies

Ensuring governance is part of the technology stack

Using continuous delivery

Reducing IT costs

Minimizing vendor lock-in technology choices

Motivation to move Big Data stack to the cloud

While many companies today seek to harness Big Data to cultivate new business insights, this mid-size company’s use of Big Data is integral to their core mission and is baked into many of their business decisions. Big Data powers their innovation in customer service by anticipating what customers like and how they will interact. They then learn from these interactions to improve future experiences.

Their current Big Data platform was adequate but fairly expensive to maintain. It was also lagging behind current software and hardware technologies due to multiple acquisitions and several integrations. Their platform direly needed a technology stack upgrade and an update to processes. An update would enable faster innovation, provide data governance, reduce maintenance costs, and, most importantly, create a single source of truth to resolve data inconsistencies—inevitably increasing their technology’s business value.

Assessing the current application portfolio and drafting target deployment models

Before executing a cloud migration and embarking on digital transformation, an organization must understand their long-term business goals, pain points, archaeology of application and data infrastructure portfolio, and individual structure, operations, and processes. The IBM Cloud Garage performed a comprehensive review of the client’s applications supporting the core business functions, grouping them into different categories to evaluate against various cloud deployment models. As previously discussed, this client’s Big Data platform is one of their core components for all their applications. Moreover, in our assessment, we discovered their platform was built using a mix of various technologies over time, which presented a series of complexities to consider.

Target cloud architecture for the Big Data platform

When we joined the client, they had already started building a target architecture model that applied leading open-source technologies. In doing so, however, the client planned on implementing a roll-your-own technology stack to a Big Data platform on the cloud without leveraging any of the cloud-native services that allow for rapid provisioning (such as Hadoop and Spark clusters) or for flexibility for data at rest with the Object storage.

Noting their goal of embracing open-source technologies, our team proposed a new target architecture that would still meet their key requirements. Our proposed architecture was split into three major categories to address the data flows: