Cloud Garage and the journey to cloud
Last year, I had the honour of giving the opening keynote at JavaLand. When I was invited to speak, my first thought was excitement, and my second was panic. What did I have to say about the cloud that would be interesting to 1,200 people? I started thinking about the stories we tell each other about cloud adoption. In the Cloud Garage, we see some patterns over and over again—a development organisation wants to achieve a significant improvement and realises moving to cloud could be the way to do that. Along the way, they encounter multiple challenges, but in the end, they succeed in adopting cloud and see enormous benefits.
The monomyth
The pattern sounds familiar, right? It turns out this story structure isn’t just for cloud adoption stories—it’s common to almost all of the stories we tell, both modern and ancient. We know this because of Joseph Campbell, an American literature professor who studied comparative mythology in the 1940s. Previous researchers had tended to focus on the differences between different myths, but Campbell noticed that there were also important similarities. He called this fundamental structure the monomyth.
Prometheus
Think about some myths you know. Were you told the story of Prometheus as a child? In technology, we mostly talk about Prometheus because of the open source monitoring framework. That framework is named after a Greek demigod, one of the Titans. Prometheus first created humanity from clay. It soon became clear that compared to other animals, humans couldn’t really compete. Humans were a bit slow, a bit spindly, couldn’t eat grass, and didn’t even have fur. In order to give humans a bit of a hand, Prometheus journeyed to Mount Olympus and stole fire from the gods. When he gave it to humanity, it gave people the fiery help they needed to thrive in a hard world.
Laconic Kylix with Prometheus and Atlas, 550 BC. Photo by Karl-Ludwig Poggemann.
Isis and Osiris
Although you might not know the details, you probably have also heard of Isis and Osiris. The Osiris myth is an important story from Old Kingdom Egypt, around two and a half millennia BC. Osiris, a god and a king of Egypt, was murdered by his brother Set. Set claimed Osiris’s throne, but that’s not the end of the story. Osiris’s wife, Isis, sought out her husband’s body and brought him back to life temporarily. After Osiris rose from the dead, he and Isis conceived a son who would grow up to reclaim the throne and restore order to Egypt. This is an early example of a story of a god being killed and reborn; it’s a theme which appears in many world religions.
The Knights of the Round Table and the Holy Grail
Jumping forward several millennia, the old English legends of King Arthur and the quest for the holy grail have been retold in dozens of books and movies. The holy grail story begins when a vision of the grail appeared to the court of King Arthur. The knights realise they must find and restore the sacred vessel, hidden somewhere in Britain. Most of the knights were either killed or slunk home empty-handed. However, three of the knights, led by Sir Galahad, succeeded in finding the grail. Once the grail was found, that was basically mission accomplished. The successful knights were too noble to hold on to the cup as a trophy, and instead, the grail was magically lifted to heaven for safekeeping.
Verdure with Deer and Shields. An accompaniment to the Holy Grail tapestries woven by Morris & Co for Stanmore Hall. This version was woven in 1900, around 700 years after the Arthurian stories were first recorded.
The hero’s journey
Despite their superficial differences, these legends have quite a lot in common. Distilled to their essence, all of these stories (and many others) are actually just one story—the quest. The quest story begins with a call to action. The hero is initially resistant, but they become convinced that they must act. On their journey, they encounter many obstacles—some mental, some physical, and some with big teeth and tentacles. Ultimately, though, the hero succeeds in their quest and is rewarded.
According to Joseph Campbell, many stories and myths start with a call to action. Once underway, the hero must overcome overwhelming obstacles before receiving a dazzling reward.
Does the hero live happily ever after when they reach the reward? Not always. Rather than being a straight line, the hero’s journey is a continuous cycle. After enjoying the reward for a while, the hero gets a new quest and must face new challenges. Nowadays, we’d call that the sequel; Luke destroys the death star in Star Wars, but finds he has even bigger problems in The Empire Strikes Back. Completing the cycle, Luke has to destroy the death star again in Return of the Jedi.
The hero’s journey is often cyclical; after completing one quest, the hero must undertake another.
Star Wars as modern mythology
The similarity to Star Wars isn’t coincidental. George Lucas was heavily influenced by Joseph Campbell’s ideas. Campbell even visited Lucas’s house to discuss myth structures. The reason Star Wars resonates with so many of us was because Lucas had a deep understanding of what makes a powerful story. Star Wars itself has now become modern mythology. Parents teach their children about Star Wars so that even children too little to watch the films know about Darth Vader. In perhaps the ultimate recognition of its mythic status, grown-ups around the world (particularly in English-language countries) list their religion as “Jedi” on national censuses. According to the 2001 census, Jedi was the fourth-most popular religion in the United Kingdom. In the same year, New Zealand showed the highest number of Jedi per capita, at 1.5%.
Mark Hamill, who plays Luke Skywalker, at a press conference in 1980. Photo courtesy of Dutch National archives.
The hero inside
As much as many of us dream of being a Jedi Knight, realistically, it’s not going to happen. Nonetheless, we all have our own stories. We undertake quests. Some of them are long ones, like working towards a degree. Some of them are shorter ones, like trying to figure out how to get faster hardware or how to get rid of annoyingly repetitive tasks at work. We’re all the (small-scale) hero of our own (small-scale) stories. What motivates most of us is earning an audience for our code creations, having our colleagues think we’re brilliant, sharing knowledge, bringing value to our employer, and, ultimately, making the world better in whatever small way we can.
Many of us hope to be a workplace hero (perhaps without the cape).
The joy of cloud
For many of us, the details of our technical quests are actually similar. We’ve seen a huge transformation in how we do work in the last 20 years. When I started my professional career, we all needed multiple versions of databases and application servers to develop and test against. To access those dependencies, we had to install them ourselves, onto our own hardware. A laptop and a desktop machine was the minimum hardware required to do a developer job, but the more computers, the higher the developer status. The best-connected developers would have an array of monitors and four desktop machines purring away under their desks.
Like it usually does, being cool required suffering—installing and patching all those operating systems and applications was drudgery. Naturally, any services we ran were only available on localhost, or at most on the LAN.
Things have changed. There’s a genre of Twitter jokes about localhost which amuses—but puzzles—me.
The joke recurs regularly:
Sometimes the idea is the same, but the port is different because technology trends have changed.
It’s true that a lot more startups get pitched than succeed, and many startups are paper-only or fail to connect to their end user. But I don’t think the barrier in any of these cases is getting from localhost to the cloud. Putting something on http://my-cool-startup.mybluemix.net is as easy as:
How could that be a barrier to a startup? With Kubernetes, there’s a bit more code, but it’s still a one-line process:
It isn’t news that standing apps up on the cloud is straightforward. We all know that the cloud makes it easy to get software to end users, fast. If the happily-ever-after is speedy hardware, elimination of repetitive tasks like patching server farms, and business value, then the way we get there is cloud.
The call to action
Although you might imagine that by now everyone is using containers, the Cloud Native Computing Foundation spoke to 504 users across the globe, ranging from developers to business managers. It found that only 25% were using containers. Last year, the number was 22%, so it’s going up, but slowly. Why aren’t the 75% of businesses not using containers taking advantage of this transformative technology? In many quests, the first barrier is the hero themselves. A wise person calls the reluctant hero to action by telling them how much they’re needed. Instead of leaping onto the next horse or into the first container, the hero says, “No thanks, I’m cool where I am.” Think of Luke Skywalker telling Obi-wan Kenobi that he’s terribly sorry but he can’t come to save the universe right now because he has to stay and help with the harvest.
Security on the cloud
What’s holding people back? Security fears are a big part of it, particularly in regulated industries. Although the real picture is much more nuanced, it feels like inside the firewall is safe and outside it is scary.
Nobody wants to be the person in their annual review saying: “Hey boss, I put all our sensitive data on the public cloud—unencrypted.”
The sad conclusion to the Prometheus myth is that, in revenge for the theft of fire, the gods chained poor Prometheus to mount Olympus for all eternity. To make the punishment even nastier, an eagle pecked out his liver every day. It’s not easy being immortal. Modern corporate performance management methodologies are (I hope) less grisly than the Prometheus treatment, but do any of us on the quest to cloud want to take the risk and find out what happens if we get it very very wrong?
Prometheus and the eagle, by Theodoor Rombouts (1597-1637).
Notice how much more expressive Prometheus’s face is in this seventeenth-century Dutch painting than it was in the fifth-century BC Greek illustration of the same scene. (I definitely wouldn’t want to be the Dutch Prometheus.) Painting made some huge technical advances in the intervening 21 centuries. Our ability to secure things in the cloud has also advanced a great deal in an extremely short time. Modern end-to-end and at-rest encryption methodologies can meet most regulatory requirements, even on public cloud. For maximum protection, it’s possible to encrypt data in-memory.
IBM is also devising new ways of delivering cloud capabilities but within the firewall. Many AI technologies need to be trained on large volumes of company data, some of it sensitive. Exporting all that to the public cloud, no matter how much encryption that is, gives corporate compliance teams the cold sweats. In recognition of that, IBM Cloud Private for Data now embeds many Watson services so they can run on-premise. As a happy side-effect, that brings the services to the data, rather than the other way round. Not moving huge volumes of data around is always a nice efficiency.
There’s another potential problem with modern cloud stacks, unrelated to encryption and even to the firewall. In a container, the whole operating system gets included in the built artefact. Since an operating system (like all other software) is a security attack surface, giving developers responsibility for packaging an operating within the build artefacts increases a developer’s area of responsibility. Developers have suddenly become responsible for securing parts of the system which used to be the ops team’s headache because they were far down the stack.
In a modern containerised build artefact (right box), the developer who created the artefact is responsible for much more of the stack, including middleware and operating system. This is different from more traditional built images (left box), where the developer is only responsible for managing the application itself.
Being in charge of security is a scary thought for many developers, but there is a big silver lining. Infrastructure as code means containers can be rebuilt at any time to eliminate vulnerabilities. Redeploying something through a pipeline is usually a lot cheaper than patching or manually reinstalling. Infrastructure as code also means we can use tooling, such as IBM’s Vulnerability Advisor, to monitor and prevent vulnerabilities.
Example output from IBM’s Vulnerability Advisor.
This means that cloud can actually be more secure than the old way we used to do things. So what happens after the hero accepts the call to action and starts moving workloads to the cloud? In particular, why are many enterprises having a hard time with their cloud quest? This is where we come back to the obstacles part of the hero’s journey. One CIO told an IBM vice president the story of their move to the cloud. They’d been working on getting their apps to the cloud for two years. “Great!” said the vice president, “How many have you done?” “We’ve got 2% onto the cloud,” was the forlorn answer.
In the move to cloud, not all obstacles have large teeth or tentacles. Most of us have, at one point or another, encountered the dreaded, “It works on my machine.” The operating environment in the cloud is so different that many things which were best practices in the data centre are anti-patterns in the cloud. Cloud-native applications are built for the cloud environment, but older applications are sometimes stuffed with cloud-liabilities. To take advantage of the cloud, an application should be developed with agility, deployed using devops techniques, and elastically scalable. To work at all on the cloud, an application should be stateless.
Being stateless
It turns out that writing stateless applications is pretty hard, and converting a stateful application to a stateless one is very hard. Servers can appear and disappear at any time because the cloud solution to an under-performing container is to kill it and start again (a bit like Osiris and Isis). In order to behave correctly after a restart, application state must be externalised. Distributed caching services, such as Redis, are so important, as is logging and observability tooling, such as Prometheus.
Optimising the right things
In the cloud, what we need to optimise for is different, and what our dependencies need to optimise for is also different. Although the situation is much better, for a while even Java JVMs had some peculiar behaviours in the cloud.
Memory is money, so a small memory footprint is economically critical. Containers stop and start often, so a fast startup is key. When I first started using Kubernetes, I had a hard learning experience with a slow application. My application was resource-intensive on startup. I wasn’t generous enough when I configured the liveness probe, so when I tried to bring the application up, the container orchestrator decided the application had failed to start and killed it while it was still starting. It then started a second instance, but because of the extra load of the stops and starts, that one started even slower, so it got killed too. It didn’t take too much of this before the failures cascaded and no instances would start.
Observability
My cascading failures is an extreme (and admittedly naive) example, but the way we need to do ops in the cloud is different from in the data centre. For folks new to the cloud, it can feel like there’s a lack of transparency. How do we monitor things we can’t feel and touch? How do we know a server’s on fire if we can’t see the flames? Unless we adapt our ops techniques to work in a stateless environment, we risk losing the information we need to diagnose problems. The mythological Prometheus brought fire to people, and the software Prometheus plays a similar role, by moving information about server fires from some hidden place in the cloud down to the people who need to know. I’ve mentioned Prometheus in particular because of the name, but there is a range of services in this area—some came earlier than Prometheus, such as DataDogand New Relic. No matter which one you choose, you’ll want some kind of monitoring solution for production workloads.
Management
Management of cloud applications is another area where many of us run into trouble. In order to take advantage of the fast release cycles of the cloud, we split monolithic applications up into smaller microservices, each with their own release lifecycle. This works great when things work, but what happens when the network is unreliable? What happens if a service goes down? What happens if an API changes? Is the change compatible with all of its consumers? Worse yet, what happens if an API stays the same but the internal semantics have changed? How should services discover each other? Service meshes and contract testing help with these issues. Nonetheless, despite their best efforts, some organisations attempt the switch to microservices but just end up with a distributed monolith.
Even when everything functions as expected, the complexity of a cloud system can far exceed that of its on-premise predecessor. There may be hundreds or even thousands of microservices, and usually, it’s not even a single cloud—it’s multiple clouds. It’s a multicloud world and a hybrid cloud world. Managing all of the clouds often needs something like IBM Multicloud Manager to keep things straight.
The reward
With so many obstacles, is it even worth attempting the journey to the cloud? The answer is, emphatically, YES. The cloud has given us extraordinary rewards. It’s enabled whole new business models. It’s allowed us to release new software versions many times a day, rather than once a year. We can engage with our users and respond to their needs in a much more satisfying way than we could when software was shipped on CDs. Developers have been freed from a whole category of boring jobs (patching and installing and configuring). Data scientists are also experiencing a cloud revolution, with cloud-accessible shared data removing some of the most tedious parts of their job.
Many businesses started the journey to cloud in search of lower costs. As well as those, they found it unlocked innovation and hugely improved end-user technologies. Many of the most exciting technologies, such as artificial intelligence and blockchain, have been born on the cloud and mostly-delivered by the cloud. Sharing resources on the cloud also gives us access to hardware too exotic to run locally, such as huge GPU clusters and quantum computers.
IBM Q System One is the world’s first integrated quantum computing system. IBM was also the first to make quantum computers available on the cloud.
Get started on your hero’s journey to cloud
To find out more about how the cloud can help your business and get started on your cloud journey, you can request a consultation with the IBM Cloud Garage.