January 23, 2018 | Written by: Josh Mintz
Categorized: Data Analytics | Open Source
Share this post:
When your business is operating at web scale, every microsecond counts. For most of the past decade, it has been an open secret that website performance has a measurable impact on financial results – for example, a study conducted by one major web retailer revealed that every additional 100 milliseconds of latency resulted in substantial and costly losses in revenue.
The message is clear – users have a low tolerance for unresponsive websites, and when you’re building an application that is going to have to handle significant traffic, you need to take latency and throughput seriously.
Get Started with Compose for ScyllaDB
Targeting millisecond response times
That’s why, when IBM Cloud started building the new backend for its global cloud solutions catalog, making the right choices about architecture was a top priority.
Dr. Gili Mendel, IBM Master Inventor, shared the requirements: “We needed a platform that could serve up all the information about our cloud product catalog—including plans, tiers, pricing, regions and availability—to tens or even hundreds of thousands of clients. And we were aiming for average response times within 10 milliseconds, to give those users the best possible UX.”
Search for the silver bullet
When starting a new project, there is always a temptation to reach for some new “silver bullet” software that promises to solve all the problems you’ve ever experienced with previous technologies. All too often, this is a mistake – the new tool may alleviate some of your issues, but there are almost always trade-offs.
However, when it does work, the results can be spectacular—and the IBM global catalog project is one such happy example.
Rich Kulp, a senior developer on the project, said: “We built prototypes using traditional relational databases, and then we tried NoSQL options like MongoDB. But for our specific use-case, we just couldn’t get them to scale as efficiently as we needed.
“We spoke to one of IBM’s architects, and they suggested taking a look at Scylla. We were skeptical at first, but when we tested it out, Scylla gave us exactly what we needed. It’s very fast, it’s very scalable, and it’s very amenable to cloud development, with good multi-instance support and robust backup and restore capabilities.”
Dr. Mendel adds: “As the icing on the cake, Scylla is available as a fully managed service via IBM Compose – giving us an easy, auto-scaling deployment system which delivers high availability and redundancy, automated no-stop backups, and excellent technical support.
“With Compose, we don’t need to worry about database setup, infrastructure or management – that’s all handled by the IBM experts. That takes a lot of pressure and workload away our development team.”
Why Scylla makes sense
In practice, Scylla isn’t quite such a leap of faith as many new database technologies. For one thing, the Scylla team and several other major companies have published a number of impressive benchmarks, suggesting the platform’s performance is no fluke.
More importantly, Scylla hasn’t attempted to reinvent the wheel by developing its own unique data model, query language, or APIs. Instead, it is designed to be a drop-in replacement for Apache Cassandra, a relatively mature technology that is widely used and well-understood throughout the developer community.
Unleashing monster performance
So why use Scylla instead of Cassandra? In a word, speed.
Cassandra runs on the Java Virtual Machine (JVM) on Linux. The JVM’s “stop the world” approach to garbage collection means that Cassandra requires complex, expensive strategies around memory allocation to avoid latency hiccups and low throughput. The reliance on the Linux kernel also adds an extra layer that Cassandra needs to go through before it can address data directly.
By contrast, Scylla has been built from scratch in C++, with a “shared-nothing” design that uses asynchronous message passing when information needs to be shared between cores, instead of shared memory and expensive locking operations. As a result, it can make highly efficient use of all the available processor cores—and claims to be easily able to achieve 1 million concurrent Cassandra Query Language (CQL) operations on a single commodity server, with latency of under one millisecond for inserts, deletes and reads.
Harnessing a thoroughly modern stack
In the architecture that IBM has deployed to production, Scylla sits at the heart of a thoroughly modern technology stack. The global catalog services themselves are written in Go, and they run in Docker containers orchestrated by Kubernetes and Cloud Foundry (as in IBM Cloud Container Service). RabbitMQ is used to handle messaging between services, and both RabbitMQ and Scylla, are hosted and managed by IBM Compose.
“Using Compose for ScyllaDB and RabbitMQ is really nice for us, because it abstracts away all of the operational concerns around managing hardware and keeping our database and message queues online,” says Dr. Mendel. “That means we can focus on application development, and making the catalog as intuitive and responsive as possible.”
To learn more about IBM Compose for ScyllaDB, visit our website, or check out the resources below:
- Getting started with ScyllaDB – a simple tutorial to help you create and connect to your first ScyllaDB instance on Compose
- Talking ScyllaDB – a fascinating interview with ScyllaDB’s CTO, Avi Kivity, about the history, philosophy and future of the database
Hyver case study – a detailed account of how Hyver uses Scylla to store, manage and retrieve API Keys for its web marketing platform in a highly scalable way.