Securely open up

All you really need to know about open source databases

Share this post:

Perhaps you’ve heard about NoSQL and open source databases but are still wondering what it’s all about and if you should even care. Maybe you’ve looked online and found pages of technical stuff but don’t understand it?

If only there was one place to get all you really need to know…

Well there is, and you found it here!

Explaining the terms

The term NoSQL came up in 2009 when it was used to describe the emergence of new databases that do not view data in strictly defined tables of rows and columns. It means non-relational databases.

Closed source means software whose source code is kept secret to prevent copying. Open source is the opposite — software whose source code is open and available for study, modification and even redistribution. Open source software is often free to download and use.

Summary: Open source databases are database systems whose source code is open source. An open source database could be relational (SQL) or non-relational (NoSQL).

Why should you care?

There are two forces at work in the database market today: the need for new applications and the need to lower costs. The need to lower costs doesn’t seem like anything new, but the need for new apps is driving the need to lower costs (they cost money to develop). So why are new applications needed?

With the advent of Web 2.0, static web pages have become dynamic and social media is all around us. Everyone is tweeting, posting, blogging, vlogging, sharing photos, chatting and commenting.

The Internet of Things (IoT) is emerging — a rapidly growing network of connected devices that collect and exchange data, such as sensors and smart devices. There are some great examples here.

Altogether, this generates huge amounts of new data that businesses want to absorb and use to stay ahead, to provide features such as product recommendations and a better customer experience. The data can be analyzed in search of patterns for applications such as fraud detection and behavior analytics.

Much of the new data is unstructured, which means that it can’t be neatly stored in a tabular database. Imagine trying to design a database to hold data on your grocery shopping — what you like, how often you buy it, whether you prefer milk or cream with your coffee.

New types of databases are needed to store the new data, and they need to be non-relational and ideally low cost. Ring any bells? Not relational as in NoSQL and low cost as in open source.

Types of NoSQL databases

We have seen that new data needs new databases, so it follows that a variety of new databases are needed to address the variety of new data and the applications that use it. The main types are listed here:

  • Key-value databases such as Redis store key and value data in memory for ultra-fast lookup. This page shows some use cases.
  • Document databases store document information. MongoDB is the best known and most widely used. This page shows some use cases.
  • Wide-column store databases are similar to key-value but allow a very large number of columns. They are well suited for analyzing huge data sets, and Cassandra is the best known. Use cases here.
  • Graph databases such as Neo4j are used to explore the relationships that link data together, allowing rapid execution of complex queries over millions of connections. Use cases include recommendations, social networks and fraud detection. This video is a great introduction.

Redis, MongoDB, Cassandra and Neo4j are all open source and there’s no cost to use them. Paid enterprise editions are available that include support and additional features. Even so, enterprise editions are much less expensive than traditional commercial databases.

Open source relational databases

Companies are looking for money in their IT budgets and discovering how much is spent on support and maintenance of traditional relational database systems. And it’s a lot. Estimates vary but some say up to 35 percent of software infrastructure spending. Switching to lower-cost open source software saves money, which is why an estimated 78 percent of enterprises use it, including open source databases:

  • MySQL is the world’s most popular open source relational database. It was acquired by Oracle in 2010, and Oracle now charges for support. A free “community” edition is still available.
  • MariaDB is a drop-in replacement for MySQL. Uncertain about MySQL’s future with Oracle, many users have migrated to MariaDB. Support subscriptions are available from Mariadb.com.
  • PostgresSQL has a strong reputation for reliability and data integrity. It’s feature-rich and is more robust and better performing than MySQL. The community edition is free.
  • PostgresPURE is available from Splendid Data on a subscription basis. It is built on PostgresSQL but with added tools and support to make an enterprise package.
  • EnterpriseDB (EDB) is also based on PostgreSQL but with additional features and tools, most notably Oracle compatibility features (which are closed source), enabling Oracle shops to transition to EDB more easily than to other PostgreSQL variants. EDB charge for these extras and support.

Free databases – too good to be true?

Get ready for a surprise: Open source databases aren’t really free, because businesses usually choose subscription editions with support. Okay, maybe that’s not too surprising. And here’s something else: businesses don’t really care if they are open source or not. Very few people really want to tinker with the source code.

But open source NoSQL databases enable innovation with new data, and open source relational databases are lower cost compared to traditional relational database management systems.

Open source databases on IBM Power Systems

All open source databases are available to run on x86. But what is available for companies that have already invested in or wish to invest in IBM Power Systems? MongoDB, EnterpriseDB, Redis, Cassandra, Neo4j and MariaDB are all available on IBM Power, and all are better performing compared to a similarly configured x86 system. In fact, IBM guarantees 2x price-performance over x86 for MongoDB and 1.8x for EnterpriseDB, which means that clients choosing IBM Power can expect better performance and less server sprawl.

This should cover the basics, but if you’d like to learn more about open source databases on IBM Power Systems, start here.

Add Comment
2 Comments

Leave a Reply

Your email address will not be published.Required fields are marked *


Stu Cunliffe

Great article Rick. How easy is it for a customer to migrate from Oracle to one of the open source relational databases?


Rick Murphy

Hi Stu, thank you for your comment.

There are several options available for clients migrating from Oracle to open source databases. To the question of how easy is it for clients to migrate, the answer is always “it depends”!

To a large extent it depends on how complex the application is and what advanced Oracle features the database uses. If the Oracle database lies at the heart of a deeply connected and integrated network of applications, that’s going to much more complex than a one-database-one-application situation.

I’d always advise clients to understand a Migration Assessment to look at issues like this before they start migration projects.

Because Oracle is a relational database, it makes sense for clients to first consider an open source relational database such as EnterpriseDB or PostgresPure. Both are based on Postgres which is highly respected, very stable and has the enterprise features that clients need.

EnterpriseDB has several ‘Oracle compatibility’ features that make it easy for clients to transition. There is compatibility with Oracle PL/SQL, compatibility with OCI and compatibility for applications with embedded SQL. Together with tools designed to be familiar to Oracle DBAs, this makes a powerful combination that minimizes the need for application changes and staff retraining.

EnterpriseDB says that 51% of client applications need little or no modification to work with their Postgres Advanced Server product, and their Migration Toolkit assists with the migration process.

Splendid Data’s PostgresPure product takes a more native approach, with no ‘add ons’ for Oracle compatibility. Many clients prefer to “go native” if they are transitioning to Postgres, and Splendid data offer a migration toolkit as well as services to assist clients. IBM has several success stories with Splendid Data in Europe, and these are available to IBM sellers and Business Partners as references.

Finally, some clients like to approach migration to open source databases head-on, and choose a non-relational database like MongoDB. Although application modifications will be needed, MongoDB’s schema-less design offers flexibility and agility that’s best-in-class, and clients such as Shutterfly and eHarmony have chosen this path. The migration process takes longer, but the resulting benefits and cost-savings can be significant. With their ability to scale both vertically and horizontally, Power servers are particularly well-suited for MongoDB.

My apologies for such a long reply to your simple question, but I hope it answers the question.