Everything you need to know about NoSQL, a type of database design that offers more flexibility than traditional databases.
What is a NoSQL database?
NoSQL, which stands for “non-SQL,” is a type of database design that does not follow traditional row-and-column (i.e., “relational”) database design.
To better understand databases, let’s go back to the advent of the first databases designed for the masses, which appeared around 1960. Those databases included database management systems (DBMS) to allow users to organize large quantities of data.
The original DBMSs were flat-file/comma-delimited, often proprietary to a particular application, and limited in the relationships they could uncover among data. DBMSs were also complex.
This eventually led to the development of relational database management systems (RDBMSs). Relational databases arranged data in tables that could be connected or related by common fields, separated from applications, and queried with SQL. In other words, the relational database placed data into tables, and SQL created an interface for interacting with it.
Relational databases and SQL work well for large servers and storage mediums. But as larger sets of frequently evolving, disparate data became more common for things like e-commerce applications, programmers needed something more flexible than SQL. NoSQL is that alternative.
NoSQL databases are built for specific data models and have flexible schemas that allow programmers to create and manage modern applications. NoSQL is also more agile because it’s not built on the concept of tables and does not use SQL to manipulate or analyze data (although some NoSQL databases may have SQL-inspired query language).
NoSQL encompasses structured data (code in a specific format, written in such a way that search engines understand it), semi-structured data (data that contains tags or other markers to separate semantic elements and enforce hierarchies of records and fields within the data), unstructured data (information that either does not have a pre-defined data model or is not organized in a pre-defined manner), and polymorphic data (data that can be transformed to any distinct data type as required).
NoSQL enables you to be more agile, more flexible, and to iterate more quickly. NoSQL database enables simpler design, better control over availability and improved scalability.
Today’s cloud providers can support SQL or NoSQL databases. Which database you choose depends on your goals.
To learn more about the state of databases, see “A Brief Overview of the Database Landscape.”
Types and examples
NoSQL databases can accommodate a wide variety of data models:
In the key-value structure, the key is usually a simple string of characters, and the value is a series of uninterrupted bytes that are opaque to the database. The data itself is usually some primitive data type (string, integer, array) or a more complex object that an application needs to persist and access directly.
This replaces the rigidity of relational schemas (schemas are basically a blueprint of how tables work) with a more flexible data model that allows developers to easily modify fields and object structures as their applications evolve. In general, key-value stores have no query language. They simply provide a way to store, retrieve, and update data using simple GET, PUT and DELETE commands. The simplicity of this model makes a key-value store fast, easy to use, scalable, portable, and flexible.
A document is an object and keys (strings) that have values of recognizable types, including numbers, Booleans, and strings, as well as nested arrays and dictionaries. Document databases are designed for flexibility. They aren’t typically forced to have a schema and are therefore easy to modify. If an application requires the ability to store varying attributes along with large amounts of data, document databases are a good option. MongoDB and Apache CouchDB are examples of popular document-based databases.
Column family stores enable very quick data access using a row key, column name, and cell timestamp. The flexible schema of these types of databases means that the columns don’t have to be consistent across records, and you can add a column to specific rows without having to add them to every single record. The wide, columnar stores data model, like that found in Apache Cassandra, are derived from Google's BigTable paper.
The modern graph database is a data storage and processing engine that makes the persistence and exploration of data and relationships more efficient. In graph theory, structures are composed of vertices and edges (data and connections), or what would later be called “data relationships.” Graphs behave similarly to how people think—in specific relationships between discrete units of data. This database type is particularly useful for visualizing, analyzing, or helping you find connections between different pieces of data. As a result, businesses leverage graph technologies for recommendation engines, fraud analytics, and network analysis. Examples of graph-based NoSQL databases include Neo4j and JanusGraph.
Examples of NoSQL databases
Many NoSQL databases were designed by young technology companies like Google, Amazon, Yahoo, and Facebook to provide more effective ways to store content or process data for huge websites. Some of the most popular NoSQL databases include Apache CouchDB, Apache Cassandra, MongoDB, Redis, and Elasticsearch.
When to use
Relational databases have been around for over 25 years, and technology has changed dramatically since then. A relational database uses SQL to perform tasks like updating data in a database or to retrieve data from a database. Some common relational database management systems that use SQL include Oracle, Db2, and Microsoft SQL Server. Maintaining high-end, commercial relational database management systems are expensive because they require purchasing licenses, trained manpower to manage and tune them, and powerful hardware.
NoSQL enables faster, more agile storage and processing, which means NoSQL databases are generally a better fit for modern, complex applications like e-Commerce sites or mobile applications.
NoSQL database’s horizontal scaling and flexible data model means they can address large volumes of rapidly changing data, making them great for agile development, quick iterations, and frequent code pushes.
In a nutshell, the difference between relational databases and NoSQL databases are performance, availability, and scalability.
Some specific cases when NoSQL databases are a better choice than RDBMS include the following:
- When you need to store large amounts of unstructured data with changing schemas. NoSQL databases usually have horizontal scaling properties that allow them to store and process large amounts of data. And NoSQL enables ad-hoc schema changes. (In contrast, with a relational database, an engineer designs the data schema up front, and SQL queries are then run against the database; if subsequent schema changes are required, they’re often difficult and complex to carry out.)
- When you’re using cloud computing and storage. Most NoSQL databases are designed to be scaled across multiple data centers and run as distributed systems, which enables them to take advantage of cloud computing infrastructure—and its higher availability—out of the box. (For more, refer to “How to Choose a Database on the IBM Cloud.”)
- When you need to develop rapidly. With NoSQL, you don’t have to prepare data like you do if you’re using a relational database. That means you can develop and iterate more quickly.
NoSQL and IBM Cloud
Today, many applications are delivered as services, and those services must be available 24/7, accessible from a wide range of devices, and scaled to what can potentially be millions of users.
NoSQL was created to manage the scale and agility challenges that face modern applications, but the suitability of a database depends on the problem it must solve. SQL and NoSQL are each suited to different use cases, so which tool to use depends more on what you are trying to accomplish. Further, over the past few years, SQL technologies like PostgreSQL have been bridging the gap between NoSQL and SQL by offering JSON support or scale-out capabilities. With IBM Cloud Databases for PostgreSQL, IBM offers enterprise-ready, fully managed PostgreSQL built with native integration into the IBM Cloud.
IBM Cloudant, in particular, is a scalable JSON document database optimized for web, mobile, IoT, and serverless applications. The service is compatible with an open source ecosystem that includes Apache CouchDB, PouchDB, and libraries for the most popular web and mobile development stacks.
Sign up for an IBMid and create your IBM Cloud account.