NoSQL, also referred to as “not only SQL”, “non-SQL”, is an approach to database design that enables the storage and querying of data outside the traditional structures found in relational databases. While it can still store data found within relational database management systems (RDBMS), it just stores it differently compared to an RDBMS. The decision to use a relational database versus a non-relational database is largely contextual, and it varies depending on the use case.
Instead of the typical tabular structure of a relational database, NoSQL databases, house data within one data structure, such as JSON document. Since this non-relational database design does not require a schema, it offers rapid scalability to manage large and typically unstructured data sets.
NoSQL is also type of distributed database, which means that information is copied and stored on various servers, which can be remote or local. This ensures availability and reliability of data. If some of the data goes offline, the rest of the database can continue to run.
Today, companies need to manage large data volumes at high speeds with the ability to scale up quickly to run modern web applications in nearly every industry. In this era of growth within cloud, big data, and mobile and web applications, NoSQL databases provide that speed and scalability, making it a popular choice for their performance and ease of use.
Structured query language (SQL) is commonly referenced in relation to NoSQL. To better understand the difference between NoSQL and SQL, it may help to understand the history of SQL, a programming language used for retrieving specific information from a database.
Before relational databases, companies used a hierarchical database system with a tree-like structure for the data tables. These early database management systems (DBMS) enabled users to organize large quantities of data. However, they were complex, often proprietary to a particular application, and limited in the ways in which they could uncover within the data. These limitations eventually led to the development of relational database management systems, which arranged data in tables. SQL provided an interface to interact with relational data, allowing analysts to connect tables by merging on common fields.
As time passed, the demands for faster and more disparate use of large data sets became increasingly more important for emerging technology, such as e-commerce applications. Programmers needed something more flexible than SQL databases (i.e. relational databases). NoSQL became that alternative.
While NoSQL provided an alternative to SQL, this advancement by no means replaced SQL databases. For example, let's say that you are managing retail orders at a company. In a relational model, individual tables would manage customer data, order data and product data separately, and they would be joined together through a unique, common key, such as a Customer ID or an Order ID. While this is great for storing and retrieving data quickly, it requires significant memory. When you want to add more memory, SQL databases can only scale vertically, not horizontally, which means your ability to add more memory is limited to the hardware you have. The result is that vertical scaling ultimately limits your company’s data storage and retrieval.
In comparison, NoSQL databases are non-relational, which eliminates the need for connecting tables.Their built-in sharding and high availability capabilities ease horizontal scaling. If a single database server is not enough to store all your data or handle all the queries, the workload can be divided across two or more servers, allowing companies to scale their data horizontally.
While each type of database has its own advantages, companies commonly utilize both NoSQL and relational databases in a single application. Today’s cloud providers can support SQL or NoSQL databases. Which database you choose depends on your goals.
For a deeper dive into the differences between the two options, see "SQL vs. NoSQL Databases: What's the Difference?"
NoSQL provides other options for organizing data in many ways. By offering diverse data structures, NoSQL can be applied to data analytics, managing big data, social networks, and mobile app development.
A NoSQL database manages information using any of these primary data models:
This is typically considered the simplest form of NoSQL databases. This schema-less data model is organized into a dictionary of key-value pairs, where each item has a key and a value. The key could be like something similar found in a SQL database, like a shopping cart ID, while the value is an array of data, like each individual item in that user’s shopping cart. It’s commonly used for caching and storing user session information, such as shopping carts. However, it's not ideal when you need to pull multiple records at a time. Redis and Memcached are examples of an open-source key-value databases.
As suggested by the name, document databases store data as documents. They can be helpful in managing semi-structured data, and data are typically stored in JSON, XML, or BSON formats. This keeps the data together when it is used in applications, reducing the amount of translation needed to use the data. Developers also gain more flexibility since data schemas do not need to match across documents (e.g. name vs. first_name). However, this can be problematic for complex transactions, leading to data corruption. Popular use cases of document databases include content management systems and user profiles. An example of a document-oriented database is MongoDB, the database component of the MEAN stack.
Want to know more about MongoBD? Check out the IBM tutorial on getting started with using IBM Cloud Databases for MongoDB.
These databases store information in columns, enabling users to access only the specific columns they need without allocating additional memory on irrelevant data. This database tries to solve for the shortcomings of key-value and document stores, but since it can be a more complex system to manage, it is not recommended for use for newer teams and projects. Apache HBase and Apache Cassandra are examples of open-source, wide-column databases. Apache HBase is built on top of Hadoop Distributed Files System that provides a way of storing sparse data sets, which is commonly used in many big data applications. Apache Cassandra, on the other hand, has been designed to manage large amounts of data across multiple servers and clustering that spans multiple data centers. It’s been used for a variety of use cases, such as social networking websites and real-time data analytics.
This type of database typically houses data from a knowledge graph. Data elements are stored as nodes, edges and properties. Any object, place, or person can be a node. An edge defines the relationship between the nodes. For example, a node could be a client, like IBM, and an agency like, Ogilvy. An edge would be categorize the relationship as a customer relationship between IBM and Ogilvy.
Graph databases are used for storing and managing a network of connections between elements within the graph. Neo4j (link resides outside IBM), a graph-based database service based on Java with an open-source community edition where users can purchase licenses for online backup and high availability extensions, or pre-package licensed version with backup and extensions included.
With this type of database, like IBM solidDB, data resides in the main memory rather than on disk, making data access faster than with conventional, disk-based databases.
Many companies have entered the NoSQL landscape. In addition to those mentioned above, here are some popular NoSQL databases:
To learn more about the state of databases, see “A Brief Overview of the Database Landscape.”
Each type of NoSQL database has strengths that make it better for specific use cases. However, they all share the following advantages for developers and create the framework to provide better service customers, including:
In a nutshell, NoSQL databases provide high performance, availability, and scalability.
The structure and type of NoSQL database you choose will depend on how your organization plans to use it. Here are some specific uses for various types of NoSQL databases.
The need for large companies to provide services without latency and to scale more quickly has spurred growth for microservices, which has led companies to examine what type of database to use for different applications.
Companies have found that using a single, relational database for every component of an application has its limitations, especially when better alternatives exist for specific components. Microservices are an attractive option, in part, because they eliminate the need for a single, shared data store for an entire application. Instead, the application has many, loosely coupled and independently deployable services, each with their own data model and database, and integrated via API gateways or an iPaaS.
The pattern of using multiple databases within a single application, also known as polyglot persistence, has helped to create space in the market for NoSQL databases to thrive. Today, developers can leverage the right database for the right microservice without trying to make everything work in the context of a single, relational database.
The data layer for hyperscale, resilient, globally available applications, based on open source Apache CouchDB
Today, many applications are delivered as services, and those services must be available 24/7, accessible from a wide range of devices, and scaled to what can potentially be millions of users. IBM Cloudant is a scalable JSON document database optimized for web, mobile, IoT, and serverless applications. The service is compatible with an open source ecosystem that includes Apache CouchDB, PouchDB, and libraries for the most popular web and mobile development stacks.