Graph databases – a natural way to represent data

Share this post:

Among the types of NoSQL databases, graph databases are increasingly popular because of their unique approach to data storage and retrieval. Instead of storing data by itself and requiring ad hoc queries to find relationships — as with traditional relational databases — a graph database stores data with its relationships. In the simplest terms, a graph database is a database management system where relationships between the data points are of highest priority.

Alaa Mahmoud (@alaa_mahmoud), Master Inventor at IBM, is a lead developer of IBM’s graph database-as-a-service, IBM Graph, built using Apache TinkerPop and now available in a free beta trial. In Episode 12 of the New Builders podcast, “Of Graphs and Gremlins – Graph Database 101,” Mahmoud discusses the process behind building IBM Graph, the industry’s first enterprise-grade distributed property graph offered as a fully managed cloud service.

[soundcloud url=”″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

“Graph databases are special in the way that they store information,” he explains. “When you look at NoSQL, even traditional RDBMS systems, they store data as tables, documents, columns, rows. If you think about it, it’s not really the natural way of representing data.”

Mahmoud further explains that when you think of your relationships to the many people you have in your life, those relationships aren’t represented by a table. “You think of them as a graph, as people having relationships with others,” he says. In a graph database, data is stored in nodes or vertices that connect to each other using relationships called edges.

Graph is a natural way of representing data… ‘Give me all the nodes with this property and these relationships’ is what you’re after.Alaa Mahmoud, IBM

“Graph is a natural way of representing data,” says Mahmoud. “Instead of making selects and joins and this kind of unnatural way of looking for stuff, what you’re really looking for is ‘Give me all the nodes that have this particular property that have relationships with other nodes.’ So you’re naturally querying the data, and you’re naturally storing it.”

Graph databases in the real world

Graph databases are most commonly used to power recommendation engines (the kind used by e-commerce sites and streaming services, for example), a perfect use case for a database that sees the natural relationships between people and things.

Furthermore, as Mahmoud points out, graph databases are relevant to many industries’ use cases, including data modeling, pattern recognition, and network topology: “Network topology is one of the very famous, very common uses of graph databases,” he says. “I have printers and I have machines; I have laptops and I have servers in different data centers — and they connect, and they have a network that connects all these pieces together, and IoT-coupled devices. It’s a graph of different devices that are connected that I want to model: [it’s] network modeling.”

Because graph databases can detect patterns in real-time, they are also ideal for use in fraud detection. In a recent blog post, Larry Weber, IBM Analytics Program Director, detailed how graph databases can more effectively detect fraud as it is happening. For example, traditional databases can detect fraud if a charge is unusually large or falls outside the buyer’s normal habits. But thieves have adjusted their patterns to make several small purchases that go unnoticed by traditional fraud detection systems.

“In a graph database, however, data and connections are stored together,” writes Weber. “Accordingly, such databases store not only data points but also data relationships and properties, allowing transactional applications to be imbued with real-time analytics functions…. A graph database is adept at detecting exactly this sort of nuanced transaction.”

The rapid rise of graph databases

Though relatively new, graph databases continue to gain traction with developers. Ben Kepes writes in Forbes that “much of this demand is driven by developers’ demand for a database that isn’t simply about data storage and retrieval. Being able to derive business value from related data, within the context of the database itself, means organizations can more rapidly react to customer actions.” Kepes also notes that Forrester predicted graph databases would be used by 25 percent of enterprises by 2017.

As the inventor of Neo4j, the first operational property graph database, Emil Eifrem attributes the rise of these databases to three factors. Graph databases are able to:

  • Analyze the world around us
  • Model and allow for significantly faster development than other databases options
  • Get real-time information for analysis

Graph databases are faster and provide better analytics than traditional databases, and as such, are becoming widely adopted in the enterprise. Mahmoud feels so strongly about the power and potential of graph databases that he was hard-pressed to offer an example where they might not be the appropriate database choice. Graph database systems are, as he says, the “natural way of presenting data.”

If you are simply looking to get back basic information from a query, like an employee number — if you’re never going to need analytics — then maybe an RDBMS would be a better fit. But, Mahmoud warns, “If your data doesn’t have relationships, your data is pretty much useless, and you want to think more about what you have and what you’re really getting out of the data that you’re storing.”

Ready to get started?

Begin today with the free beta trial of IBM Graph on Bluemix:

Listen to more episodes of the New Builders Podcast:

More Community stories

Announcing Data Virtualization (Federation) for Db2 on Cloud

IBM is happy to announce Db2 Data Virtualization features for Db2 on Cloud. Sometimes called “federation”, this feature lets our customers access data from multiple different databases with a single query. With this latest addition, users will now be able to access data that is located on any of their Db2 or Informix data sources including both cloud and on premises system. This functionality is supported on all versions of Db2 on Cloud, except for the free Lite plan (However, users can use the Lite plan as a target that you can pull data from).

Continue reading

IBM Content Delivery Network improves user experience worldwide

This blog shows how much the IBM Content Delivery Network (CDN) can improve the performance of your cloud applications, globally. We tested some specific use cases, and we wanted to share what we learned!

Continue reading

Now available: Veeam on IBM Cloud price reductions + new licensing options + free trial

We are excited to announce that we have reduced the price of our existing Veeam Availability Suite Enterprise Plus and Veeam Physical Agent solutions, as well as launched a new Veeam License, Veeam Backup and Replication. The new prices and options are now available in our IBM Cloud Portal: Veeam Availability Suite Enterprise Plus for […]

Continue reading