Graph databases – a natural way to represent data

Share this post:

Among the types of NoSQL databases, graph databases are increasingly popular because of their unique approach to data storage and retrieval. Instead of storing data by itself and requiring ad hoc queries to find relationships — as with traditional relational databases — a graph database stores data with its relationships. In the simplest terms, a graph database is a database management system where relationships between the data points are of highest priority.

Alaa Mahmoud (@alaa_mahmoud), Master Inventor at IBM, is a lead developer of IBM’s graph database-as-a-service, IBM Graph, built using Apache TinkerPop and now available in a free beta trial. In Episode 12 of the New Builders podcast, “Of Graphs and Gremlins – Graph Database 101,” Mahmoud discusses the process behind building IBM Graph, the industry’s first enterprise-grade distributed property graph offered as a fully managed cloud service.

[soundcloud url=”″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

“Graph databases are special in the way that they store information,” he explains. “When you look at NoSQL, even traditional RDBMS systems, they store data as tables, documents, columns, rows. If you think about it, it’s not really the natural way of representing data.”

Mahmoud further explains that when you think of your relationships to the many people you have in your life, those relationships aren’t represented by a table. “You think of them as a graph, as people having relationships with others,” he says. In a graph database, data is stored in nodes or vertices that connect to each other using relationships called edges.

Graph is a natural way of representing data… ‘Give me all the nodes with this property and these relationships’ is what you’re after.Alaa Mahmoud, IBM

“Graph is a natural way of representing data,” says Mahmoud. “Instead of making selects and joins and this kind of unnatural way of looking for stuff, what you’re really looking for is ‘Give me all the nodes that have this particular property that have relationships with other nodes.’ So you’re naturally querying the data, and you’re naturally storing it.”

Graph databases in the real world

Graph databases are most commonly used to power recommendation engines (the kind used by e-commerce sites and streaming services, for example), a perfect use case for a database that sees the natural relationships between people and things.

Furthermore, as Mahmoud points out, graph databases are relevant to many industries’ use cases, including data modeling, pattern recognition, and network topology: “Network topology is one of the very famous, very common uses of graph databases,” he says. “I have printers and I have machines; I have laptops and I have servers in different data centers — and they connect, and they have a network that connects all these pieces together, and IoT-coupled devices. It’s a graph of different devices that are connected that I want to model: [it’s] network modeling.”

Because graph databases can detect patterns in real-time, they are also ideal for use in fraud detection. In a recent blog post, Larry Weber, IBM Analytics Program Director, detailed how graph databases can more effectively detect fraud as it is happening. For example, traditional databases can detect fraud if a charge is unusually large or falls outside the buyer’s normal habits. But thieves have adjusted their patterns to make several small purchases that go unnoticed by traditional fraud detection systems.

“In a graph database, however, data and connections are stored together,” writes Weber. “Accordingly, such databases store not only data points but also data relationships and properties, allowing transactional applications to be imbued with real-time analytics functions…. A graph database is adept at detecting exactly this sort of nuanced transaction.”

The rapid rise of graph databases

Though relatively new, graph databases continue to gain traction with developers. Ben Kepes writes in Forbes that “much of this demand is driven by developers’ demand for a database that isn’t simply about data storage and retrieval. Being able to derive business value from related data, within the context of the database itself, means organizations can more rapidly react to customer actions.” Kepes also notes that Forrester predicted graph databases would be used by 25 percent of enterprises by 2017.

As the inventor of Neo4j, the first operational property graph database, Emil Eifrem attributes the rise of these databases to three factors. Graph databases are able to:

  • Analyze the world around us
  • Model and allow for significantly faster development than other databases options
  • Get real-time information for analysis

Graph databases are faster and provide better analytics than traditional databases, and as such, are becoming widely adopted in the enterprise. Mahmoud feels so strongly about the power and potential of graph databases that he was hard-pressed to offer an example where they might not be the appropriate database choice. Graph database systems are, as he says, the “natural way of presenting data.”

If you are simply looking to get back basic information from a query, like an employee number — if you’re never going to need analytics — then maybe an RDBMS would be a better fit. But, Mahmoud warns, “If your data doesn’t have relationships, your data is pretty much useless, and you want to think more about what you have and what you’re really getting out of the data that you’re storing.”

Ready to get started?

Begin today with the free beta trial of IBM Graph on Bluemix:

Listen to more episodes of the New Builders Podcast:

More Community stories
April 30, 2019

Introducing IBM Analytics Engine v1.2 and Announcing the Deprecation of IBM Analytics Engine v1.0

We are excited to inform you about the new version of IBM Analytics Engine v1.2 that will be available starting May 15, 2019. Along with this release, Analytics Engine v1.0 will be retired.

Continue reading

April 16, 2019

Announcing the Deprecation of the Decision Optimization Beta Service

The End of Beta date for the Decision Optimization service is May 17, 2019. The End of Beta Support date is June 20, 2019.

Continue reading

April 2, 2019

Data Refinery and Profiling Changes in Watson Studio and Watson Knowledge Catalog

We'd like to announce data refinery and profiling changes related to Watson Studio and Watson Knowledge Catalog that will take effect on May 17, 2019.

Continue reading