IBM Db2 Graph

Utilizing the Apache Tinkerpop graph analytics framework, IBM® Db2® Graph transforms and optimizes Gremlin queries for analyzing data in your Db2 database. Using a Graph overlay file that defines each row in a table as either a vertex or an edge, Db2 Graph is able to expose your Db2 data to graph queries natively and without third party software.

To meet the challenge of analyzing rapidly growing graph and network data created by modern applications many different graph database implementations are emerging. They mainly target low-latency graph queries, such as finding the neighbors of a vertex with certain properties and retrieving the shortest path between two vertices.

Although many of the graph databases handle graph-only queries well, they fall short for real life applications involving graph analysis. Graph queries are not all that one does in an analytics workload. They are often only a part of an integrated heterogeneous analytics pipeline, which can include SQL, machine learning, graph, and other analytics. Graph queries need to be synergistic with other analytics.

Unfortunately, most existing graph databases are stand-alone and cannot easily integrate with other analytics systems:
  • Customers need to create and maintain data transformation, export, and loading jobs
  • The time to export and load data is time that could be spent analyzing the data to gain insights
  • Access control and auditing become problematic when there are two copies of the data
  • Custom views of graphs require complex logic to create and maintain, as underlying data changes
Db2 Graph solves these challenges by letting you run Gremlin queries on your existing relational data structure to perform graph analytics without requiring any changes to the underlying database structure.

How Db2 Graph works

Db2 Graph is written with the Apache Tinkerpop graph analytics framework. It can transform and optimize Gremlin queries into SQL statements, which get efficiently processed in Db2 over a JDBC connection. It works by creating a virtual graph model that defines each row in a table as either a vertex or an edge.

This means that data already stored in Db2 can be exposed for graph queries without:
  • exporting the data
  • transforming the data
  • loading the data into a separate graph analytics application
Since the graph queries are running on the data stored in Db2, any new or updated data is made available for graph queries immediately allowing for graph analytics to be performed on transactional data in real time. The vertex and edge definitions are not limited to tables. Custom views in Db2 can be used to define vertexes or edges as well. Allowing for instantaneous customization of a graph with no need to maintain complex logic to create or modify edges to get different graph views.

There are also a number of security and audit processes that become much easier with Db2 Graph. Since the data is stored in Db2, and does not need to be exported or transformed, existing practices for data security and auditing can remain in place. Db2 Graph is viewed as another client accessing the data.

Db2 Graph is preconfigured with the Gremlin server. On x86 platforms, Db2 Graph is also preconfigured with the Gremlin console. The Gremlin console is only available for x86 platforms.

Further reading

For an introduction to Apache Tinkerpop and Gremlin, the graph analytics framework that Db2 Graph is built upon, see these pages:

For an in-depth look at Gremlin and Apache Tinkerpop, see the guide PRACTICAL GREMLIN:An Apache TinkerPop Tutorial.

The Db2 Graph demo

For a visual walk-through of Db2 Graph, see the IBM Db2 Graph demo.