Db2 Graph

Utilizing the Apache Tinkerpop graph analytics framework, Db2 Graph transforms and optimizes gremlin queries for analyzing data in your Db2 Warehouse database. Using a Graph overlay file that defines each row in a table as either a vertex or an edge, Db2 Graph is able to expose your Db2 Warehouse data to graph queries natively and without third party software.

Important: Db2 Graph is not FIPS 140-2 compliant. Beginning with Cloud Pak for Data 4.6.0, if Db2 Graph is deployed in a FIPS 140-2 environment, it will operate in a mode that disables FIPS enforcement for the execution of Db2 Graph processes.

To meet the challenge of analyzing rapidly growing graph and network data created by modern applications many different graph database implementations are emerging. They mainly target low-latency graph queries, such as finding the neighbors of a vertex with certain properties and retrieving the shortest path between two vertices.

Although many of the graph databases handle graph-only queries well, they fall short for real life applications involving graph analysis. Graph queries are not all that one does in an analytics workload. They are often only a part of an integrated heterogeneous analytics pipeline, which can include SQL, machine learning, graph, and other analytics. Graph queries need to be synergistic with other analytics.

Unfortunately, most existing graph databases are stand-alone and cannot easily integrate with other analytics systems:
  • Customers need to create and maintain data transformation, export, and loading jobs
  • The time to export and load data is time that could be spent analyzing the data to gain insights
  • Access control and auditing become problematic when there are two copies of the data
  • Custom views of graphs require complex logic to create and maintain, as underlying data changes
Db2 Graph solves these challenges by letting you run Gremlin queries on your existing relational data structure to perform graph analytics without requiring any changes to the underlying database structure.

How Db2 Graph works

Db2 Graph is written with the Apache Tinkerpop graph analytics framework. It can transform and optimize Gremlin queries into SQL statements, which get efficiently processed in Db2 Warehouse over a JDBC connection. It works by creating a virtual graph model that defines each row in a table as either a vertex or an edge.

This means that data already stored in Db2 Warehouse can be exposed for graph queries without:

  • exporting the data
  • transforming the data
  • loading the data into a separate graph analytics application

Because the graph queries are running on the data stored in Db2 Warehouse, any new or updated data is made available for graph queries immediately allowing for graph analytics to be performed on transactional data in real time. The vertex and edge definitions are not limited to tables. Custom views in Db2 Warehouse can be used to define vertexes or edges as well. Allowing for instantaneous customization of a graph with no need to maintain complex logic to create or modify edges to get different graph views.

There are also a number of security and audit processes that become much easier with Db2 Graph. Since the data is stored in Db2 Warehouse, and does not need to be exported or transformed, existing practices for data security and auditing can remain in place. Db2 Graph is viewed as another client accessing the data.

Db2 Graph is preconfigured with the Gremlin console and Gremlin server.

Note: For full details about using the Db2 Graph service, see IBM Db2 Graph in the Db2 documentation.