06 February, 2025
Graph retrieval augmented generation (Graph RAG) is emerging as a powerful technique for generative AI applications to use domain-specific knowledge and relevant information. Graph RAG is an alternative to vector search methods that use a vector database. Knowledge graphs are knowledge systems where graph databases such as Neo4j or Amazon Neptune can represent structured data. In a knowledge graph, the relationships between data points, called edges, are as meaningful as the connections between data points, called vertices or sometimes nodes. A knowledge graph makes it easy to traverse a network and process complex queries about connected data. Knowledge graphs are especially well suited for use cases involving chatbots, identity resolution, network analysis, recommendation engines, customer 360 and fraud detection.
A Graph RAG approach leverages the structured nature of graph databases to give greater depth and context of retrieved information about networks or complex relationships. When a graph database is paired with a large language model (LLM), a developer can automate significant parts of the graph creation process from unstructured data like text. An LLM can process text data and identify entities, understand their relationships and represent them in a graph structure.
There are many ways to create a Graph RAG application, for instance Microsoft’s GraphRAG, or pairing GPT4 with LlamaIndex. For this tutorial you’ll use Memgraph, an open source graph database solution to create a rag system by using Meta’s Llama-3 on watsonx. Memgraph uses Cypher, a declarative query language. It shares some similarities with SQL but focuses on nodes and relationships rather than tables and rows. You’ll have Llama 3 both create and populate your graph database from unstructured text and query information in the database.
While you can choose from several tools, this tutorial walks you through how to set up an IBM account to use a Jupyter Notebook.
Log in to watsonx.ai™ using your IBM Cloud® account.
Create a watsonx.ai project.
You get your project ID from within your project. Click the Manage tab. Then, copy the project ID from the Details section of the General page. You need this Project ID for this tutorial.
Next, associate your project with the watsonx.ai Runtime
a. Create a watsonx.ai Runtime service instance (choose the Lite plan, which is a free instance).
b. Generate an API Key in watsonx.ai Runtime. Save this API key for use in this tutorial.
c. Go to your project and select the Manage tab
d. In the left tab, select Services and Integrations
e. Select IBM services
f. Select Associate service and pick watsonx.ai Runtime.
g. Associate the watsonx.ai Runtime to the project that you created in watsonx.ai
Now, you'll need to install Docker from https://www.docker.com/products/docker-desktop/
Once you've installed Docker, install Memgraph using their Docker container. On OSX or Linux, you can use this command in a terminal:
On a Windows computer use:
Follow the installation steps to get the Memgraph engine and Memgraph lab up and running.
On your computer, create a fresh virtualenv for this project:
In the Python environment for your notebook, install the following Python libraries:
Now you're ready to connect to Memgraph.
If you've configured Memgraph to use a username and password, set them here, otherwise you can use the defaults of having neither. It's not good practice for a production database but for a local development environment that doesn't store sensitive data, it's not an issue.
Now create a sample string that describes a dataset of relationships that you can use to test the graph generating capabilities of your LLM system. You can use more complex data sources but this simple example helps us demonstrate the algorithm.
Enter the watsonx API key that you created in the first step:
Now configure a WatsonxLLM instance to generate text. The temperature should be fairly low and the number of tokens high to encourage the model to generate as much detail as possible without hallucinating entities or relationships that aren't present.
The LLMGraphTransformer allows you to set what kinds of nodes and relationships you'd like the LLM to generate. In your case, the text describes employees at a company, the groups they work in and their job titles. Restricting the LLM to just those entities makes it more likely that you'll get a good representation of the knowledge in a graph.
The call to convert_to_graph_documents has the LLMGraphTransformer create a knowledge graph from the text. This step generates the correct Neo4j syntax to insert the information into the graph database to represent the relevant context and relevant entities.
Now clear any old data out of the Memgraph database and insert the new nodes and edges.
The generated Cypher syntax is stored in the graph_documents objects. You can inspect it simply by printing it as a string.
The schema and data types created by the Cypher can be seen in the graphs `get_schema` property.
This prints out:
You can also see the graph structure in the Memgraph labs viewer:
The LLM has done a reasonable job of creating the correct nodes and relationships. Now it's time to query the knowledge graph.
Prompting the LLM correctly requires some prompt engineering. LangChain provides a FewShotPromptTemplate that can be used to give examples to the LLM in the prompt to ensure that it writes correct and succinct Cypher syntax. The following code gives several examples of questions and queries that the LLM should use. It also shows constraining the output of the model to only the query. An overly chatty LLM might add in extra information that would lead to invalid Cypher queries, so the prompt template asks the model to output only the query itself.
Adding an instructive prefix also helps to constrain the model behavior and makes it more likely that the LLM will output correct Cypher syntax.
Next, you'll create a prompt to control how the LLM answers the question with the information returned from Memgraph. We'll give the LLM several examples and instructions on how to respond once it has context information back from the graph database.
Now it's time to create the question answering chain. The MemgraphQAChain allows you to set which LLM you'd like to use, the graph schema to be used and information about debugging. Using a temperature of 0 and a length penalty encourages the LLM to keep the Cypher prompt short and straightforward.
Now you can invoke the chain with a natural language question (note that your responses might be slightly different because LLMs are not purely deterministic).
This will output:
> Entering new MemgraphQAChain chain... Generated Cypher: MATCH (p:Person {id: 'John'})-[:TITLE]->(t:Title) RETURN t.id Full Context: [{'t.id': 'Director of the Digital Marketing Group'}] > Finished chain. {'query': 'What is Johns title?', 'result': ' \nAnswer: Director of the Digital Marketing Group.', 'intermediate_steps': [{'query': " MATCH (p:Person {id: 'John'})-[:TITLE]->(t:Title) RETURN t.id"}, {'context': [{'t.id': 'Director of the Digital Marketing Group'}]}]}
In the next question, ask the chain a slightly more complex question:
This should return:
The correct answer is contained in the response. In some cases there may be extra text that you would want to remove before returning the answer to an end user.
You can ask the Memgraph chain about Group relationships:
This will return:
This is the correct answer.
Finally, ask the chain a question with two outputs:
This should output:
The chain correctly identifies both of the collaborators.
In this tutorial, you built a Graph RAG application using Memgraph and watsonx to generate the graph data structures and query them. Using an LLM through watsonx you extracted node and edge information from natural language source text and generated Cypher query syntax to populate a graph database. You then used watsonx to turn natural language questions about that source text into Cypher queries that extracted information from the graph database. Using prompt engineering the LLM turned the results from the Memgraph database into natural language responses.