Overview
In this guide, we’ll use Ollama, an open-source tool that makes it easy to download and run AI models locally.1. Install Ollama
Install Ollama for Linux with:systemd service named ollama.service to run the server in the background.
To manage the service manually:
2. Download Granite Models
Ollama supports a range of IBM Granite models. Larger models provide better results but require more resources. To download Granite 4:3. Run Granite
To start chatting with Granite:granite4).
4. Notes on Context Length
By default, Ollama runs models with a short context length to save memory.
For longer conversations, you can adjust it by setting:
5. Using the API
You can also interact with Granite programmatically using Ollama’s OpenAI-compatible API:6. Adding documents
You can add documents to your context using the special"document <title>" role: