Trusted Legal AI Built for Enterprise Scale

Built on IBM watsonx.data the platform combines hybrid retrieval and secure deployment for regulated legal environments

Graphic render of a tech stack featuring structured and unstructured data
Making legal AI accurate, complete and trusted

Legal research demands more than speed, it requires precision, completeness and verifiable citations. Law firms and legal departments working across hundreds of thousands of documents cannot afford partial answers or hallucinations. Getting “70% of the answer” introduces real risk.

Modern AI techniques such as semantic retrieval and retrieval-augmented generation (RAG) offer promise, but scaling them in regulated environments presents challenges. Sensitive legal data - often containing PII or PHI - may need to remain on premises due to regulatory or risk constraints.

Shorthills AI set out to build a production-grade legal AI assistant capable of delivering traceable, multi-angle, citation-backed answers while operating securely at enterprise scale.

60% more complete and accurate results with fewer missed cases 4X more complete responses across multiple query aspects 9X increase in diversity of legal reasoning
We needed a platform that could handle hundreds of thousands of legal documents without compromising speed, accuracy or security. IBM watsonx.data gave us the flexibility and performance to deliver more comprehensive, reliable results for our clients while maintaining enterprise-grade security and governance.
Paramdeep Singh Co-founder Shorthills AI
Building scalable hybrid legal search

Shorthills AI collaborated with IBM to build a production-ready hybrid search platform using IBM® watsonx.data® and Langflow. The team designed a modular architecture combining keyword, vector and graph-based search.

Documents were ingested into a secure data lake, intelligently chunked and enriched with entity extraction (e.g., judge names, case types, citations). Embeddings were stored in watsonx.data to support hybrid search at scale. A routing layer directed queries to the optimal search method—keyword for document IDs, vector for semantic questions and graph for deeper legal reasoning—balancing cost, speed and relevance.

This approach transformed research workflows from manual and fragmented searches to automated, context-aware retrieval with citations and reranking for accuracy.

Faster, deeper and more reliable legal research

With IBM watsonx.data, Shorthills AI delivered measurable improvements: a 4X increase in comprehensiveness, a 9X increase in diversity of legal arguments and over 60% improvement in recall and precision. Search times decreased while relevance improved, enabling legal teams to uncover multiple grounds for rebuttal and supporting case law more reliably.

Researchers now receive broader, citation-backed results instead of partial answers. The platform is scalable, secure and suitable for on-premises deployments required by regulated industries.

Looking ahead, Shorthills AI plans to extend the solution with AI agents that can draft responses, summarize findings and automate downstream workflows—deepening its collaboration with IBM as a long-term technology partner.

About Shorthills AI

Founded in 2018, Shorthills AI is a technology company specializing in generative AI and data engineering solutions. With global presence across USA, India, Canada and Australia, it serves clients across legal, healthcare and enterprise sectors, building scalable AI systems that drive measurable ROI through intelligent automation and advanced search.

Solution component IBM® watsonx.data®
Power enterprise AI with IBM watsonx.data

Explore IBM watsonx.data to see how a hybrid, open data lakehouse can unify and govern your data to power trusted AI and analytics at scale.

  1. Learn more
  2. Explore the free demo
Legal

© Copyright IBM Corporation March, 2026.

IBM, the IBM logo, and IBM watsonx.data® are trademarks of IBM Corp., registered in many jurisdictions worldwide.

Examples presented as illustrative only. Actual results will vary based on client configurations and conditions and, therefore, generally expected results cannot be provided.