What is Milvus?

6 January 2025

Authors

Matthew Kosinski

Enterprise Technology Writer

What is Milvus?

Milvus is an open-source vector database developed by Zilliz. Milvus is known for providing scalable storage for large amounts of vector embeddings and supporting high-performance similarity searches of vector data.

Zilliz first developed Milvus in 2017 and contributed the project to the Linux® Foundation in 2020. Milvus is now available as both an open-source software under the Apache License 2.0 and as a fully managed cloud service from Zilliz.

What are vector databases, and why do they matter?

Vector databases store and manage datasets in the form of vectors. They can help organizations manage unstructured data, and they are critical to advanced artificial intelligence (AI) and machine learning (ML) efforts.

Vectors are arrays of numbers that represent complex concepts and objects, such as words and images.  

Unstructured data—such as text, video and audio—makes up a significant portion of enterprise data today, but traditional databases are often ill-suited to organize and manage this data.  

Organizations can feed this data to specialized deep learning embedding models, which output vector representations called “embeddings.” For example, the word “cat” might be represented by the vector [0.2, -0.4, 0.7], while the word “dog” might be represented by [0.6, 0.1, 0.5].

Transforming data into vectors enables organizations to store different kinds of unstructured data in a shared format in one vector database.  

Vectors also help organizations unlock the value of this data for AI and ML. Vectors capture the semantic relationships between elements, enabling effective processing by large language models (LLMs) and generative AI (gen AI) tools. Most advanced AI and ML applications today rely on vectors for training and content generation.

Like other vector databases, Milvus gives organizations a way to manage and organize embedding vectors. The Milvus vector database’s highly scalable storage and efficient vector search capabilities have made it a popular choice for retrieval augmented generation (RAG), recommendation systems and other AI applications.  

3D design of balls rolling on a track

The latest AI News + Insights 


Discover expertly curated insights and news on AI, cloud and more in the weekly Think Newsletter. 

Milvus's architecture

Milvus is a cloud-native vector database with a mulitlayered, microservices-based architecture. Milvus separates storage and compute resources, which enables organizations to scale each layer independently and horizontally.

Milvus is compatible with several different embedding models. Organizations can connect their models to Milvus, which ingests the embeddings along with metadata and other pertinent information. Milvus supports streaming and batch embedding uploads.

Milvus has 4 layers:

  • Access layer: This is the external-facing layer, which accepts inputs from users and services and returns outputs.  

  • Coordinator service: Zilliz refers to this layer as the system’s “brain” because it orchestrates load balancing, data management, query execution and other important tasks. 

  • Worker nodes: This layer executes queries, updates data and builds indexes. 

  • Object storage layer: This layer includes a metadata store, a log broker that records real-time data changes and an object store that holds log snapshots, index files and intermediate computation results.

Milvus deployment types

  • Milvus Lite: A Python library that allows users to run Milvus in local environments. Milvus Lite currently supports Ubuntu and MacOS, but not Microsoft Windows.

  • Milvus Standalone: A complete Milvus database packaged in a single Docker image and run on a single machine. 

  • Milvus Cluster: A distributed vector database that spreads services across groups of nodes in a Kubernetes cluster.  

  • Zilliz Cloud: The fully managed version of Milvus.
Mixture of Experts | 27 February, episode 44

Decoding AI: Weekly News Roundup

Join our world-class panel of engineers, researchers, product leaders and more as they cut through the AI noise to bring you the latest in AI news and insights.

Key characteristics and capabilities of Milvus

Advanced search capabilities 

Milvus supports high-performance vector similarity searches, a type of vector search that returns results that are semantically similar to a query. The benefit of similarity search is that it is not limited to exact matches, as a traditional keyword search would be.  

For example, a keyword search for “best pizza restaurant” would return only results containing the words “best”, “pizza” and “restaurant.” A similarity search for the same keyword would find any results for highly recommended pizza places, even if the exact words "best pizza restaurant" are not used in the content.

Milvus supports several similarity search types, including top-k approximate nearest neighbor (ANN) and range ANN.  

Milvus also supports hybrid searches, which combine semantic vector searches with other criteria, such as metadata filtering or keyword search.

Hybrid searches can make searches more efficient and more relevant. Consider a search that combines keyword and vector search. The search can first use specific keywords to filter results based on exact matches and then use vector similarity search to search those filtered results for the most semantically relevant content.  

Indexing 

Milvus supports several indexing types, including hierarchical navigable small world (HNSW), inverted file (IVF) and GPU-based indexes.

Indexing vectors can help speed up searches. For example, HNSW clusters similar vectors together during the index construction process, making it easier to find relevant results faster.

Scalability 

Milvus’s layers can scale independently of one another, which can give organizations a cost- and resource-effective way to handle massive amounts of vector data and intensive searches.

Integrations and compatibility 

Milvus offers numerous software development kits (SDKs) to support development in various languages, including Python (pymilvus), Java and Go.

Milvus can also integrate with analytics tools such as Apache Spark, frameworks such as LangChain and gen AI models such as IBM watsonx™, Meta’s Llama and OpenAI’s GPT models. 

Open source

A large open-source community contributes bug fixes, updates and other enhancements to Milvus’s GitHub repos.

Milvus vs. other vector databases

Milvus vs. Pinecone

Both Pinecone and Milvus offer low-latency search and scalable storage, but Pinecone is a proprietary vector database and is available only as a managed service.

Milvus vs. Weaviate

Like Milvus, Weaviate is open source and supports hybrid searches. One key difference is that Milvus offers more indexing types than Weaviate.

Milvus vs. Qdrant

Another open-source vector database, Qdrant is known for its strong metadata filtering capabilities. While Qdrant is well suited for moderate-scale uses, Milvus can generally handle higher volumes of vector data.

Milvus vs. Chroma

Chroma focuses on ease of use and quick local deployments. Chroma does not have a distributed architecture, making it less scalable. While Chroma is commonly used for prototyping and testing, Milvus can support a wider range of use cases.

Common Milvus use cases

Organizations use Milvus to support numerous AI applications, including:

  • Retrieval augmented generation (RAG)

  • Recommendation systems

  • Media searches 

  • Anomaly and fraud detection

Retrieval augmented generation (RAG)

RAG is an architecture that connects AI models to external knowledge bases to help them deliver more relevant, accurate results.  

Milvus is common in RAG implementations because of its support for efficient hybrid searches. By combining the contextual understanding of semantic search with the precision of keyword search and metadata filtering, Milvus can help surface relevant docs, code snippets and other information from RAG sources.

CacheGPT, an open source semantic cache for LLMs developed by Zilliz, also helps Milvus support RAG implementations. CacheGPT stores responses from generative AI apps as vector embeddings.

Connected services—such as RAG interfaces—don’t need to make an API call to the generative AI for every search. Instead, they can check the cache first and call the gen AI only if the answer isn’t there.

Recommendation systems

Milvus is popular in recommendation systems that match content, products and ads to users based on past behaviors. 

User preferences can be represented as vectors, and a similarity search can surface the vector representations of products, ads and content that are close to the user’s preferences.

Media searches

Milvus's similarity search capabilities can help streamline image search, audio search, video search and other media searches.

Anomaly and fraud detection

Milvus can be used to help spot defects in products by comparing the vectors of product images against the vectors representing those products' proper forms. Differences between the vectors might indicate defects.

Milvus can also help spot anomalies in other contexts. In cybersecurity, vectors representing authorized network activity can be compared to vectors representing known malicious activity. Likewise, in finance, vectors representing transactions can be analyzed to identify deviations that might indicate fraud.

Related solutions
IBM watsonx.ai

Train, validate, tune and deploy generative AI, foundation models and machine learning capabilities with IBM watsonx.ai, a next-generation enterprise studio for AI builders. Build AI applications in a fraction of the time with a fraction of the data.

Discover watsonx.ai
Artificial intelligence solutions

Put AI to work in your business with IBM’s industry-leading AI expertise and portfolio of solutions at your side.

Explore AI solutions
AI consulting and services

Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.

Explore AI services
Take the next step

Get one-stop access to capabilities that span the AI development lifecycle. Produce powerful AI solutions with user-friendly interfaces, workflows and access to industry-standard APIs and SDKs.

Explore watsonx.ai Book a live demo