Milvus is an open-source vector database developed by Zilliz. Milvus is known for providing scalable storage for large amounts of vector embeddings and supporting high-performance similarity searches of vector data.
Zilliz first developed Milvus in 2017 and contributed the project to the Linux® Foundation in 2020. Milvus is now available as both an open-source software under the Apache License 2.0 and as a fully managed cloud service from Zilliz.
Vector databases store and manage datasets in the form of vectors. They can help organizations manage unstructured data, and they are critical to advanced artificial intelligence (AI) and machine learning (ML) efforts.
Vectors are arrays of numbers that represent complex concepts and objects, such as words and images.
Unstructured data—such as text, video and audio—makes up a significant portion of enterprise data today, but traditional databases are often ill-suited to organize and manage this data.
Organizations can feed this data to specialized deep learning embedding models, which output vector representations called “embeddings.” For example, the word “cat” might be represented by the vector [0.2, -0.4, 0.7], while the word “dog” might be represented by [0.6, 0.1, 0.5].
Transforming data into vectors enables organizations to store different kinds of unstructured data in a shared format in one vector database.
Vectors also help organizations unlock the value of this data for AI and ML. Vectors capture the semantic relationships between elements, enabling effective processing by large language models (LLMs) and generative AI (gen AI) tools. Most advanced AI and ML applications today rely on vectors for training and content generation.
Like other vector databases, Milvus gives organizations a way to manage and organize embedding vectors. The Milvus vector database’s highly scalable storage and efficient vector search capabilities have made it a popular choice for retrieval augmented generation (RAG), recommendation systems and other AI applications.
Milvus is a cloud-native vector database with a mulitlayered, microservices-based architecture. Milvus separates storage and compute resources, which enables organizations to scale each layer independently and horizontally.
Milvus is compatible with several different embedding models. Organizations can connect their models to Milvus, which ingests the embeddings along with metadata and other pertinent information. Milvus supports streaming and batch embedding uploads.
Milvus has 4 layers:
Access layer: This is the external-facing layer, which accepts inputs from users and services and returns outputs.
Coordinator service: Zilliz refers to this layer as the system’s “brain” because it orchestrates load balancing, data management, query execution and other important tasks.
Worker nodes: This layer executes queries, updates data and builds indexes.
Object storage layer: This layer includes a metadata store, a log broker that records real-time data changes and an object store that holds log snapshots, index files and intermediate computation results.
Milvus supports high-performance vector similarity searches, a type of vector search that returns results that are semantically similar to a query. The benefit of similarity search is that it is not limited to exact matches, as a traditional keyword search would be.
For example, a keyword search for “best pizza restaurant” would return only results containing the words “best”, “pizza” and “restaurant.” A similarity search for the same keyword would find any results for highly recommended pizza places, even if the exact words "best pizza restaurant" are not used in the content.
Milvus supports several similarity search types, including top-k approximate nearest neighbor (ANN) and range ANN.
Milvus also supports hybrid searches, which combine semantic vector searches with other criteria, such as metadata filtering or keyword search.
Hybrid searches can make searches more efficient and more relevant. Consider a search that combines keyword and vector search. The search can first use specific keywords to filter results based on exact matches and then use vector similarity search to search those filtered results for the most semantically relevant content.
Milvus supports several indexing types, including hierarchical navigable small world (HNSW), inverted file (IVF) and GPU-based indexes.
Indexing vectors can help speed up searches. For example, HNSW clusters similar vectors together during the index construction process, making it easier to find relevant results faster.
Milvus’s layers can scale independently of one another, which can give organizations a cost- and resource-effective way to handle massive amounts of vector data and intensive searches.
Milvus offers numerous software development kits (SDKs) to support development in various languages, including Python (pymilvus), Java and Go.
Milvus can also integrate with analytics tools such as Apache Spark, frameworks such as LangChain and gen AI models such as IBM watsonx™, Meta’s Llama and OpenAI’s GPT models.
A large open-source community contributes bug fixes, updates and other enhancements to Milvus’s GitHub repos.
Both Pinecone and Milvus offer low-latency search and scalable storage, but Pinecone is a proprietary vector database and is available only as a managed service.
Like Milvus, Weaviate is open source and supports hybrid searches. One key difference is that Milvus offers more indexing types than Weaviate.
Another open-source vector database, Qdrant is known for its strong metadata filtering capabilities. While Qdrant is well suited for moderate-scale uses, Milvus can generally handle higher volumes of vector data.
Chroma focuses on ease of use and quick local deployments. Chroma does not have a distributed architecture, making it less scalable. While Chroma is commonly used for prototyping and testing, Milvus can support a wider range of use cases.
Organizations use Milvus to support numerous AI applications, including:
Retrieval augmented generation (RAG)
Recommendation systems
Media searches
Anomaly and fraud detection
RAG is an architecture that connects AI models to external knowledge bases to help them deliver more relevant, accurate results.
Milvus is common in RAG implementations because of its support for efficient hybrid searches. By combining the contextual understanding of semantic search with the precision of keyword search and metadata filtering, Milvus can help surface relevant docs, code snippets and other information from RAG sources.
CacheGPT, an open source semantic cache for LLMs developed by Zilliz, also helps Milvus support RAG implementations. CacheGPT stores responses from generative AI apps as vector embeddings.
Connected services—such as RAG interfaces—don’t need to make an API call to the generative AI for every search. Instead, they can check the cache first and call the gen AI only if the answer isn’t there.
Milvus is popular in recommendation systems that match content, products and ads to users based on past behaviors.
User preferences can be represented as vectors, and a similarity search can surface the vector representations of products, ads and content that are close to the user’s preferences.
Milvus's similarity search capabilities can help streamline image search, audio search, video search and other media searches.
Milvus can be used to help spot defects in products by comparing the vectors of product images against the vectors representing those products' proper forms. Differences between the vectors might indicate defects.
Milvus can also help spot anomalies in other contexts. In cybersecurity, vectors representing authorized network activity can be compared to vectors representing known malicious activity. Likewise, in finance, vectors representing transactions can be analyzed to identify deviations that might indicate fraud.
We surveyed 2,000 organizations about their AI initiatives to discover what's working, what's not and how you can get ahead.
IBM® Granite™ is our family of open, performant and trusted AI models tailored for business and optimized to scale your AI applications. Explore language, code, time series and guardrail options.
Access our full catalog of over 100 online courses by purchasing an individual or multi-user subscription today, enabling you to expand your skills across a range of our products at a low price.
Led by top IBM thought leaders, the curriculum is designed to help business leaders gain the knowledge needed to prioritize the AI investments that can drive growth.
Want to get a better return on your AI investments? Learn how scaling gen AI in key areas drives change by helping your best minds build and deliver innovative new solutions.
Learn how to confidently incorporate generative AI and machine learning into your business.
Dive into the three critical elements of a strong AI strategy: creating a competitive edge, scaling AI across the business and advancing trustworthy AI.