Adding a Milvus service

Milvus is a vector database that stores, indexes, and manages massive embedding vectors that are developed by deep neural networks and other machine learning (ML) models. It is developed to empower embedding similarity search and AI applications. Milvus makes unstructured data search more accessible and consistent across various environments.

watsonx.data on IBM Software Hub

watsonx.data Developer edition

About this task

You can add Milvus as a service in IBM® watsonx.data through web console by using the following steps.

Procedure

  1. Log in to watsonx.data console.
  2. From the navigation menu, select Infrastructure Manager.
  3. To define and connect to a service, click Add component and select Add service, select Milvus, and click Next.
  4. In the Add service window, select Milvus from the Type list.In the Add component - Milvus window, provide the following details.
    Field Description
    Display name Enter the Milvus service name to be displayed on the screen.
    Size Select the suitable size.
    • Starter: Recommended for 1 million vectors, 64 index parameters, 1024 segment size, and 1024 dimensions.
    • Small: Recommended for 10 million vectors, 64 index parameters, 1024 segment size, and 1024 dimensions.
    • Medium: Recommended for 50 million vectors, 64 index parameters, 1024 segment size, and 1024 dimensions.
    • Large: Recommended for 100 million vectors, 64 index parameters, 1024 segment size, and 1024 dimensions.
    • Custom: Recommended for upto 3 billion vectors, 64 index parameters, and 1024 segment. The actual number of vectors and dimensions supported depends on the index type and the maximum supported vCPU configuration.

      • IVF_SQ8 - Up to 3 billion vectors.
      • IVF_FLAT - Up to 1.3 billion vectors.
      • HNSW - Up to 1 billion vectors.
    Add storage bucket Associate an external storage for the Starter, Small, Medium, or Large sizes. To associate an external storage, you must have the storage configured. For more information about adding an external storage, see Adding storage.
    Path For external storages, specify the path where you want to store vectorized data files.
    Note: Running multiple Milvus instances that share the same rootPath within a single MinIO bucket is not recommended because it causes their data and metadata to overlap, leading to conflicts, and a high risk of data corruption or loss. To ensure data integrity and isolation, you must configure each Milvus instance with a unique minio.rootPath value in its configuration file before starting, even if they use the same bucket.
    Important:
    • You can scale up Milvus between predefined T-shirt sizes (small, medium, large) or custom sizes. Scaling down Milvus may impact performance when reducing from a higher capacity. If collections no longer fit into memory after scaling down, service might be impacted. In case of a service impact, the only solution is to either drop the collection or scale back up. Even if the service do not crash, the collections that were previously loaded but now exceed available memory may encounter issues.
      Note: Scaling operation introduces a 5 to 10 minutes service delay. Ongoing operations may be disrupted during scaling transitions.
    • If the schema of the collection changes (an increase in the number of fields in a collection or increase in the size of the varchar field beyond 256 characters, or if multiple vector fields are added into the collection), the number of records might decrease.
    • You must provide the endpoint for storages used by Milvus with the region for region-specific storages like S3 and without trailing slashes. For example:
      https://s3.<REGION>.amazonaws.com
    Note: Milvus service can connect to a storage without a catalog. You can perform the actions on Milvus even after disabling the storage.
  5. Click Create.

    Related API: For information on related API, see