Deploying AI services with code

You can build a customized AI service which is tailored to your generative AI application from the ground up. For example, if you are deploying an asset that uses retrieval augmented generation (RAG), you can capture the logic for retrieving answers from the grounding documents in the AI service.

Methods for deploying AI services with code

You can use the following methods for coding and deploying your AI services:

  1. Coding and deploying AI services manually

    You can create a notebook that contains the AI service and connections within the Project. The AI service captures the logic of your RAG application and contains the generation function, which is a deployable unit of code. The generation function is promoted to the deployment space, which is used to create a deployment. The deployment is exposed as a REST API endpoint that can be accessed by other applications. You can send a request to the REST API endpoint to use the deployed AI service for inferencing. The deployed AI service processes the request and returns a response.

    For more information, see Coding and deploying AI services manually.

  2. Coding and deploying AI services with templates

    You can use pre-defined templates to deploy your AI services in watsonx.ai. AI service templates provide a pre-built foundation for AI applications, enabling developers to focus on the core logic of their application, rather than starting from scratch. By providing a pre-defined structure, configuration, and set of tools, AI service templates simplify the process of deploying AI services, reduce the risk of errors, and improve the overall efficiency and consistency of AI development and deployment.

    For more information, see Coding and deploying AI services with templates.

Choosing the right method for deployment

There are two approaches to deploying AI services: coding manually and developer templates. Each approach has its advantages and disadvantages. The choice of deployment approach depends on the specific needs and requirements of the project. Developer templates are suitable for simple deployments with limited customization needs, while manual coding is suitable for complex deployments with high customization needs.

The following table provides a comparison summary of the three approaches for deploying AI services with code:

Key differences
Approach Ease of use Customization Scalability Time-to-market
Manual coding Difficult Full High Slow
Developer templates Easy Limited Limited Fast

Learn more

Parent topic: Deploying AI services