Inference

Using the elastic distributed inference capability in WML Accelerator, users can create inference services and deploy published models.

Elastic distributed inference is a secure, robust, scalable inference service which exposes WML Accelerator REST API for users to publish and manage inference services, for REST clients to consume the service, and for administrators to manage the service. An inference service can be used for inference by any authorized client.

The elastic distributed inference feature can host models for all projects (or for each line of business). Models are developed and published by developers for a specific project. Published models can run as an inference service. Inference services can be started and stopped.