PowerAI Vision Inference Server

With a PowerAI Vision Inference server, you can quickly and easily deploy multiple trained models to a single server. These models are portable and can be used by many users and on different systems. This allows you to make trained models available to others, such as customers or collaborators.

Hardware requirements
Software requirements
Installing
Deploying a trained model
Inference
Output
Stopping a deployed model

Hardware requirements

Hardware requirements

For deployment, the amount of memory required depends on the type of model you want to deploy.
Note: When initially deployed, the model uses less memory than will be required when inferences occur using the model. The memory requirements below are based on the amount of memory required when the model is used to perform inferences. To avoid errors, you should always follow these memory requirements.
- A classification model (GoogLeNet) requires about 750 MB GPU memory. For example, for a system with 16 GB memory GPUs, 19-20 image classification models can be deployed.
- An object detection model (Faster R-CNN) requires about 2 GB GPU memory. For example, for a system with 16 GB memory GPUs, 8 object detection models can be deployed.
- A object detection model (tiny YOLO V2) requires about 750 MB GPU memory. For example, for a system with 16 GB memory GPUs, 19-20 image classification models can be deployed.
- A custom classification model based on TensorFlow will take all memory available on a GPU. However, you can deploy it to a GPU that has at least 2GB memory.
- A custom object detection model based on TensorFlow will take all memory available on a GPU. However, you can deploy it to a GPU that has at least 2GB memory.

Software requirements

Linux

Red Hat Enterprise Linux (RHEL) 7.5 (little endian).
Ubuntu 16.04 or later.

NVIDIA CUDA

x86 - 9.2 or later drivers. For information, see the NVIDIA CUDA Toolkit website.
ppc64le - 10.0 or later drivers. For information, see the NVIDIA CUDA Toolkit website.

Docker

RHEL - Docker Version 1.13, or later, which is the version of Docker that is installed with RHEL 7.5.
Ubuntu - Docker CE 18.06.01
When running Docker, nvidia-docker2 is supported. For support of nvidia-docker2 on Docker CE, see Using nvidia-docker2 with PowerAI Vision Inference Server.

Unzip

The unzip package is required on the system to deploy the zipped models.

Installing

Download the install files by using one of these methods:
- Download the product tar file from the IBM Passport Advantage website.
- Download the product tar.gz file from Advanced Administration System (AAS). This system is also called Entitled Software Support (ESS).
Run the appropriate commands to install the product, depending on the platform you are installing on. There are RPM files for installation on RHEL (x86 and ppc64le) and DEB files for installation on Ubuntu (amd64 and ppc64le).
RHEL

rpm -i file_name.rpm

Ubuntu

dpkg -i file_name.deb

Load the product Docker images with the appropriate container's tar file. The file name has this format: powerai-vision-inference-<arch>-containers-<release>.tar, where <arch> is x86 or ppc64le, and <release> is the product version being installed.
```
/opt/powerai-vision/dnn-deploy-service/bin/load_images.sh -f <tar_file>
```

PowerAI Vision Inference Server will be installed at /opt/powerai-vision/dnn-deploy-service.

Deploying a trained model

The following types of models can be deployed: object detection using Faster R-CNN (default), tiny-YOLO V2, and custom TensorFlow models. Image classification using GoogLeNet (default) and custom TensorFlow models. To deploy a model, run this command:

/opt/powerai-vision/dnn-deploy-service/bin/deploy_zip_model.sh

Notes:

The first time you run this command, you are prompted to accept the license agreement.
On a RHEL system with SELinux enabled (default), the loaded model files must have an appropriate SELinux context to be loaded into a container. To ensure a model has the proper context, run:
```
sudo chcon -t svirt_sandbox_file_t <model_path>
```

Usage:

./deploy_zip_model.sh -m <model-name> -p <port> -g <gpu> zipped_model_file

model-name: The docker container name for the deployed model.
port: The port to deploy the model to.
gpu: Optional: The GPU to deploy the model to. If specified as -1, the model will be deployed to CPU.
zipped_model_file: The full path and file name of the trained model that was exported from PowerAI Vision. It can be an image classification model or an object detection model, but must be in zip format.

Examples:

/opt/powerai-vision/dnn-deploy-service/bin/deploy_zip_model.sh --model dog --port 6001 --gpu 1 ./dog_classification.zip
/opt/powerai-vision/dnn-deploy-service/bin/deploy_zip_model.sh --m face -p 6002 /home/user/mydata/face.zip

Inference

Inference can be done by using the deployed model with a local file or an image URL.

Example 1 - Classification:

curl -F "imagefile=@/home/testdata/cocker-spaniel-dogs-puppies-1.jpg" http://localhost:6001/inference

Example 2 - Object detection:

curl -G -d "imageurl=https://assets.imgix.net/examples/couple.jpg" http://localhost:6002/inference

Example 3 – Object detection of tiny YOLO model with confidence threshold:

curl -F "imagefile=@/home/testdata/Chihuahua.jpeg" –F "confthre=0.8" http://localhost:6001/inference

Note: Confidence threshold works for FRCNN and tiny YOLO object detection models and google-net image classification models. The confidence threshold is a value in the range 0.0 - 1.0, treated as a percentage. Only results with a confidence greater than the specified threshold are returned. The smaller confidence threshold you specify, the more results are returned. If you specify 0, many, many results will be returned because there is no filter based on the confidence level of the model. The default confidence threshold is 0.5.

Output

The PowerAI Vision Inference Server can deploy both image classification models and object detection models.

Image classification model

A successful classification will report something similar to the following:

Example 1 output - success

{"classified": {"Cocker Spaniel": 0.93}, "result": "success"}

The image has been classified as a Cocker Spaniel with a confidence of .93.

Example 1 output - fail

{"result": "fail"}

The image could not be classified. This might happen if the image could not be loaded, for example.

Object detection model

A successful detection will report something similar to the following:

Example 2 output - success

{"classified": [{"confidence": 0.94, "ymax": 335, "label": "face", "xmax": 576, 
                  "xmin": 424, "ymin": 160, "attr": []}], "result": "success"}

The faces in the image are located at the specified coordinates. The confidence of each label is given.

Example 2 output - success

{"classified": [], "result": "success"}

Object detection was carried out successfully, but there was nothing to be labeled that has confidence above the threshold.

Example 2 output - fail

{"result": "fail"}

Objects could not be detected. This might happen if the image could not be loaded, for example.

Stopping a deployed model

To stop the deployed model, run the following command. When you stop the deployed model, the GPU memory is made available.

docker stop <model-name>; docker rm <model-name>

Example 1:

docker stop dog; docker rm dog

Example 2:

docker stop face; docker rm face