Prerequisites
-
Red Hat Enterprise Linux Server release 7.5 on the POWER9 or x86_64 architecture.
-
At least one NVIDIA CUDA-capable GPU with at least 8GB GPU memory for x86_64 architecture and at least 10GB GPU memory for Power9.
-
NVIDIA driver version 396.xx or later (the driver must be compatible with CUDA 9.2).
-
Docker 17.12.0-ce or later.
-
NVIDIA Container Runtime for Docker (nvidia-docker) 2.0.3 or later.
-
Docker Compose 1.21.2 or later.
Install the prerequisites
-
NVIDIA GPU driver
a) Go to http://www.nvidia.com/Download/index.aspx
b) Select the options that match your GPU and operating system (if prompted, select 9.2 for CUDA toolkit).
c) Install the driver.
d) Check that the driver installation was successful by typing:
nvidia-smi
You should see output similar to the output below:
Fri Jun 22 13:51:51 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 396.26 Driver Version: 396.26 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla V100-SXM2... On | 00000004:04:00.0 Off | 0 | | N/A 30C P0 38W / 191W | 0MiB / 15360MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla V100-SXM2... On | 00000004:05:00.0 Off | 0 | | N/A 33C P0 35W / 191W | 0MiB / 15360MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 Tesla V100-SXM2... On | 00000035:03:00.0 Off | 0 | | N/A 28C P0 36W / 191W | 0MiB / 15360MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 Tesla V100-SXM2... On | 00000035:04:00.0 Off | 0 | | N/A 35C P0 37W / 191W | 0MiB / 15360MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
-
Docker and NVIDIA Container Runtime for Docker (nvidia-docker)
a) Install Docker:
See https://docs.docker.com/install/
b) Install nvidia-docker
See https://github.com/NVIDIA/nvidia-docker
c) Verify the installation:
$ nvidia-docker version NVIDIA Docker: 2.0.3 Client: Version: 17.12.0-ce API version: 1.35 Go version: go1.9.2 Git commit: 52b8a7c Built: Wed Jan 17 16:35:55 2018 OS/Arch: linux/ppc64le Server: Engine: Version: 17.12.0-ce API version: 1.35 (minimum version 1.12) Go version: go1.9.2 Git commit: 52b8a7c Built: Wed Jan 17 16:42:23 2018 OS/Arch: linux/ppc64le Experimental: false -
Docker Compose
a) Install Docker Compose
See https://docs.docker.com/compose/install/#install-compose
b) Verify the installation:
$ docker-compose version docker-compose version 1.21.2, build a133471 docker-py version: 3.4.0 CPython version: 2.7.5 OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
Unpack the DLE installer archive and start the DLE service
Note: You may need to run some commands as sudo.
-
Obtain the installer archive file for your computer architecture.
-
Select a directory on your DLE host to contain the DLE and copy the installer archive file to this location. For example, /opt/ibm (hereafter referred to as $DLE_HOME).
-
Change to the $DLE_HOME directory and extract the installer archive.
cd $DLE_HOME tar -xvzf 2.5.0.0-IVA-DLE-Cntr-$(COMPUTER_ARCHITECTURE).tar.gz -
Change to the ivadle subdirectory and start the DLE. The first time you run this command, you will be prompted to accept the license agreement and it will take longer as the Docker images are loaded.
cd ivadle ./start.sh -
If you are deploying the DLE on a POWER9 system with V100 GPUs, you can instead run a customized service that allocates additional instances of the person parts detection models:
./start.sh --service_file=docker-compose-ppc64le-v100.yml
Verify the installation
-
Check that the service is running by typing:
docker ps
The output should look similar to the output below (on POWER9, the image names will be slightly different):
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 215b4c17ca39 nginx "nginx -g 'daemon of…" 11 seconds ago Up 6 seconds 80/tcp, 0.0.0.0:14001->14001/tcp ivadle_nginx_1 af8f44411d58 frapps_x86_64 "/bin/sh -c \"frapps\"" 15 seconds ago Up 11 seconds 18003/tcp ivadle_align-extract_1 5e2cedc5e3bc frapps_x86_64 "/bin/sh -c \"frapps\"" 15 seconds ago Up 11 seconds 18002/tcp ivadle_detect-align-extract_1 356570d85482 dlie_caffe_x86_64_gpu:latest "/bin/sh -c \"dlie-se…" 26 seconds ago Up 15 seconds 16004/tcp ivadle_torso-pattern_1 7e8112e51d9a dlie_caffe_x86_64_gpu:latest "/bin/sh -c \"dlie-se…" 26 seconds ago Up 15 seconds 16003/tcp ivadle_face-combined_1 db4e635e1aed dlie_caffe_x86_64_gpu:latest "/bin/sh -c \"dlie-se…" 26 seconds ago Up 15 seconds 15001/tcp ivadle_person-parts_1 465cd1a3c061 dlie_caffe_x86_64_gpu:latest "/bin/sh -c \"dlie-se…" 26 seconds ago Up 15 seconds 16005/tcp ivadle_whole-body-backpack_1 4c69856305bc dlcomp_x86_64 "/bin/sh -c \"dlcomp\"" 26 seconds ago Up 16 seconds 17001/tcp ivadle_face-comp_1 ea0a02537da2 dlie_caffe_x86_64_gpu:latest "/bin/sh -c \"dlie-se…" 26 seconds ago Up 15 seconds 16001/tcp ivadle_face-gender_1 5f69a893ded9 dlie_caffe_x86_64_gpu:latest "/bin/sh -c \"dlie-se…" 26 seconds ago Up 15 seconds 16002/tcp ivadle_face-age_1 c5b2c266e3b1 dlie_caffe_x86_64_gpu:latest "/bin/sh -c \"dlie-se…" 26 seconds ago Up 15 seconds 18001/tcp ivadle_face-recognition_1
-
Run the "check" script. This will feed test images to the DLE and compare the output with the expected output. If the difference is not within a defined margin of error, the tests will fail.
bin/check-default-service.sh
Uninstall the DLE
If you would like to uninstall the DLE, complete the following steps.
cd $DLE_HOME
- Change to the $DLE_HOME directory. As indicated above, $DLE_HOME is one level above the ivadle directory, which contains the DLE installation files.
- Run the uninstall script.
If you only want to delete the DLE Docker containers and images:
ivadle/uninstall.sh
If you want to delete both the DLE Docker containers and all of the DLE installation files:
ivadle/uninstall.sh --full
Deep Learning Orchestrator (DLO) Command line Interface (CLI)
The CLI is an executable file named svc-cfg located in the bin subdirectory that allows you to add or remove models from the DLE configuration files. The usage instructions can be found here: Deep Learning Orchestrator (DLO) Command Line Interface (CLI) Usage Instructions.