Getting started with PyTorch and TensorRT

WML CE 1.6.1 includes a Technology Preview of TensorRT. TensorRT is a C++ library provided by NVIDIA which focuses on running pre-trained networks quickly and efficiently for the purpose of inferencing. Full technical details on TensorRT can be found in the NVIDIA TensorRT Developers Guide.

Installing TensorRT

Support for TensorRT in PyTorch is enabled by default in WML CE 1.6.1 therefore, TensorRT is installed as a requisite when PyTorch is installed.

For detailed instructions about how to install PyTorch see Installing the MLDL frameworks.

TensorRT is also available as a standalone package in WML CE, however those installation details are not covered in this section.

Validate PyTorch and TensorRT installation

You can validate the installation of TensorRT alongside PyTorch, Caffe2, and ONNX by running the following commands, from your Python 3 environment:

python
Python 3.6.7 |Anaconda, Inc.| (default, Oct 23 2018, 19:29:21)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import caffe2
>>> import tensorrt
>>>

If desired, extended validation of the Caffe2, ONNX and TensorRT features found in PyTorch can be accessed using the caffe2-test script.

The extended tests can be executed as follows, from your Python 3 environment:

caffe2-test -t trt/test_trt.py

The tests will take a few minutes to complete.

Note: During the technology preview, several tests will be skipped; however all other tests are expected to pass.

Code Samples for TensorRT

Sample code provided by NVIDIA can be installed as a separate package in WML CE 1.6.1

Installing TensorRT sample code

Install the TensorRT samples into the same virtual environment as PyTorch. From your Python 3 environment:

conda install tensorrt-samples

Install a compatible compiler into the virtual environment. From your Python 3 environment:

conda install gxx_linux-ppc64le=7  # on Power

OR

conda install gxx_linux-64=7       # on x86

If you plan to run the python sample code, you also need to install PyCuda. From your Python 3 environment:

pip install pycuda

After installation of the samples has completed, you will find an assortment of C++ and Python based samples located in the $CONDA_PREFIX/samples/tensorrt directory.

C++ Samples

Every C++ sample includes a README.md file. Refer to the $CONDA_PREFIX/samples/tensorrt/<sample-name>/README.md file for detailed information about how the sample works, sample code, and step-by-step instructions about how to run and verify its output.

In addition to the readme files, an online description of the C++ samples can be found on the NVIDIA website.

Find additional information for working with the C++ API at Working With TensorRT Using The C++ API.

Python Samples

Every Python sample includes a README.md file. Refer to the $CONDA_PREFIX/samples/tensorrt/python/<sample-name>/README.md file for detailed information about how the sample works, sample code, and step-by-step instructions on how to run and verify its output.

In addition to the readme files, an online description of the python samples can be found on the NVIDIA website.

Find additional information for working with the Python API at Using The Python API.

Known Issues

TensorRT support is being provided as a Technology Preview. Some features may be disabled or still "work in progress".

The WML CE team is aware of the following issues:

Due to a compiler mismatch with the NVIDIA supplied TensorRT ONNX Python bindings and the one used to build the WML CE version of the ONNX package, attempting to import both tensorrt and onnx in a Python environment results in a segfault. For example:

(my-py3-env) $ python
Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:34:02) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import onnx 
>>> import tensorrt
Segmentation fault (core dumped)

Python Samples:

fc_plugin_caffe_mnist
Due to a compiler mismatch with the NVIDIA supplied TensorRT ONNX Python bindings and the one used to compile the fc_plugin example code a segfault will occur when attempting to execute the example.
yolov3_onnx
This example is currently failing to execute properly, the example code imports both onnx and tensorrt modules resulting in a segfault. The WML CE team is working with NVIDIA to resolve the issue.

C++ Samples:

In order to compile the C++ sample code for use with PyTorch, there are a couple of changes required.

The current version of Makefile.conf may not find the correct compiler.

In order to workaound this problem, modify $CONDA_PREFIX/samples/tensorrt/Makefile.conf to correctly find the compiler as follows:

31 else ifeq ($(TARGET), ppc64le)
32 CUDA_LIBDIR=lib
33 CUDNN_LIBDIR=lib
34 GCC_PREFIX=$(CONDA_PREFIX)/bin/powerpc64le-conda_cos7-linux-gnu-
35 CC = $(GCC_PREFIX)g++
36 CUCC = $(CUDA_INSTALL_DIR)/bin/nvcc -m64
37 else ifeq ($(TARGET), qnx)
sampleFasterRCNN
There are missing files for this example (VGG16_faster_rcnn_final.caffemodel) so it cannot be executed.
sampleSSD
There are missing files for this example (ssd.prototxt not found) so it cannot be executed.
sampleMovielens
There is an issue with the data file location, execute the following commands from your Python 3 environment, to workaround the issue:
mkdir -p $CONDA_PREFIX/samples/bin/data/samples/movielens/
cp $CONDA_PREFIX/data/tensorrt/movielens/* $CONDA_PREFIX/samples/bin/data/samples/movielens/
cd $CONDA_PREFIX/samples/bin
./sample_movielens
sampleMLP
There is an issue with the data file location execute the following commands from your Python 3 environment, to work around the issue:
cp $CONDA_PREFIX/data/tensorrt/mnist/*.pgm $CONDA_PREFIX/data/tensorrt/mlp/
cd $CONDA_PREFIX/samples/bin
./sample_mlp -d $CONDA_PREFIX/data/tensorrt/mlp