Getting started with PyTorch and TensorRT
WML CE 1.6.1 includes a Technology Preview of TensorRT. TensorRT is a C++ library provided by NVIDIA which focuses on running pre-trained networks quickly and efficiently for the purpose of inferencing. Full technical details on TensorRT can be found in the NVIDIA TensorRT Developers Guide.
Installing TensorRT
Support for TensorRT in PyTorch is enabled by default in WML CE 1.6.1 therefore, TensorRT is installed as a requisite when PyTorch is installed.
For detailed instructions about how to install PyTorch see Installing the MLDL frameworks.
TensorRT is also available as a standalone package in WML CE, however those installation details are not covered in this section.
Validate PyTorch and TensorRT installation
You can validate the installation of TensorRT alongside PyTorch, Caffe2, and ONNX by running the following commands, from your Python 3 environment:
python
Python 3.6.7 |Anaconda, Inc.| (default, Oct 23 2018, 19:29:21)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import caffe2
>>> import tensorrt
>>>
If desired, extended validation of the Caffe2, ONNX and TensorRT features found in PyTorch can be
accessed using the caffe2-test
script.
The extended tests can be executed as follows, from your Python 3 environment:
caffe2-test -t trt/test_trt.py
The tests will take a few minutes to complete.
Code Samples for TensorRT
Sample code provided by NVIDIA can be installed as a separate package in WML CE 1.6.1
Installing TensorRT sample code
Install the TensorRT samples into the same virtual environment as PyTorch. From your Python 3 environment:
conda install tensorrt-samples
Install a compatible compiler into the virtual environment. From your Python 3 environment:
conda install gxx_linux-ppc64le=7 # on Power
OR
conda install gxx_linux-64=7 # on x86
If you plan to run the python sample code, you also need to install PyCuda
. From
your Python 3 environment:
pip install pycuda
After installation of the samples has completed, you will find an assortment of C++ and Python
based samples located in the $CONDA_PREFIX/samples/tensorrt
directory.
C++ Samples
Every C++ sample includes a README.md file. Refer to the
$CONDA_PREFIX/samples/tensorrt/<sample-name>/README.md file
for detailed
information about how the sample works, sample code, and step-by-step instructions about how to run
and verify its output.
In addition to the readme files, an online description of the C++ samples can be found on the NVIDIA website.
Find additional information for working with the C++ API at Working With TensorRT Using The C++ API.
Python Samples
Every Python sample includes a README.md file. Refer to the
$CONDA_PREFIX/samples/tensorrt/python/<sample-name>/README.md
file for detailed
information about how the sample works, sample code, and step-by-step instructions on how to run and
verify its output.
In addition to the readme files, an online description of the python samples can be found on the NVIDIA website.
Find additional information for working with the Python API at Using The Python API.
Known Issues
TensorRT support is being provided as a Technology Preview. Some features may be disabled or still "work in progress".
The WML CE team is aware of the following issues:
Due to a compiler mismatch with the NVIDIA supplied TensorRT ONNX Python bindings and the one
used to build the WML CE version of the ONNX
package, attempting to import both tensorrt
and onnx
in a Python
environment results in a segfault. For example:
(my-py3-env) $ python
Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:34:02)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import onnx
>>> import tensorrt
Segmentation fault (core dumped)
Python Samples:
- fc_plugin_caffe_mnist
- Due to a compiler mismatch with the NVIDIA supplied TensorRT ONNX Python bindings and the one used to compile the fc_plugin example code a segfault will occur when attempting to execute the example.
- yolov3_onnx
- This example is currently failing to execute properly, the example code imports both
onnx
andtensorrt
modules resulting in a segfault. The WML CE team is working with NVIDIA to resolve the issue.
C++ Samples:
In order to compile the C++ sample code for use with PyTorch, there are a couple of changes required.
The current version of Makefile.conf may not find the correct compiler.
In order to workaound this problem, modify
$CONDA_PREFIX/samples/tensorrt/Makefile.conf
to correctly find the compiler as
follows:
31 else ifeq ($(TARGET), ppc64le)
32 CUDA_LIBDIR=lib
33 CUDNN_LIBDIR=lib
34 GCC_PREFIX=$(CONDA_PREFIX)/bin/powerpc64le-conda_cos7-linux-gnu-
35 CC = $(GCC_PREFIX)g++
36 CUCC = $(CUDA_INSTALL_DIR)/bin/nvcc -m64
37 else ifeq ($(TARGET), qnx)
- sampleFasterRCNN
- There are missing files for this example (
VGG16_faster_rcnn_final.caffemodel
) so it cannot be executed. - sampleSSD
- There are missing files for this example (ssd.prototxt not found) so it cannot be executed.
- sampleMovielens
- There is an issue with the data file location, execute the following commands from your Python 3
environment, to workaround the
issue:
mkdir -p $CONDA_PREFIX/samples/bin/data/samples/movielens/
cp $CONDA_PREFIX/data/tensorrt/movielens/* $CONDA_PREFIX/samples/bin/data/samples/movielens/
cd $CONDA_PREFIX/samples/bin
./sample_movielens
- sampleMLP
- There is an issue with the data file location execute the following commands from your Python 3
environment, to work around the
issue:
cp $CONDA_PREFIX/data/tensorrt/mnist/*.pgm $CONDA_PREFIX/data/tensorrt/mlp/
cd $CONDA_PREFIX/samples/bin
./sample_mlp -d $CONDA_PREFIX/data/tensorrt/mlp