WML CE tuning recommendations
Find recommended settings for optimal deep learning performance on the S822LC and AC922 for High-Performance Computing.
Enable Performance Governor
sudo yum install kernel-tools
sudo cpupower -c all frequency-set -g performance
Enable GPU persistence mode
sudo systemctl enable nvidia-persistenced
sudo systemctl start nvidia-persistenced
Set GPU memory and graphics clocks
S822LC with NVIDIA Tesla P100, set clocks to maximum
sudo nvidia-smi -ac 715,1480
AC922 with NVIDIA Tesla V100, set clocks to NVIDIA defaults
sudo nvidia-smi -rac
For TensorFlow, set the SMT mode
S822LC with NVIDIA Tesla P100, set SMT=2.
sudo ppc64_cpu --smt=2
AC922 with NVIDIA Tesla V100, set SMT based on DDL usage:
sudo ppc64_cpu --smt=4 # for TensorFlow WITHOUT DDL
sudo ppc64_cpu --smt=2 # for TensorFlow WITH DDL
When running multiple TensorFlow jobs on an AC922 or S822LC, both the operating system process limit and the number of threads in TensorFlow thread pools should be taken into consideration. In some cases, the high number of threads in these pools can negatively impact performance. For instructions to configure the number of threads in the TensorFlow thread pools, see the TensorFlow Manual tuning topic.