PyCUDA is a great library if you want to use gpu computing with NVIDIA chips. If you want a more portable approach or if you have ATI chips instead of NVIDIA, then you might consider PyOpenCl instead of PyCUDA. I provided instructions on how to install PyOpenCl on Anaconda for Windows in a previous entry.
Installing PyCUDA on Anaconda for Windows can be tricky. Here is what you can do, it worked fine for me. I am using the latest Anaconda distribution with Python 3.5 in it.
- Install CUDA toolkit. You have to select the right combination of Windows version. For me it was Windows 7, 64 bits, and a local installer:
- Install a Visual C++ version. I have Visual Studio 2015 installed but it is not yet supported by PyCUDA, hence I resorted to my Visual Studio 2013.
Add the path to its executable in your Path variable. For me it was:
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin
Setting environment variables in Windows depends on the version you are using. see for instance here.
- Install the latest Anaconda distribution with Python 3.5 in it. I used the 64 bits version. Download starts when you click on this blue button:
- Get the the PyCUDA prebuilt binary from Christoph Golke (his page is an invaluable resource for installing Python packages on Windows). Select the right combination of OS and CPython version. For me it was pycuda-2015.1.3+cuda7518-cp35-none-win_amd64.whl
- Use pip to install the above package. Simply type this in your Anaconda prompt
pip install <directory>\pycuda-2015.1.3+cuda7518-cp35-none-win_amd64.whl
where <directory> is where you downloaded PyCUDA. For me it was:
[Anaconda3] C:\Users\IBM_ADMIN>pip install c:\Users\IBM_ADMIN\Downloads\pycuda-2015.1.3+cuda7518-cp35-none-win_amd64.whl
Then you should be all set!
You can test your installation by running the hello_gpu.py displayed on the front page of PyCUDA documentation. If you get an array full of zeros then you're fine.
I could run the code in a notebook without problem:
This looks promising to me. I used my PyCUDA installation to compute Mandelbrot sets more efficiently, see How To Quickly Compute The Mandelbrot Set In Python .
I hope this will help readers willing to use PyCUDA and Anaconda on Windows.
In writing the above I found this page to be useful: Up and running with Theano (GPU) + PyCUDA on Windows. However, the pipwin step did not work for me. I could also install Theano, but my NVIDIA gpu is too old for it, hence Theano does not use PyCUDA on my machine.
If the above does not wok for you, for instance if you do not want to use Anaconda, or cannot use Microsoft Visual Studio, then you may get help from these: