Windows Server 2019 Data Science Virtual Machine: Conv2D not working in py38_tensorflow conda environment when GPU turned on

Question

The Conv2D layer in tensorflow is not working when GPU is turned on. The operating system is Windows (Windows Server 2019 Datacenter). The size is Standard NV6_Promo (6 vcpus, 56 GiB memory) in East US 2.

This means one cannot use Conv2D layers in tensorflow in training for image classification. Note that tensorflow Dense layers work with GPU.

I have created a simple stand-alone example.

tensorflow_test_Conv2D.py

compute application of Conv2D layer on tensorflow tensor

if following lines are uncommented, then code turns off GPU

if following lines are commented, then code uses GPU

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

import tensorflow as tf

if name == "main":
# create tensorflow object start with 2d object and expand dims to get 4d object
x = tf.constant([[1.0, 2.02, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]])
x = tf.expand_dims(x, axis=0)
x = tf.expand_dims(x, axis=3)
print("input tensor shape: {} and value {}".format(x.shape, x))
# apply convolutional layer
layer = tf.keras.layers.Conv2D(1, (2, 2), input_shape=(3, 3, 1))
y = layer(x)
print("Conv2D(x) shape: {} and value {}".format(y.shape, y))

Program runs when GPU turned off (uncomment the os lines). Program fails when GPU turned on. Here is error message:

2021-07-15 04:49:36.851512: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll
2021-07-15 04:49:37.199399: E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2021-07-15 04:49:37.215405: E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
Traceback (most recent call last):
File "tensorflow_test_Conv2D.py", line 19, in
y = layer(x)
File "c:\Miniconda\envs\py38_tensorflow\lib\site-packages ensorflow\python\keras\engine\base_layer.py", line 1030, in call
outputs = call_fn(inputs, *args, **kwargs)
File "c:\Miniconda\envs\py38_tensorflow\lib\site-packages ensorflow\python\keras\layers\convolutional.py", line 249, in call
outputs = self._convolution_op(inputs, self.kernel)
File "c:\Miniconda\envs\py38_tensorflow\lib\site-packages ensorflow\python\util\dispatch.py", line 206, in wrapper
return target(*args, **kwargs)
File "c:\Miniconda\envs\py38_tensorflow\lib\site-packages ensorflow\python\ops n_ops.py", line 1012, in convolution_v2
return convolution_internal(
File "c:\Miniconda\envs\py38_tensorflow\lib\site-packages ensorflow\python\ops n_ops.py", line 1142, in convolution_internal
return op(
File "c:\Miniconda\envs\py38_tensorflow\lib\site-packages ensorflow\python\ops n_ops.py", line 2596, in _conv2d_expanded_batch
return gen_nn_ops.conv2d(
File "c:\Miniconda\envs\py38_tensorflow\lib\site-packages ensorflow\python\ops\gen_nn_ops.py", line 931, in conv2d
_ops.raise_from_not_ok_status(e, name)
File "c:\Miniconda\envs\py38_tensorflow\lib\site-packages ensorflow\python\framework\ops.py", line 6897, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

Answer

Hello:
I was able overcome the problem by doing some installations myself.
For the same Windows Data Science Virtual Machine, I did the following:
(1) Upgraded the NIVIDIA driver for the Tesla M60
(2) Created a new Anaconda environment with python 3.8
(3) I followed the approach in Youtube video https://www.youtube.com/watch?v=toJe8ZbFhEc
to install cudatoolkit=11.2 and cuDNN=8.1 and tensorflow-gpu=2.5.0 (this approach installs cudatoolkit and cuDNN using conda)
(4) Made some other installs (matplotlib, etc)
The Conv2D layer does work after this upgrade of the code.
Thanks, Satish Reddy

Answer

@Satish Reddy Thanks for the question. Can you please add more details about the Tensorflow version that you are trying. Based on the failed unknown error, Please follow if you are using Conda environments, by installing tensorflow-gpu and not CUDAtoolkit nor cuDNN because they are already installed by tensorflow-gpu (see this answer). Note though, that new conda tensorflow-gpu versions may not install CUDAtoolkit or cuDNN -> the solution is to install a lower version of tensorflow-gpu and then upgrade it with pip (see this answer).

Answer

@Satish Reddy Thanks, We have forwarded to the product team. We will preinstall newer version of CuDNN automatically to the new images of DSVM for Windows.

Share via

Windows Server 2019 Data Science Virtual Machine: Conv2D not working in py38_tensorflow conda environment when GPU turned on

tensorflow_test_Conv2D.py

compute application of Conv2D layer on tensorflow tensor

if following lines are uncommented, then code turns off GPU

if following lines are commented, then code uses GPU

3 answers