"GPUs are normally detected easily by Tensorflow, you could look for lines like\n",
"\n",
"```\n",
" Successfully opened dynamic library libcudart.so.10.1\n",
"``` \n",
"\n",
"in the output, when running a Jupyter Notebook from the Anaconda Prompt ala\n",
"\n",
"```\n",
" > jupyter-notebook \n",
"```\n",
"\n",
"My output (Linux system) produces a OK output like\n",
"\n",
"```\n",
" 2022-03-31 13:39:09.426791: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set\n",
" 2022-03-31 13:39:09.427505: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1\n",
" 2022-03-31 13:39:09.452592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero\n",
" 2022-03-31 13:39:09.452795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: \n",
" 2022-03-31 13:39:09.452823: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1\n",
" 2022-03-31 13:39:09.455163: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10\n",
" 2022-03-31 13:39:09.455217: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10\n",
" 2022-03-31 13:39:09.458319: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10\n",
" 2022-03-31 13:39:09.458839: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10\n",
" 2022-03-31 13:39:09.461252: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10\n",
" 2022-03-31 13:39:09.462146: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10\n",
" 2022-03-31 13:39:09.465156: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7\n",
" 2022-03-31 13:39:09.465399: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero\n",
" 2022-03-31 13:39:09.465604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero\n",
" 2022-03-31 13:39:09.465685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0\n",
"```\n",
"\n",
"but if you just see one unsuccessfull opening of a dynamic library ('cannot open shared object'), all GPU support will be disables. This typically manifest itself directly in the Notebook as a Warning\n",
"\n",
"```\n",
" 2022-03-31 13:51:46.925659: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64\n",
" 2022-03-31 13:51:46.925763: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)\n",
" 2022-03-31 13:51:46.925817: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist\n",
"```\n",
"\n",
"NOTE: does Tensorflow still requires to have the package `tensorflow-gpu` installed or is `tensorflow` sufficient nowadays?\n",
"\n",
"### Testing for GPUs\n",
"\n",
"Look for inspiration in the python file \n",
"\n",
"```libitmal/kernelfuns.py```\n",
" \n",
"and try to call the function \n",
"\n",
"```StartupSequence_GPU(verbose=True)```\n",
"\n",
"It this still fails, try to create a function, that checks for GPUs ala:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ba90ea99",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"MyGPUCheck():..\n",
" found 0 GPU device(s)\n",
"OK\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2022-03-31 14:08:21.834532: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64\n",
"2022-03-31 14:08:21.834582: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)\n",
"2022-03-31 14:08:21.834608: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist\n"
GPUs are normally detected easily by Tensorflow, you could look for lines like
```
Successfully opened dynamic library libcudart.so.10.1
```
in the output, when running a Jupyter Notebook from the Anaconda Prompt ala
```
> jupyter-notebook
```
My output (Linux system) produces a OK output like
```
2022-03-31 13:39:09.426791: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-03-31 13:39:09.427505: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-03-31 13:39:09.452592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.452795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
2022-03-31 13:39:09.452823: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2022-03-31 13:39:09.455163: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2022-03-31 13:39:09.455217: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2022-03-31 13:39:09.458319: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-03-31 13:39:09.458839: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-03-31 13:39:09.461252: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-03-31 13:39:09.462146: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2022-03-31 13:39:09.465156: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7
2022-03-31 13:39:09.465399: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.465604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.465685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
```
but if you just see one unsuccessfull opening of a dynamic library ('cannot open shared object'), all GPU support will be disables. This typically manifest itself directly in the Notebook as a Warning
```
2022-03-31 13:51:46.925659: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64
2022-03-31 13:51:46.925763: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-31 13:51:46.925817: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist
```
NOTE: does Tensorflow still requires to have the package `tensorflow-gpu` installed or is `tensorflow` sufficient nowadays?
### Testing for GPUs
Look for inspiration in the python file
```libitmal/kernelfuns.py```
and try to call the function
```StartupSequence_GPU(verbose=True)```
It this still fails, try to create a function, that checks for GPUs ala:
2022-03-31 14:08:21.834532: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64
2022-03-31 14:08:21.834582: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-31 14:08:21.834608: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist
%% Cell type:markdown id:7d6393b1 tags:
A succefull call to thelist of physical devices may output something like