Commit 0d340a34 authored by Carsten Eie Frigaard's avatar Carsten Eie Frigaard
Browse files

update

parent 4ced2e8a
%% Cell type:markdown id:26fe9c4e tags:
# SWMAL
## Jupyter Notebook WIKI
%% Cell type:markdown id:793c2a07 tags:
### Missing GPU Support in Tensorflow
GPUs are normally detected easily by Tensorflow, you could look for lines like
```
Successfully opened dynamic library libcudart.so.10.1
```
in the output, when running a Jupyter Notebook from the Anaconda Prompt ala
```
> jupyter-notebook
```
My output (Linux system) produces a OK output like
```
2022-03-31 13:39:09.426791: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-03-31 13:39:09.427505: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-03-31 13:39:09.452592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.452795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2022-03-31 13:39:09.452823: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2022-03-31 13:39:09.455163: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2022-03-31 13:39:09.455217: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2022-03-31 13:39:09.458319: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-03-31 13:39:09.458839: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-03-31 13:39:09.461252: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-03-31 13:39:09.462146: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2022-03-31 13:39:09.465156: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7
2022-03-31 13:39:09.465399: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.465604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.465685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
```
but if you just see one unsuccessfull opening of a dynamic library ('cannot open shared object'), all GPU support will be disables. This typically manifest itself directly in the Notebook as a Warning
```
2022-03-31 13:51:46.925659: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64
2022-03-31 13:51:46.925763: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-31 13:51:46.925817: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist
```
NOTE: does Tensorflow still requires to have the package `tensorflow-gpu` installed or is `tensorflow` sufficient nowadays?
### Testing for GPUs
Look for inspiration in the python file
```libitmal/kernelfuns.py```
and try to call the function
```StartupSequence_GPU(verbose=True)```
It this still fails, try to create a function, that checks for GPUs ala:
%% Cell type:code id:ba90ea99 tags:
```
import tensorflow
def MyGPUCheck():
print("MyGPUCheck():..")
physical_devices = tensorflow.config.list_physical_devices('GPU')
n = len(physical_devices)
print(f" found {n} GPU device(s)")
for i in physical_devices:
print(f" {i}")
print("OK")
MyGPUCheck()
```
%% Output
MyGPUCheck():..
found 0 GPU device(s)
OK
2022-03-31 14:08:21.834532: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64
2022-03-31 14:08:21.834582: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-31 14:08:21.834608: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist
%% Cell type:markdown id:7d6393b1 tags:
A succefull call to thelist of physical devices may output something like
```
MyGPUCheck():..
found 1 GPU device(s)
PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')
OK
```
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment