Commit 687a0b32 authored by Carsten Eie Frigaard's avatar Carsten Eie Frigaard
Browse files

update

parent 5485544f
%% Cell type:markdown id:26fe9c4e tags:
# SWMAL
## Jupyter Notebook WIKI
%% Cell type:markdown id:191a5d7f tags:
### Check Exitsing Versions
Run this to see installed packages and their versions.
%% Cell type:code id:e20109f2 tags:
``` python
from libitmal import versions
print("PRINT VERSIONS..\n")
versions.Versions()
print("\nOK")
```
%% Output
PRINT VERSIONS..
Python version: 3.9.7.
Scikit-learn version: 0.24.2.
Keras version: 2.4.3
Tensorflow version: 2.4.1
Tensorflow.keras version: 2.4.0
Opencv2 version: 4.5.1
OK
%% Cell type:code id:bb656cbc-c194-417c-93c3-800ef8cdbd2d tags:
``` python
print("\nPRINT ENVIRONMNETS (only works on a Linux platform)..\n")
! echo "ENV SETUP:"
! echo " PATH = $PATH"
! echo " PYTHONPATH = $PYTHONPATH"
! echo " CONDA_ROOT = $CONDA_ROOT"
! echo " CONDA_DEFAULT_ENV= $CONDA_DEFAULT_ENV"
! echo " VIRTUAL_ENV = $VIRTUAL_ENV"
! echo " LC_ALL = $LC_ALL"
! echo -n " which conda = " ; which conda
! echo -n " conda info = " ; conda info
! echo -n " hostname = " ; hostname
```
%% Output
PRINT ENVIRONMNETS (only works on a Linux platform)..
ENV SETUP:
PATH = /opt/anaconda-2021.11/bin:/opt/anaconda-2021.11/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/bin
PYTHONPATH =
CONDA_ROOT = /opt/anaconda-2021.11/
CONDA_DEFAULT_ENV=
VIRTUAL_ENV =
LC_ALL =
which conda = /opt/anaconda-2021.11/bin/conda
conda info =
active environment : None
user config file : /home/cef/.condarc
populated config files : /home/cef/.condarc
conda version : 4.12.0
conda-build version : 3.21.5
python version : 3.9.7.final.0
virtual packages : __linux=5.8.0=0
__glibc=2.31=0
__unix=0=0
__archspec=1=x86_64
base environment : /opt/anaconda-2021.11 (read only)
conda av data dir : /opt/anaconda-2021.11/etc/conda
conda av metadata url : None
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /opt/anaconda-2021.11/pkgs
/home/cef/.conda/pkgs
envs directories : /home/cef/.conda-2021.11/envs
/home/cef/.conda/envs
/opt/anaconda-2021.11/envs
platform : linux-64
user-agent : conda/4.12.0 requests/2.26.0 CPython/3.9.7 Linux/5.8.0-50-generic ubuntu/20.04.3 glibc/2.31
UID:GID : 1000:1000
netrc file : None
offline mode : False
hostname = leno
%% Cell type:markdown id:793c2a07 tags:
### Missing GPU Support in Tensorflow
GPUs are normally detected easily by Tensorflow, you could look for lines like
```
Successfully opened dynamic library libcudart.so.10.1
```
in the output, when running a Jupyter Notebook from the Anaconda Prompt ala
```
> jupyter-notebook
```
My output (Linux system) produces a OK output like
```
2022-03-31 13:39:09.426791: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-03-31 13:39:09.427505: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-03-31 13:39:09.452592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.452795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2022-03-31 13:39:09.452823: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2022-03-31 13:39:09.455163: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2022-03-31 13:39:09.455217: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2022-03-31 13:39:09.458319: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-03-31 13:39:09.458839: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-03-31 13:39:09.461252: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-03-31 13:39:09.462146: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2022-03-31 13:39:09.465156: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7
2022-03-31 13:39:09.465399: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.465604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.465685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
```
but if you just see one unsuccessfull opening of a dynamic library ('cannot open shared object'), all GPU support will be disables. This typically manifest itself directly in the Notebook as a Warning
```
2022-03-31 13:51:46.925659: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64
2022-03-31 13:51:46.925763: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-31 13:51:46.925817: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist
```
NOTE: does Tensorflow still requires to have the package `tensorflow-gpu` installed or is `tensorflow` sufficient nowadays?
### Testing for GPUs
Look for inspiration in the python file
```libitmal/kernelfuns.py```
and try to call the function
```StartupSequence_GPU(verbose=True)```
It this still fails, try to create a function, that checks for GPUs ala:
%% Cell type:code id:ba90ea99 tags:
```
``` python
import tensorflow
def MyGPUCheck():
print("MyGPUCheck():..")
physical_devices = tensorflow.config.list_physical_devices('GPU')
n = len(physical_devices)
print(f" found {n} GPU device(s)")
for i in physical_devices:
print(f" {i}")
print("OK")
MyGPUCheck()
```
%% Output
MyGPUCheck():..
found 0 GPU device(s)
OK
2022-03-31 14:08:21.834532: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64
2022-03-31 14:08:21.834582: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-31 14:08:21.834608: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist
%% Cell type:markdown id:7d6393b1 tags:
A succefull call to thelist of physical devices may output something like
```
MyGPUCheck():..
found 1 GPU device(s)
PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')
OK
```
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment