Commit 687a0b32 authored by Carsten Eie Frigaard's avatar Carsten Eie Frigaard
Browse files

update

parent 5485544f
%% Cell type:markdown id:26fe9c4e tags: %% Cell type:markdown id:26fe9c4e tags:
# SWMAL # SWMAL
## Jupyter Notebook WIKI ## Jupyter Notebook WIKI
%% Cell type:markdown id:191a5d7f tags:
### Check Exitsing Versions
Run this to see installed packages and their versions.
%% Cell type:code id:e20109f2 tags:
``` python
from libitmal import versions
print("PRINT VERSIONS..\n")
versions.Versions()
print("\nOK")
```
%% Output
PRINT VERSIONS..
Python version: 3.9.7.
Scikit-learn version: 0.24.2.
Keras version: 2.4.3
Tensorflow version: 2.4.1
Tensorflow.keras version: 2.4.0
Opencv2 version: 4.5.1
OK
%% Cell type:code id:bb656cbc-c194-417c-93c3-800ef8cdbd2d tags:
``` python
print("\nPRINT ENVIRONMNETS (only works on a Linux platform)..\n")
! echo "ENV SETUP:"
! echo " PATH = $PATH"
! echo " PYTHONPATH = $PYTHONPATH"
! echo " CONDA_ROOT = $CONDA_ROOT"
! echo " CONDA_DEFAULT_ENV= $CONDA_DEFAULT_ENV"
! echo " VIRTUAL_ENV = $VIRTUAL_ENV"
! echo " LC_ALL = $LC_ALL"
! echo -n " which conda = " ; which conda
! echo -n " conda info = " ; conda info
! echo -n " hostname = " ; hostname
```
%% Output
PRINT ENVIRONMNETS (only works on a Linux platform)..
ENV SETUP:
PATH = /opt/anaconda-2021.11/bin:/opt/anaconda-2021.11/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/bin
PYTHONPATH =
CONDA_ROOT = /opt/anaconda-2021.11/
CONDA_DEFAULT_ENV=
VIRTUAL_ENV =
LC_ALL =
which conda = /opt/anaconda-2021.11/bin/conda
conda info =
active environment : None
user config file : /home/cef/.condarc
populated config files : /home/cef/.condarc
conda version : 4.12.0
conda-build version : 3.21.5
python version : 3.9.7.final.0
virtual packages : __linux=5.8.0=0
__glibc=2.31=0
__unix=0=0
__archspec=1=x86_64
base environment : /opt/anaconda-2021.11 (read only)
conda av data dir : /opt/anaconda-2021.11/etc/conda
conda av metadata url : None
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /opt/anaconda-2021.11/pkgs
/home/cef/.conda/pkgs
envs directories : /home/cef/.conda-2021.11/envs
/home/cef/.conda/envs
/opt/anaconda-2021.11/envs
platform : linux-64
user-agent : conda/4.12.0 requests/2.26.0 CPython/3.9.7 Linux/5.8.0-50-generic ubuntu/20.04.3 glibc/2.31
UID:GID : 1000:1000
netrc file : None
offline mode : False
hostname = leno
%% Cell type:markdown id:793c2a07 tags: %% Cell type:markdown id:793c2a07 tags:
### Missing GPU Support in Tensorflow ### Missing GPU Support in Tensorflow
GPUs are normally detected easily by Tensorflow, you could look for lines like GPUs are normally detected easily by Tensorflow, you could look for lines like
``` ```
Successfully opened dynamic library libcudart.so.10.1 Successfully opened dynamic library libcudart.so.10.1
``` ```
in the output, when running a Jupyter Notebook from the Anaconda Prompt ala in the output, when running a Jupyter Notebook from the Anaconda Prompt ala
``` ```
> jupyter-notebook > jupyter-notebook
``` ```
My output (Linux system) produces a OK output like My output (Linux system) produces a OK output like
``` ```
2022-03-31 13:39:09.426791: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2022-03-31 13:39:09.426791: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-03-31 13:39:09.427505: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2022-03-31 13:39:09.427505: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-03-31 13:39:09.452592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-03-31 13:39:09.452592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.452795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 2022-03-31 13:39:09.452795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6 pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2022-03-31 13:39:09.452823: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2022-03-31 13:39:09.452823: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2022-03-31 13:39:09.455163: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10 2022-03-31 13:39:09.455163: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2022-03-31 13:39:09.455217: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10 2022-03-31 13:39:09.455217: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2022-03-31 13:39:09.458319: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2022-03-31 13:39:09.458319: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-03-31 13:39:09.458839: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2022-03-31 13:39:09.458839: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-03-31 13:39:09.461252: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2022-03-31 13:39:09.461252: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-03-31 13:39:09.462146: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10 2022-03-31 13:39:09.462146: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2022-03-31 13:39:09.465156: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7 2022-03-31 13:39:09.465156: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7
2022-03-31 13:39:09.465399: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-03-31 13:39:09.465399: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.465604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-03-31 13:39:09.465604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-31 13:39:09.465685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2022-03-31 13:39:09.465685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
``` ```
but if you just see one unsuccessfull opening of a dynamic library ('cannot open shared object'), all GPU support will be disables. This typically manifest itself directly in the Notebook as a Warning but if you just see one unsuccessfull opening of a dynamic library ('cannot open shared object'), all GPU support will be disables. This typically manifest itself directly in the Notebook as a Warning
``` ```
2022-03-31 13:51:46.925659: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64 2022-03-31 13:51:46.925659: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64
2022-03-31 13:51:46.925763: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) 2022-03-31 13:51:46.925763: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-31 13:51:46.925817: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist 2022-03-31 13:51:46.925817: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist
``` ```
NOTE: does Tensorflow still requires to have the package `tensorflow-gpu` installed or is `tensorflow` sufficient nowadays? NOTE: does Tensorflow still requires to have the package `tensorflow-gpu` installed or is `tensorflow` sufficient nowadays?
### Testing for GPUs ### Testing for GPUs
Look for inspiration in the python file Look for inspiration in the python file
```libitmal/kernelfuns.py``` ```libitmal/kernelfuns.py```
and try to call the function and try to call the function
```StartupSequence_GPU(verbose=True)``` ```StartupSequence_GPU(verbose=True)```
It this still fails, try to create a function, that checks for GPUs ala: It this still fails, try to create a function, that checks for GPUs ala:
%% Cell type:code id:ba90ea99 tags: %% Cell type:code id:ba90ea99 tags:
``` ``` python
import tensorflow import tensorflow
def MyGPUCheck(): def MyGPUCheck():
print("MyGPUCheck():..") print("MyGPUCheck():..")
physical_devices = tensorflow.config.list_physical_devices('GPU') physical_devices = tensorflow.config.list_physical_devices('GPU')
n = len(physical_devices) n = len(physical_devices)
print(f" found {n} GPU device(s)") print(f" found {n} GPU device(s)")
for i in physical_devices: for i in physical_devices:
print(f" {i}") print(f" {i}")
print("OK") print("OK")
MyGPUCheck() MyGPUCheck()
``` ```
%% Output %% Output
MyGPUCheck():.. MyGPUCheck():..
found 0 GPU device(s) found 0 GPU device(s)
OK OK
2022-03-31 14:08:21.834532: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64 2022-03-31 14:08:21.834532: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: .:/opt/cuda-11.2/lib64:/opt/opencv/opencv4-4.5.1/lib:/opt/pylon5/lib64
2022-03-31 14:08:21.834582: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) 2022-03-31 14:08:21.834582: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-31 14:08:21.834608: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist 2022-03-31 14:08:21.834608: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (leno): /proc/driver/nvidia/version does not exist
%% Cell type:markdown id:7d6393b1 tags: %% Cell type:markdown id:7d6393b1 tags:
A succefull call to thelist of physical devices may output something like A succefull call to thelist of physical devices may output something like
``` ```
MyGPUCheck():.. MyGPUCheck():..
found 1 GPU device(s) found 1 GPU device(s)
PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU') PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')
OK OK
``` ```
......
#!/bin/bash q#!/bin/bash
set -ea set -ea
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment