Commit 1e4984d3 authored by Carsten Eie Frigaard's avatar Carsten Eie Frigaard
Browse files

pre-lesson-01-update

parent ce36fff0
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# ITMAL Intro # ITMAL Intro
## Mini Python Demo ## Mini Python Demo
REVISIONS||
---------||
2019-0128|CEF, initial.
2019-0820|CEF, E19 ITMAL update.
2019-0828|CEF, split into more cells.
2020-0125|CEF, F20 ITMAL update.
2020-0831|CEF, E20 ITMAL update, fixed typo in y.shape and make gfx links to BB.
2021-0201|CEF, F21 ITMAL update.
### Mini Python/Jupyternotebook demo ### Mini Python/Jupyternotebook demo
Build-in python array an Numpy arrays... Build-in python array an Numpy arrays...
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
%reset -f %reset -f
# import clause, imports numpy as the name 'np' # import clause, imports numpy as the name 'np'
import numpy as np import numpy as np
# python build-in array # python build-in array
x = [[1, 2, 3], [4, 5, 6]] x = [[1, 2, 3], [4, 5, 6]]
# print using print-f-syntax, prefeed againts say print('x = ',x) # print using print-f-syntax, prefeed againts say print('x = ',x)
print(f'x = {x}') print(f'x = {x}')
print('OK') print('OK')
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# create a numpy array (notice the 1.0 double) # create a numpy array (notice the 1.0 double)
y = np.array( [[1.0, 2, 3, 4], [10, 20, 30, 42]] ) y = np.array( [[1.0, 2, 3, 4], [10, 20, 30, 42]] )
print(f'y = {y}') print(f'y = {y}')
print() print()
print(f'y.dtype={y.dtype}, y.itemsize={y.itemsize}, y.shape={y.shape}') print(f'y.dtype={y.dtype}, y.itemsize={y.itemsize}, y.shape={y.shape}')
print('\nOK') print('\nOK')
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print("indexing...like a (m x n) matrix") print("indexing...like a (m x n) matrix")
print(y[0,1]) print(y[0,1])
print(y[0,-1]) # elem 0-from the 'right', strange but pythonic print(y[0,-1]) # elem 0-from the 'right', strange but pythonic
print(y[0,-2]) # elem 1-from the 'right' print(y[0,-2]) # elem 1-from the 'right'
# print a column, but will display as 'row' # print a column, but will display as 'row'
print(y[:,1]) print(y[:,1])
print('\nOK') print('\nOK')
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Matrix multiplication #### Matrix multiplication
Just use Numpy as a matrix like class; create a (3 x 4) matrix and do some matrix operations on it... Just use Numpy as a matrix like class; create a (3 x 4) matrix and do some matrix operations on it...
<img src='https://blackboard.au.dk/bbcswebdav/courses/BB-Cou-UUVA-94506/Fildeling/L01/Figs/matrix.jpg' alt="WARNING: you need to be logged into Blackboard to view images"> <img src='https://itundervisning.ase.au.dk/GITMAL/L01/Figs/matrix.jpg' alt="WARNING: you need to be logged into Blackboard to view images">
(NOTE: do not use `numpy.matrix`, <a href='https://docs.scipy.org/doc/numpy/reference/generated/numpy.matrix.html'>it is unfortunatly depricated.</a>) (NOTE: do not use `numpy.matrix`, <a href='https:
//docs.scipy.org/doc/numpy/reference/generated/numpy.matrix.html'>it is unfortunatly depricated.</a>)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = np.array([ [2, -5, -11 ,0], [-9, 4, 6, 13], [4, 7, 12, -2]]) x = np.array([ [2, -5, -11 ,0], [-9, 4, 6, 13], [4, 7, 12, -2]])
y = np.transpose(x) y = np.transpose(x)
print(f'x={x}\nx.shape={x.shape}\ny.shape={y.shape}') print(f'x={x}\nx.shape={x.shape}\ny.shape={y.shape}')
# No direct * oprator in numpy, # No direct * oprator in numpy,
# x*y will throw ValueError: operands could not be broadcast together with shapes (3,4) (4,3) # x*y will throw ValueError: operands could not be broadcast together with shapes (3,4) (4,3)
#z=x*y #z=x*y
# numpy dot is a typically combo python function; # numpy dot is a typically combo python function;
# inner-product if x and y are 1D arrays (vectors) # inner-product if x and y are 1D arrays (vectors)
# matrix multiplication if x and y are 2D arrays (matrices) # matrix multiplication if x and y are 2D arrays (matrices)
z = np.dot(x, y) z = np.dot(x, y)
print(f'\nThe dot product, np.dot(x, y)={z}') print(f'\nThe dot product, np.dot(x, y)={z}')
# alternatives to .dot: # alternatives to .dot:
print(np.matmul(x, y)) print(np.matmul(x, y))
print(x @ y) print(x @ y)
# the depricated numpy matrix # the depricated numpy matrix
mx = np.matrix(x) mx = np.matrix(x)
my = np.matrix(y) my = np.matrix(y)
mz = mx*my; mz = mx*my;
print(f'\nmatrix type mult: mx*my={mz}') print(f'\nmatrix type mult: mx*my={mz}')
print('\nOK') print('\nOK')
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Writing pythonic, robust code #### Writing pythonic, robust code
Range-checks and fail-fast... Range-checks and fail-fast...
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import sys, traceback import sys, traceback
print('Writing pythonic,robust code: range-checks and fail-fast...') print('Writing pythonic,robust code: range-checks and fail-fast...')
# python do all kinds of range-checks: robust coding # python do all kinds of range-checks: robust coding
#print(y[:,-5]) # will throw! #print(y[:,-5]) # will throw!
print('a pythonic assert..') print('a pythonic assert..')
assert True==0, 'notice the lack of () in python asserts' assert True==0, 'notice the lack of () in python asserts'
print('\nOK') print('\nOK')
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def MyTrace(some_exception): def MyTrace(some_exception):
print(f'cauth exception e="{some_exception}"') print(f'cauth exception e="{some_exception}"')
traceback.print_exc(file=sys.stdout) traceback.print_exc(file=sys.stdout)
print() print()
print('a try-catch block..') print('a try-catch block..')
try: try:
print(y[:,-5]) print(y[:,-5])
except IndexError as e: except IndexError as e:
MyTrace(e) MyTrace(e)
finally: finally:
print('finally executed last no matter what..') print('finally executed last no matter what..')
print('\nOK') print('\nOK')
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# This is python, but weird for C/C++/C# aficionados: # This is python, but weird for C/C++/C# aficionados:
try: try:
import a_non_existing_lib import a_non_existing_lib
except: except:
print("you don not have the 'a_non_existing_lib' library!") print("you don not have the 'a_non_existing_lib' library!")
print("\nOK") print("\nOK")
``` ```
%% Cell type:markdown id: tags:
## Administration
REVISIONS||
---------||
2019-01-28| CEF, initial.
2019-08-20| CEF, E19 ITMAL update.
2019-08-28| CEF, split into more cells.
2020-01-25| CEF, F20 ITMAL update.
2020-08-31| CEF, E20 ITMAL update, fixed typo in y.shape and make gfx links to BB.
2021-02-01| CEF, F21 ITMAL update.
2021-08-02| CEF, update to E21 ITMAL.
......
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# ITMAL Exercise # ITMAL Exercise
## Intro ## Intro
We startup by reusing parts of `01_the_machine_learning_landscape.ipynb` from Géron [GITHOML]. So we begin with what Géron says about life satisfactions vs GDP per capita. We startup by reusing parts of `01_the_machine_learning_landscape.ipynb` from Géron [GITHOML]. So we begin with what Géron says about life satisfactions vs GDP per capita.
Halfway down this notebook, a list of questions for ITMAL is presented. Halfway down this notebook, a list of questions for ITMAL is presented.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Chapter 1 – The Machine Learning landscape ## Chapter 1 – The Machine Learning landscape
_This is the code used to generate some of the figures in chapter 1._ _This is the code used to generate some of the figures in chapter 1._
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Setup ### Setup
First, let's make sure this notebook works well in both python 2 and 3, import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures: First, let's make sure this notebook works well in both python 2 and 3, import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# To support both python 2 and python 3 # To support both python 2 and python 3
from __future__ import division, print_function, unicode_literals from __future__ import division, print_function, unicode_literals
# Common imports # Common imports
import numpy as np import numpy as np
import os import os
# to make this notebook's output stable across runs # to make this notebook's output stable across runs
np.random.seed(42) np.random.seed(42)
# To plot pretty figures # To plot pretty figures
%matplotlib inline %matplotlib inline
import matplotlib import matplotlib
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 14 plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12 plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12 plt.rcParams['ytick.labelsize'] = 12
# Where to save the figures # Where to save the figures
PROJECT_ROOT_DIR = "." PROJECT_ROOT_DIR = "."
CHAPTER_ID = "fundamentals" CHAPTER_ID = "fundamentals"
def save_fig(fig_id, tight_layout=True): def save_fig(fig_id, tight_layout=True):
path = os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID, fig_id + ".png") path = os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID, fig_id + ".png")
print("IGNORING: Saving figure", fig_id) # ITMAL: I've disabled saving of figures print("IGNORING: Saving figure", fig_id) # ITMAL: I've disabled saving of figures
#if tight_layout: #if tight_layout:
# plt.tight_layout() # plt.tight_layout()
#plt.savefig(path, format='png', dpi=300) #plt.savefig(path, format='png', dpi=300)
# Ignore useless warnings (see SciPy issue #5998) # Ignore useless warnings (see SciPy issue #5998)
import warnings import warnings
warnings.filterwarnings(action="ignore", module="scipy", message="^internal gelsd") warnings.filterwarnings(action="ignore", module="scipy", message="^internal gelsd")
print("OK") print("OK")
``` ```
%% Output %% Output
OK OK
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Code example 1-1 ### Code example 1-1
This function just merges the OECD's life satisfaction data and the IMF's GDP per capita data. It's a bit too long and boring and it's not specific to Machine Learning, which is why I left it out of the book. This function just merges the OECD's life satisfaction data and the IMF's GDP per capita data. It's a bit too long and boring and it's not specific to Machine Learning, which is why I left it out of the book.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def prepare_country_stats(oecd_bli, gdp_per_capita): def prepare_country_stats(oecd_bli, gdp_per_capita):
oecd_bli = oecd_bli[oecd_bli["INEQUALITY"]=="TOT"] oecd_bli = oecd_bli[oecd_bli["INEQUALITY"]=="TOT"]
oecd_bli = oecd_bli.pivot(index="Country", columns="Indicator", values="Value") oecd_bli = oecd_bli.pivot(index="Country", columns="Indicator", values="Value")
gdp_per_capita.rename(columns={"2015": "GDP per capita"}, inplace=True) gdp_per_capita.rename(columns={"2015": "GDP per capita"}, inplace=True)
gdp_per_capita.set_index("Country", inplace=True) gdp_per_capita.set_index("Country", inplace=True)
full_country_stats = pd.merge(left=oecd_bli, right=gdp_per_capita, full_country_stats = pd.merge(left=oecd_bli, right=gdp_per_capita,
left_index=True, right_index=True) left_index=True, right_index=True)
full_country_stats.sort_values(by="GDP per capita", inplace=True) full_country_stats.sort_values(by="GDP per capita", inplace=True)
remove_indices = [0, 1, 6, 8, 33, 34, 35] remove_indices = [0, 1, 6, 8, 33, 34, 35]
keep_indices = list(set(range(36)) - set(remove_indices)) keep_indices = list(set(range(36)) - set(remove_indices))
return full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[keep_indices] return full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[keep_indices]
print("OK") print("OK")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The code in the book expects the data files to be located in the current directory. I just tweaked it here to fetch the files in datasets/lifesat. The code in the book expects the data files to be located in the current directory. I just tweaked it here to fetch the files in datasets/lifesat.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import os import os
datapath = os.path.join("../datasets", "lifesat", "") datapath = os.path.join("../datasets", "lifesat", "")
# NOTE: a ! prefix makes us able to run system commands.. # NOTE: a ! prefix makes us able to run system commands..
# (command 'dir' for windows, 'ls' for Linux or Macs) # (command 'dir' for windows, 'ls' for Linux or Macs)
# #
! dir ! dir
print("\nOK") print("\nOK")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Code example # Code example
import matplotlib import matplotlib
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import numpy as np import numpy as np
import pandas as pd import pandas as pd
import sklearn.linear_model import sklearn.linear_model
# Load the data # Load the data
try: try:
oecd_bli = pd.read_csv(datapath + "oecd_bli_2015.csv", thousands=',') oecd_bli = pd.read_csv(datapath + "oecd_bli_2015.csv", thousands=',')
gdp_per_capita = pd.read_csv(datapath + "gdp_per_capita.csv",thousands=',',delimiter='\t', gdp_per_capita = pd.read_csv(datapath + "gdp_per_capita.csv",thousands=',',delimiter='\t',
encoding='latin1', na_values="n/a") encoding='latin1', na_values="n/a")
except Exception as e: except Exception as e:
print(f"ITMAL NOTE: well, you need to have the 'datasets' dir in path, please unzip 'datasets.zip' and make sure that its included in the datapath='{datapath}' setting in the cell above..") print(f"ITMAL NOTE: well, you need to have the 'datasets' dir in path, please unzip 'datasets.zip' and make sure that its included in the datapath='{datapath}' setting in the cell above..")
raise e raise e
# Prepare the data # Prepare the data
country_stats = prepare_country_stats(oecd_bli, gdp_per_capita) country_stats = prepare_country_stats(oecd_bli, gdp_per_capita)
X = np.c_[country_stats["GDP per capita"]] X = np.c_[country_stats["GDP per capita"]]
y = np.c_[country_stats["Life satisfaction"]] y = np.c_[country_stats["Life satisfaction"]]
# Visualize the data # Visualize the data
country_stats.plot(kind='scatter', x="GDP per capita", y='Life satisfaction') country_stats.plot(kind='scatter', x="GDP per capita", y='Life satisfaction')
plt.show() plt.show()
# Select a linear model # Select a linear model
model = sklearn.linear_model.LinearRegression() model = sklearn.linear_model.LinearRegression()
# Train the model # Train the model
model.fit(X, y) model.fit(X, y)
# Make a prediction for Cyprus # Make a prediction for Cyprus
X_new = [[22587]] # Cyprus' GDP per capita X_new = [[22587]] # Cyprus' GDP per capita
y_pred = model.predict(X_new) y_pred = model.predict(X_new)
print(y_pred) # outputs [[ 5.96242338]] print(y_pred) # outputs [[ 5.96242338]]
print("OK") print("OK")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## ITMAL ## ITMAL
Now we plot the linear regression result. Now we plot the linear regression result.
Just ignore all the data plotter code mumbo-jumbo here (code take dirclty from the notebook, [GITHOML])...and see the final plot. Just ignore all the data plotter code mumbo-jumbo here (code take dirclty from the notebook, [GITHOML])...and see the final plot.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
oecd_bli = pd.read_csv(datapath + "oecd_bli_2015.csv", thousands=',') oecd_bli = pd.read_csv(datapath + "oecd_bli_2015.csv", thousands=',')
oecd_bli = oecd_bli[oecd_bli["INEQUALITY"]=="TOT"] oecd_bli = oecd_bli[oecd_bli["INEQUALITY"]=="TOT"]
oecd_bli = oecd_bli.pivot(index="Country", columns="Indicator", values="Value") oecd_bli = oecd_bli.pivot(index="Country", columns="Indicator", values="Value")
#oecd_bli.head(2) #oecd_bli.head(2)
gdp_per_capita = pd.read_csv(datapath+"gdp_per_capita.csv", thousands=',', delimiter='\t', gdp_per_capita = pd.read_csv(datapath+"gdp_per_capita.csv", thousands=',', delimiter='\t',
encoding='latin1', na_values="n/a") encoding='latin1', na_values="n/a")
gdp_per_capita.rename(columns={"2015": "GDP per capita"}, inplace=True) gdp_per_capita.rename(columns={"2015": "GDP per capita"}, inplace=True)
gdp_per_capita.set_index("Country", inplace=True) gdp_per_capita.set_index("Country", inplace=True)
#gdp_per_capita.head(2) #gdp_per_capita.head(2)
full_country_stats = pd.merge(left=oecd_bli, right=gdp_per_capita, left_index=True, right_index=True) full_country_stats = pd.merge(left=oecd_bli, right=gdp_per_capita, left_index=True, right_index=True)
full_country_stats.sort_values(by="GDP per capita", inplace=True) full_country_stats.sort_values(by="GDP per capita", inplace=True)
#full_country_stats #full_country_stats
remove_indices = [0, 1, 6, 8, 33, 34, 35] remove_indices = [0, 1, 6, 8, 33, 34, 35]
keep_indices = list(set(range(36)) - set(remove_indices)) keep_indices = list(set(range(36)) - set(remove_indices))
sample_data = full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[keep_indices] sample_data = full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[keep_indices]
#missing_data = full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[remove_indices] #missing_data = full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[remove_indices]
sample_data.plot(kind='scatter', x="GDP per capita", y='Life satisfaction', figsize=(5,3)) sample_data.plot(kind='scatter', x="GDP per capita", y='Life satisfaction', figsize=(5,3))
plt.axis([0, 60000, 0, 10]) plt.axis([0, 60000, 0, 10])
position_text = { position_text = {
"Hungary": (5000, 1), "Hungary": (5000, 1),
"Korea": (18000, 1.7), "Korea": (18000, 1.7),
"France": (29000, 2.4), "France": (29000, 2.4),
"Australia": (40000, 3.0), "Australia": (40000, 3.0),
"United States": (52000, 3.8), "United States": (52000, 3.8),
} }
for country, pos_text in position_text.items(): for country, pos_text in position_text.items():
pos_data_x, pos_data_y = sample_data.loc[country] pos_data_x, pos_data_y = sample_data.loc[country]
country = "U.S." if country == "United States" else country country = "U.S." if country == "United States" else country
plt.annotate(country, xy=(pos_data_x, pos_data_y), xytext=pos_text, plt.annotate(country, xy=(pos_data_x, pos_data_y), xytext=pos_text,
arrowprops=dict(facecolor='black', width=0.5, shrink=0.1, headwidth=5)) arrowprops=dict(facecolor='black', width=0.5, shrink=0.1, headwidth=5))
plt.plot(pos_data_x, pos_data_y, "ro") plt.plot(pos_data_x, pos_data_y, "ro")
#save_fig('money_happy_scatterplot') #save_fig('money_happy_scatterplot')
plt.show() plt.show()
from sklearn import linear_model from sklearn import linear_model
lin1 = linear_model.LinearRegression() lin1 = linear_model.LinearRegression()
Xsample = np.c_[sample_data["GDP per capita"]] Xsample = np.c_[sample_data["GDP per capita"]]
ysample = np.c_[sample_data["Life satisfaction"]] ysample = np.c_[sample_data["Life satisfaction"]]
lin1.fit(Xsample, ysample) lin1.fit(Xsample, ysample)
t0 = 4.8530528 t0 = 4.8530528
t1 = 4.91154459e-05 t1 = 4.91154459e-05
sample_data.plot(kind='scatter', x="GDP per capita", y='Life satisfaction', figsize=(5,3)) sample_data.plot(kind='scatter', x="GDP per capita", y='Life satisfaction', figsize=(5,3))
plt.axis([0, 60000, 0, 10]) plt.axis([0, 60000, 0, 10])
M=np.linspace(0, 60000, 1000) M=np.linspace(0, 60000, 1000)
plt.plot(M, t0 + t1*M, "b") plt.plot(M, t0 + t1*M, "b")
plt.text(5000, 3.1, r"$\theta_0 = 4.85$", fontsize=14, color="b") plt.text(5000, 3.1, r"$\theta_0 = 4.85$", fontsize=14, color="b")
plt.text(5000, 2.2, r"$\theta_1 = 4.91 \times 10^{-5}$", fontsize=14, color="b") plt.text(5000, 2.2, r"$\theta_1 = 4.91 \times 10^{-5}$", fontsize=14, color="b")
#save_fig('best_fit_model_plot') #save_fig('best_fit_model_plot')
plt.show() plt.show()
print("OK") print("OK")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Ultra-brief Intro to the Fit-Predict Interface in Scikit-learn ## Ultra-brief Intro to the Fit-Predict Interface in Scikit-learn
OK, the important lines in the cells above are really just OK, the important lines in the cells above are really just
```python ```python
#Select a linear model #Select a linear model
model = sklearn.linear_model.LinearRegression() model = sklearn.linear_model.LinearRegression()
# Train the model # Train the model
model.fit(X, y) model.fit(X, y)
# Make a prediction for Cyprus # Make a prediction for Cyprus
X_new = [[22587]] # Cyprus' GDP per capita X_new = [[22587]] # Cyprus' GDP per capita
y_pred = model.predict(X_new) y_pred = model.predict(X_new)
print(y_pred) # outputs [[ 5.96242338]] print(y_pred) # outputs [[ 5.96242338]]
``` ```
What happens here is that we create model, called LinearRegression (for now just a 100% black-box method), put in our data training $\mathbf{X}$ matrix and corresponding desired training ground thruth vector $\mathbf{y}$ (aka $\mathbf{y}_{true})$, and then train the model. What happens here is that we create model, called LinearRegression (for now just a 100% black-box method), put in our data training $\mathbf{X}$ matrix and corresponding desired training ground thruth vector $\mathbf{y}$ (aka $\mathbf{y}_{true})$, and then train the model.
After training we extract a _predicted_ $\mathbf{y}_{pred}$ vector from the model, for some input scalar $x$=22587. After training we extract a _predicted_ $\mathbf{y}_{pred}$ vector from the model, for some input scalar $x$=22587.
### Supervised Training via Fit-predict ### Supervised Training via Fit-predict
The train-predict (or train-fit) process on some data can be visualized as The train-predict (or train-fit) process on some data can be visualized as
<img src="https://blackboard.au.dk/bbcswebdav/courses/BB-Cou-UUVA-94506/Fildeling/L01/Figs/supervised_learning.png" alt="WARNING: you need to be logged into Blackboard to view images" style="height:250px"> <img src="https://itundervisning.ase.au.dk/GITMAL/L01/Figs/supervised_learning.png" alt="WARNING: you need to be logged into Blackboard to view images" style="height:250px">
In this figure the untrained model is a `sklearn.linear_model.LinearRegression` python object. When trained via `model.fit()`, using some know answers for the data, $\mathbf{y}_{true}~$, it becomes a blue-boxed trained model. In this figure the untrained model is a `sklearn.linear_model.LinearRegression` python object. When trained via `model.fit()`, using some know answers for the data, $\mathbf{y}_{true}~$, it becomes a blue-boxed trained model.
The trained model can be used to _predict_ values from new, yet-unseen, data, via the `model.predict()` function. The trained model can be used to _predict_ values from new, yet-unseen, data, via the `model.predict()` function.
In other words, how high is life-satisfaction for Cyprus' GDP=22587 USD? In other words, how high is life-satisfaction for Cyprus' GDP=22587 USD?
Just call `model.predict()` on a matrix with one single numerical element, 22587, well, not a matrix really, but a python list-of-lists, `[[22587]]` Just call `model.predict()` on a matrix with one single numerical element, 22587, well, not a matrix really, but a python list-of-lists, `[[22587]]`
```y_pred = model.predict([[22587]])``` ```y_pred = model.predict([[22587]])```
Apparently 5.96 the models answers! Apparently 5.96 the models answers!
(you get used to the python built-in containers and numpy on the way..) (you get used to the python built-in containers and numpy on the way..)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Qa) The $\theta$ parameters and the $R^2$ Score ### Qa) The $\theta$ parameters and the $R^2$ Score
Géron uses some $\theta$ parameter from this linear regression model, in his examples and plots above. Géron uses some $\theta$ parameter from this linear regression model, in his examples and plots above.
How do you extract the $\theta_0$ and $\theta_1$ coefficients in his life-satisfaction figure form the linear regression model, via the models python attributes? How do you extract the $\theta_0$ and $\theta_1$ coefficients in his life-satisfaction figure form the linear regression model, via the models python attributes?
Read the documentation for the linear regressor at Read the documentation for the linear regressor at
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
Extract the score=0.734 for the model using data (X,y) and explain what $R^2$ score measures in broad terms Extract the score=0.734 for the model using data (X,y) and explain what $R^2$ score measures in broad terms
$$ $$
\begin{array}{rcll} \begin{array}{rcll}
R^2 &=& 1 - u/v\\ R^2 &=& 1 - u/v\\
u &=& \sum (y_{true} - y_{pred}~)^2 ~~~&\small \mbox{residual sum of squares}\\ u &=& \sum (y_{true} - y_{pred}~)^2 ~~~&\small \mbox{residual sum of squares}\\
v &=& \sum (y_{true} - \mu_{true}~)^2 ~~~&\small \mbox{total sum of squares} v &=& \sum (y_{true} - \mu_{true}~)^2 ~~~&\small \mbox{total sum of squares}
\end{array} \end{array}
$$ $$
with $y_{true}~$ being the true data, $y_{pred}~$ being the predicted data from the model and $\mu_{true}~$ being the true mean of the data. with $y_{true}~$ being the true data, $y_{pred}~$ being the predicted data from the model and $\mu_{true}~$ being the true mean of the data.
What are the minimum and maximum values for $R^2~$? What are the minimum and maximum values for $R^2~$?
Is it best to have a low $R^2$ score or a high $R^2$ score? This means, is $R^2$ a loss/cost function or a function that measures of fitness/goodness? Is it best to have a low $R^2$ score or a high $R^2$ score? This means, is $R^2$ a loss/cost function or a function that measures of fitness/goodness?
NOTE$_1$: the $R^2$ is just one of many scoring functions used in ML, we will see plenty more other methods later. NOTE$_1$: the $R^2$ is just one of many scoring functions used in ML, we will see plenty more other methods later.
NOTE$_2$: there are different definitions of the $R^2$, 'coefficient of determination', in linear algebra. We stricly use the formulation above. NOTE$_2$: there are different definitions of the $R^2$, 'coefficient of determination', in linear algebra. We stricly use the formulation above.
OPTIONAL: Read the additional in-depth literature on $R^2~$: OPTIONAL: Read the additional in-depth literature on $R^2~$:
> https://en.wikipedia.org/wiki/Coefficient_of_determination > https://en.wikipedia.org/wiki/Coefficient_of_determination
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: add your code here.. # TODO: add your code here..
assert False, "TODO: solve Qa, and remove me.." assert False, "TODO: solve Qa, and remove me.."
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## The Merits of the Fit-Predict Interface ## The Merits of the Fit-Predict Interface
Now comes the really fun part: all methods in Scikit-learn have this fit-predict interface, and you can easily interchange models in your code just by instantiating a new and perhaps better ML model. Now comes the really fun part: all methods in Scikit-learn have this fit-predict interface, and you can easily interchange models in your code just by instantiating a new and perhaps better ML model.
There are still a lot of per-model parameters to tune, but fortunately, the built-in default values provide you with a good initial guess for good model setup. There are still a lot of per-model parameters to tune, but fortunately, the built-in default values provide you with a good initial guess for good model setup.
Later on, you might want to go into the parameter detail trying to optimize some params (opening the lid of the black-box ML algo), but for now, we pretty much stick to the default values. Later on, you might want to go into the parameter detail trying to optimize some params (opening the lid of the black-box ML algo), but for now, we pretty much stick to the default values.
Let's try to replace the linear regression now, let's test a _k-nearest neighbour algorithm_ instead (still black boxed algorithm-wise)... Let's try to replace the linear regression now, let's test a _k-nearest neighbour algorithm_ instead (still black boxed algorithm-wise)...
### Qb) Using k-Nearest Neighbors ### Qb) Using k-Nearest Neighbors
Change the linear regression model to a `sklearn.neighbors.KNeighborsRegressor` with k=3 (as in [HOML:p21,bottom]), and rerun the `fit` and `predict` using this new model. Change the linear regression model to a `sklearn.neighbors.KNeighborsRegressor` with k=3 (as in [HOML:p21,bottom]), and rerun the `fit` and `predict` using this new model.
What do the k-nearest neighbours estimate for Cyprus, compared to the linear regression (it should yield=5.77)? What do the k-nearest neighbours estimate for Cyprus, compared to the linear regression (it should yield=5.77)?
What _score-method_ does the k-nearest model use, and is it comparable to the linear regression model? What _score-method_ does the k-nearest model use, and is it comparable to the linear regression model?
Seek out the documentation in Scikit-learn, if the scoring methods are not equal, can they be compared to each other at all then? Seek out the documentation in Scikit-learn, if the scoring methods are not equal, can they be compared to each other at all then?
Remember to put pointer/text from the Sckikit-learn documentation in the journal...(did you find the right kNN model etc.) Remember to put pointer/text from the Sckikit-learn documentation in the journal...(did you find the right kNN model etc.)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# this is our raw data set: # this is our raw data set:
sample_data sample_data
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# and this is our preprocessed data # and this is our preprocessed data
country_stats country_stats
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Prepare the data # Prepare the data
X = np.c_[country_stats["GDP per capita"]] X = np.c_[country_stats["GDP per capita"]]
y = np.c_[country_stats["Life satisfaction"]] y = np.c_[country_stats["Life satisfaction"]]
print("X.shape=",X.shape) print("X.shape=",X.shape)
print("y.shape=",y.shape) print("y.shape=",y.shape)
# Visualize the data # Visualize the data
country_stats.plot(kind='scatter', x="GDP per capita", y='Life satisfaction') country_stats.plot(kind='scatter', x="GDP per capita", y='Life satisfaction')
plt.show() plt.show()
# Select and train a model # Select and train a model
# TODO: add your code here.. # TODO: add your code here..
assert False, "TODO: add you instatiation and training of the knn model here.." assert False, "TODO: add you instatiation and training of the knn model here.."
# knn = .. # knn = ..
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Qc) Tuning Parameter for k-Nearest Neighbors and A Sanity Check ### Qc) Tuning Parameter for k-Nearest Neighbors and A Sanity Check
But that not the full story. Try plotting the prediction for both models in the same graph and tune the `k_neighbor` parameter of the `KNeighborsRegressor` model. But that not the full story. Try plotting the prediction for both models in the same graph and tune the `k_neighbor` parameter of the `KNeighborsRegressor` model.
Choosing `k_neighbor=1` produces a nice `score=1`, that seems optimal...but is it really so good? Choosing `k_neighbor=1` produces a nice `score=1`, that seems optimal...but is it really so good?
Plotting the two models in a 'Life Satisfaction-vs-GDP capita' 2D plot by creating an array in the range 0 to 60000 (USD) (the `M` matrix below) and then predict the corresponding y value will sheed some light to this. Plotting the two models in a 'Life Satisfaction-vs-GDP capita' 2D plot by creating an array in the range 0 to 60000 (USD) (the `M` matrix below) and then predict the corresponding y value will sheed some light to this.
Now reusing the plots stubs below, try to explain why the k-nearest neighbour with `k_neighbor=1` has such a good score. Now reusing the plots stubs below, try to explain why the k-nearest neighbour with `k_neighbor=1` has such a good score.
Does a score=1 with `k_neighbor=1`also mean that this would be the prefered estimator for the job? Does a score=1 with `k_neighbor=1`also mean that this would be the prefered estimator for the job?
Hint here is a similar plot of a KNN for a small set of different k's: Hint here is a similar plot of a KNN for a small set of different k's:
<img src="https://blackboard.au.dk/bbcswebdav/courses/BB-Cou-UUVA-91831/Fildeling/L01/Figs/regression_with_knn.png" alt="WARNING: you need to be logged into Blackboard to view images" style="height:150px"> <img src="https://itundervisning.ase.au.dk/GITMAL/L01/Figs/regression_with_knn.png" alt="WARNING: you need to be logged into Blackboard to view images" style="height:150px">
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
sample_data.plot(kind='scatter', x="GDP per capita", y='Life satisfaction', figsize=(5,3)) sample_data.plot(kind='scatter', x="GDP per capita", y='Life satisfaction', figsize=(5,3))
plt.axis([0, 60000, 0, 10]) plt.axis([0, 60000, 0, 10])
# create an test matrix M, with the same dimensionality as X, and in the range [0;60000] # create an test matrix M, with the same dimensionality as X, and in the range [0;60000]
# and a step size of your choice # and a step size of your choice
m=np.linspace(0, 60000, 1000) m=np.linspace(0, 60000, 1000)
M=np.empty([m.shape[0],1]) M=np.empty([m.shape[0],1])
M[:,0]=m M[:,0]=m
# from this test M data, predict the y values via the lin.reg. and k-nearest models # from this test M data, predict the y values via the lin.reg. and k-nearest models
y_pred_lin = model.predict(M) y_pred_lin = model.predict(M)
y_pred_knn = knn.predict(M) # ASSUMING the variable name 'knn' of your KNeighborsRegressor y_pred_knn = knn.predict(M) # ASSUMING the variable name 'knn' of your KNeighborsRegressor
# use plt.plot to plot x-y into the sample_data plot.. # use plt.plot to plot x-y into the sample_data plot..
plt.plot(m, y_pred_lin, "r") plt.plot(m, y_pred_lin, "r")
plt.plot(m, y_pred_knn, "b") plt.plot(m, y_pred_knn, "b")
# TODO: add your code here.. # TODO: add your code here..
assert False, "TODO: try knn with different k_neighbor params, that is re-instantiate knn, refit and replot.." assert False, "TODO: try knn with different k_neighbor params, that is re-instantiate knn, refit and replot.."
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Qd) Trying out a Neural Network ### Qd) Trying out a Neural Network
Let us then try a Neural Network on the data, using the fit-predict interface allows us to replug a new model into our existing code. Let us then try a Neural Network on the data, using the fit-predict interface allows us to replug a new model into our existing code.
There are a number of different NN's available, let's just hook into Scikit-learns Multi-Layer Perceptron for regression, that is an 'MLPRegressor'. There are a number of different NN's available, let's just hook into Scikit-learns Multi-Layer Perceptron for regression, that is an 'MLPRegressor'.
Now, the data-set for training the MLP is really not well scaled, so we need to tweak a lot of parameters in the MLP just to get it to produce some sensible output: with out preprocessing and scaling of the input data, `X`, the MLP is really a bad choice of model for the job since it so easily produces garbage output. Now, the data-set for training the MLP is really not well scaled, so we need to tweak a lot of parameters in the MLP just to get it to produce some sensible output: with out preprocessing and scaling of the input data, `X`, the MLP is really a bad choice of model for the job since it so easily produces garbage output.
Try training the `mlp` regression model below, predict the value for Cyprus, and find the `score` value for the training set...just as we did for the linear and KNN models. Try training the `mlp` regression model below, predict the value for Cyprus, and find the `score` value for the training set...just as we did for the linear and KNN models.
Can the `MLPRegressor` score function be compared with the linear and KNN-scores? Can the `MLPRegressor` score function be compared with the linear and KNN-scores?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from sklearn.neural_network import MLPRegressor from sklearn.neural_network import MLPRegressor
# Setup MLPRegressor, can be very tricky for the tiny-data # Setup MLPRegressor, can be very tricky for the tiny-data
mlp = MLPRegressor( hidden_layer_sizes=(10,), solver='adam', activation='relu', tol=1E-5, max_iter=100000, verbose=True) mlp = MLPRegressor( hidden_layer_sizes=(10,), solver='adam', activation='relu', tol=1E-5, max_iter=100000, verbose=True)
mlp.fit(X,y.ravel()) mlp.fit(X,y.ravel())
# lets make a MLP regressor prediction and redo the plots # lets make a MLP regressor prediction and redo the plots
y_pred_mlp = mlp.predict(M) y_pred_mlp = mlp.predict(M)
plt.plot(m, y_pred_lin, "r") plt.plot(m, y_pred_lin, "r")
plt.plot(m, y_pred_knn, "b") plt.plot(m, y_pred_knn, "b")
plt.plot(m, y_pred_mlp, "k") plt.plot(m, y_pred_mlp, "k")
# TODO: add your code here.. # TODO: add your code here..
assert False, "TODO: predict value for Cyprus and fetch the score() from the fitting." assert False, "TODO: predict value for Cyprus and fetch the score() from the fitting."
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### [OPTIONAL] Qe) Neural Network with pre-scaling ### [OPTIONAL] Qe) Neural Network with pre-scaling
Now, the neurons in neural networks normally expects input data in the range `[0;1]` or sometimes in the range `[-1;1]`, meaning that for value outside this range the you put of the neuron will saturate to it's min or max value (also typical `0` or `1`). Now, the neurons in neural networks normally expects input data in the range `[0;1]` or sometimes in the range `[-1;1]`, meaning that for value outside this range the you put of the neuron will saturate to it's min or max value (also typical `0` or `1`).
A concrete value of `X` is, say 22.000 USD, that is far away from what the MLP expects. To af fix to the problem in Qd) is to preprocess data by scaling it down to something more sensible. A concrete value of `X` is, say 22.000 USD, that is far away from what the MLP expects. To af fix to the problem in Qd) is to preprocess data by scaling it down to something more sensible.
Try to scale X to a range of `[0;1]`, re-train the MLP, re-plot and find the new score from the rescaled input. Any better? Try to scale X to a range of `[0;1]`, re-train the MLP, re-plot and find the new score from the rescaled input. Any better?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: add your code here.. # TODO: add your code here..
assert False, "TODO: try prescale data for the MPL...any better?" assert False, "TODO: try prescale data for the MPL...any better?"
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
REVISIONS|| REVISIONS||
---------|| ---------||
2018-1218|CEF, initial. 2018-12-18|CEF, initial.
2019-0124|CEF, spell checked and update. 2019-01-24|CEF, spell checked and update.
2019-0130|CEF, removed reset -f, did not work on all PC's. 2019-01-30|CEF, removed reset -f, did not work on all PC's.
2019-0820|CEF, E19 ITMAL update. 2019-08-20|CEF, E19 ITMAL update.
2019-0826|CEF, minor mod to NN exercise. 2019-08-26|CEF, minor mod to NN exercise.
2019-0828|CEF, fixed dataset dir issue, datapath"../datasets" changed to "./datasets". 2019-08-28|CEF, fixed dataset dir issue, datapath"../datasets" changed to "./datasets".
2020-0125|CEF, F20 ITMAL update. 2020-01-25|CEF, F20 ITMAL update.
2020-0806|CEF, E20 ITMAL update, minor fix of ls to dir and added exception to datasets load, udpated figs paths. 2020-08-06|CEF, E20 ITMAL update, minor fix of ls to dir and added exception to datasets load, udpated figs paths.
2020-0924|CEF, updated text to R2, Qa exe. 2020-09-24|CEF, updated text to R2, Qa exe.
2020-0928|CEF, updated R2 and theta extraction, use python attributes, moved revision table. Added comment about MLP. 2020-09-28|CEF, updated R2 and theta extraction, use python attributes, moved revision table. Added comment about MLP.
2021-0112|CEF, updated Qe. 2021-01-12|CEF, updated Qe.
2011-0208|CEF, added ls for Mac/Linux to dir command cell. 2021-02-08|CEF, added ls for Mac/Linux to dir command cell.
2021-08-02|CEF, update to E21 ITMAL.
......
No preview for this file type
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# ITMAL Exercise # ITMAL Exercise
## Python Basics ## Python Basics
### Modules and Packages in Python ### Modules and Packages in Python
Reuse of code in Jupyter notebooks can be done by either including a raw python source as a magic command Reuse of code in Jupyter notebooks can be done by either including a raw python source as a magic command
```python ```python
%load filename.py %load filename.py
``` ```
but this just pastes the source into the notebook and creates all kinds of pains regarding code maintenance. but this just pastes the source into the notebook and creates all kinds of pains regarding code maintenance.
A better way is to use a python __module__. A module consists simply (and pythonic) of a directory with a module init file in it (possibly empty) A better way is to use a python __module__. A module consists simply (and pythonic) of a directory with a module init file in it (possibly empty)
```python ```python
libitmal/__init__.py libitmal/__init__.py
``` ```
To this directory you can add modules in form of plain python files, say To this directory you can add modules in form of plain python files, say
```python ```python
libitmal/utils.py libitmal/utils.py
``` ```
That's about it! The `libitmal` file tree should now look like That's about it! The `libitmal` file tree should now look like
``` ```
libitmal/ libitmal/
├── __init__.py ├── __init__.py
├── __pycache__ ├── __pycache__
│   ├── __init__.cpython-36.pyc │   ├── __init__.cpython-36.pyc
│   └── utils.cpython-36.pyc │   └── utils.cpython-36.pyc
├── utils.py ├── utils.py
``` ```
with the cache part only being present once the module has been initialized. with the cache part only being present once the module has been initialized.
You should now be able to use the `libitmal` unit via an import directive, like You should now be able to use the `libitmal` unit via an import directive, like
```python ```python
import numpy as np import numpy as np
from libitmal import utils as itmalutils from libitmal import utils as itmalutils
print(dir(itmalutils)) print(dir(itmalutils))
print(itmalutils.__file__) print(itmalutils.__file__)
X = np.array([[1,2],[3,-100]]) X = np.array([[1,2],[3,-100]])
itmalutils.PrintMatrix(X,"mylabel=") itmalutils.PrintMatrix(X,"mylabel=")
itmalutils.TestAll() itmalutils.TestAll()
``` ```
#### Qa Load and test the `libitmal` module #### Qa Load and test the `libitmal` module
Try out the `libitmal` module from [GITMAL]. Load this module and run the function Try out the `libitmal` module from [GITMAL]. Load this module and run the function
```python ```python
from libitmal import utils as itmalutils from libitmal import utils as itmalutils
itmalutils.TestAll() itmalutils.TestAll()
``` ```
from this module. from this module.
##### Implementation details ##### Implementation details
Note that there is a python module ___include___ search path, that you may have to investigate and modify. For my Linux setup I have an export or declare statement in my .bashrc file, like Note that there is a python module ___include___ search path, that you may have to investigate and modify. For my Linux setup I have an export or declare statement in my .bashrc file, like
```bash ```bash
declare -x PYTHONPATH=~/ASE/ML/itmal:$PYTHONPATH declare -x PYTHONPATH=~/ASE/ML/itmal:$PYTHONPATH
``` ```
but your ```itmal```, the [GITMAL] root dir, may be placed elsewhere. but your ```itmal```, the [GITMAL] root dir, may be placed elsewhere.
For ___Windows___, you have to add `PYTHONPATH` to your user environment variables...see screenshot below (enlarge by modding the image width-tag or find the original png in the Figs directory). For ___Windows___, you have to add `PYTHONPATH` to your user environment variables...see screenshot below (enlarge by modding the image width-tag or find the original png in the Figs directory).
<img src="https://blackboard.au.dk/bbcswebdav/courses/BB-Cou-UUVA-94506/Fildeling/L01/Figs/Screenshot_windows_enviroment_variables.png" alt="WARNING: you need to be logged into Blackboard to view images" style="width:350px"> <img src="https://itundervisning.ase.au.dk/GITMAL/L01/Figs/Screenshot_windows_enviroment_variables.png" alt="WARNING: you need to be logged into Blackboard to view images" style="width:350px">
or if you, like me, hate setting up things in a GUI, and prefer a console, try in a CMD on windows or if you, like me, hate setting up things in a GUI, and prefer a console, try in a CMD on windows
```bash ```bash
CMD> setx.exe PYTHONPATH "C:\Users\auXXYYZZ\itmal" CMD> setx.exe PYTHONPATH "C:\Users\auXXYYZZ\itmal"
``` ```
replacing the username and path with whatever you have. If everything fails you could programmatically add your path to the libitmal directory as replacing the username and path with whatever you have. If everything fails you could programmatically add your path to the libitmal directory as
```python ```python
import sys,os import sys,os
sys.path.append(os.path.expanduser('~/itmal')) sys.path.append(os.path.expanduser('~/itmal'))
from libitmal import utils as itmalutils from libitmal import utils as itmalutils
print(dir(itmalutils)) print(dir(itmalutils))
print(itmalutils.__file__) print(itmalutils.__file__)
``` ```
For the journal: remember to document your particular PATH setup. For the journal: remember to document your particular PATH setup.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: Qa... # TODO: Qa...
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Qb Create your own module, with some functions, and test it #### Qb Create your own module, with some functions, and test it
Now create your own module, with some dummy functionality. Load it and run you dummy function in a Jupyter Notebook. Now create your own module, with some dummy functionality. Load it and run you dummy function in a Jupyter Notebook.
Keep this module at hand, when coding, and try to capture reusable python functions in it as you invent them! Keep this module at hand, when coding, and try to capture reusable python functions in it as you invent them!
For the journal: remember to document your particular library setup (where did you place files, etc). For the journal: remember to document your particular library setup (where did you place files, etc).
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: Qb... # TODO: Qb...
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Qc How do you 'recompile' a module? #### Qc How do you 'recompile' a module?
When changing the module code, Jupyter will keep running on the old module. How do you force the Jupyter notebook to re-load the module changes? When changing the module code, Jupyter will keep running on the old module. How do you force the Jupyter notebook to re-load the module changes?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: Qc... # TODO: Qc...
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### [OPTIONAL] Qd Write a Howto on Python Modules a Packages #### [OPTIONAL] Qd Write a Howto on Python Modules a Packages
Write a short description of how to use modules in Python (notes on modules path, import directives, directory structure, etc.) Write a short description of how to use modules in Python (notes on modules path, import directives, directory structure, etc.)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: Qd... # TODO: Qd...
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Classes in Python ### Classes in Python
Good news: Python got classes. Bad news: they are somewhat obscure compared to C++ classes. Good news: Python got classes. Bad news: they are somewhat obscure compared to C++ classes.
Though we will not use object-oriented programming in Python intensively, we still need some basic understanding of Python classes. Let's just dig into a class-demo, here is `MyClass` in Python Though we will not use object-oriented programming in Python intensively, we still need some basic understanding of Python classes. Let's just dig into a class-demo, here is `MyClass` in Python
```python ```python
class MyClass: class MyClass:
myvar = "blah" myvar = "blah"
def myfun(self): def myfun(self):
print("This is a message inside the class.") print("This is a message inside the class.")
myobjectx = MyClass() myobjectx = MyClass()
``` ```
NOTE: The following exercise assumes some C++ knowledge, in particular the OPRG and OOP courses. If you are an EE-student, then ignore the cryptic C++ comments, and jump directly to some Python code instead. It's the Python solution here, that is important! NOTE: The following exercise assumes some C++ knowledge, in particular the OPRG and OOP courses. If you are an EE-student, then ignore the cryptic C++ comments, and jump directly to some Python code instead. It's the Python solution here, that is important!
#### Qe Extend the class with some public and private functions and member variables #### Qe Extend the class with some public and private functions and member variables
How are private function and member variables represented in python classes? How are private function and member variables represented in python classes?
What is the meaning of `self` in python classes? What is the meaning of `self` in python classes?
What happens to a function inside a class if you forget `self` in the parameter list, like `def myfun():` instead of `def myfun(self):` and you try to call it like `myobjectx.myfun()`? Remember to document the demo code and result. What happens to a function inside a class if you forget `self` in the parameter list, like `def myfun():` instead of `def myfun(self):` and you try to call it like `myobjectx.myfun()`? Remember to document the demo code and result.
[OPTIONAL] What does 'class' and 'instance variables' in python correspond to in C++? Maybe you can figure it out, I did not really get it reading, say this tutorial [OPTIONAL] What does 'class' and 'instance variables' in python correspond to in C++? Maybe you can figure it out, I did not really get it reading, say this tutorial
> https://www.digitalocean.com/community/tutorials/understanding-class-and-instance-variables-in-python-3 > https://www.digitalocean.com/community/tutorials/understanding-class-and-instance-variables-in-python-3
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: Qe... # TODO: Qe...
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Qf Extend the class with a Constructor #### Qf Extend the class with a Constructor
Figure a way to declare/define a constructor (CTOR) in a python class. How is it done in python? Figure a way to declare/define a constructor (CTOR) in a python class. How is it done in python?
Is there a class destructor in python (DTOR)? Give a textual reason why/why-not python has a DTOR? Is there a class destructor in python (DTOR)? Give a textual reason why/why-not python has a DTOR?
Hint: python is garbage collection like in C#, and do not go into the details of `__del__, ___enter__, __exit__` functions...unless you find it irresistible to investigate. Hint: python is garbage collection like in C#, and do not go into the details of `__del__, ___enter__, __exit__` functions...unless you find it irresistible to investigate.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: Qf... # TODO: Qf...
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Qg Extend the class with a to-string function #### Qg Extend the class with a to-string function
Then find a way to serialize a class, that is to make some `tostring()` functionality similar to a C++ Then find a way to serialize a class, that is to make some `tostring()` functionality similar to a C++
```C++ ```C++
friend ostream& operator<<(ostream& s,const MyClass& x) friend ostream& operator<<(ostream& s,const MyClass& x)
{ {
return os << .. return os << ..
} }
``` ```
If you do not know C++, you might be aware of the C# way to string serialize If you do not know C++, you might be aware of the C# way to string serialize
``` ```
string s=myobject.tostring() string s=myobject.tostring()
``` ```
that is a per-class buildin function `tostring()`, now what is the pythonic way of 'printing' a class instance? that is a per-class buildin function `tostring()`, now what is the pythonic way of 'printing' a class instance?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: Qg... # TODO: Qg...
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### [OPTIONAL] Qh Write a Howto on Python Classes #### [OPTIONAL] Qh Write a Howto on Python Classes
Write a _How-To use Classes Pythonically_, including a description of public/privacy, constructors/destructors, the meaning of `self`, and inheritance. Write a _How-To use Classes Pythonically_, including a description of public/privacy, constructors/destructors, the meaning of `self`, and inheritance.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# TODO: Qh... # TODO: Qh...
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Administration ## Administration
REVISIONS|| REVISIONS||
---------|| ---------||
2018-1219| CEF, initial. 2018-12-19| CEF, initial.
2018-0206| CEF, updated and spell checked. 2018-02-06| CEF, updated and spell checked.
2018-0207| CEF, made Qh optional. 2018-02-07| CEF, made Qh optional.
2018-0208| CEF, added PYTHONPATH for windows. 2018-02-08| CEF, added PYTHONPATH for windows.
2018-0212| CEF, small mod in itmalutils/utils. 2018-02-12| CEF, small mod in itmalutils/utils.
2019-0820| CEF, E19 ITMAL update. 2019-08-20| CEF, E19 ITMAL update.
2020-0125| CEF, F20 ITMAL update. 2020-01-25| CEF, F20 ITMAL update.
2020-0806| CEF, E20 ITMAL update, udpated figs paths. 2020-08-06| CEF, E20 ITMAL update, udpated figs paths.
2020-0907| CEF, added text on OPRG and OOP for EE's 2020-09-07| CEF, added text on OPRG and OOP for EE's
2020-0929| CEF, added elaboration for journal in Qa+b. 2020-09-29| CEF, added elaboration for journal in Qa+b.
2021-0206| CEF, fixed itmalutils.TestAll() in markdown cell. 2021-02-06| CEF, fixed itmalutils.TestAll() in markdown cell.
2021-08-02| CEF, update to E21 ITMAL.
......