In this exercise, we need to explain all important overall concepts in training. Let's begin with Figure 5.3 from Deep Learning, Ian Goodfellow, et. al. [DL], that pretty much sums it all up
<imgsrc="https://itundervisning.ase.au.dk/GITMAL/L07/Figs/dl_generalization_error.png"alt="WARNING: you need to be logged into Blackboard to view images"style="height:500px">
### Qa On Generalization Error
### Qa) On Generalization Error
Write a detailed description of figure 5.3 (above) for your hand-in.
All concepts in the figure must be explained
* training/generalization error,
* underfit/overfit zone,
* optimal capacity,
* generalization gab,
* and the two axes: x/capacity, y/error.
%% Cell type:code id: tags:
``` python
# TODO: ...in text
assertFalse,"TODO: write some text.."
```
%% Cell type:markdown id: tags:
### Qb A MSE-Epoch/Error Plot
Next, we look at a SGD model for fitting polynomial, that is _polynomial regression_ similar to what Géron describes in [HOML] ("Polynomial Regression" + "Learning Curves").
Review the code below for plotting the RMSE vs. the iteration number or epoch below (three cells, part I/II/III).
Write a short description of the code, and comment on the important points in the generation of the (R)MSE array.
The training phase output lots of lines like
> `epoch= 104, mse_train=1.50, mse_val=2.37` <br>
> `epoch= 105, mse_train=1.49, mse_val=2.35`
What is an ___epoch___ and what is `mse_train` and `mse_val`?
NOTE$_1$: the generalization plot figure 5.3 in [DL] (above) and the plots below have different x-axis, and are not to be compared directly!
NOTE$_2$: notice that a 90 degree polynomial is used for the polynomial regression. This is just to produce a model with an extremly high capacity.
%% Cell type:code id: tags:
``` python
# Run code: Qb(part I)
# NOTE: modified code from [GITHOML], 04_training_linear_models.ipynb
How would you implement ___early stopping___, in the code above?
Write an explanation of the early stopping concept...that is, just write some pseudo code that 'implements' the early stopping.
OPTIONAL: also implement your early stopping pseudo code in Python, and get it to work with the code above (and not just flipping the hyperparameter to `early_stopping=True` on the `SGDRegressor`).
%% Cell type:code id: tags:
``` python
# TODO: early stopping..
assertFalse,"TODO: explain early stopping"
```
%% Cell type:markdown id: tags:
### Qd Explain the Polynomial RMSE-Capacity plot
### Qd) Explain the Polynomial RMSE-Capacity plot
Now we revisit the concepts from `capacity_under_overfitting.ipynb` notebook and the polynomial fitting with a given capacity (polynomial degree).
Peek into the cell below (code similar to what we saw in `capacity_under_overfitting.ipynb`), and explain the generated RMSE-Capacity plot. Why does the _training error_ keep dropping, while the _CV-error_ drops until around capacity 3, and then begin to rise again?
What does the x-axis _Capacity_ and y-axis _RMSE_ represent?
Try increasing the model capacity. What happens when you do plots for `degrees` larger than around 10? Relate this with what you found via Qa+b in `capacity_under_overfitting.ipynb`.
%% Cell type:code id: tags:
``` python
# Run and review this code
# NOTE: modified code from [GITHOML], 04_training_linear_models.ipynb