This example demonstrates the problems of underfitting and overfitting and

how we can use linear regression with polynomial features to approximate

nonlinear functions.

The plot below shows the function that we want to approximate,

which is a part of the cosine function. In addition, the samples from the

real function and the approximations of different models are displayed. The

models have polynomial features of different degrees.

We can see that a linear function (polynomial with degree 1) is not sufficient to fit the

training samples. This is called **underfitting**.

A polynomial of degree 4 approximates the true function almost perfectly. However, for higher degrees the model will **overfit** the training data, i.e. it learns the noise of the

training data.

We evaluate quantitatively **overfitting**/**underfitting** by using

cross-validation. We calculate the mean squared error (MSE) on the validation

set, the higher, the less likely the model generalizes correctly from the

training data.

### Qa Explain the polynomial fitting via code review

### Qa) Explain the polynomial fitting via code review

Review the code below, write a __short__ code review summary, and explain how the polynomial fitting is implemented?

NOTE: Do not dig into the plotting details (its unimportant compared to the rest of the code), but just explain the outcome of the plots.

%% Cell type:code id: tags:

``` python

# TODO: code review

#assert False, "TODO: remove me, and review this code"

# NOTE: code from https://scikit-learn.org/stable/auto_examples/model_selection/plot_underfitting_overfitting.html

print(f" CV sub-scores: mean = {scores.mean():.2}, std = {scores.std():.2}")

foriinrange(len(scores)):

print(f" CV fold {i} => score = {scores[i]:.2}")

plt.show()

print('OK')

```

%% Cell type:code id: tags:

``` python

# TODO: code review..

assertFalse,"TODO: review in text"

```

%% Cell type:markdown id: tags:

### Qb Explain the capacity and under/overfitting concept

### Qb) Explain the capacity and under/overfitting concept

Write a textual description of the capacity and under/overfitting concept using the plots in the code above.

What happens when the polynomial degree is low/medium/high with respect to under/overfitting concepts? Explain in details.

%% Cell type:code id: tags:

``` python

# TODO: plot explainations..

assertFalse,"TODO: answer...in text"

```

%% Cell type:markdown id: tags:

### Qc Score method

### Qc) Score method

Why is the scoring method called `neg_mean_squared_error` in the code?

Explain why we see a well known $J$-function, the $MSE$, is conceptually moving from being a cost-function to now be a score function, how can that be?

What happens if you try to set it to `mean_squared_error`, i.e. does it work or does it raise an exception, ala

Remember to document the outcome for Your journal.

What is the theoretical minimum and maximum score values (remember that the score range was $[-\infty;1]$ for the $r^2$ score). Why does the Degree 15 model have a `Score(-MSE) = -1.8E8`? And, why is this by no means the best model?