To predict the laptop’s price we’ll use linear regression, this is predicting a target numerical.
We already have a dataset that contains a matrix X of features and a y vector with the values we want to predict. Recall the general form that represent a supervised machine learning model:
$$ y \thickapprox g(X) $$
To facilitate compression the next concepts, we take a single record that we represent with $x_i$ and the target with $y_i$, such that:
$$ y_i \thickapprox g(x_i) $$
Where $x_i = (x_{i1}, x_{i2}, x_{i3}, ..., x_{in})$ is a single record that have n-features.
There are many ways the function g could look, but to do justice to the name, we’ll establish that the function g will be a linear relation between the $x_i$ and $y_i$ vectors, it has the following form:
$$ y_i = g(x_i) = g(x_{i1}, x_{i2}, x_{i3}, ..., x_{in}) = w_0 + x_{i1}w_1 + x_{i2}w_2 + x_{i3}w_3 + ... + x_{in}w_n $$
Where $w_0, w_1, w_2, ... ,w_n$ are the variables that establishing the relation between the features and the target. Normally, this variables is called the weights. In particular, $w_0$ is the bias term*.*
Let’s use a shorten notation:
$$ y_i = w_0 + \sum_{k=1}^{n} x_{ik}w_k
$$
Based on this relationship, we can implement a prediction using linear regression, which in Python code would look like this:
def linear_regression(X, w):
return w[0] + X.dot(w[1:])
Easy, we learn the weights from data. In the previous block we had the following expression:
$$ Xw = y $$
The matrix $X$ may not be a square matrix and to find the $w$ vector, we need to invert the matrix $X$. A solution for this is:
$$ X^TXw = X^Ty
$$