5. Example Multiple Linear Regression¶
Different methods used to demonstrate Multiple Linear Regression
Ordinary Least Square
- Python from scratch
- Scikit
Gradient Descent
- Python from scratch
- Scikit
5.1. Ordinary Least Square¶
Loading Boston house-price from sklearn.datasets :
7 | data = datasets.load_boston()
|
Define the data predictors and the target data :
12 13 14 15 16 | # define the data/predictors as the pre-set feature names
df = pd.DataFrame(data.data, columns=data.feature_names)
# Put the target (housing value -- MEDV) in another DataFrame
target = pd.DataFrame(data.target, columns=["MEDV"])
|
5.1.1. Python¶
Using the Ordinary Least Square method derived in the previous section :
29 30 31 32 | Xt = np.transpose(X)
XtX = np.dot(Xt,X)
Xty = np.dot(Xt,y)
coef_ = np.linalg.solve(XtX,Xty)
|
Set of coefficiants as calculated :
[ -9.28965170e-02 4.87149552e-02 -4.05997958e-03 2.85399882e+00
-2.86843637e+00 5.92814778e+00 -7.26933458e-03 -9.68514157e-01
1.71151128e-01 -9.39621540e-03 -3.92190926e-01 1.49056102e-02
-4.16304471e-01]
5.1.2. Scikit¶
Lets define our regression model :
37 | model = linear_model.LinearRegression(fit_intercept=False)
|
Note
we are using the same linear_model
as in our simple linear regression method. Also the fit_intercept
has been set to False
. This is just
to validate our derivation in the previous section. fit_intercept
by default is set to True
and will not assume that our response \(y\)
is centered and will give an model.intercept_
value.
Fitting our model :
38 | model = model.fit(X,y)
|
[ -9.28965170e-02 4.87149552e-02 -4.05997958e-03 2.85399882e+00
-2.86843637e+00 5.92814778e+00 -7.26933458e-03 -9.68514157e-01
1.71151128e-01 -9.39621540e-03 -3.92190926e-01 1.49056102e-02
-4.16304471e-01]
Evaluating our model :
42 43 44 | y_predicted = model.predict(X)
print("Mean squared error: %.2f" % mean_squared_error(y, y_predicted))
print('R²: %.2f' % r2_score(y, y_predicted))
|
Mean squared error: 24.17
R²: 0.71