5. Example Multiple Linear Regression

Different methods used to demonstrate Multiple Linear Regression

  • Ordinary Least Square

    • Python from scratch
    • Scikit
  • Gradient Descent

    • Python from scratch
    • Scikit

5.1. Ordinary Least Square

Loading Boston house-price from sklearn.datasets :

7
data = datasets.load_boston()

Define the data predictors and the target data :

12
13
14
15
16
# define the data/predictors as the pre-set feature names
df = pd.DataFrame(data.data, columns=data.feature_names)

# Put the target (housing value -- MEDV) in another DataFrame
target = pd.DataFrame(data.target, columns=["MEDV"])

5.1.1. Python

Using the Ordinary Least Square method derived in the previous section :

29
30
31
32
Xt = np.transpose(X)
XtX = np.dot(Xt,X)
Xty = np.dot(Xt,y)
coef_ = np.linalg.solve(XtX,Xty)

Set of coefficiants as calculated :

[   -9.28965170e-02  4.87149552e-02 -4.05997958e-03  2.85399882e+00
    -2.86843637e+00  5.92814778e+00 -7.26933458e-03 -9.68514157e-01
    1.71151128e-01 -9.39621540e-03 -3.92190926e-01  1.49056102e-02
    -4.16304471e-01]

5.1.2. Scikit

Lets define our regression model :

37
model = linear_model.LinearRegression(fit_intercept=False)

Note

we are using the same linear_model as in our simple linear regression method. Also the fit_intercept has been set to False. This is just to validate our derivation in the previous section. fit_intercept by default is set to True and will not assume that our response \(y\) is centered and will give an model.intercept_ value.

Fitting our model :

38
model = model.fit(X,y)
[   -9.28965170e-02  4.87149552e-02 -4.05997958e-03  2.85399882e+00
    -2.86843637e+00  5.92814778e+00 -7.26933458e-03 -9.68514157e-01
    1.71151128e-01 -9.39621540e-03 -3.92190926e-01  1.49056102e-02
    -4.16304471e-01]

Evaluating our model :

42
43
44
y_predicted = model.predict(X)
print("Mean squared error: %.2f" % mean_squared_error(y, y_predicted))
print('R²: %.2f' % r2_score(y, y_predicted))
Mean squared error: 24.17
: 0.71