+ - 0:00:00
Notes for current slide
Notes for next slide

STA 506 2.0 Linear Regression Analysis

Lecture 2-3: Simple Linear Regression

Dr Thiyanga S. Talagala


1 / 52

Recap: correlation

2 / 52

Recap: correlation (cont.)

value interpretation
-1 Perfect negative
(-1, -0.75) Strong negative
(-0.75, -0.5) Moderate negative
(-0.5, -0.25) Weak negative
(-0.25, 0.25) No linear association
(0.25, 0.5) Weak positive
(0.5, 0.75) Moderate positive
(0.75, 1) Strong positive
1 Perfect positive
3 / 52

Recap: Terminologies

  • Response variable: dependent variable

  • Explanatory variables: independent variables, predictors, regressor variables, features (in Machine Learning)

Response variable = Model function + Random Error

  • Parameter

  • Statistic

  • Estimator

  • Estimate

Read my blogpost

4 / 52


5 / 52


6 / 52

Simple Linear Regression

Simple - single regressor

Linear - has a dual role here.

It may be taken to describe the fact that the relationship between YY and XX is linear. The word linear refers to the fact that the regression parameters enter in a linear fashion.

7 / 52

Meaning of Linear Model

| What about this?


8 / 52

Meaning of Linear Model

| What about this?


| Linear or nonlinear?


8 / 52

Meaning of Linear Model

| What about this?


| Linear or nonlinear?

Y=β0+β1x+β2x2+ϵY=β0+β1x+β2x2+ϵ | Linear or nonlinear?


8 / 52

Meaning of Linear Model

| What about this?


| Linear or nonlinear?

Y=β0+β1x+β2x2+ϵY=β0+β1x+β2x2+ϵ | Linear or nonlinear?


What about this?


8 / 52

True relationship between X and Y in the population


If ff is approximated by a linear function


The error terms are normally distributed with mean 00 and variance σ2σ2. Then the mean response, YY, at any value of the XX is


For a single unit (yi,xi)(yi,xi)

yi=β0+β1xi+ϵi where ϵiN(0,σ2)yi=β0+β1xi+ϵi where ϵiN(0,σ2)

We use sample values (yi,xi)(yi,xi) where i=1,2,...ni=1,2,...n to estimate β0β0 and β1β1.

The fitted regression model is


9 / 52

Normal distribution

[1] 5.008403

[1] 5.082935
10 / 52

Normal distribution

11 / 52

Buckle up!

Let's walk through the steps.

15 / 52


True relationship between X and Y in the population


16 / 52


True relationship between X and Y in the population


Example: Suppose you want to model daughters' height as a function of mothers' height.

Do you think an exact (deterministic) relationship exists between these two variables?

17 / 52


True relationship between X and Y in the population


Example: Suppose you want to model daughters' height as a function of mothers' height.

Do you think an exact (deterministic) relationship exists between these two variables?


18 / 52


True relationship between X and Y in the population


Example: Suppose you want to model daughters' height as a function of mothers' height.

Do you think an exact (deterministic) relationship exists between these two variables?

  1. Daughters' height may depend on many other variables than Mothers' height.
19 / 52


True relationship between X and Y in the population


Example: Suppose you want to model daughters' height as a function of mothers' height.

Do you think an exact (deterministic) relationship exists between these two variables?

  1. Daughters' height may depend on many other variables than Mothers' height.

  2. Even if many variables are included in the model, it is unlikely that we can predict the daughter's height exactly. Why?

20 / 52


True relationship between X and Y in the population


Example: Suppose you want to model daughters' height as a function of mothers' height.

Do you think an exact (deterministic) relationship exists between these two variables?

  1. Daughters' height may depend on many other variables than Mothers' height.

  2. Even if many variables are included in the model, it is unlikely that we can predict the daughter's height exactly. Why?

There will almost certainly be some variations in the model predictions that cannot be modelled, or explained.

These unexplained variances are assumed to be caused by the unexplainable random phenomena, so they can be referred to as random error.

21 / 52


22 / 52

In-class: Population Regression Line

True relationship between X and Y in the population


If ff is approximated by a linear function


The error terms are normally distributed with mean 00 and variance σ2σ2. Then the mean response, YY, at any value of the XX is


23 / 52

24 / 52

In-class: Population Regression Line


For a single unit (yi,xi)(yi,xi)

yi=β0+β1xi+ϵi where ϵiN(0,σ2)yi=β0+β1xi+ϵi where ϵiN(0,σ2)

27 / 52

Take a sample:

The fitted regression line is


28 / 52

Our example (0.52, 30.7)

Dashboard: https://statisticsmart.shinyapps.io/SimpleLinearRegression/

30 / 52

Our example (0.582, 28.5)

Dashboard: https://statisticsmart.shinyapps.io/SimpleLinearRegression/

31 / 52

Our example (0.5, 32.5)

Dashboard: https://statisticsmart.shinyapps.io/SimpleLinearRegression/

Which is the best?

32 / 52

Which is the best?

Sum of squares of Residuals


33 / 52

Evaluating your answers: Fitted values

Dheight = 30.7 + 0.52Mheight

df <- alr3::heights
df$fitted <- 30.7 + (0.52*df$M)
Mheight Dheight fitted
1 59.7 55.1 61.744
2 58.2 56.5 60.964
3 60.6 56.0 62.212
4 60.7 56.8 62.264
5 61.8 56.0 62.836
6 55.5 57.9 59.560
7 55.4 57.1 59.508
8 56.8 57.6 60.236
9 57.5 57.2 60.600
10 57.3 57.1 60.496

First fitted value: 30.7 + (0.52 * 59.7) = 61.744

34 / 52

Evaluating your answers

Sum of squares of Residuals


Dheight = 30.7 + 0.52Mheight

Mheight Dheight fitted resid_squared
1 59.7 55.1 61.744 44.142736
2 58.2 56.5 60.964 19.927296
3 60.6 56.0 62.212 38.588944
4 60.7 56.8 62.264 29.855296
5 61.8 56.0 62.836 46.730896
6 55.5 57.9 59.560 2.755600
7 55.4 57.1 59.508 5.798464
8 56.8 57.6 60.236 6.948496
9 57.5 57.2 60.600 11.560000
10 57.3 57.1 60.496 11.532816
[1] 7511.118

SSR: 7511.118

35 / 52

Evaluating your answers

Dashboard: https://statisticsmart.shinyapps.io/SimpleLinearRegression/

  • Green: 7511.118 (0.52, 30.7)

  • Orange: 8717.41 (0.582, 28.5)

  • Purple: 7066.075 (0.5, 32.5)

36 / 52

How to estimate β0β0 and β1β1?

Sum of squares of Residuals


Observed value


Fitted value





The least-squares regression approach chooses coefficients ˆβ0^β0 and ˆβ1^β1 to minimize SSRSSR.

37 / 52

Least-squares Estimation of the Parameters

yi=β0+β1xi+ϵi, i =1, 2, 3, ...n .yi=β0+β1xi+ϵi, i =1, 2, 3, ...n .

The least squares criterion is


38 / 52

Least-squares Estimation of the Parameters (cont.)

The least squares criterion is


The least-squares estimators of β0β0 and β1β1, say ^β0^β0 and ^β1,^β1, must satisfy




39 / 52

Least-squares Estimation of the Parameters (cont.)

Simplifying the two equations yields

n^β0+^β1ni=1xi=ni=1yi,n^β0+^β1ni=1xi=ni=1yi, ^β0ni=1xi+^β1ni=1x2i=ni=1yixi.

These are called least-squares normal equations.

40 / 52

Least-squares Estimation of the Parameters (cont.)



The solution to the normal equation is




The fitted simple linear regression model is then


41 / 52

Least-squares fit

Try this with R

library(alr3) # to load the dataset
model1 <- lm(Dheight ~ Mheight, data=heights)
lm(formula = Dheight ~ Mheight, data = heights)
(Intercept) Mheight
29.9174 0.5417
42 / 52

Least-squares fit and your guesses

fit <- 0.5417 * df$Mheight + 29.9174
sum((df$Dheight - fit)^2)
[1] 7051.97
43 / 52

Least square fit and your guesses

  • Green: 7511.118 (0.52, 30.7)

  • Orange: 8717.41 (0.582, 28.5)

  • Purple: 7066.075 (0.5, 32.5)

  • Blue: 7051.97 (0.541, 29.9174)

44 / 52

Try this with R

library(alr3) # to load the dataset
model1 <- lm(Dheight ~ Mheight, data=heights)
lm(formula = Dheight ~ Mheight, data = heights)
Min 1Q Median 3Q Max
-7.397 -1.529 0.036 1.492 9.053
Estimate Std. Error t value Pr(>|t|)
(Intercept) 29.91744 1.62247 18.44 <2e-16 ***
Mheight 0.54175 0.02596 20.87 <2e-16 ***
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.266 on 1373 degrees of freedom
Multiple R-squared: 0.2408, Adjusted R-squared: 0.2402
F-statistic: 435.5 on 1 and 1373 DF, p-value: < 2.2e-16
45 / 52

Visualise the model: Try with R

ggplot(data=heights, aes(x=Mheight, y=Dheight)) +
geom_point(alpha=0.5) +
geom_smooth(method="lm", se=FALSE,
col="blue", lwd=2) +
theme(aspect.ratio = 1)

46 / 52

Least squares regression line

Mheight Dheight
Min. :55.40 Min. :55.10
1st Qu.:60.80 1st Qu.:62.00
Median :62.40 Median :63.60
Mean :62.45 Mean :63.75
3rd Qu.:63.90 3rd Qu.:65.60
Max. :70.80 Max. :73.10

The LSRL passes through the point ( ˉx, ˉy), that is (sample mean of x, sample mean of y)

47 / 52

Least squares regression line

The least squares regression line doesn't match the population regression line perfectly, but it is a pretty good estimate. And, of course, we'd get a different least squares regression line if we took another (different) sample.

48 / 52
49 / 52

Extrapolation: beyond the scope of the model.

50 / 52

Next Lecture

More work - Simple Linear Regression, Residual Analysis, Predictions

51 / 52

All rights reserved by

Dr. Thiyanga S. Talagala

52 / 52

Recap: correlation

2 / 52


Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow