ENV221 L09

Author

Peng Zhao

1 Overview of the module and R

2 R Basic Operations

3 R Programming

4 Statistical Graphs

5 Basic Concepts

6 Descriptive statistics

7 Distributions and the Central Limit Theorem

8 Hypothesis test

9 t-test

9.1 Learning objectives

In this lecture, you will

  1. Learn the t distribution.
  2. Use t test for one-sample and two-sample hypothesis testing.

9.2 Limitations of \(z\) distribution and \(z\)-test

When we know \(\sigma\), we can

  • estimate population parameters with \(z\)-distribution (see Lecture 7.8), and
  • do the \(z\)-test for the mean.

What if we do not know \(\sigma\)?

9.3 Student’s \(t\)-distribution

Mr. Student (William Sealy Gosset)

Student’s \(t\)-distribution:
If the distribution of a random variable \(x\) is (approximately) normal, then the statistic \(t\) (\(t = \frac{\bar x - \mu}{s / \sqrt n}\)) follows a t-distribution.
Degrees of freedom (\(\nu\) (nu) or \(df\)):
The number of free choices left after a sample statistic such as \(\bar x\) is calculated. For the \(t\)-distribution, when your sample size is \(n\), \(df = n - 1\).

In Graphs:

Code
mycol <- c("black", "darkblue", "blue", "green", "red")
curve(dt(x, df = 3), -4, 4, ylim = c(0, 0.4), col = mycol[1], xlab = "t", ylab = "f(t)", las = 1)
i <- 2
for (df in c(5, 10, 30)) {
  curve(dt(x, df = df), add = TRUE, col = mycol[i])
  i <- i + 1
}
curve(dnorm(x), add = TRUE, col = mycol[i])
legend("topright", lty = 1, col = mycol, title = "d.f.",
       legend = c(3, 5, 10, 30, "Inf"), bty = "n")

PDF:

\[f(t)=\frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\sqrt{\nu \pi} \Gamma\left(\frac{\nu}{2}\right)}\left(1+\frac{t^{2}}{\nu}\right)^{-\frac{\nu+1}{2}}\]

For \(\nu>1\) even,

\[ \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\sqrt{\nu \pi} \Gamma\left(\frac{\nu}{2}\right)}=\frac{(\nu-1)(\nu-3) \cdots 5 \cdot 3}{2 \sqrt{\nu}(\nu-2)(\nu-4) \cdots 4 \cdot 2} \]

For \(\nu>1\) odd,

\[ \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\sqrt{\nu \pi} \Gamma\left(\frac{\nu}{2}\right)}=\frac{(\nu-1)(\nu-3) \cdots 4 \cdot 2}{\pi \sqrt{\nu}(\nu-2)(\nu-4) \cdots 5 \cdot 3} \]

Properties:

  • Symmetric and bell shaped, like the normal distribution, but has heavier tails.
  • Mean = median = 0.
  • The larger the sample size (\(n\)) is, the more the distribution resembles a normal distribution. When \(\nu = \infty\) , \(f(t)=\frac{1}{\sqrt{2 \pi }} e^{-t^{2} / 2}\).

9.4 Estimating Population Parameters with \(t\)-distribution

Example: Battery life

A manufacturer develops a new battery and wants to know how long each battery lasts averagely before it burns out. They test a sample of 30 batteries and find that the sample mean is 90 hours with a standard deviation of 25 hours. Estimate the population mean and the 95 % confidence interval for the mean.

sample_mean <- 90
n <- 30
sdv <- 25
se <- sdv / sqrt(n)
t_critical <- qt(0.025, lower.tail = FALSE, df = n - 1)
c(sample_mean - t_critical * se, sample_mean + t_critical * se)
[1] 80.66485 99.33515

Example: Heavy metal concentration (mg/L) in sediments of a river

A researcher sampled sediments of a river. She took 10 observations as a random sample. The Fe concentrations (mg/L) are 24.29, 21.47, 28.58, 22.41, 23.12, 23.91, 28.51, 16.25, 22.57, 23.17.

x <- c(24.29, 21.47, 28.58, 22.41, 23.12, 
        23.91, 28.51, 16.25, 22.57, 23.17)
sample_mean <- mean(x)
n <- length(x)
sdv <- sd(x)
se <- sdv / sqrt(n)
t_critical <- qt(0.025, df = n - 1, lower.tail = FALSE)
t_critical
[1] 2.262157
sample_mean
[1] 23.428
c(sample_mean - t_critical * se, sample_mean + t_critical * se)
[1] 20.91987 25.93613

Use t.test():

t.test(x)

    One Sample t-test

data:  x
t = 21.13, df = 9, p-value = 5.588e-09
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 20.91987 25.93613
sample estimates:
mean of x 
   23.428 

How do we get the t score?

t_score <- (sample_mean - 0)/se
t_score
[1] 21.13037

9.5 One-sample test

One-sample test
Test whether a population mean is significantly different from some hypothesized value.

9.5.1 One-tailed

Example: Battery life

A manufacturer develops a new battery and claims that the average battery life is not shorter than 100 hours. A consumer group believes this average is shorter. They test a sample of 10 batteries: 64, 100, 80, 137, 80, 125, 96, 127, 105, 90 hours. Can they prove that this average life is shorter than the manufacturer claims?

  1. Hypotheses: \(H_0:\mu \ge 100, H_1: \mu< 100\). Reject \(H_0\)? Given \(\alpha = 0.05\).

  2. Gather data:

mu <- 100
x <- c(91, 100, 70, 87, 104, 92, 104, 88, 72, 119)
n <- length(x)
xbar <- mean(x)
s <- sd(x)
  1. Calculate a test statistic:

\[t =\frac{\bar x - \mu}{s / \sqrt{n}}\]

t_score <- (xbar - mu) / (s / sqrt(n))
t_score
[1] -1.547751
pt(t_score, n-1)
[1] 0.07804407

Instead of pt(), look up the \(t\) table:

  • \(df = 9\), \(t = -1.547751\).
  • the absolute value of \(t\) is between 1.383 and 1.833.
  • Thus, the \(p\) value is between 0.05 and 0.10.
  1. Decision

The consumer group cannot reject \(H_0\) at a significant level of \(\alpha = 0.05\).

One step in R:

t.test(x, mu = 100, alternative = "less")

    One Sample t-test

data:  x
t = -1.5478, df = 9, p-value = 0.07804
alternative hypothesis: true mean is less than 100
95 percent confidence interval:
     -Inf 101.3459
sample estimates:
mean of x 
     92.7 
  1. Conclusion

The average battery life is not shorter than 100 hours.

9.5.2 Two-tailed

Example: Zn in the water

A previous study reported that the mean concentration of zinc (Zn) in the water of a river is 0.02 mg/L. A scientist thinks the real Zn concentration is not 0.02 mg/L. She collects 8 samples and tests them. The concentrations are 0.011, 0.021, 0.001, 0.007, 0.031, 0.023, 0.026, 0.019. Is there evidence to support her hypothesis? Use \(\alpha = 0.10\) to carry out the analysis.

  1. Hypotheses: \(H_0:\mu = 0.02, H_1: \mu \ne 0.02\). Reject \(H_0\)? Given \(\alpha = 0.10\).

  2. Gather data:

mu <- 0.02
x <- c(0.011, 0.021, 0.001, 0.007, 0.031, 0.023, 0.026, 0.019)
xbar <- mean(x)
sdv <- sd(x)
n <- length(x)
  1. Calculate a test statistic:
t_score <- (xbar - mu) / (sdv / sqrt(n))
t_score
[1] -0.7301154
pt(t_score, n-1) * 2
[1] 0.4890274

Look up the t table:

  • \(df = 7\), \(t = -0.7301154\).
  • the absolute value of \(t\) is between 0.711 and 0.896.
  • Thus, the \(p\) value is between 0.4 (0.2 * 2) and 0.5 (0.25 * 2).
  1. Decision.

She cannot reject \(H_0\).

One step in R:

t.test(x, mu = 0.02, alternative = "two.sided", conf.level = 0.1)

    One Sample t-test

data:  x
t = -0.73012, df = 7, p-value = 0.489
alternative hypothesis: true mean is not equal to 0.02
10 percent confidence interval:
 0.01690656 0.01784344
sample estimates:
mean of x 
 0.017375 
  1. Conclusion.

The mean concentration of Zn is 0.02 mg/L.

9.6 Two-sample test

9.6.1 Sampling distribution of the difference

Two-sample Test
Test the difference between two population means.
The sampling distribution of the difference between means
the distribution of all possible values of differences between pairs of sample means with the sample sizes held constant from pair to pair.

Example: Two 3D printers

A scientist wants to choose between two 3D printers to produce a 3D model. The printing time is 24.58, 22.09, 23.70, 18.89, 22.02, 28.71, 24.44, 20.91, 23.83, 20.83 hours for Printer 1, and 21.61, 19.06, 20.72, 15.77, 19, 25.88, 21.48, 17.85, 20.86, 17.77 hours for Printer 2. Do these two printers have the same mean printing time?

Before we compare the two sample means, we have to test wheter the two population variances are equal: \(F\)-test (see Lecture 8.6).

x1 <- c(24.58, 22.09, 23.70, 18.89, 22.02, 28.71, 24.44, 20.91, 23.83, 20.83)
x2 <- c(21.61, 19.06, 20.72, 15.77, 19, 25.88, 21.48, 17.85, 20.86, 17.77)
var.test(x2, x1, ratio = 1, alternative = "two.sided", conf.level = 0.95)

    F test to compare two variances

data:  x2 and x1
F = 1.0586, num df = 9, denom df = 9, p-value = 0.9338
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.2629492 4.2620464
sample estimates:
ratio of variances 
          1.058632 

Conclusion: These two printers have the same variances in printing time.

9.6.2 Two-sample test with equal variance

A distribution curve of the sample mean:

Code
curve(dnorm(x), 
      xlab = expression(bar(x)[1] - bar(x)[2]), 
      ylab = "Density", 
      xlim = c(-5, 5), las = 1)

Distribution of the difference between two means.

According to CLT, the mean:

\[\mu_{\bar x_1 - \bar x_2} = \mu_1 - \mu_2\]

For simplicity, we use \(\mu_d\) for \(\mu_1 - \mu_2\), i.e.:

\[\mu_{\bar x_1 - \bar x_2} = \mu_d\]

The standard deviation (the standard error of the difference between means):

\[\sigma_{\bar x_1 - \bar x_2} = \sqrt {\frac{\sigma_1^2}{n} + \frac{\sigma_2^2}{m}}\]

Compared to the standard deviation we learnt from CLT:

\[\sigma_{\bar x} = \sigma \sqrt {\frac{1}{N}}\]

Calculate a test statistic for two-sample test with equal variances:

\[t = \frac{(\bar x_1 - \bar x_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{(n-1)s_1^2 + (m - 1)s_2^2}{(n - 1) + (m -1)} (\frac{1}{n} + \frac{1}{m})}}\]

or simplified as

\[t = \frac{\bar x_d -\mu_d}{s_p\sqrt{\frac{1}{n_p}}}\]

where \(s_p^2\) is the pooled standard deviation:

\[s_p^2 = \frac{(n-1)s_1^2 + (m - 1)s_2^2}{(n - 1) + (m -1)} \]

\[\frac{1}{n} + \frac{1}{m} = \frac{1}{n_p}\]

\[\bar x_1 - \bar x_2 = \bar x_d \]

\[ \mu_1 - \mu_2 = \mu_d\]

Do the two printers have the same mean printing time?

  1. Hypotheses:

    • \(H_0: \mu_1 - \mu_2 = 0\)
    • \(H_1: \mu_1 - \mu_2 \ne 0\)
    • Reject \(H_0\)? Given \(\alpha = 0.05\).
  2. Collect data

  3. Calculate a test statistic:

n <- length(x1)
m <- length(x2)
xbar1 <- mean(x1)
xbar2 <- mean(x2)
s1 <- sd(x1)
s2 <- sd(x2)
sp <- sqrt(((n-1) * s1 ^ 2 + (m-1) * s2 ^ 2) /((n-1) + (m - 1)))
np <- 1 / (1/n + 1/m)
xbar_d <- xbar1 - xbar2
mu_d <- 0
t_score <- (xbar_d - mu_d) / (sp / sqrt(np))
t_score
[1] 2.439594
t_critical <- qt(0.975, df = n - 1 + m - 1)
t_critical
[1] 2.100922
pt(t_score, df = n -1 + m -1, lower.tail = FALSE) * 2
[1] 0.02528077
  1. Decision

As \(t_{\mathrm{score}} > t_{\mathrm{critical}, 1-\alpha}\) at \(\alpha = 0.05\), we can reject \(H_0\).

One step:

t.test(x1, x2, var.equal = TRUE, alternative = "two.sided", mu = 0)

    Two Sample t-test

data:  x1 and x2
t = 2.4396, df = 18, p-value = 0.02528
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.4164695 5.5835305
sample estimates:
mean of x mean of y 
       23        20 

In a data frame:

dtf <- data.frame(x = c(x1, x2), 
                 printer = rep(c("Printer1", "Printer2"), 
                               c(length(x1), length(x2))))
t.test(x ~ printer, var.equal = TRUE, alternative = "two.sided", mu = 0, data = dtf)

    Two Sample t-test

data:  x by printer
t = 2.4396, df = 18, p-value = 0.02528
alternative hypothesis: true difference in means between group Printer1 and group Printer2 is not equal to 0
95 percent confidence interval:
 0.4164695 5.5835305
sample estimates:
mean in group Printer1 mean in group Printer2 
                    23                     20 
  1. Conclusion.

The two printers have different mean printing times.

9.6.3 Two-sample test with unequal variances

If the variances of the two populations are unequal (based on \(F\)-test):

\[t = \frac{\bar x_d - \mu_d}{\sqrt{\frac{s_1^2}{n} + \frac{s_2^2}{m}}}\]

t_score <- (xbar_d - mu_d) /sqrt(s1^2/n + s2^2/m)
t_score
[1] 2.439594
(t_critical <- qt(0.975, df = n - 1 + m -1))
[1] 2.100922
pt(t_score, df = n -1 + m -1, lower.tail = FALSE) * 2
[1] 0.02528077
t.test(x ~ printer, var.equal = FALSE, alternative = "two.sided", mu = 0, data = dtf)

    Welch Two Sample t-test

data:  x by printer
t = 2.4396, df = 17.985, p-value = 0.02529
alternative hypothesis: true difference in means between group Printer1 and group Printer2 is not equal to 0
95 percent confidence interval:
 0.4163193 5.5836807
sample estimates:
mean in group Printer1 mean in group Printer2 
                    23                     20 

9.6.4 Paired samples

When the samples are matched, treat the differences as a one-sample \(t\) test.

Example: Weight-loss program.

Ten people take part in a weight-loss program. They weigh before starting the program, and weigh again after the one-month program. Do they really lose weight?

  1. Hypotheses:
  • \(\mu_\mathrm{diff} = \mu_\mathrm{before} - \mu_\mathrm{after}\). \(H_0: \mu_\mathrm{diff} \le 0, H_1: \mu_\mathrm{diff} > 0\)
  • Reject \(H_0\)? Given \(\alpha = 0.05\).
  1. Collect data.
wl <- data.frame(
  id = LETTERS[1:10],
  before = c(198, 201, 210, 185, 204, 156, 167, 197, 220, 186),
  after = c(194, 203, 200, 183, 200, 153, 166, 197, 215, 184))
wl$diff <- wl$before - wl$after
id before after diff
A 198 194 4
B 201 203 -2
C 210 200 10
D 185 183 2
E 204 200 4
F 156 153 3
G 167 166 1
H 197 197 0
I 220 215 5
J 186 184 2
Data of the weight-loss program.
  1. Calculate a test statistic.

\[t = \frac{\bar d - \mu_d}{s_d}\]

d_bar <- mean(wl$diff)
mu_d <- 0
s_d <- sd(wl$diff) / sqrt(nrow(wl))
(t_score <- (d_bar - mu_d) / s_d)
[1] 2.82414
(t_critical <- qt(0.95, df = nrow(wl) - 1))
[1] 1.833113
pt(t_score, df = nrow(wl) - 1, lower.tail = FALSE)
[1] 0.009955992
  1. Decision.

As \(t_\mathrm{score} > t_\mathrm{critical, 1-\alpha}\), we can reject \(H_0\) at \(\alpha = 0.05\).

One step:

t.test(wl$before, wl$after, alternative = "greater", paired = TRUE)

    Paired t-test

data:  wl$before and wl$after
t = 2.8241, df = 9, p-value = 0.009956
alternative hypothesis: true mean difference is greater than 0
95 percent confidence interval:
 1.017647      Inf
sample estimates:
mean difference 
            2.9 
  1. Conclusion.

They really lose weight.

9.7 Readings

  • Applied Environmental Statistics with R - Chapter 9

9.8 Highlights

  • Carry out a step-by-step \(t\)-test for a scientific question.
    • State \(H_0\) and \(H_1\).
    • Calculate the critical value for a given \(\alpha\).
    • Calculate the testing statistics (\(t\) score).
    • Draw a conclusion for the scientific question.