Epi Explained: Understanding T-Tests

This is an oil painting of a european woman with short hair and an orange off the shoulder overcoat and blue dress. the painting is Portrait of Mrs Robert Celestin Guinness (Dickie) (1935) by Philip Alexius de László (Hungarian, 1869–1937)

T-tests are a foundational tool in statistics and public health, enabling researchers to make inferences about population means when the population standard deviation is unknown and the sample size is small. This entry of Epi Explained delves into the intricacies of T-tests, covering their types, applications, and the mathematical principles in a way that should be easy enough for anyone to understand.

Introduction to T-Tests

At its core, a T-test is a type of inferential statistic (a statistical analysis that uses a small sample of the population to determine traits about the larger whole) used to determine if there is a significant difference between the means of two groups, which may be related in certain features. It’s widely used in research to compare the averages of two samples, especially when dealing with small sample sizes and when the population’s standard deviation is unknown.

Historical Context

The T-test was developed by William Sealy Gosset, a chemist working for the Guinness brewery in Dublin, Ireland, in the early 1900’s. Due to proprietary concerns, Gosset published his work under the pseudonym “Student,” which is why the T-test is sometimes referred to as the Student’s T-test. It was originally designed as a way to monitor the quality of the famous Stout, without using too much of it in testing. The T-test was formally published in 1908, and has seen constant use ever since.

Mathematical Foundation

The T-test is built on the premise that when a population is normally distributed, the small sample means will also be normally distributed around the population mean. The formula to calculate the T-statistic varies depending on the type of T-test being performed, so we’ll go over each one by one. Keep in mind that there is a lot of variable overlap, so we’ll cover each once as they appear.

Types of T-Tests

There are four types of T-tests, each designed for specific situations:

One-Sample T-Test:

- Purpose: To compare the mean of a single sample to a known mean (from the population or a theoretical expectation).
- Use Case: For example, if you want to test if the average height of a particular plant species is different from a known average height.
- Formula: [math]t = \frac{\bar{x} – \mu}{\left(\frac{s}{\sqrt{n}}\right)} [/math]
  - Where:
    - [math]t[/math] is the Test Statistic, what we’re aiming to solve for.
    - [math]\bar{x}[/math] is the sample mean, or the average of all of our observations
      - [math]\text{Average} = \frac{\text{Sum of All Observations}}{\text{Number of Observations}}[/math]
    - [math]\mu[/math] is the population mean, which might also be a best estimate, theoretical mean, or observed in previous studies or work.
    - [math]s[/math] is the standard deviation of the sample.
      - [math] \text{Standard Deviation} = \sqrt{\frac{\text{Sum of squared differences between observations and the mean}}{\text{Number of observations} – 1}} [/math]. Keep in mind that a sum of a squared difference is the process of subtracting the mean from each observation, squaring the result, and then summing all those squared differences.
    - [math]n[/math] is our sample size for our number of observations.

Independent Two-Sample T-Test (also known as an unpaired T-test):

- Purpose: To compare the means of two independent groups to see if there is a statistically significant difference between them.
- Use Case: Comparing the average heights of plants from two different species.
- Formula: [math] t = \frac{\bar{x}_1 – \bar{x}_2}{\sqrt{\frac{s^2_1}{n_1} + \frac{s^2_2}{n_2}}} [/math]
  - Where:
    - [math]\bar{x}_1[/math] and [math]\bar{x}_2[/math] are the means of the first and second groups, respectively.
    - $[math]s^2_1[/math]$ and [math]s^2_2[/math] are the variances of the first and second groups, respectively.
      - Variance, while similar to Standard Deviation isn’t quite the same. To calculate the variance ( $[math]s^2[/math]$ ) of a sample, you subtract the mean ([math]\bar{x}[/math]) from each data point ([math]x_i[/math]), square these differences, sum them up, and then divide by the number of observations minus one ([math]n-1[/math]) for a sample variance. Keep in mind that the symbol $\sum$ denotes that you do this step for each observation from initial value (i) to end value ([math]n[/math]). The formula for sample variance is: $s^{2} = \frac{\sum_{i = 1}^{n} (x_{i} - \overset{ˉ}{x})^{2}}{n - 1}$
    - [math]n_1[/math]and [math]n_2[/math] are the sample sizes of the first and second groups, respectively.

Paired Sample T-Test:

- - - Purpose: To compare the means of the same group or matched pairs at two different times or under two different conditions.
    - Use Case: Measuring the blood pressure of patients before and after a treatment to see if the treatment had an effect.
    - Formula: [math]t = \frac{\bar{d}}{s_d / \sqrt{n}}[/math]
      - Where:
        
        [math]\bar{d}[/math] is the mean of the differences between paired observations.
        
        For each pair of observations, subtract one observation from the other to get the difference. The order of subtraction should be consistent across all pairs. For example, if you have pre-test and post-test scores for a group of subjects, you might subtract the pre-test score from the post-test score for each subject to see how much change occurred. Then take all the differences, sum them, and divide them by the number of pairs.
        
        [math]s_d[/math] is the standard deviation of the differences between paired observations.
        
        [math] n [/math] is the number of pairs.

Welch’s T-Test:

- Purpose: Similar to the independent two-sample T-test but does not assume equal variances between the two groups, making it more reliable when it’s expected that there’s a difference in underlying populations as a whole.
- Use Case: Comparing test scores between two groups of students from different schools where the assumption of homogeneity of variances is not met.
- Formula: [math] t = \frac{\bar{x}_1 – \bar{x}_2}{\sqrt{\frac{s^2_1}{n_1} + \frac{s^2_2}{n_2}}} [/math]
  - [math] \bar{x}_1 [/math] and [math] \bar{x}_2 [/math] are the sample means.
  - [math] s^2_1 [/math] and [math] s^2_2 [/math] are the sample variances.
  - [math] n_1 [/math] and [math] n_2 [/math] are the sample sizes.
  - To calculate degrees of freedom for Welch’s T-Test in particular, a specific formula is needed:
    - [math] df = \frac{\left(\frac{s^2_1}{n_1} + \frac{s^2_2}{n_2}\right)^2}{\frac{\left(\frac{s^2_1}{n_1}\right)^2}{n_1 – 1} + \frac{\left(\frac{s^2_2}{n_2}\right)^2}{n_2 – 1}} [/math]

For all T-Tests, Degrees of Freedom are calculated (either by subtracting 1 from the sample size or following the equation for Welch’s T-Test) and then used as a reference number against a T-distribution Table. If the result of your T-Test is greater than the value of the T-distribution table, then you can reject the null hypothesis [math] H_0 [/math] that states that there is no significant difference between the two observations, and accept the alternative hypothesis [math]H_a[/math] that there is a significant difference.

Application in Research

T-tests are indispensable in various sub fields of public health, whether one has an interest in Environmental Health (say to compare the mean concentration of pollutants and lung function in populations), Epidemiology ( Looking at potential higher rates of a condition after suffering a specific illness), or Global Health (comparing infant mortality rates between countries).

Practice Problems

1. A public health researcher wants to investigate the effect of a new community-based exercise program on reducing stress levels among adults. To do this, the researcher measures the stress levels (using a standardized stress assessment tool) of 50 participants before they start the program and then again after they have completed the 8-week program.

Which type of T-test should the researcher use to analyze the data?

A) One-Sample T-Test
B) Independent Two-Sample T-Test
C) Paired Sample T-Test
D) Welch’s T-Test

2. A Mental Health Epidemiologist is curious as to whether the adults in their community have a sleep duration significantly different from that of a national average of 7 hours. In a study involving 25 adults, it was found that the sample mean was 6.5 hours, and the standard deviation was 1 hour. Say they are using a significance value of 0.05. What would the T-Statistic be, and would it be significant?

3. A Health Education Specialist is comparing the effectiveness of two different health education programs on improving knowledge about diabetes management. Program A was delivered to a group of 30 participants, and Program B was delivered to a group of 40 participants. The specialist measured the increase in knowledge scores (out of 100) before and after the programs and calculated the following statistics:

Program A: Mean increase = 15, Standard Deviation = 5
Program B: Mean increase = 18, Standard Deviation = 7

The specialist decides to use Welch’s T-test to determine if the difference in mean knowledge score increases between the two programs is statistically significant, given the differences in sample sizes and variances. Assuming a significance level of 0.05, which of the following outcomes is correct?

A) The difference in means is not statistically significant.
B) The difference in means is statistically significant, and Program B is more effective.
C) The difference in means is statistically significant, but it’s not clear which program is more effective.
D) The sample sizes are too small to conduct Welch’s T-test.

Answer Key, click to reveal

C, -2.5 and Significant, B

Conclusion

T-tests offer a robust method for comparing means across different samples, making them a crucial part of the statistical toolkit for public health practitioners. If you’re interested in how to do T-Tests via R or Python, we have tutorials available for both.

In any case, thank you for reading this entry of Epi Explained. Check back regularly for more public health content!

Humanities Moment

The featured image for this article was Portrait of Mrs Robert Celestin Guinness (Dickie) (1935) by Philip Alexius de László (Hungarian, 1869–1937). Philip Alexius de László, an Anglo-Hungarian painter renowned for his portraits of royal and aristocratic figures, was born in Budapest into a Jewish family and later became a British citizen in 1914. He pursued art education in Hungary, Munich, and Paris, achieving early acclaim with a portrait of Pope Leo XIII, and settled in London in 1907 after marrying Lucy Guinness (who was part of the so-called “banking branch” of the Guinness family more commonly known for brewing). Despite his successful career, honored by numerous awards and ennoblement, de László faced internment during WWI due to accusations of enemy contact but was later exonerated; he died in 1937 from heart problems, but not before his art was featured in the 1932 Olympics.

Epidemiology, Broadly Speaking