12 Hypothesis Testing

12.1 Writing a Hypothesis for a Study

12.1.1 Steps:

12.1.1.1 1. Identify the research question:

What relationship or difference are you testing?
Example: Does the minimum wage increase affect average worker wages?

12.1.1.2 2. State the null hypothesis

Typically, no effect, no difference, status quo
$H_0$ : The minimum wage policy has no effect on wages.

12.1.1.3 3. State the alternative hypothesis

Typically, effect exists, difference exists
$H_\alpha$ : The minimum wage policy increases wages.

12.1.1.4 4. Decide direction:

One-tailed –> effect expected in a specific direction
Two-tailed–> effect could be in either direction

Tip: Good hypotheses are CMT: Clear, Measurable, and Testable.

12.1.1.5 Difference between one-tailed test and two-tailed test

A two-tailed test checks whether a sample mean or proportion is significantly different from a given value in either direction (higher or lower).

Null Hypothesis (H₀): There is no difference between the sample statistic and the population value.
Alternative Hypothesis ($H_\alpha$): The sample statistic is either higher or lower than the population value.

A one-tailed test checks if a sample statistic is significantly greater or smaller than a given value in one direction only.

Null Hypothesis (H₀): There is no difference OR the difference is in the opposite direction.
Alternative Hypothesis ($H_\alpha$): The sample statistic is either higher or lower, but not both.

12.1.1.6 Deciding between One-Tailed versus Two-Tailed

A central bank claims that the country’s average annual inflation rate is 3%. An economist wants to test whether this is incorrect.
A government introduces a new minimum wage policy, and policymakers want to check whether it has decreased employment levels.
A researcher wants to compare the GDP growth rate of two neighboring countries to see if they are different.
An economist wants to determine whether male workers earn more than female workers in the same industry.
A researcher wants to test whether developing countries receive less Foreign Direct Investment (FDI) than developed countries.

12.2 Test of Proportions

A test of proportion is used when we want to check whether the proportion of a certain event (e.g., unemployment rate, poverty rate) in a sample is different from a known proportion in the population.

12.2.1 One-Sample Test of Proportion

We use this test when we want to see if the proportion of a sample is different from the population proportion.

Example: Unemployment Rate

A government report claims that 8% of the labor force is unemployed. We collect a sample of 500 workers and find that 50 are unemployed. Is the actual unemployment rate significantly different from 8%?

prop.test(x = 50, n = 500, p = 0.08, correct = FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  50 out of 500, null probability 0.08
## X-squared = 2.7174, df = 1, p-value = 0.09926
## alternative hypothesis: true p is not equal to 0.08
## 95 percent confidence interval:
##  0.07667756 0.12942191
## sample estimates:
##   p 
## 0.1

This employs prop.test then inside the parenthesis, we have x=Number of successes, n=total sample size and p=hypothesized population proportion. The correct=FALSE adjusts the t-statistic. The rule is to set it to TRUE for small sample sizes and FALSE for big sample sizes.

If the p-value < 0.05, we reject the null hypothesis, meaning the unemployment rate in our sample is significantly different from 8%.
If the p-value > 0.05, we fail to reject the null hypothesis, meaning we do not have enough evidence to say the unemployment rate differs from 8%.

If we believe the true unemployment rate is higher than 8%

prop.test(x = 50, n = 500, p = 0.08, alternative = "greater", correct = FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  50 out of 500, null probability 0.08
## X-squared = 2.7174, df = 1, p-value = 0.04963
## alternative hypothesis: true p is greater than 0.08
## 95 percent confidence interval:
##  0.08003919 1.00000000
## sample estimates:
##   p 
## 0.1

Since it is one-tailed, alternative = "greater" tests the direction. The alternative is changed if it is in the opposite direction.

If we believe the true unemployment rate is lower than 8%

prop.test(x = 50, n = 500, p = 0.08, alternative = "less", correct = FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  50 out of 500, null probability 0.08
## X-squared = 2.7174, df = 1, p-value = 0.9504
## alternative hypothesis: true p is less than 0.08
## 95 percent confidence interval:
##  0.0000000 0.1242664
## sample estimates:
##   p 
## 0.1

12.2.2 Two-Sample Test of Proportion

We use this when comparing proportions between two groups (e.g., unemployment rates in two different regions).

12.2.2.1 Example: Employment Rate Comparison

A researcher wants to compare the employment rates between urban and rural areas.

In an urban area, 450 out of 500 people are employed.
In a rural area, 420 out of 500 people are employed.

We test whether the employment rates are significantly different.

prop.test(x = c(450, 420), n = c(500, 500), correct = FALSE)

## 
##  2-sample test for equality of proportions without continuity correction
## 
## data:  c(450, 420) out of c(500, 500)
## X-squared = 7.9576, df = 1, p-value = 0.004789
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.01847836 0.10152164
## sample estimates:
## prop 1 prop 2 
##   0.90   0.84

The x= number of successes in the two groups, n=total number of sample sizes in the two groups.

Interpreting the Results:

If p-value < 0.05: Employment rates differ significantly between urban and rural areas.
If p-value > 0.05: There is no significant difference in employment rates.

If we believe urban employment rate is higher than rural:

prop.test(x = c(450, 420), n = c(500, 500), alternative = "greater", correct = FALSE)

## 
##  2-sample test for equality of proportions without continuity correction
## 
## data:  c(450, 420) out of c(500, 500)
## X-squared = 7.9576, df = 1, p-value = 0.002394
## alternative hypothesis: greater
## 95 percent confidence interval:
##  0.02515394 1.00000000
## sample estimates:
## prop 1 prop 2 
##   0.90   0.84

If we believe urban employment rate is lower than rural:

prop.test(x = c(450, 420), n = c(500, 500), alternative = "less", correct = FALSE)

## 
##  2-sample test for equality of proportions without continuity correction
## 
## data:  c(450, 420) out of c(500, 500)
## X-squared = 7.9576, df = 1, p-value = 0.9976
## alternative hypothesis: less
## 95 percent confidence interval:
##  -1.00000000  0.09484606
## sample estimates:
## prop 1 prop 2 
##   0.90   0.84

12.2.3 Real-World Example:

We are going to use the wage1 dataset from Wooldridge package but we do not want to do anything to the original dataset (though technically we can) but, better safe than sorry.

library(wooldridge) 
data(wage1)

sample<-wage1

Is the proportion of female workers greater than 50%?

prop.test(sum(sample$female), nrow(sample), p = 0.5, alternative = "greater", correct = FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  sum(sample$female) out of nrow(sample), null probability 0.5
## X-squared = 0.92015, df = 1, p-value = 0.8313
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.443458 1.000000
## sample estimates:
##         p 
## 0.4790875

12.3 T-Tests

A t-test is used to compare means (e.g., wages, GDP growth).

Types of T-Tests:

One-Sample T-Test: Compares a sample mean to a known value.
Independent-Samples T-Test: Compares means of two different groups.
Paired-Samples T-Test: Compares before and after effects within the same group.

12.3.1 One-Sample T-Test

A report states that the average monthly wage in a country is 20,500. We take a sample of 100 workers and test if their wages differ.

set.seed(123)
wages<-rnorm(100, mean=21000, sd=500)
t.test(wages, mu=20500)

## 
##  One Sample t-test
## 
## data:  wages
## t = 11.946, df = 99, p-value < 0.00000000000000022
## alternative hypothesis: true mean is not equal to 20500
## 95 percent confidence interval:
##  20954.64 21135.76
## sample estimates:
## mean of x 
##   21045.2

The rnorm generates a random sample of 100 wages. Then the t.test performs a one-sample t-test by comparing if the sample mean (21000) is different from the hypothesized mean (20500)

Interpreting the results:

If p-value < 0.05: The average wage is significantly different from 20,500.
If p-value > 0.05: There is no significant difference in wages.

12.3.2 Real-World Example

Is the average hourly wage significantly different from $5 per hour?

t.test(sample$wage, mu = 5)

## 
##  One Sample t-test
## 
## data:  sample$wage
## t = 5.5649, df = 525, p-value = 0.00000004186
## alternative hypothesis: true mean is not equal to 5
## 95 percent confidence interval:
##  5.579768 6.212437
## sample estimates:
## mean of x 
##  5.896103

12.3.3 Independent-Samples T-Test

We compare male and female wages in the same industry.

set.seed(123)
male_wage<-rnorm(50, mean=23000, sd=600)
female_wage<-rnorm(50, mean=22500, sd=500)

t.test(male_wage, female_wage, var.equal=TRUE)

## 
##  Two Sample t-test
## 
## data:  male_wage and female_wage
## t = 4.4149, df = 98, p-value = 0.00002603
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  246.3177 648.5583
## sample estimates:
## mean of x mean of y 
##  23020.64  22573.20

This performs a two-sample t-test assuming equal variances.

Interpreting the results:

If p-value < 0.05, wages differ significantly.
If p-value > 0.05, there is no significant difference.

12.3.3.1 Checking variance before running the T-Test

Variance measures how spread out the data is around the mean.

Low variance → Data points are close to the mean.
High variance → Data points are spread out from the mean.

var.test(male_wage, female_wage)

## 
##  F test to compare two variances
## 
## data:  male_wage and female_wage
## F = 1.5057, num df = 49, denom df = 49, p-value = 0.1556
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.8544445 2.6533137
## sample estimates:
## ratio of variances 
##           1.505692

If variance ratio is close to 1, assume equal variances (var.equal = TRUE).

12.3.4 Real-World Example

Do men and women earn different wages?

#check variance
var.test(sample$wage[sample$female == 1], sample$wage[wage1$female == 0])

## 
##  F test to compare two variances
## 
## data:  sample$wage[sample$female == 1] and sample$wage[wage1$female == 0]
## F = 0.36954, num df = 251, denom df = 273, p-value = 0.000000000000004813
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.2900223 0.4714405
## sample estimates:
## ratio of variances 
##          0.3695359

Interpreting the variance ratio:

Null Hypothesis (H₀): The variances of male and female wages are equal
Alternative Hypothesis ($H_\alpha$): The variances of male and female wages are not equal

t.test(sample$wage ~ sample$female, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  sample$wage by sample$female
## t = 8.44, df = 456.33, p-value = 0.0000000000000004243
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  1.926971 3.096690
## sample estimates:
## mean in group 0 mean in group 1 
##        7.099489        4.587659

~ Means “wage depends on gender” (separates the dependent and independent variable)

12.3.5 Paired-Samples T-Test

A new minimum wage policy was introduced. We analyze its impact on worker wages before and after implementation.

set.seed(0)
wage_before <- rnorm(50, mean = 32000, sd = 400)  
wage_after <- wage_before + rnorm(50, mean = 2500, sd = 100)  

t.test(wage_before, wage_after, paired = TRUE)

## 
##  Paired t-test
## 
## data:  wage_before and wage_after
## t = -206.45, df = 49, p-value < 0.00000000000000022
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -2526.496 -2477.785
## sample estimates:
## mean difference 
##       -2502.141

This means that wage_after was simply the wages after the policy were increased by approx. 2500 per worker. The t.test simply performs a paired t-test that is why paired=TRUE

Interpreting the Results:

If p-value < 0.05: The minimum wage policy significantly increased wages.
If p-value > 0.05: The policy had no significant effect.

Did wages increase?

t.test(wage_before, wage_after, paired = TRUE, alternative = "greater")

## 
##  Paired t-test
## 
## data:  wage_before and wage_after
## t = -206.45, df = 49, p-value = 1
## alternative hypothesis: true mean difference is greater than 0
## 95 percent confidence interval:
##  -2522.46      Inf
## sample estimates:
## mean difference 
##       -2502.141

12.4 Chi-Square Test

This tests association between two categorical variables.

$H_0$ variables are independent

$H_\alpha$ variables are associated

This is often used in surveys or panel data with categorical outcomes. Example, Is marital status associated with working in trade?

chisq.test(table(sample$married, sample$trade))

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table(sample$married, sample$trade)
## X-squared = 9.2022, df = 1, p-value = 0.002417

We perform Chi-square test of independence between employment status and gender. We use table to count for each combination of categories (cross-tabulation). The chi-square test is for seeing if these two variables are independent.

The results tell us that marital status and working in trade are associated. This simply tells us that distribution of trade employment differs between married and unmarried individuals. However, chi-square does not show the likelihood, the magnitude of the difference and whether marriage causes industry choice.

married may be a confounding variable! because when you ignore it, you might encounter omitted variable bias if marriage influences both trade and wages.

You can look at the contingency table to see which group drives the association.

table(sample$married, sample$trade)

##    
##       0   1
##   0 131  75
##   1 244  76

prop.table(table(sample$married, sample$trade), margin=1)

##    
##             0         1
##   0 0.6359223 0.3640777
##   1 0.7625000 0.2375000

The first table, the first column is for married while the first row is for trade (the number of observations).

The next table is the proportions, with each row sums to 1.

We are NOT saying that marriage causes people to work in trade. NO; we are only showing a descriptive association.

12.4.1 Steps in Hypothesis Testing

State hypotheses
Choose test
Perform the test
Examine p-value and confidence intervals
Draw conclusion - reject or fail to reject null hypothesis
Make a statement about your data - “From our sample, we have evidence that the minimum wage policy increased wages.”