Home ›
Statistics Help ›
Chi-Square Tests
Chi-Square Tests Explained: Testing Categorical Data
📊 Quick Answer
Chi-square tests analyze categorical (count) data to determine if observed frequencies differ significantly from expected frequencies. The two main types are the goodness-of-fit test (does data match an expected distribution?) and the test of independence (are two categorical variables related?). Both compare what you observed to what you’d expect if the null hypothesis were true.
📑 In This Guide
What Is a Chi-Square Test?
A chi-square test (written χ² and pronounced “kai-square”) is a hypothesis test for categorical data. Unlike t-tests and ANOVA which compare means of numerical data, chi-square tests compare counts or frequencies.
The core question: Do the observed counts differ significantly from what we’d expect by chance?
The chi-square statistic measures the discrepancy between observed (O) and expected (E) frequencies:
χ² = Σ (O − E)² / E
Sum across all categories
A larger χ² value means greater discrepancy between observed and expected—making it less likely the null hypothesis is true.
The Chi-Square Distribution
The chi-square distribution is the sampling distribution used to determine p-values for chi-square tests. It has several important properties:
- Always positive: Since we square the differences, χ² can never be negative
- Always right-skewed: Though it becomes more symmetric with higher degrees of freedom
- Shape depends on degrees of freedom: Different df values produce different distributions
The chi-square distribution is always right-skewed and becomes more symmetric as df increases
For chi-square tests:
- Goodness-of-fit: df = (number of categories) − 1
- Test of independence: df = (rows − 1) × (columns − 1)
Chi-Square vs. T-Test vs. ANOVA: Which Test Do I Use?
| Question Type | Data Type | Test to Use |
|---|---|---|
| Compare means of 2 groups | Numerical (continuous) | T-test |
| Compare means of 3+ groups | Numerical (continuous) | ANOVA |
| Does data fit expected distribution? | Categorical (counts) | Chi-square goodness-of-fit |
| Are two categorical variables related? | Categorical (counts) | Chi-square independence |
| Compare proportions between groups | Categorical (counts) | Chi-square or z-test for proportions |
Key rule: If your data is counts/frequencies in categories, use chi-square. If your data is measurements/scores you can average, use t-test or ANOVA.
Chi-Square Goodness-of-Fit Test
The goodness-of-fit test determines whether observed data matches an expected distribution. It answers: “Do the frequencies in my sample match what I’d expect based on some theory or claim?”
When to Use It
- Testing if a die is fair (each face should appear 1/6 of the time)
- Checking if customer preferences match market research predictions
- Verifying if genetic ratios match Mendelian expectations (3:1 ratio)
- Testing if days of the week have equal accident rates
Hypotheses
H₀: The observed frequencies match the expected frequencies (data fits the expected distribution)
H₁: The observed frequencies do not match the expected frequencies
Goodness-of-fit compares what you observed to what you’d expect if the null hypothesis were true
Example: Testing a Die
You roll a die 60 times and want to test if it’s fair. If fair, each face should appear about 10 times (60 ÷ 6 = 10).
| Face | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
| Observed (O) | 8 | 12 | 7 | 15 | 9 | 9 |
| Expected (E) | 10 | 10 | 10 | 10 | 10 | 10 |
χ² = (8-10)²/10 + (12-10)²/10 + (7-10)²/10 + (15-10)²/10 + (9-10)²/10 + (9-10)²/10
χ² = 0.4 + 0.4 + 0.9 + 2.5 + 0.1 + 0.1 = 4.4
df = 6 − 1 = 5
Looking up χ² = 4.4 with df = 5 gives p ≈ 0.49. Since p > 0.05, we fail to reject H₀—the die appears fair.
📝 Chi-Square Calculation Steps
- State hypotheses: H₀ (frequencies match expected) vs. H₁ (they don’t)
- Calculate expected frequencies: Based on hypothesis or using (row × column)/total
- Check assumptions: All expected frequencies ≥ 5
- For each cell, calculate: (O − E)² / E
- Sum all cells: χ² = Σ (O − E)² / E
- Find degrees of freedom: df = k − 1 (goodness-of-fit) or df = (r−1)(c−1) (independence)
- Find p-value: Use chi-square table or calculator
- Compare to α: If p < α, reject H₀
Chi-Square Test of Independence
The test of independence determines whether two categorical variables are related or independent. It uses a contingency table (cross-tabulation) to organize the data.
When to Use It
- Is smoking status related to lung disease? (smoker/non-smoker vs. disease/no disease)
- Does gender affect voting preference? (male/female vs. candidate A/B/C)
- Is treatment effectiveness related to age group?
- Are education level and income bracket associated?
💡 Which Chi-Square Test Do I Need?
- One categorical variable, comparing to expected distribution → Goodness-of-fit test
- Two categorical variables, testing if they’re related → Test of independence
- “Is this die/spinner/distribution fair?” → Goodness-of-fit test
- “Is Variable A associated with Variable B?” → Test of independence
- Data in a single row/column of counts → Goodness-of-fit test
- Data in a contingency table (rows × columns) → Test of independence
Hypotheses
H₀: The two variables are independent (no association)
H₁: The two variables are not independent (there is an association)
A contingency table organizes counts for two categorical variables
Example: Treatment and Age
A study examines whether treatment effectiveness depends on age group. 120 patients are classified by age (young/old) and outcome (improved/not improved).
If age and outcome are independent, we’d expect the improvement rate to be the same in both age groups. The expected frequencies are calculated from row and column totals.
Using the data from the contingency table above:
- χ² = (45-40)²/40 + (15-20)²/20 + (35-40)²/40 + (25-20)²/20
- χ² = 0.625 + 1.25 + 0.625 + 1.25 = 3.75
- df = (2-1)(2-1) = 1
With χ² = 3.75 and df = 1, p ≈ 0.053. This is borderline—at α = 0.05, we’d barely fail to reject H₀, but at α = 0.10, we’d reject it.
📊 Complete Worked Example: Test of Independence
Problem: A survey of 200 people asks about exercise habits (Regular/None) and stress level (High/Low). Is exercise associated with stress level?
| High Stress | Low Stress | Total | |
|---|---|---|---|
| Regular Exercise | 30 | 50 | 80 |
| No Exercise | 60 | 60 | 120 |
| Total | 90 | 110 | 200 |
Step 1: State hypotheses
H₀: Exercise and stress are independent (no association)
H₁: Exercise and stress are associated
Step 2: Calculate expected frequencies
E = (row total × column total) / grand total
- E(Regular, High) = (80 × 90) / 200 = 36
- E(Regular, Low) = (80 × 110) / 200 = 44
- E(None, High) = (120 × 90) / 200 = 54
- E(None, Low) = (120 × 110) / 200 = 66
Step 3: Check assumptions
All expected values ≥ 5 ✓
Step 4: Calculate χ²
χ² = (30−36)²/36 + (50−44)²/44 + (60−54)²/54 + (60−66)²/66
χ² = 1.00 + 0.82 + 0.67 + 0.55 = 3.04
Step 5: Degrees of freedom
df = (2−1)(2−1) = 1
Step 6: Find p-value and conclude
χ² = 3.04 with df = 1 gives p ≈ 0.081
Since p = 0.081 > α = 0.05, we fail to reject H₀.
Conclusion: There is not sufficient evidence to conclude that exercise and stress level are associated at the 0.05 significance level.
Assumptions and Requirements
Chi-square tests have specific requirements that must be met for valid results:
✅ Chi-Square Requirements
- Random sampling: Data should come from a random sample
- Independence: Observations must be independent of each other
- Categorical data: Variables must be categorical (counts/frequencies)
- Expected frequency rule: All expected frequencies should be ≥ 5
- Mutually exclusive categories: Each observation fits in exactly one cell
⚠️ The Expected Frequency Rule
If any expected frequency is less than 5, the chi-square approximation may be inaccurate. Solutions: combine categories, collect more data, or use Fisher’s exact test (for 2×2 tables with small expected counts).
📘 Yates’ Continuity Correction (2×2 Tables)
Some courses require Yates’ correction for 2×2 contingency tables, which adjusts for the fact that chi-square is a continuous distribution but counts are discrete. The corrected formula subtracts 0.5 from each |O − E| before squaring:
χ²(Yates) = Σ (|O − E| − 0.5)² / E
Yates’ correction produces a more conservative (larger) p-value. Check your course requirements—some instructors require it, others don’t. Modern practice often skips it in favor of Fisher’s exact test for small samples.
Calculating Expected Frequencies
Goodness-of-Fit Test
Expected frequencies come from the hypothesized distribution:
- Equal distribution: E = n / (number of categories)
- Specified proportions: E = n × (hypothesized proportion)
Test of Independence
Expected frequencies assume the variables are independent:
E = (Row Total × Column Total) / Grand Total
This formula calculates what we’d expect in each cell if the row and column variables were completely unrelated.
Interpreting Results
Decision Rule
Compare your calculated χ² to the critical value from the chi-square table (based on df and α), or use the p-value:
- If p < α: Reject H₀. For goodness-of-fit, data doesn’t match expected distribution. For independence, variables are associated.
- If p ≥ α: Fail to reject H₀. For goodness-of-fit, data is consistent with expected distribution. For independence, no evidence of association.
Effect Size: Cramér’s V
A significant chi-square tells you there’s an association, but not how strong. Cramér’s V measures effect size for chi-square tests of independence:
V = √(χ² / (n × (min(r,c) − 1)))
where r = rows, c = columns
Interpretation: V ranges from 0 to 1. Generally: V < 0.1 is negligible, V = 0.1–0.3 is small, V = 0.3–0.5 is medium, V > 0.5 is large.
Common Student Mistakes
❌ Mistake #1: Using percentages instead of counts
Chi-square tests require raw frequency counts, not percentages or proportions. If your data is in percentages, convert back to counts before running the test.
❌ Mistake #2: Ignoring the expected frequency rule
If any expected frequency is below 5, your results may be invalid. Always check expected values before interpreting. If some are too small, combine categories or use an alternative test.
❌ Mistake #3: Using chi-square on numerical data
Chi-square is for categorical data only. If your variables are continuous (height, weight, test scores), you need a different test like correlation or t-test. The data must be counts in categories.
❌ Mistake #4: Confusing goodness-of-fit with test of independence
Goodness-of-fit: One categorical variable, comparing to a hypothesized distribution. Test of independence: Two categorical variables, testing if they’re related. The setup and df calculations differ.
❌ Mistake #5: Misinterpreting “independence”
Failing to reject H₀ doesn’t prove the variables ARE independent—it just means you don’t have enough evidence to say they’re related. Statistical tests can’t prove the null hypothesis.
❌ Mistake #6: Wrong degrees of freedom
For goodness-of-fit: df = categories − 1. For independence: df = (rows − 1)(columns − 1). Using the wrong df gives wrong p-values and wrong conclusions.
Platform-Specific Tips
ALEKS
ALEKS frequently asks you to identify the correct test before calculating. Know the difference between goodness-of-fit and independence problems from the wording. ALEKS is strict about rounding—follow their instructions exactly, and double-check your expected values calculation.
MyStatLab (Pearson)
StatCrunch handles chi-square tests well. For goodness-of-fit, go to Stat → Goodness-of-fit → Chi-Square Test. For independence, use Stat → Tables → Contingency → With Summary (if you have counts). MyStatLab often asks for both the test statistic and expected values—show your work.
WebAssign
WebAssign chi-square problems often give you a contingency table and ask for expected frequencies, χ², df, and conclusion. Calculate expected values carefully—they sometimes ask for specific cells. Pay attention to whether they want a one-tailed interpretation (rare for chi-square) or the standard right-tailed test.
Calculator Tips (TI-83/84)
- Enter observed data: Matrix [A] (MATRIX → EDIT)
- Run test: STAT → TESTS → χ²-Test
- Expected values: Automatically stored in Matrix [B] after running the test
- Goodness-of-fit: Use χ²GOF-Test if available, or calculate manually
Need help with these platforms? Our tutors work with ALEKS statistics, MyStatLab, and WebAssign every day.
Frequently Asked Questions
What’s the difference between chi-square goodness-of-fit and test of independence?
Goodness-of-fit tests one categorical variable against a hypothesized distribution (e.g., “Is this die fair?”). Test of independence tests whether two categorical variables are related (e.g., “Is smoking related to lung disease?”). Different setup, different df formula, but same χ² formula.
What if my expected frequencies are less than 5?
The chi-square approximation becomes unreliable when expected frequencies are too small. Options: combine categories to increase expected counts, collect more data, or use Fisher’s exact test (for 2×2 tables). Some sources say E ≥ 5 in all cells; others allow up to 20% of cells with E between 1 and 5. Check your course requirements.
Why is chi-square always a right-tailed test?
Because we square the differences (O − E)², the χ² value is always positive. Large χ² values indicate large discrepancies between observed and expected—which is what makes us doubt the null hypothesis. We never care about “too small” χ² values because they just mean observed matches expected well.
How do I calculate degrees of freedom?
Goodness-of-fit: df = k − 1, where k is the number of categories. Test of independence: df = (r − 1)(c − 1), where r is rows and c is columns. For a 3×4 table, df = (3-1)(4-1) = 6.
Can chi-square tell me which cells are different?
The overall chi-square test just tells you whether there’s a significant association, not where. To identify which specific cells contribute most to the chi-square value, examine the standardized residuals: (O − E) / √E. Residuals greater than ±2 indicate cells that differ significantly from expected.
What’s the relationship between chi-square and other tests?
For a 2×2 table, the chi-square test of independence is equivalent to a z-test for two proportions (χ² = z²). Chi-square is also related to the G-test (likelihood ratio test), which some courses prefer. For paired categorical data, use McNemar’s test instead of the standard chi-square.
When should I use Fisher’s exact test instead?
Fisher’s exact test is preferred when sample sizes are small and expected frequencies fall below 5, especially in 2×2 tables. It calculates the exact probability rather than using the chi-square approximation. Most statistical software offers this option—look for “Fisher’s exact” when running chi-square on small samples.
Can you help with my chi-square homework?
Absolutely. Chi-square tests are a common topic in intro statistics, and our tutors handle them regularly. Whether you’re working on goodness-of-fit problems, independence tests, or need help interpreting contingency tables, we work with ALEKS, MyStatLab, WebAssign, and other platforms daily. Get a free quote to get started.
Quick Reference Summary
📊 Goodness-of-Fit Test
Purpose: Test if data matches expected distribution
Data: One categorical variable
df = k − 1 (categories minus 1)
H₀: Observed = Expected
🔗 Test of Independence
Purpose: Test if two variables are related
Data: Two categorical variables
df = (r − 1)(c − 1)
H₀: Variables are independent
🧮 Key Formulas
| Chi-square statistic: | χ² = Σ (O − E)² / E |
| Expected (independence): | E = (row total × column total) / n |
| Yates’ correction: | χ² = Σ (|O − E| − 0.5)² / E |
| Cramér’s V: | V = √(χ² / (n × (min(r,c) − 1))) |
⚠️ Requirements Checklist
- Random sample ✓
- Independent observations ✓
- Categorical data (counts, not percentages) ✓
- All expected frequencies ≥ 5 ✓
- Each observation in exactly one cell ✓
Decision rule: If p-value < α (usually 0.05), reject H₀. Chi-square tests are always right-tailed.
Related Resources
Statistics Foundations
- Hypothesis Testing Guide
- T-Tests and ANOVA Explained
- Descriptive Statistics Explained
- Normal Distribution Guide
Statistics Help
Need Help With Chi-Square Tests?
Our tutors handle chi-square problems daily—from setting up contingency tables to calculating expected frequencies and interpreting results.