Home ›
Statistics Help ›
P-Values
P-Values Explained: What They Really Mean (And Don’t Mean)
📊 Quick Answer
A p-value is the probability of getting results at least as extreme as yours, assuming the null hypothesis is true. In plain English: “If nothing interesting is really happening (H₀ is true), how surprised should we be by these results?” A small p-value means your results would be rare if H₀ were true—suggesting maybe H₀ isn’t true after all.
📑 In This Guide
What Is a P-Value?
The p-value answers a specific question: “If the null hypothesis were true, what’s the probability of seeing data at least as extreme as what we observed?”
Let’s break that down:
- “If the null hypothesis were true” — We start by assuming H₀ is correct (no effect, no difference, nothing interesting is happening)
- “What’s the probability” — We calculate how likely certain results would be
- “Data at least as extreme as what we observed” — Results this far or farther from what H₀ predicts
Think of it like a “surprise meter.” If you assume nothing special is happening, how surprised should you be by your results?
💡 The Courtroom Analogy
In a trial, we assume “innocent until proven guilty” (that’s our H₀). The p-value is like asking: “If the defendant really IS innocent, how likely is all this evidence against them?” If the answer is “very unlikely,” we have reason to doubt the innocence assumption—just like a small p-value makes us doubt H₀.
What a P-Value Is NOT (The Misconceptions)
This is where most students go wrong. The p-value is one of the most misunderstood concepts in statistics—even researchers get it wrong. Here’s what it does NOT mean:
The p-value is about the data, not about the hypothesis being true or false
The 5 Biggest Misconceptions
❌ Misconception #1: “P = 0.03 means there’s a 3% chance the null hypothesis is true”
Reality: The p-value is NOT the probability that H₀ is true. H₀ is either true or false—it’s not a random variable. The p-value tells you about the probability of your DATA, assuming H₀ is already true.
❌ Misconception #2: “P = 0.03 means there’s a 97% chance the alternative hypothesis is true”
Reality: Same problem. The p-value doesn’t tell you the probability of H₁ being true. You can’t just subtract from 1. The math doesn’t work that way.
❌ Misconception #3: “P = 0.03 means there’s a 3% chance I’m making an error”
Reality: The probability of a Type I error (rejecting H₀ when it’s true) is α, not the p-value. If α = 0.05, your error rate is 5%—regardless of whether your specific p-value is 0.001 or 0.049.
❌ Misconception #4: “A small p-value means the effect is large/important”
Reality: P-values say nothing about the size of the effect. A tiny, meaningless effect can have a tiny p-value if your sample is large enough. Always look at effect size separately.
❌ Misconception #5: “P = 0.03 means the results are probably not due to chance”
Reality: The p-value tells you probability IF results are due to chance—not the probability that they ARE due to chance. It’s a subtle but critical difference. P(data | H₀) ≠ P(H₀ | data).
🔑 Remember this:
P-value = P(getting this data | H₀ is true)
P-value ≠ P(H₀ is true | this data)
These look similar but are completely different questions!
The P-Value on a Distribution
Visually, the p-value is the area under the curve in the tail(s) of the sampling distribution, beyond your test statistic.
The p-value is the shaded area: probability of getting results this extreme or more extreme
The curve represents all possible results you could get IF H₀ were true. Your test statistic (like t = 2.15) lands somewhere on this distribution. The p-value is the probability of landing that far out (or farther) in the tail.
Key insight: If H₀ is true, extreme values are rare. So if you get an extreme result (far in the tail), either:
- You got unlucky and saw a rare event, OR
- H₀ isn’t actually true
The smaller the p-value, the more we lean toward option 2.
Interpreting P-Values
P-values are typically interpreted on a scale of evidence against the null hypothesis:
Smaller p-values indicate stronger evidence against H₀
| P-Value Range | Evidence Against H₀ | Typical Interpretation |
|---|---|---|
| p > 0.10 | Little or none | Results consistent with H₀ |
| 0.05 < p ≤ 0.10 | Weak | Marginally significant (borderline) |
| 0.01 < p ≤ 0.05 | Moderate | Statistically significant |
| 0.001 < p ≤ 0.01 | Strong | Highly significant |
| p ≤ 0.001 | Very strong | Very highly significant |
⚠️ Don’t Treat 0.05 as Magic
There’s nothing special about p = 0.05. It’s a convention, not a law of nature. A result with p = 0.049 is not fundamentally different from p = 0.051. The cutoff is arbitrary—always report the actual p-value so readers can judge for themselves.
The Decision Rule
In hypothesis testing, we compare the p-value to a predetermined significance level (α), usually 0.05:
The decision depends on whether p falls above or below your chosen α
📝 The Decision Rule
- If p < α: Reject H₀. Results are statistically significant.
- If p ≥ α: Fail to reject H₀. Results are not statistically significant.
Common α values:
- α = 0.05 — Most common; 5% significance level
- α = 0.01 — Stricter; used when you need strong evidence
- α = 0.10 — More lenient; used in exploratory research
Important: You must choose α BEFORE looking at your data. Changing α after seeing the p-value is cheating (called “p-hacking”).
The hypothesis testing decision process
One-Tailed vs. Two-Tailed P-Values
The p-value depends on whether you’re doing a one-tailed or two-tailed test:
| Two-Tailed Test | One-Tailed Test |
|---|---|
| H₁: μ ≠ value (not equal) | H₁: μ > value OR μ < value |
| Looks for difference in EITHER direction | Looks for difference in ONE specific direction |
| P-value = area in BOTH tails | P-value = area in ONE tail only |
| Use when direction isn’t pre-specified | Use when you predict a specific direction |
| More conservative (harder to reject H₀) | Less conservative (easier to reject H₀) |
Key relationship: One-tailed p-value = (Two-tailed p-value) / 2
If a calculator gives you a two-tailed p-value of 0.04 and you’re doing a one-tailed test in the correct direction, your p-value is 0.02.
⚠️ When to Use One-Tailed Tests
Only use a one-tailed test when you have a strong, pre-specified reason to look in one direction only. If you’re just hoping for a smaller p-value, that’s cheating. When in doubt, use two-tailed—it’s the safer, more common choice.
Statistical vs. Practical Significance
A common trap: assuming a small p-value means an important finding. Statistical significance and practical significance are different things.
A tiny p-value doesn’t mean the effect matters; always consider effect size
Why this happens: With a large enough sample, ANY effect—no matter how tiny—becomes statistically significant. The p-value shrinks as sample size grows, even if the actual effect stays the same.
What to do:
- Always report effect size alongside p-values (Cohen’s d, R², etc.)
- Ask: “Even if this is real, does it matter in practice?”
- Consider the context: a 1-point difference might matter in some situations but not others
Common Student Mistakes
❌ Mistake #1: Saying “accept H₀”
We never “accept” the null hypothesis—we only “fail to reject” it. Absence of evidence isn’t evidence of absence. Not having enough evidence to reject H₀ doesn’t prove H₀ is true; you might just need more data.
❌ Mistake #2: Treating p = 0.05 as a cliff
There’s no meaningful difference between p = 0.049 and p = 0.051. Don’t say one is “significant” and the other is “not significant” as if they’re worlds apart. Report the actual p-value and interpret the strength of evidence on a continuum.
❌ Mistake #3: Confusing p-value with effect size
A small p-value does NOT mean a large effect. With n = 10,000, even a trivial difference becomes “highly significant.” Always look at effect size (like Cohen’s d) to assess how big the effect actually is.
❌ Mistake #4: Interpreting p as P(H₀ is true)
This is the #1 misconception. P = 0.03 does NOT mean “3% chance the null is true.” It means “IF H₀ is true, there’s a 3% chance of getting data this extreme.” The null hypothesis isn’t a random variable—it’s either true or false.
❌ Mistake #5: Choosing one-tailed tests to get significance
If your two-tailed p-value is 0.08 and you switch to one-tailed to get p = 0.04, that’s p-hacking. The direction must be specified BEFORE seeing the data, based on theory—not to manufacture significance.
❌ Mistake #6: Writing conclusions incorrectly
Wrong: “We proved the drug works.” Right: “We found statistically significant evidence that the drug is associated with improvement.” Statistics shows association and evidence, not proof.
Platform-Specific Tips
ALEKS
ALEKS often asks you to identify the correct conclusion based on a p-value and α. Use exactly these phrases:
- p < α: "Reject H₀" or "There is sufficient evidence to support [H₁]"
- p ≥ α: “Fail to reject H₀” or “There is not sufficient evidence to support [H₁]”
ALEKS is strict about wording—never say “accept H₀.”
MyStatLab (Pearson)
StatCrunch and MyStatLab usually provide exact p-values. Watch whether they give one-tailed or two-tailed—the software often defaults to two-tailed. If your alternative hypothesis is one-sided, you may need to divide by 2. MyStatLab frequently asks for interpretation sentences—include context and direction.
WebAssign
WebAssign problems often test conceptual understanding: “What does this p-value mean?” Review the misconceptions section—they love testing whether you know what the p-value ISN’T. Also watch for questions about Type I error vs. p-value.
TI-83/84 Calculator
The calculator gives you a p-value directly for most tests (STAT → TESTS). It typically gives two-tailed p-values for t-tests and z-tests. The p-value appears in the output—just compare to α and make your decision.
Need help with these platforms? Our tutors work with ALEKS statistics, MyStatLab, and WebAssign every day.
Quick Reference Summary
📐 Definition
P-value = Probability of getting results at least as extreme as observed, assuming H₀ is true
✅ Decision Rule
p < α: Reject H₀ (significant)
p ≥ α: Fail to reject H₀ (not significant)
❌ What P-Value Is NOT
- NOT the probability H₀ is true
- NOT the probability H₁ is true
- NOT the probability you made an error
- NOT a measure of effect size or importance
📝 Conclusion Templates
Reject H₀: “At the α = 0.05 significance level, there is sufficient evidence to conclude that [H₁ in context].”
Fail to reject H₀: “At the α = 0.05 significance level, there is not sufficient evidence to conclude that [H₁ in context].”
⚠️ Remember: Never say “accept H₀” • Always report actual p-value • Statistical significance ≠ practical importance
Frequently Asked Questions
What p-value is considered significant?
The most common threshold is α = 0.05 (5%). If p < 0.05, results are typically called "statistically significant." However, this is just a convention—some fields use stricter thresholds (0.01) or more lenient ones (0.10). Always report the actual p-value and let readers judge.
Why can’t we say “accept the null hypothesis”?
Failing to reject H₀ doesn’t prove H₀ is true—it just means you don’t have enough evidence against it. Maybe the effect is real but your sample was too small to detect it. It’s like a jury verdict of “not guilty”—it doesn’t mean innocent, just not proven guilty beyond reasonable doubt.
Can the p-value ever be exactly 0?
Technically, no—there’s always some probability, however tiny. When software reports p = 0.000 or p < 0.001, it means the value is too small to display. You can say "p < 0.001" but not "p = 0." In reality, p-values like 10⁻¹⁵ are effectively zero for practical purposes.
What’s the relationship between p-value and confidence interval?
They’re related: if the 95% confidence interval doesn’t include the null hypothesis value, the two-tailed p-value is < 0.05. They provide complementary information—the p-value tells you about significance, while the CI tells you about the range of plausible effect sizes.
Why do some statisticians criticize p-values?
P-values are widely misunderstood and misused. Critics argue they encourage binary thinking (significant/not significant), lead to p-hacking, and don’t tell you what you really want to know (probability the hypothesis is true). Many now recommend also reporting effect sizes, confidence intervals, and using Bayesian methods.
What if my p-value is exactly 0.05?
This is a borderline case. Technically, if p = 0.05 exactly and α = 0.05, you’d fail to reject (since we need p < α, not p ≤ α). But in practice, this is splitting hairs. Report "p = 0.05" and acknowledge it's borderline—the difference between 0.049 and 0.051 isn't meaningful.
How do sample size and p-value relate?
Larger samples lead to smaller p-values for the same effect size. With enough data, even tiny effects become “statistically significant.” This is why you can’t interpret p-values without considering sample size. A significant result with n = 10,000 might represent a trivially small effect.
Can you help with my p-value homework?
Absolutely. P-value interpretation is one of the most common topics we help with because it’s so frequently misunderstood. Whether you need help calculating p-values, interpreting output, or writing conclusions, our tutors work with ALEKS, MyStatLab, WebAssign, and other platforms daily. Get a free quote to get started.
Related Resources
Statistics Foundations
- Hypothesis Testing Guide
- T-Tests and ANOVA Explained
- Confidence Intervals Explained
- Normal Distribution Guide
Statistics Help
Need Help Understanding P-Values?
Our tutors explain p-values in plain English and help you interpret them correctly on homework and exams.