P-Values Explained: What They Really Mean (And Don’t Mean)

Q: What p-value is considered significant?

The most common threshold is α = 0.05 (5%). If p < 0.05, results are typically called statistically significant. However, this is just a convention—some fields use stricter or more lenient thresholds.

Q: Why can't we say accept the null hypothesis?

Failing to reject H₀ doesn't prove H₀ is true—it just means you don't have enough evidence against it. Maybe the effect is real but your sample was too small to detect it.

Q: Can the p-value ever be exactly 0?

Technically, no—there's always some probability, however tiny. When software reports p = 0.000 or p < 0.001, it means the value is too small to display.

Q: What's the relationship between p-value and confidence interval?

They're related: if the 95% confidence interval doesn't include the null hypothesis value, the two-tailed p-value is < 0.05. They provide complementary information.

Q: Why do some statisticians criticize p-values?

P-values are widely misunderstood and misused. Critics argue they encourage binary thinking, lead to p-hacking, and don't tell you the probability the hypothesis is true.

Q: What if my p-value is exactly 0.05?

This is borderline. Technically, if p = 0.05 exactly and α = 0.05, you'd fail to reject (since we need p < α). Report it as borderline—the difference between 0.049 and 0.051 isn't meaningful.

Q: How do sample size and p-value relate?

Larger samples lead to smaller p-values for the same effect size. With enough data, even tiny effects become statistically significant. This is why you can't interpret p-values without considering sample size.

Q: Can you help with my p-value homework?

Absolutely. P-value interpretation is one of the most common topics we help with. Our tutors work with ALEKS, MyStatLab, WebAssign, and other platforms daily.

📊 Quick Answer

A p-value is the probability of getting results at least as extreme as yours, assuming the null hypothesis is true. In plain English: “If nothing interesting is really happening (H₀ is true), how surprised should we be by these results?” A small p-value means your results would be rare if H₀ were true—suggesting maybe H₀ isn’t true after all.

📑 In This Guide

What Is a P-Value?
What a P-Value Is NOT
The P-Value on a Distribution
Interpreting P-Values
The Decision Rule
One-Tailed vs. Two-Tailed P-Values
Statistical vs. Practical Significance
Common Student Mistakes
Platform-Specific Tips
FAQs

What Is a P-Value?

The p-value answers a specific question: “If the null hypothesis were true, what’s the probability of seeing data at least as extreme as what we observed?”

Let’s break that down:

“If the null hypothesis were true” — We start by assuming H₀ is correct (no effect, no difference, nothing interesting is happening)
“What’s the probability” — We calculate how likely certain results would be
“Data at least as extreme as what we observed” — Results this far or farther from what H₀ predicts

Think of it like a “surprise meter.” If you assume nothing special is happening, how surprised should you be by your results?

💡 The Courtroom Analogy

In a trial, we assume “innocent until proven guilty” (that’s our H₀). The p-value is like asking: “If the defendant really IS innocent, how likely is all this evidence against them?” If the answer is “very unlikely,” we have reason to doubt the innocence assumption—just like a small p-value makes us doubt H₀.

What a P-Value Is NOT (The Misconceptions)

This is where most students go wrong. The p-value is one of the most misunderstood concepts in statistics—even researchers get it wrong. Here’s what it does NOT mean:

What p-value is vs what it is not - common misconceptions

The p-value is about the data, not about the hypothesis being true or false

The 5 Biggest Misconceptions

❌ Misconception #1: “P = 0.03 means there’s a 3% chance the null hypothesis is true”

Reality: The p-value is NOT the probability that H₀ is true. H₀ is either true or false—it’s not a random variable. The p-value tells you about the probability of your DATA, assuming H₀ is already true.

❌ Misconception #2: “P = 0.03 means there’s a 97% chance the alternative hypothesis is true”

Reality: Same problem. The p-value doesn’t tell you the probability of H₁ being true. You can’t just subtract from 1. The math doesn’t work that way.

❌ Misconception #3: “P = 0.03 means there’s a 3% chance I’m making an error”

Reality: The probability of a Type I error (rejecting H₀ when it’s true) is α, not the p-value. If α = 0.05, your error rate is 5%—regardless of whether your specific p-value is 0.001 or 0.049.

❌ Misconception #4: “A small p-value means the effect is large/important”

Reality: P-values say nothing about the size of the effect. A tiny, meaningless effect can have a tiny p-value if your sample is large enough. Always look at effect size separately.

❌ Misconception #5: “P = 0.03 means the results are probably not due to chance”

Reality: The p-value tells you probability IF results are due to chance—not the probability that they ARE due to chance. It’s a subtle but critical difference. P(data | H₀) ≠ P(H₀ | data).

🔑 Remember this:

P-value = P(getting this data | H₀ is true)

P-value ≠ P(H₀ is true | this data)

These look similar but are completely different questions!

The P-Value on a Distribution

Visually, the p-value is the area under the curve in the tail(s) of the sampling distribution, beyond your test statistic.

P-value shown as shaded area under the normal curve

The p-value is the shaded area: probability of getting results this extreme or more extreme

The curve represents all possible results you could get IF H₀ were true. Your test statistic (like t = 2.15) lands somewhere on this distribution. The p-value is the probability of landing that far out (or farther) in the tail.

Key insight: If H₀ is true, extreme values are rare. So if you get an extreme result (far in the tail), either:

You got unlucky and saw a rare event, OR
H₀ isn’t actually true

The smaller the p-value, the more we lean toward option 2.

Interpreting P-Values

P-values are typically interpreted on a scale of evidence against the null hypothesis:

P-value evidence scale showing strength of evidence

Smaller p-values indicate stronger evidence against H₀

P-Value Range	Evidence Against H₀	Typical Interpretation
p > 0.10	Little or none	Results consistent with H₀
0.05 < p ≤ 0.10	Weak	Marginally significant (borderline)
0.01 < p ≤ 0.05	Moderate	Statistically significant
0.001 < p ≤ 0.01	Strong	Highly significant
p ≤ 0.001	Very strong	Very highly significant

⚠️ Don’t Treat 0.05 as Magic

There’s nothing special about p = 0.05. It’s a convention, not a law of nature. A result with p = 0.049 is not fundamentally different from p = 0.051. The cutoff is arbitrary—always report the actual p-value so readers can judge for themselves.

The Decision Rule

In hypothesis testing, we compare the p-value to a predetermined significance level (α), usually 0.05:

Comparing p-value to alpha: reject vs fail to reject

The decision depends on whether p falls above or below your chosen α

📝 The Decision Rule

If p < α: Reject H₀. Results are statistically significant.
If p ≥ α: Fail to reject H₀. Results are not statistically significant.

Common α values:

α = 0.05 — Most common; 5% significance level
α = 0.01 — Stricter; used when you need strong evidence
α = 0.10 — More lenient; used in exploratory research

Important: You must choose α BEFORE looking at your data. Changing α after seeing the p-value is cheating (called “p-hacking”).

The hypothesis testing decision process

One-Tailed vs. Two-Tailed P-Values

The p-value depends on whether you’re doing a one-tailed or two-tailed test:

Two-Tailed Test	One-Tailed Test
H₁: μ ≠ value (not equal)	H₁: μ > value OR μ < value
Looks for difference in EITHER direction	Looks for difference in ONE specific direction
P-value = area in BOTH tails	P-value = area in ONE tail only
Use when direction isn’t pre-specified	Use when you predict a specific direction
More conservative (harder to reject H₀)	Less conservative (easier to reject H₀)

Key relationship: One-tailed p-value = (Two-tailed p-value) / 2

If a calculator gives you a two-tailed p-value of 0.04 and you’re doing a one-tailed test in the correct direction, your p-value is 0.02.

⚠️ When to Use One-Tailed Tests

Only use a one-tailed test when you have a strong, pre-specified reason to look in one direction only. If you’re just hoping for a smaller p-value, that’s cheating. When in doubt, use two-tailed—it’s the safer, more common choice.

Statistical vs. Practical Significance

A common trap: assuming a small p-value means an important finding. Statistical significance and practical significance are different things.

Statistical significance vs practical significance examples

A tiny p-value doesn’t mean the effect matters; always consider effect size

Why this happens: With a large enough sample, ANY effect—no matter how tiny—becomes statistically significant. The p-value shrinks as sample size grows, even if the actual effect stays the same.

What to do:

Always report effect size alongside p-values (Cohen’s d, R², etc.)
Ask: “Even if this is real, does it matter in practice?”
Consider the context: a 1-point difference might matter in some situations but not others

Common Student Mistakes

❌ Mistake #1: Saying “accept H₀”

We never “accept” the null hypothesis—we only “fail to reject” it. Absence of evidence isn’t evidence of absence. Not having enough evidence to reject H₀ doesn’t prove H₀ is true; you might just need more data.

❌ Mistake #2: Treating p = 0.05 as a cliff

There’s no meaningful difference between p = 0.049 and p = 0.051. Don’t say one is “significant” and the other is “not significant” as if they’re worlds apart. Report the actual p-value and interpret the strength of evidence on a continuum.

❌ Mistake #3: Confusing p-value with effect size

A small p-value does NOT mean a large effect. With n = 10,000, even a trivial difference becomes “highly significant.” Always look at effect size (like Cohen’s d) to assess how big the effect actually is.

❌ Mistake #4: Interpreting p as P(H₀ is true)

This is the #1 misconception. P = 0.03 does NOT mean “3% chance the null is true.” It means “IF H₀ is true, there’s a 3% chance of getting data this extreme.” The null hypothesis isn’t a random variable—it’s either true or false.

❌ Mistake #5: Choosing one-tailed tests to get significance

If your two-tailed p-value is 0.08 and you switch to one-tailed to get p = 0.04, that’s p-hacking. The direction must be specified BEFORE seeing the data, based on theory—not to manufacture significance.

❌ Mistake #6: Writing conclusions incorrectly

Wrong: “We proved the drug works.” Right: “We found statistically significant evidence that the drug is associated with improvement.” Statistics shows association and evidence, not proof.

Platform-Specific Tips

ALEKS

ALEKS often asks you to identify the correct conclusion based on a p-value and α. Use exactly these phrases:

p < α: "Reject H₀" or "There is sufficient evidence to support [H₁]"
p ≥ α: “Fail to reject H₀” or “There is not sufficient evidence to support [H₁]”

ALEKS is strict about wording—never say “accept H₀.”

MyStatLab (Pearson)

StatCrunch and MyStatLab usually provide exact p-values. Watch whether they give one-tailed or two-tailed—the software often defaults to two-tailed. If your alternative hypothesis is one-sided, you may need to divide by 2. MyStatLab frequently asks for interpretation sentences—include context and direction.

WebAssign

WebAssign problems often test conceptual understanding: “What does this p-value mean?” Review the misconceptions section—they love testing whether you know what the p-value ISN’T. Also watch for questions about Type I error vs. p-value.

TI-83/84 Calculator

The calculator gives you a p-value directly for most tests (STAT → TESTS). It typically gives two-tailed p-values for t-tests and z-tests. The p-value appears in the output—just compare to α and make your decision.

Need help with these platforms? Our tutors work with ALEKS statistics, MyStatLab, and WebAssign every day.

Quick Reference Summary

📐 Definition

P-value = Probability of getting results at least as extreme as observed, assuming H₀ is true

✅ Decision Rule

p < α: Reject H₀ (significant)

p ≥ α: Fail to reject H₀ (not significant)

❌ What P-Value Is NOT

NOT the probability H₀ is true
NOT the probability H₁ is true
NOT the probability you made an error
NOT a measure of effect size or importance

📝 Conclusion Templates

Reject H₀: “At the α = 0.05 significance level, there is sufficient evidence to conclude that [H₁ in context].”

Fail to reject H₀: “At the α = 0.05 significance level, there is not sufficient evidence to conclude that [H₁ in context].”

⚠️ Remember: Never say “accept H₀” • Always report actual p-value • Statistical significance ≠ practical importance

Frequently Asked Questions

What p-value is considered significant?

The most common threshold is α = 0.05 (5%). If p < 0.05, results are typically called "statistically significant." However, this is just a convention—some fields use stricter thresholds (0.01) or more lenient ones (0.10). Always report the actual p-value and let readers judge.

Why can’t we say “accept the null hypothesis”?

Failing to reject H₀ doesn’t prove H₀ is true—it just means you don’t have enough evidence against it. Maybe the effect is real but your sample was too small to detect it. It’s like a jury verdict of “not guilty”—it doesn’t mean innocent, just not proven guilty beyond reasonable doubt.

Can the p-value ever be exactly 0?

Technically, no—there’s always some probability, however tiny. When software reports p = 0.000 or p < 0.001, it means the value is too small to display. You can say "p < 0.001" but not "p = 0." In reality, p-values like 10⁻¹⁵ are effectively zero for practical purposes.

What’s the relationship between p-value and confidence interval?

They’re related: if the 95% confidence interval doesn’t include the null hypothesis value, the two-tailed p-value is < 0.05. They provide complementary information—the p-value tells you about significance, while the CI tells you about the range of plausible effect sizes.

Why do some statisticians criticize p-values?

P-values are widely misunderstood and misused. Critics argue they encourage binary thinking (significant/not significant), lead to p-hacking, and don’t tell you what you really want to know (probability the hypothesis is true). Many now recommend also reporting effect sizes, confidence intervals, and using Bayesian methods.

What if my p-value is exactly 0.05?

This is a borderline case. Technically, if p = 0.05 exactly and α = 0.05, you’d fail to reject (since we need p < α, not p ≤ α). But in practice, this is splitting hairs. Report "p = 0.05" and acknowledge it's borderline—the difference between 0.049 and 0.051 isn't meaningful.

How do sample size and p-value relate?

Larger samples lead to smaller p-values for the same effect size. With enough data, even tiny effects become “statistically significant.” This is why you can’t interpret p-values without considering sample size. A significant result with n = 10,000 might represent a trivially small effect.

Can you help with my p-value homework?

Absolutely. P-value interpretation is one of the most common topics we help with because it’s so frequently misunderstood. Whether you need help calculating p-values, interpreting output, or writing conclusions, our tutors work with ALEKS, MyStatLab, WebAssign, and other platforms daily. Get a free quote to get started.

Related Resources

Statistics Foundations

Statistics Help

Need Help Understanding P-Values?

Our tutors explain p-values in plain English and help you interpret them correctly on homework and exams.

Get a Free Quote