P-value

In statistical hypothesis testing, the p-value of a random variable T used as a test statistic is the probability that T will assume a value "at least as extreme" as the observed value tobserved, given that a null hypothesis being considered is true. "More extreme" would mean less favorable to the null hypothesis; in some cases that means greater than, in some cases less than, and in some cases further away from a specified center. In other words, assume that a simple null hypothesis is rejected if a test statistic T exceeds a critical value c. Suppose that in a particular case the T was observed to be equal to tobserved. Then the p-value of T in that case is the probability that T would equal or exceed tobserved. The p-value does not depend on unobservable parameters, but only on the data, i.e., it is observable; it is a "statistic." In classical frequentist inference, one rejects the null hypothesis if the p-value is smaller than a number called the level of the test. In effect, the p-value itself is then being used as the test statistic. If the level is 0.05, then the probability that the p-value is less than 0.05, given that the null hypothesis is true, is 0.05, provided the test statistic has a continuous distribution. In that case, the p-value is uniformly distributed if the null hypothesis is true.

Frequent misunderstandings

There are several common misunderstandings about p-values. All of the following statements are FALSE: a) The p-value is the probability that the null hypothesis is true, justifying the "rule" of considering as significant p-values closer to 0 (zero). Comment: In fact, frequentist statistics does not, and cannot, attach probabilities to hypotheses. Comparison of Bayesian and classical approaches shows that p can be very close to zero while the posterior probability of the null is very close to unity. This is the Jeffreys-Lindley Paradox. b) The p-value is the probability of falsely rejecting the null hypothesis. This error is called the prosecutor's fallacy. Comment: Suppose one selects the 5% significance level. The Type I error rate is the average value over all possible outcomes of the p-value in the range 0 to 0.05. If after carrying out the calculation the p-value is computed to be, say, 0.049999 then the Type I error rate is in fact around 29%. On the other hand, if the p-value is very close to zero then the Type I error rate is much lower than 5%. c) The p-value is the probability that a replicating experiment would not yield the same conclusion.

Reference

"Calibration of P-values for Testing Precise Null Hypotheses". Sellke, T., Bayarri, M.J. and Berger, J. (2001) The American Statistician (55), 62--71.

 

<< PreviousWord BrowserNext >>
restored trains
illinois country
petty officer second class
rhosnesni high school
esther peterson
library of congress classification:class p, subclass pf west germanic languages
microverse
diaborromon
america coming together
sea snake
national qualifications authority of ireland
postmaterialism
california proposition 22 (2000)
finnish sign language
david viscott
mountbatten
2004 election
ottawa treaty
samuel francis smith
aberaeron
rkatsiteli
library of congress classification:class q, subclass q science (general)
apra harbor
cajamarca
list of effects
secondary school
auxerrois
samus pattison
ayacucho
wingdings
jim tunney
carmenere
nasal decongestant
nobel, ontario
chasselas
hapten
carolina bay
public knowledge
hero fortress
chenin blanc
army group
colombard
best of seven playoff
mike godwin