{"id":2092,"date":"2013-10-08T23:24:20","date_gmt":"2013-10-09T06:24:20","guid":{"rendered":"http:\/\/qpr.ca\/blogs\/?p=2092"},"modified":"2013-10-22T20:29:02","modified_gmt":"2013-10-23T03:29:02","slug":"whats-wrong-with-p-values","status":"publish","type":"post","link":"https:\/\/qpr.ca\/blogs\/2013\/10\/08\/whats-wrong-with-p-values\/","title":{"rendered":"What&#8217;s Wrong With P-Values?"},"content":{"rendered":"<p>One of my favourite betes noires claims to have put\u00a0<a href=\"http:\/\/wmbriggs.com\/blog\/?p=9338\">everything wrong with P-Values under one roof<\/a>.<\/p>\n<p>My response started with \u00a0&#8220;<span style=\"background-color: #fffcfc; color: #1f1f1f; font-family: Arial, Helvetica, sans-serif;\">There\u2019s nothing wrong with p-values any more than with Popeye. They is what they is and that\u2019s that. To blame them for their own abuse is just a pale version of blaming any other victim.&#8221;<\/span><\/p>\n<p>Briggs replied saying &#8220;<span style=\"background-color: #fffcfc; color: #1f1f1f; font-family: Arial, Helvetica, sans-serif;\">This odd because there are several proofs showing there just are many things wrong with them. Particularly that their use is always fallacious.<\/span>&#8221; which is odd itself as it seems to be just a reworking of exactly what I said, namely that what is &#8220;wrong&#8221; with them is just the (allegedly) fallacious uses that are made of them.<\/p>\n<p>My comment continued with the following example:<\/p>\n<p style=\"background-color: #fffcfc; border: 0px; margin-bottom: 1.5em; outline: 0px; vertical-align: baseline; word-wrap: break-word; color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; margin-top: 0px !important; margin-right: 0px !important; margin-left: 0px !important; padding: 0px !important;\">&#8220;But if you are the kind of pervert who really enjoys abuse here goes:<br \/>\nLet H0 be the claim that z=N(0,1) and let r=1\/z.<br \/>\nThen P(|r|&gt;20)=P(|z|&lt;.05)=approx.04&lt;.05<br \/>\nSo if z is within .05 of 0 then the p-value for r is less than .05 and so at the 95% confidence level we must reject the hypothesis that mean(z)=0.<span style=\"color: #000000; font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;\">&#8220;<\/span><\/p>\n<p>Now the joke here is really based on Briggs mis-statement of what a p-value is. Not that there would be anything wrong with the thing he defined but it just wouldn&#8217;t be properly called a p-value. And in order to criticize something (or even just the use of that thing) you need to know what it actually is. So for the enlightenment of Mr Briggs, let me explore what a p-value actually is.<\/p>\n<p>What Briggs defined as a p-value is as follows: &#8220;<span style=\"color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; font-size: 15px; line-height: 21px; background-color: #fffcfc;\">Given the model used and the test statistic dependent on that model and given the data seen and assuming the null hypothesis (tied to a parameter) is\u00a0<\/span><em style=\"background-color: #fffcfc; border: 0px; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline; word-wrap: break-word; color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; font-size: 15px; line-height: 21px;\">true<\/em><span style=\"color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; font-size: 15px; line-height: 21px; background-color: #fffcfc;\">, the p-value is the probability of seeing a test statistic larger (in absolute value) than the one actually seen\u00a0<\/span><em style=\"background-color: #fffcfc; border: 0px; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline; word-wrap: break-word; color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; font-size: 15px; line-height: 21px;\">if<\/em><span style=\"color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; font-size: 15px; line-height: 21px; background-color: #fffcfc;\">\u00a0the experiment which generated the data were run an indefinite number of future times and where the milieu of the experiment is precisely the same except where it is \u201crandomly\u201d different.<\/span>&#8221; This has a number of oddities (excessive and redundant uses of the word &#8220;given&#8221; and the inclusion of an inappropriate repetition condition being among them) but the most significant thing wrong with it is that it only applies to certain kinds of test statistic &#8211; as demonstrated by my silly example above.<\/p>\n<p>A better definition might be: Given a stochastic model (which we call the null hypothesis) and a test statistic defined in terms of that model, the p-value of an observed value of that statistic is the probability in the model of having a value of the statistic which is further from the predicted mean than the observed value.<\/p>\n<p>With this definition, it becomes clear that if the null hypothesis is true (ie if the model does accurately predict probabilities) then the occurrence of a low P-value implies the occurrence of an improbable event and so the logical disjunction that Briggs quotes from R A Fisher, namely &#8220;<span style=\"background-color: #fffcfc; color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; font-size: 15px; line-height: 21px;\">Either the null hypothesis is false, or the p-value has attained by chance an exceptionally low value<\/span>&#8221; is indeed correct.<\/p>\n<p>Briggs claim that this is &#8220;<span style=\"background-color: #fffcfc; color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; font-size: 15px; line-height: 21px;\">not a logical disjunction<\/span>&#8221; is of course nonsense (any statement of the form &#8220;Either A or B&#8221; is a logical disjunction), and this one has the added virtue of being true. Of course \u00a0if the observed statistic has a low p-value then the disjunction is essentially tautological, but then \u00a0really so is anything else that we can be convinced of by logic.<\/p>\n<p>But Briggs is right to wonder if it has any significance &#8211; or at least, if it does then what is the reason for that.<\/p>\n<p>Why do <del>we<\/del>\u00a0some people consider the occurrence of a low p-value to be significant (in the common language sense rather than just by definition)? In other words, why and how should it reduce our faith in the null hypothesis?<\/p>\n<p>The first thing to note is that the disjunction \u00a0&#8220;<span style=\"background-color: #fffcfc; color: #1f1f1f; font-family: Arial, Helvetica, sans-serif; font-size: 15px; line-height: 21px;\">Either the null hypothesis is false, or something very improbable has happened<\/span>&#8221; should NOT actually do anything to reduce our faith in the null hypothesis. It certainly matters what <em>kind<\/em> of improbable thing we have seen happen. \u00a0For example a meteor strike destroying New York should not cause us to doubt the hypothesis that sex and gender are not correlated &#8211; so clearly the improbable observed thing must be something that is predicted to be improbable <em>by the null hypothesis model<\/em>. \u00a0But in fact, in any model with continuously distributed variables the occurrence of ANY particular exact observed value is an event of zero probability. One might hope to talk in such cases of the probability density instead, but the probability density can be changed just by re-scaling the variable, so that won&#8217;t do either.<\/p>\n<p>What is it about the special case of a low p-value, ie an improbably large deviation from the expected value of a variable, that reduces our faith in the null hypothesis?<\/p>\n<p>&#8230;to be <a href=\"https:\/\/qpr.ca\/blogs\/2013\/10\/09\/so-what-is-significant-about-p-values\/\">continued<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of my favourite betes noires claims to have put\u00a0everything wrong with P-Values under one roof. My response started with \u00a0&#8220;There\u2019s nothing wrong with p-values any more than with Popeye. They is what they is and that\u2019s that. To blame &hellip; <a href=\"https:\/\/qpr.ca\/blogs\/2013\/10\/08\/whats-wrong-with-p-values\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-2092","post","type-post","status-publish","format-standard","hentry","category-general"],"_links":{"self":[{"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/posts\/2092","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/comments?post=2092"}],"version-history":[{"count":3,"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/posts\/2092\/revisions"}],"predecessor-version":[{"id":2112,"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/posts\/2092\/revisions\/2112"}],"wp:attachment":[{"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/media?parent=2092"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/categories?post=2092"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/qpr.ca\/blogs\/wp-json\/wp\/v2\/tags?post=2092"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}