QQ Plot and Shapiro Wilk Test Disagree












1












$begingroup$


My QQ Plot shows that the data is not normally distributed



qqplot(residual_values, fit = True, line = '45')
pylab.show()


enter image description here



It has a skewness of 0.54



residual_values.skew()  # 0.5469389365591185


But the p_value of Shapiro test is greater than 0.05, telling me that it is normally distributed



shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)


What is the correct inference from this, Is it Normally Distributed or not?










share|cite|improve this question







New contributor




Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$








  • 3




    $begingroup$
    The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
    $endgroup$
    – The Laconic
    4 hours ago






  • 3




    $begingroup$
    It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
    $endgroup$
    – Nick Cox
    3 hours ago










  • $begingroup$
    @TheLaconic Sorry I am new to ML.
    $endgroup$
    – Shinigami
    3 hours ago










  • $begingroup$
    @NickCox Thank you Sorry reputation is low to upvote you guys
    $endgroup$
    – Shinigami
    3 hours ago










  • $begingroup$
    It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
    $endgroup$
    – Glen_b
    1 hour ago


















1












$begingroup$


My QQ Plot shows that the data is not normally distributed



qqplot(residual_values, fit = True, line = '45')
pylab.show()


enter image description here



It has a skewness of 0.54



residual_values.skew()  # 0.5469389365591185


But the p_value of Shapiro test is greater than 0.05, telling me that it is normally distributed



shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)


What is the correct inference from this, Is it Normally Distributed or not?










share|cite|improve this question







New contributor




Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$








  • 3




    $begingroup$
    The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
    $endgroup$
    – The Laconic
    4 hours ago






  • 3




    $begingroup$
    It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
    $endgroup$
    – Nick Cox
    3 hours ago










  • $begingroup$
    @TheLaconic Sorry I am new to ML.
    $endgroup$
    – Shinigami
    3 hours ago










  • $begingroup$
    @NickCox Thank you Sorry reputation is low to upvote you guys
    $endgroup$
    – Shinigami
    3 hours ago










  • $begingroup$
    It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
    $endgroup$
    – Glen_b
    1 hour ago
















1












1








1





$begingroup$


My QQ Plot shows that the data is not normally distributed



qqplot(residual_values, fit = True, line = '45')
pylab.show()


enter image description here



It has a skewness of 0.54



residual_values.skew()  # 0.5469389365591185


But the p_value of Shapiro test is greater than 0.05, telling me that it is normally distributed



shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)


What is the correct inference from this, Is it Normally Distributed or not?










share|cite|improve this question







New contributor




Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




My QQ Plot shows that the data is not normally distributed



qqplot(residual_values, fit = True, line = '45')
pylab.show()


enter image description here



It has a skewness of 0.54



residual_values.skew()  # 0.5469389365591185


But the p_value of Shapiro test is greater than 0.05, telling me that it is normally distributed



shapiro(residual_values) # (0.9569438099861145, 0.2261517345905304)


What is the correct inference from this, Is it Normally Distributed or not?







regression machine-learning






share|cite|improve this question







New contributor




Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question







New contributor




Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question






New contributor




Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 6 hours ago









ShinigamiShinigami

164




164




New contributor




Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Shinigami is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 3




    $begingroup$
    The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
    $endgroup$
    – The Laconic
    4 hours ago






  • 3




    $begingroup$
    It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
    $endgroup$
    – Nick Cox
    3 hours ago










  • $begingroup$
    @TheLaconic Sorry I am new to ML.
    $endgroup$
    – Shinigami
    3 hours ago










  • $begingroup$
    @NickCox Thank you Sorry reputation is low to upvote you guys
    $endgroup$
    – Shinigami
    3 hours ago










  • $begingroup$
    It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
    $endgroup$
    – Glen_b
    1 hour ago
















  • 3




    $begingroup$
    The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
    $endgroup$
    – The Laconic
    4 hours ago






  • 3




    $begingroup$
    It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
    $endgroup$
    – Nick Cox
    3 hours ago










  • $begingroup$
    @TheLaconic Sorry I am new to ML.
    $endgroup$
    – Shinigami
    3 hours ago










  • $begingroup$
    @NickCox Thank you Sorry reputation is low to upvote you guys
    $endgroup$
    – Shinigami
    3 hours ago










  • $begingroup$
    It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
    $endgroup$
    – Glen_b
    1 hour ago










3




3




$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
4 hours ago




$begingroup$
The QQ plot looks consistent with being normally distributed. Did you expect every point to fall exactly on the line?
$endgroup$
– The Laconic
4 hours ago




3




3




$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
3 hours ago




$begingroup$
It is approximately normally distributed if you are prepared to discount slight skewness. No procedure ever indicates more.
$endgroup$
– Nick Cox
3 hours ago












$begingroup$
@TheLaconic Sorry I am new to ML.
$endgroup$
– Shinigami
3 hours ago




$begingroup$
@TheLaconic Sorry I am new to ML.
$endgroup$
– Shinigami
3 hours ago












$begingroup$
@NickCox Thank you Sorry reputation is low to upvote you guys
$endgroup$
– Shinigami
3 hours ago




$begingroup$
@NickCox Thank you Sorry reputation is low to upvote you guys
$endgroup$
– Shinigami
3 hours ago












$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b
1 hour ago






$begingroup$
It's approximately normal, the skewness in the sample is quite mild; this doesn't automatically mean the population is also skewed (though I expect it is). A high p-value on a test of normality doesn't mean that it is normal, only that you couldn't detect whatever population non-normality there was. (The answer to "is it normally distributed" is "no" - unless you generated it to be normal it won't actually be normal -- but why would it have to be?)
$endgroup$
– Glen_b
1 hour ago












4 Answers
4






active

oldest

votes


















1












$begingroup$

The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.






share|cite|improve this answer









$endgroup$





















    1












    $begingroup$

    The q-q is consistent with (not "proving") approximate normality, more or less.



    The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



    @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).






    share|cite|improve this answer









    $endgroup$





















      1












      $begingroup$

      The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.






      share|cite|improve this answer









      $endgroup$





















        0












        $begingroup$

        My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).






        share|cite|improve this answer









        $endgroup$













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "65"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          Shinigami is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396717%2fqq-plot-and-shapiro-wilk-test-disagree%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          4 Answers
          4






          active

          oldest

          votes








          4 Answers
          4






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1












          $begingroup$

          The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



          The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.






          share|cite|improve this answer









          $endgroup$


















            1












            $begingroup$

            The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



            The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.






            share|cite|improve this answer









            $endgroup$
















              1












              1








              1





              $begingroup$

              The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



              The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.






              share|cite|improve this answer









              $endgroup$



              The QQ plot is an informal test of normality that can give you some insight into the nature of deviations from normality; for example, whether the distribution has some skew, or fat tails, or whether there are specific observations that deviate from what you would expect from a normal distribution (outliers). The QQ plot can often convince you that the distribution is definitely not normal, but this isn't such a case. Here, the points fall more or less along the line, which is broadly consistent with normality--intuitively, the sort of variation you would expect to see in a small sample.



              The Shapiro test is a formal test of normality. I'm not familiar with the shapiro function's output, so I'm not sure which number, if either, is supposed to be the p-value, but if you say it's largish, then we are led to accept the null hypothesis of normality. And this is consistent with what we see qualitatively in the QQ plot.







              share|cite|improve this answer












              share|cite|improve this answer



              share|cite|improve this answer










              answered 2 hours ago









              The LaconicThe Laconic

              1,0741615




              1,0741615

























                  1












                  $begingroup$

                  The q-q is consistent with (not "proving") approximate normality, more or less.



                  The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



                  @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).






                  share|cite|improve this answer









                  $endgroup$


















                    1












                    $begingroup$

                    The q-q is consistent with (not "proving") approximate normality, more or less.



                    The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



                    @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).






                    share|cite|improve this answer









                    $endgroup$
















                      1












                      1








                      1





                      $begingroup$

                      The q-q is consistent with (not "proving") approximate normality, more or less.



                      The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



                      @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).






                      share|cite|improve this answer









                      $endgroup$



                      The q-q is consistent with (not "proving") approximate normality, more or less.



                      The Shapiro-Wilk is a formal test of normality and as such, it cannot confirm the null hypothesis of normality. The data may be reasonably consistent with normality yet still be from a different nonnormal underlying distribution. Frequentist hypothesis tests, as a general rule, cannot prove a hypothesis, and failure to reject (p>alpha) does not support the null hypothesis.



                      @The Laconic gave some decent advice to interpret the q-q plot. However, large p-values do not lead you to accept the null hypothesis (therefore, you don't conclude normality based on this test; the best you can do is say insufficient evidence of nonnormality at the a priori chosen alpha level).







                      share|cite|improve this answer












                      share|cite|improve this answer



                      share|cite|improve this answer










                      answered 2 hours ago









                      LSCLSC

                      1707




                      1707























                          1












                          $begingroup$

                          The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.






                          share|cite|improve this answer









                          $endgroup$


















                            1












                            $begingroup$

                            The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.






                            share|cite|improve this answer









                            $endgroup$
















                              1












                              1








                              1





                              $begingroup$

                              The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.






                              share|cite|improve this answer









                              $endgroup$



                              The shapiro-wilk p-value being >0.05 indicates lack of evidence to against normality. That is consistent with the QQ plot you showed, which is not too far off the line. I don't see what the inconsistency is here. Also, you should give a CI for the skewness coefficient.







                              share|cite|improve this answer












                              share|cite|improve this answer



                              share|cite|improve this answer










                              answered 2 hours ago









                              beta1_equals_beta2beta1_equals_beta2

                              512




                              512























                                  0












                                  $begingroup$

                                  My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).






                                  share|cite|improve this answer









                                  $endgroup$


















                                    0












                                    $begingroup$

                                    My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).






                                    share|cite|improve this answer









                                    $endgroup$
















                                      0












                                      0








                                      0





                                      $begingroup$

                                      My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).






                                      share|cite|improve this answer









                                      $endgroup$



                                      My understanding is that, given power issues with normality tests, they are not highly recommended. As a result I don't use them any more, preferring QQ plots (which are recommended in the literature I have seen).







                                      share|cite|improve this answer












                                      share|cite|improve this answer



                                      share|cite|improve this answer










                                      answered 1 hour ago









                                      user54285user54285

                                      413




                                      413






















                                          Shinigami is a new contributor. Be nice, and check out our Code of Conduct.










                                          draft saved

                                          draft discarded


















                                          Shinigami is a new contributor. Be nice, and check out our Code of Conduct.













                                          Shinigami is a new contributor. Be nice, and check out our Code of Conduct.












                                          Shinigami is a new contributor. Be nice, and check out our Code of Conduct.
















                                          Thanks for contributing an answer to Cross Validated!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          Use MathJax to format equations. MathJax reference.


                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function () {
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f396717%2fqq-plot-and-shapiro-wilk-test-disagree%23new-answer', 'question_page');
                                          }
                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          Plaza Victoria

                                          In PowerPoint, is there a keyboard shortcut for bulleted / numbered list?

                                          How to put 3 figures in Latex with 2 figures side by side and 1 below these side by side images but in...