Removing variable with big p-value?











up vote
0
down vote

favorite
1












I have made a regression with 2 explanatory variables. The summary of that regression shows that one of my variable has a big p-value (0.705). Should I include that variable when writing the the y hat equation?










share|cite|improve this question


























    up vote
    0
    down vote

    favorite
    1












    I have made a regression with 2 explanatory variables. The summary of that regression shows that one of my variable has a big p-value (0.705). Should I include that variable when writing the the y hat equation?










    share|cite|improve this question
























      up vote
      0
      down vote

      favorite
      1









      up vote
      0
      down vote

      favorite
      1






      1





      I have made a regression with 2 explanatory variables. The summary of that regression shows that one of my variable has a big p-value (0.705). Should I include that variable when writing the the y hat equation?










      share|cite|improve this question













      I have made a regression with 2 explanatory variables. The summary of that regression shows that one of my variable has a big p-value (0.705). Should I include that variable when writing the the y hat equation?







      statistics linear-regression p-value






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Nov 14 at 23:21









      Camue

      31




      31






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          0
          down vote



          accepted










          This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.



          Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.



          If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.



          Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.






          share|cite|improve this answer





















          • Thanks. Very helpful. Could you list some of the voting methods?
            – Camue
            Nov 17 at 10:24




















          up vote
          0
          down vote













          This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)



          Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.



          Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.






          share|cite|improve this answer





















            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "69"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2998958%2fremoving-variable-with-big-p-value%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            0
            down vote



            accepted










            This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.



            Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.



            If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.



            Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.






            share|cite|improve this answer





















            • Thanks. Very helpful. Could you list some of the voting methods?
              – Camue
              Nov 17 at 10:24

















            up vote
            0
            down vote



            accepted










            This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.



            Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.



            If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.



            Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.






            share|cite|improve this answer





















            • Thanks. Very helpful. Could you list some of the voting methods?
              – Camue
              Nov 17 at 10:24















            up vote
            0
            down vote



            accepted







            up vote
            0
            down vote



            accepted






            This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.



            Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.



            If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.



            Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.






            share|cite|improve this answer












            This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.



            Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.



            If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.



            Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Nov 15 at 6:53









            AnNg

            355




            355












            • Thanks. Very helpful. Could you list some of the voting methods?
              – Camue
              Nov 17 at 10:24




















            • Thanks. Very helpful. Could you list some of the voting methods?
              – Camue
              Nov 17 at 10:24


















            Thanks. Very helpful. Could you list some of the voting methods?
            – Camue
            Nov 17 at 10:24






            Thanks. Very helpful. Could you list some of the voting methods?
            – Camue
            Nov 17 at 10:24












            up vote
            0
            down vote













            This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)



            Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.



            Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.






            share|cite|improve this answer

























              up vote
              0
              down vote













              This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)



              Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.



              Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.






              share|cite|improve this answer























                up vote
                0
                down vote










                up vote
                0
                down vote









                This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)



                Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.



                Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.






                share|cite|improve this answer












                This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)



                Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.



                Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered Nov 14 at 23:33









                fny

                864612




                864612






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2998958%2fremoving-variable-with-big-p-value%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Plaza Victoria

                    In PowerPoint, is there a keyboard shortcut for bulleted / numbered list?

                    How to put 3 figures in Latex with 2 figures side by side and 1 below these side by side images but in...