Hypothesis testing for non-independent data











up vote
0
down vote

favorite












Researchers collected the following data concerning comparability of diagnoses of
schizophrenia obtained from primary-care physician report as compared with proxy
report (from spouses). Data was collected from 953 people. The researchers found
that schizophrenia was identified as present on 115 physician reports and 124 proxy
reports. Both physician and proxy informants identified 34 people as positive for
schizophrenia.



How do I compare the percentage of subjects identified as schizophrenic by the two reports? And how do I test if there is a significant agreement between the two reports. Can someone please help me on how to approach this problem?










share|cite|improve this question






















  • Well, you could assume that the physician reports were normative, analyze the distribution that would yield, and see whether the proxy results were probable.
    – lulu
    Nov 20 at 0:44










  • @lulu, what do you mean by normative?
    – Lady
    Nov 20 at 0:45










  • Actually, I don't understand the question. What does the $953$ have to do with anything? And what is the difference between "schizophrenia identified as present" and "positive for schizophrenia"?
    – lulu
    Nov 20 at 0:46










  • I was assuming (probably incorrectly) that you had two samples, sample $A$ had $115$ members and $34$ positives, sample $B$ had $124$ members and also had $34$ positives. But in that case I am ignoring the $953$ entirely.
    – lulu
    Nov 20 at 0:48










  • @lulu, 953 is the grand total number of people sampled. The intersection between the physician report and the proxy report is 34. Meaning 34 were sampled from physician and proxy report.
    – Lady
    Nov 20 at 0:50















up vote
0
down vote

favorite












Researchers collected the following data concerning comparability of diagnoses of
schizophrenia obtained from primary-care physician report as compared with proxy
report (from spouses). Data was collected from 953 people. The researchers found
that schizophrenia was identified as present on 115 physician reports and 124 proxy
reports. Both physician and proxy informants identified 34 people as positive for
schizophrenia.



How do I compare the percentage of subjects identified as schizophrenic by the two reports? And how do I test if there is a significant agreement between the two reports. Can someone please help me on how to approach this problem?










share|cite|improve this question






















  • Well, you could assume that the physician reports were normative, analyze the distribution that would yield, and see whether the proxy results were probable.
    – lulu
    Nov 20 at 0:44










  • @lulu, what do you mean by normative?
    – Lady
    Nov 20 at 0:45










  • Actually, I don't understand the question. What does the $953$ have to do with anything? And what is the difference between "schizophrenia identified as present" and "positive for schizophrenia"?
    – lulu
    Nov 20 at 0:46










  • I was assuming (probably incorrectly) that you had two samples, sample $A$ had $115$ members and $34$ positives, sample $B$ had $124$ members and also had $34$ positives. But in that case I am ignoring the $953$ entirely.
    – lulu
    Nov 20 at 0:48










  • @lulu, 953 is the grand total number of people sampled. The intersection between the physician report and the proxy report is 34. Meaning 34 were sampled from physician and proxy report.
    – Lady
    Nov 20 at 0:50













up vote
0
down vote

favorite









up vote
0
down vote

favorite











Researchers collected the following data concerning comparability of diagnoses of
schizophrenia obtained from primary-care physician report as compared with proxy
report (from spouses). Data was collected from 953 people. The researchers found
that schizophrenia was identified as present on 115 physician reports and 124 proxy
reports. Both physician and proxy informants identified 34 people as positive for
schizophrenia.



How do I compare the percentage of subjects identified as schizophrenic by the two reports? And how do I test if there is a significant agreement between the two reports. Can someone please help me on how to approach this problem?










share|cite|improve this question













Researchers collected the following data concerning comparability of diagnoses of
schizophrenia obtained from primary-care physician report as compared with proxy
report (from spouses). Data was collected from 953 people. The researchers found
that schizophrenia was identified as present on 115 physician reports and 124 proxy
reports. Both physician and proxy informants identified 34 people as positive for
schizophrenia.



How do I compare the percentage of subjects identified as schizophrenic by the two reports? And how do I test if there is a significant agreement between the two reports. Can someone please help me on how to approach this problem?







statistics statistical-inference






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Nov 20 at 0:33









Lady

1158




1158












  • Well, you could assume that the physician reports were normative, analyze the distribution that would yield, and see whether the proxy results were probable.
    – lulu
    Nov 20 at 0:44










  • @lulu, what do you mean by normative?
    – Lady
    Nov 20 at 0:45










  • Actually, I don't understand the question. What does the $953$ have to do with anything? And what is the difference between "schizophrenia identified as present" and "positive for schizophrenia"?
    – lulu
    Nov 20 at 0:46










  • I was assuming (probably incorrectly) that you had two samples, sample $A$ had $115$ members and $34$ positives, sample $B$ had $124$ members and also had $34$ positives. But in that case I am ignoring the $953$ entirely.
    – lulu
    Nov 20 at 0:48










  • @lulu, 953 is the grand total number of people sampled. The intersection between the physician report and the proxy report is 34. Meaning 34 were sampled from physician and proxy report.
    – Lady
    Nov 20 at 0:50


















  • Well, you could assume that the physician reports were normative, analyze the distribution that would yield, and see whether the proxy results were probable.
    – lulu
    Nov 20 at 0:44










  • @lulu, what do you mean by normative?
    – Lady
    Nov 20 at 0:45










  • Actually, I don't understand the question. What does the $953$ have to do with anything? And what is the difference between "schizophrenia identified as present" and "positive for schizophrenia"?
    – lulu
    Nov 20 at 0:46










  • I was assuming (probably incorrectly) that you had two samples, sample $A$ had $115$ members and $34$ positives, sample $B$ had $124$ members and also had $34$ positives. But in that case I am ignoring the $953$ entirely.
    – lulu
    Nov 20 at 0:48










  • @lulu, 953 is the grand total number of people sampled. The intersection between the physician report and the proxy report is 34. Meaning 34 were sampled from physician and proxy report.
    – Lady
    Nov 20 at 0:50
















Well, you could assume that the physician reports were normative, analyze the distribution that would yield, and see whether the proxy results were probable.
– lulu
Nov 20 at 0:44




Well, you could assume that the physician reports were normative, analyze the distribution that would yield, and see whether the proxy results were probable.
– lulu
Nov 20 at 0:44












@lulu, what do you mean by normative?
– Lady
Nov 20 at 0:45




@lulu, what do you mean by normative?
– Lady
Nov 20 at 0:45












Actually, I don't understand the question. What does the $953$ have to do with anything? And what is the difference between "schizophrenia identified as present" and "positive for schizophrenia"?
– lulu
Nov 20 at 0:46




Actually, I don't understand the question. What does the $953$ have to do with anything? And what is the difference between "schizophrenia identified as present" and "positive for schizophrenia"?
– lulu
Nov 20 at 0:46












I was assuming (probably incorrectly) that you had two samples, sample $A$ had $115$ members and $34$ positives, sample $B$ had $124$ members and also had $34$ positives. But in that case I am ignoring the $953$ entirely.
– lulu
Nov 20 at 0:48




I was assuming (probably incorrectly) that you had two samples, sample $A$ had $115$ members and $34$ positives, sample $B$ had $124$ members and also had $34$ positives. But in that case I am ignoring the $953$ entirely.
– lulu
Nov 20 at 0:48












@lulu, 953 is the grand total number of people sampled. The intersection between the physician report and the proxy report is 34. Meaning 34 were sampled from physician and proxy report.
– Lady
Nov 20 at 0:50




@lulu, 953 is the grand total number of people sampled. The intersection between the physician report and the proxy report is 34. Meaning 34 were sampled from physician and proxy report.
– Lady
Nov 20 at 0:50










1 Answer
1






active

oldest

votes

















up vote
0
down vote













Clue, to get you started: My interpretation of your question is that
you need to finish filling in the frequencies in the following $2 times 2$ table. (This can be done by simple arithmetic from the counts provided.)
Then use the completed table to do a chi-squared test of independence.



                   Physician
---------------
Yes No Total
---------------------------------------------
Yes 34 124
Spouse
No
--------------------------------------------
Total 115 953


Addendum, Results from Minitab statistical software are shown below. Do you know how to find
Expected counts, Contributions, and Pearson chi-sq statistic?
Null hypothesis that Physicians and Spouses make judgments
about schizophrenia in independent ways is rejected. That is,
both physicians and spouses must be looking at some of the same
symptoms. Briefly put, under independence, we would 'expect' only
about 15 agreements on 'Yes', but we observe 34 agreements (more than twice as
many as expected).



Chi-Square Test for Association: Spouse, Worksheet columns 

Rows: Spouse Columns: Worksheet columns

PhyYes PhyNo All

Yes 34 90 124
15.0 109.0
24.219 3.324

No 81 748 829
100.0 729.0
3.623 0.497

All 115 838 953

Cell Contents: Count
Expected count
Contribution to Chi-square

Pearson Chi-Square = 31.662, DF = 1, P-Value = 0.000


Note; Formally, a standard test comparing the physicians' estimate $115/953$ with
the spouses' estimate $124/953$ is shown below. It shows no significant
difference. However, this is supposed to be a comparison of two independent
proportions, which we do not seem to have here.



Test and CI for Two Proportions 

Sample X N Sample p
1 124 935 0.132620
2 115 935 0.122995

Difference = p (1) - p (2)
Estimate for difference: 0.00962567
95% CI for difference: (-0.0206363, 0.0398876)
Test for difference = 0 (vs ≠ 0): Z = 0.62 P-Value = 0.533





share|cite|improve this answer























    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3005752%2fhypothesis-testing-for-non-independent-data%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    Clue, to get you started: My interpretation of your question is that
    you need to finish filling in the frequencies in the following $2 times 2$ table. (This can be done by simple arithmetic from the counts provided.)
    Then use the completed table to do a chi-squared test of independence.



                       Physician
    ---------------
    Yes No Total
    ---------------------------------------------
    Yes 34 124
    Spouse
    No
    --------------------------------------------
    Total 115 953


    Addendum, Results from Minitab statistical software are shown below. Do you know how to find
    Expected counts, Contributions, and Pearson chi-sq statistic?
    Null hypothesis that Physicians and Spouses make judgments
    about schizophrenia in independent ways is rejected. That is,
    both physicians and spouses must be looking at some of the same
    symptoms. Briefly put, under independence, we would 'expect' only
    about 15 agreements on 'Yes', but we observe 34 agreements (more than twice as
    many as expected).



    Chi-Square Test for Association: Spouse, Worksheet columns 

    Rows: Spouse Columns: Worksheet columns

    PhyYes PhyNo All

    Yes 34 90 124
    15.0 109.0
    24.219 3.324

    No 81 748 829
    100.0 729.0
    3.623 0.497

    All 115 838 953

    Cell Contents: Count
    Expected count
    Contribution to Chi-square

    Pearson Chi-Square = 31.662, DF = 1, P-Value = 0.000


    Note; Formally, a standard test comparing the physicians' estimate $115/953$ with
    the spouses' estimate $124/953$ is shown below. It shows no significant
    difference. However, this is supposed to be a comparison of two independent
    proportions, which we do not seem to have here.



    Test and CI for Two Proportions 

    Sample X N Sample p
    1 124 935 0.132620
    2 115 935 0.122995

    Difference = p (1) - p (2)
    Estimate for difference: 0.00962567
    95% CI for difference: (-0.0206363, 0.0398876)
    Test for difference = 0 (vs ≠ 0): Z = 0.62 P-Value = 0.533





    share|cite|improve this answer



























      up vote
      0
      down vote













      Clue, to get you started: My interpretation of your question is that
      you need to finish filling in the frequencies in the following $2 times 2$ table. (This can be done by simple arithmetic from the counts provided.)
      Then use the completed table to do a chi-squared test of independence.



                         Physician
      ---------------
      Yes No Total
      ---------------------------------------------
      Yes 34 124
      Spouse
      No
      --------------------------------------------
      Total 115 953


      Addendum, Results from Minitab statistical software are shown below. Do you know how to find
      Expected counts, Contributions, and Pearson chi-sq statistic?
      Null hypothesis that Physicians and Spouses make judgments
      about schizophrenia in independent ways is rejected. That is,
      both physicians and spouses must be looking at some of the same
      symptoms. Briefly put, under independence, we would 'expect' only
      about 15 agreements on 'Yes', but we observe 34 agreements (more than twice as
      many as expected).



      Chi-Square Test for Association: Spouse, Worksheet columns 

      Rows: Spouse Columns: Worksheet columns

      PhyYes PhyNo All

      Yes 34 90 124
      15.0 109.0
      24.219 3.324

      No 81 748 829
      100.0 729.0
      3.623 0.497

      All 115 838 953

      Cell Contents: Count
      Expected count
      Contribution to Chi-square

      Pearson Chi-Square = 31.662, DF = 1, P-Value = 0.000


      Note; Formally, a standard test comparing the physicians' estimate $115/953$ with
      the spouses' estimate $124/953$ is shown below. It shows no significant
      difference. However, this is supposed to be a comparison of two independent
      proportions, which we do not seem to have here.



      Test and CI for Two Proportions 

      Sample X N Sample p
      1 124 935 0.132620
      2 115 935 0.122995

      Difference = p (1) - p (2)
      Estimate for difference: 0.00962567
      95% CI for difference: (-0.0206363, 0.0398876)
      Test for difference = 0 (vs ≠ 0): Z = 0.62 P-Value = 0.533





      share|cite|improve this answer

























        up vote
        0
        down vote










        up vote
        0
        down vote









        Clue, to get you started: My interpretation of your question is that
        you need to finish filling in the frequencies in the following $2 times 2$ table. (This can be done by simple arithmetic from the counts provided.)
        Then use the completed table to do a chi-squared test of independence.



                           Physician
        ---------------
        Yes No Total
        ---------------------------------------------
        Yes 34 124
        Spouse
        No
        --------------------------------------------
        Total 115 953


        Addendum, Results from Minitab statistical software are shown below. Do you know how to find
        Expected counts, Contributions, and Pearson chi-sq statistic?
        Null hypothesis that Physicians and Spouses make judgments
        about schizophrenia in independent ways is rejected. That is,
        both physicians and spouses must be looking at some of the same
        symptoms. Briefly put, under independence, we would 'expect' only
        about 15 agreements on 'Yes', but we observe 34 agreements (more than twice as
        many as expected).



        Chi-Square Test for Association: Spouse, Worksheet columns 

        Rows: Spouse Columns: Worksheet columns

        PhyYes PhyNo All

        Yes 34 90 124
        15.0 109.0
        24.219 3.324

        No 81 748 829
        100.0 729.0
        3.623 0.497

        All 115 838 953

        Cell Contents: Count
        Expected count
        Contribution to Chi-square

        Pearson Chi-Square = 31.662, DF = 1, P-Value = 0.000


        Note; Formally, a standard test comparing the physicians' estimate $115/953$ with
        the spouses' estimate $124/953$ is shown below. It shows no significant
        difference. However, this is supposed to be a comparison of two independent
        proportions, which we do not seem to have here.



        Test and CI for Two Proportions 

        Sample X N Sample p
        1 124 935 0.132620
        2 115 935 0.122995

        Difference = p (1) - p (2)
        Estimate for difference: 0.00962567
        95% CI for difference: (-0.0206363, 0.0398876)
        Test for difference = 0 (vs ≠ 0): Z = 0.62 P-Value = 0.533





        share|cite|improve this answer














        Clue, to get you started: My interpretation of your question is that
        you need to finish filling in the frequencies in the following $2 times 2$ table. (This can be done by simple arithmetic from the counts provided.)
        Then use the completed table to do a chi-squared test of independence.



                           Physician
        ---------------
        Yes No Total
        ---------------------------------------------
        Yes 34 124
        Spouse
        No
        --------------------------------------------
        Total 115 953


        Addendum, Results from Minitab statistical software are shown below. Do you know how to find
        Expected counts, Contributions, and Pearson chi-sq statistic?
        Null hypothesis that Physicians and Spouses make judgments
        about schizophrenia in independent ways is rejected. That is,
        both physicians and spouses must be looking at some of the same
        symptoms. Briefly put, under independence, we would 'expect' only
        about 15 agreements on 'Yes', but we observe 34 agreements (more than twice as
        many as expected).



        Chi-Square Test for Association: Spouse, Worksheet columns 

        Rows: Spouse Columns: Worksheet columns

        PhyYes PhyNo All

        Yes 34 90 124
        15.0 109.0
        24.219 3.324

        No 81 748 829
        100.0 729.0
        3.623 0.497

        All 115 838 953

        Cell Contents: Count
        Expected count
        Contribution to Chi-square

        Pearson Chi-Square = 31.662, DF = 1, P-Value = 0.000


        Note; Formally, a standard test comparing the physicians' estimate $115/953$ with
        the spouses' estimate $124/953$ is shown below. It shows no significant
        difference. However, this is supposed to be a comparison of two independent
        proportions, which we do not seem to have here.



        Test and CI for Two Proportions 

        Sample X N Sample p
        1 124 935 0.132620
        2 115 935 0.122995

        Difference = p (1) - p (2)
        Estimate for difference: 0.00962567
        95% CI for difference: (-0.0206363, 0.0398876)
        Test for difference = 0 (vs ≠ 0): Z = 0.62 P-Value = 0.533






        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Nov 20 at 22:43

























        answered Nov 20 at 20:10









        BruceET

        35k71440




        35k71440






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3005752%2fhypothesis-testing-for-non-independent-data%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Plaza Victoria

            Puebla de Zaragoza

            Musa