Relative entropy is non-negative












6












$begingroup$


Let $p=(p_1,dotsc,p_r), q=(q_1,dotsc,q_r)$ be two different probability distributions. Define the relative entropy $$h(p||q) = sum_{i=1}^r p_i (ln p_i - ln q_i)$$ Show $h(p||q)geq 0$. I'm given the hint that I should show $-xln x$ is concave and then show for any concave function $f(y)-f(x)leq (y-x)f'(x)$ holds. I rewritten the relative entropy as $$h(p||q)=sum_{i=1}^r p_i ln left(frac{p_i}{q_i}right)= -sum_{i=1}^r p_i ln left(frac{q_i}{p_i}right)$$ which sort of looks like $-xln x$, and I did show that $-xln x$ is concave, but I don't really understand what I'm supposed to do, or even if this hint is helpful.










share|cite|improve this question









$endgroup$








  • 1




    $begingroup$
    This is called Gibb's inequality.
    $endgroup$
    – Srivatsan
    Oct 4 '11 at 20:24
















6












$begingroup$


Let $p=(p_1,dotsc,p_r), q=(q_1,dotsc,q_r)$ be two different probability distributions. Define the relative entropy $$h(p||q) = sum_{i=1}^r p_i (ln p_i - ln q_i)$$ Show $h(p||q)geq 0$. I'm given the hint that I should show $-xln x$ is concave and then show for any concave function $f(y)-f(x)leq (y-x)f'(x)$ holds. I rewritten the relative entropy as $$h(p||q)=sum_{i=1}^r p_i ln left(frac{p_i}{q_i}right)= -sum_{i=1}^r p_i ln left(frac{q_i}{p_i}right)$$ which sort of looks like $-xln x$, and I did show that $-xln x$ is concave, but I don't really understand what I'm supposed to do, or even if this hint is helpful.










share|cite|improve this question









$endgroup$








  • 1




    $begingroup$
    This is called Gibb's inequality.
    $endgroup$
    – Srivatsan
    Oct 4 '11 at 20:24














6












6








6


4



$begingroup$


Let $p=(p_1,dotsc,p_r), q=(q_1,dotsc,q_r)$ be two different probability distributions. Define the relative entropy $$h(p||q) = sum_{i=1}^r p_i (ln p_i - ln q_i)$$ Show $h(p||q)geq 0$. I'm given the hint that I should show $-xln x$ is concave and then show for any concave function $f(y)-f(x)leq (y-x)f'(x)$ holds. I rewritten the relative entropy as $$h(p||q)=sum_{i=1}^r p_i ln left(frac{p_i}{q_i}right)= -sum_{i=1}^r p_i ln left(frac{q_i}{p_i}right)$$ which sort of looks like $-xln x$, and I did show that $-xln x$ is concave, but I don't really understand what I'm supposed to do, or even if this hint is helpful.










share|cite|improve this question









$endgroup$




Let $p=(p_1,dotsc,p_r), q=(q_1,dotsc,q_r)$ be two different probability distributions. Define the relative entropy $$h(p||q) = sum_{i=1}^r p_i (ln p_i - ln q_i)$$ Show $h(p||q)geq 0$. I'm given the hint that I should show $-xln x$ is concave and then show for any concave function $f(y)-f(x)leq (y-x)f'(x)$ holds. I rewritten the relative entropy as $$h(p||q)=sum_{i=1}^r p_i ln left(frac{p_i}{q_i}right)= -sum_{i=1}^r p_i ln left(frac{q_i}{p_i}right)$$ which sort of looks like $-xln x$, and I did show that $-xln x$ is concave, but I don't really understand what I'm supposed to do, or even if this hint is helpful.







probability






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Oct 4 '11 at 20:15









anonanon

5514




5514








  • 1




    $begingroup$
    This is called Gibb's inequality.
    $endgroup$
    – Srivatsan
    Oct 4 '11 at 20:24














  • 1




    $begingroup$
    This is called Gibb's inequality.
    $endgroup$
    – Srivatsan
    Oct 4 '11 at 20:24








1




1




$begingroup$
This is called Gibb's inequality.
$endgroup$
– Srivatsan
Oct 4 '11 at 20:24




$begingroup$
This is called Gibb's inequality.
$endgroup$
– Srivatsan
Oct 4 '11 at 20:24










2 Answers
2






active

oldest

votes


















6












$begingroup$

Assume that the random variable $X$ is such that $X=dfrac{p_i}{q_i}$ with probability $q_i$, for every $i$. Then,
$$
h(pmidmid q)=sumlimits_iq_ifrac{p_i}{q_i}lnleft(frac{p_i}{q_i}right)=mathrm E(Xln X).
$$
Since the function $xmapsto xln x$ is convex, $mathrm E(Xln X)geqslant mathrm E(X)lnmathrm E(X)$ by Jensen inequality.



To complete the proof one must simply compute $mathrm E(X)$, and I will let you do that.






share|cite|improve this answer









$endgroup$









  • 2




    $begingroup$
    @anon There's another, but similar, way. Show that $x mapsto ln x$ is concave, so that $mathbf E[ln Y] leq ln (mathbf E[Y])$. Now, let $Y$ be a random variable that takes the value $(q_i/p_i)$ with probability $p_i$. Apply Jensen to $Y$.
    $endgroup$
    – Srivatsan
    Oct 4 '11 at 20:35








  • 1




    $begingroup$
    @Srivatsan, thanks, this might be considered as a (slightly...) simpler approach.
    $endgroup$
    – Did
    Oct 4 '11 at 20:43





















0












$begingroup$

I think the proof in the textbook is an awesome one:



$D(p||q) = sum p(x) log frac{p(x)}{q(x)}
= -sum p(x) log frac{q(x)}{p(x)} le -sum log p(x) frac {q(x)}{p(x)} = -sum log q(x) = 0$



Then $D(p||q) ge 0$






share|cite|improve this answer











$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f69859%2frelative-entropy-is-non-negative%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    6












    $begingroup$

    Assume that the random variable $X$ is such that $X=dfrac{p_i}{q_i}$ with probability $q_i$, for every $i$. Then,
    $$
    h(pmidmid q)=sumlimits_iq_ifrac{p_i}{q_i}lnleft(frac{p_i}{q_i}right)=mathrm E(Xln X).
    $$
    Since the function $xmapsto xln x$ is convex, $mathrm E(Xln X)geqslant mathrm E(X)lnmathrm E(X)$ by Jensen inequality.



    To complete the proof one must simply compute $mathrm E(X)$, and I will let you do that.






    share|cite|improve this answer









    $endgroup$









    • 2




      $begingroup$
      @anon There's another, but similar, way. Show that $x mapsto ln x$ is concave, so that $mathbf E[ln Y] leq ln (mathbf E[Y])$. Now, let $Y$ be a random variable that takes the value $(q_i/p_i)$ with probability $p_i$. Apply Jensen to $Y$.
      $endgroup$
      – Srivatsan
      Oct 4 '11 at 20:35








    • 1




      $begingroup$
      @Srivatsan, thanks, this might be considered as a (slightly...) simpler approach.
      $endgroup$
      – Did
      Oct 4 '11 at 20:43


















    6












    $begingroup$

    Assume that the random variable $X$ is such that $X=dfrac{p_i}{q_i}$ with probability $q_i$, for every $i$. Then,
    $$
    h(pmidmid q)=sumlimits_iq_ifrac{p_i}{q_i}lnleft(frac{p_i}{q_i}right)=mathrm E(Xln X).
    $$
    Since the function $xmapsto xln x$ is convex, $mathrm E(Xln X)geqslant mathrm E(X)lnmathrm E(X)$ by Jensen inequality.



    To complete the proof one must simply compute $mathrm E(X)$, and I will let you do that.






    share|cite|improve this answer









    $endgroup$









    • 2




      $begingroup$
      @anon There's another, but similar, way. Show that $x mapsto ln x$ is concave, so that $mathbf E[ln Y] leq ln (mathbf E[Y])$. Now, let $Y$ be a random variable that takes the value $(q_i/p_i)$ with probability $p_i$. Apply Jensen to $Y$.
      $endgroup$
      – Srivatsan
      Oct 4 '11 at 20:35








    • 1




      $begingroup$
      @Srivatsan, thanks, this might be considered as a (slightly...) simpler approach.
      $endgroup$
      – Did
      Oct 4 '11 at 20:43
















    6












    6








    6





    $begingroup$

    Assume that the random variable $X$ is such that $X=dfrac{p_i}{q_i}$ with probability $q_i$, for every $i$. Then,
    $$
    h(pmidmid q)=sumlimits_iq_ifrac{p_i}{q_i}lnleft(frac{p_i}{q_i}right)=mathrm E(Xln X).
    $$
    Since the function $xmapsto xln x$ is convex, $mathrm E(Xln X)geqslant mathrm E(X)lnmathrm E(X)$ by Jensen inequality.



    To complete the proof one must simply compute $mathrm E(X)$, and I will let you do that.






    share|cite|improve this answer









    $endgroup$



    Assume that the random variable $X$ is such that $X=dfrac{p_i}{q_i}$ with probability $q_i$, for every $i$. Then,
    $$
    h(pmidmid q)=sumlimits_iq_ifrac{p_i}{q_i}lnleft(frac{p_i}{q_i}right)=mathrm E(Xln X).
    $$
    Since the function $xmapsto xln x$ is convex, $mathrm E(Xln X)geqslant mathrm E(X)lnmathrm E(X)$ by Jensen inequality.



    To complete the proof one must simply compute $mathrm E(X)$, and I will let you do that.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Oct 4 '11 at 20:27









    DidDid

    248k23225463




    248k23225463








    • 2




      $begingroup$
      @anon There's another, but similar, way. Show that $x mapsto ln x$ is concave, so that $mathbf E[ln Y] leq ln (mathbf E[Y])$. Now, let $Y$ be a random variable that takes the value $(q_i/p_i)$ with probability $p_i$. Apply Jensen to $Y$.
      $endgroup$
      – Srivatsan
      Oct 4 '11 at 20:35








    • 1




      $begingroup$
      @Srivatsan, thanks, this might be considered as a (slightly...) simpler approach.
      $endgroup$
      – Did
      Oct 4 '11 at 20:43
















    • 2




      $begingroup$
      @anon There's another, but similar, way. Show that $x mapsto ln x$ is concave, so that $mathbf E[ln Y] leq ln (mathbf E[Y])$. Now, let $Y$ be a random variable that takes the value $(q_i/p_i)$ with probability $p_i$. Apply Jensen to $Y$.
      $endgroup$
      – Srivatsan
      Oct 4 '11 at 20:35








    • 1




      $begingroup$
      @Srivatsan, thanks, this might be considered as a (slightly...) simpler approach.
      $endgroup$
      – Did
      Oct 4 '11 at 20:43










    2




    2




    $begingroup$
    @anon There's another, but similar, way. Show that $x mapsto ln x$ is concave, so that $mathbf E[ln Y] leq ln (mathbf E[Y])$. Now, let $Y$ be a random variable that takes the value $(q_i/p_i)$ with probability $p_i$. Apply Jensen to $Y$.
    $endgroup$
    – Srivatsan
    Oct 4 '11 at 20:35






    $begingroup$
    @anon There's another, but similar, way. Show that $x mapsto ln x$ is concave, so that $mathbf E[ln Y] leq ln (mathbf E[Y])$. Now, let $Y$ be a random variable that takes the value $(q_i/p_i)$ with probability $p_i$. Apply Jensen to $Y$.
    $endgroup$
    – Srivatsan
    Oct 4 '11 at 20:35






    1




    1




    $begingroup$
    @Srivatsan, thanks, this might be considered as a (slightly...) simpler approach.
    $endgroup$
    – Did
    Oct 4 '11 at 20:43






    $begingroup$
    @Srivatsan, thanks, this might be considered as a (slightly...) simpler approach.
    $endgroup$
    – Did
    Oct 4 '11 at 20:43













    0












    $begingroup$

    I think the proof in the textbook is an awesome one:



    $D(p||q) = sum p(x) log frac{p(x)}{q(x)}
    = -sum p(x) log frac{q(x)}{p(x)} le -sum log p(x) frac {q(x)}{p(x)} = -sum log q(x) = 0$



    Then $D(p||q) ge 0$






    share|cite|improve this answer











    $endgroup$


















      0












      $begingroup$

      I think the proof in the textbook is an awesome one:



      $D(p||q) = sum p(x) log frac{p(x)}{q(x)}
      = -sum p(x) log frac{q(x)}{p(x)} le -sum log p(x) frac {q(x)}{p(x)} = -sum log q(x) = 0$



      Then $D(p||q) ge 0$






      share|cite|improve this answer











      $endgroup$
















        0












        0








        0





        $begingroup$

        I think the proof in the textbook is an awesome one:



        $D(p||q) = sum p(x) log frac{p(x)}{q(x)}
        = -sum p(x) log frac{q(x)}{p(x)} le -sum log p(x) frac {q(x)}{p(x)} = -sum log q(x) = 0$



        Then $D(p||q) ge 0$






        share|cite|improve this answer











        $endgroup$



        I think the proof in the textbook is an awesome one:



        $D(p||q) = sum p(x) log frac{p(x)}{q(x)}
        = -sum p(x) log frac{q(x)}{p(x)} le -sum log p(x) frac {q(x)}{p(x)} = -sum log q(x) = 0$



        Then $D(p||q) ge 0$







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Dec 16 '18 at 7:32

























        answered Dec 16 '18 at 6:53









        Lerner ZhangLerner Zhang

        314217




        314217






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f69859%2frelative-entropy-is-non-negative%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Plaza Victoria

            In PowerPoint, is there a keyboard shortcut for bulleted / numbered list?

            How to put 3 figures in Latex with 2 figures side by side and 1 below these side by side images but in...