How to quantify the confidence on a Bayesian posterior probability?












1












$begingroup$


Consider a physical system that depends on a parameter $0leq phi <infty$. I want to (i) find the probability that this parameter is smaller than a critical value: $phileq phi_c$, and (ii) quantify my confidence on the probability that I produce in (i)---see below. To this aim, I use Bayesian approach. A priori, the probability distribution of this parameter is $p(phi)$. I measure some observable, say $X$. By repeating this measurement $m$ times, I collect the dataset of outcomes of $X$, which I show with the vector ${bf x} = {x_1,x_2,dots,x_m}$.
From the Bayes rule, I can update the probability distribution $p(phi) to p(phi|{bf x})$ as follows:
$$ p(phi|{bf x}) = frac{p({bf x}|phi)p(phi)}{int dphi ~ p({bf x}|phi)p(phi)}.$$
Thus, the answer to (i) is simply $p(phileqphi_c) = int_0^{phi_c} dphi~ p(phi|{bf x})$.
It remains to answer (ii), i.e., to quantify how reliable is our answer to (i). More precisely, I want to quantify the error made in estimating $p(phileqphi_c)$. Such error should depend on the number of measurements $m$, and vanish as $mto infty$ (most likely it scales with $1/sqrt{m}$).



I would appreciate any hint or references on quantifying the described error. I hope that I was clear enough about the problem...if not, please let me know.










share|cite|improve this question











$endgroup$

















    1












    $begingroup$


    Consider a physical system that depends on a parameter $0leq phi <infty$. I want to (i) find the probability that this parameter is smaller than a critical value: $phileq phi_c$, and (ii) quantify my confidence on the probability that I produce in (i)---see below. To this aim, I use Bayesian approach. A priori, the probability distribution of this parameter is $p(phi)$. I measure some observable, say $X$. By repeating this measurement $m$ times, I collect the dataset of outcomes of $X$, which I show with the vector ${bf x} = {x_1,x_2,dots,x_m}$.
    From the Bayes rule, I can update the probability distribution $p(phi) to p(phi|{bf x})$ as follows:
    $$ p(phi|{bf x}) = frac{p({bf x}|phi)p(phi)}{int dphi ~ p({bf x}|phi)p(phi)}.$$
    Thus, the answer to (i) is simply $p(phileqphi_c) = int_0^{phi_c} dphi~ p(phi|{bf x})$.
    It remains to answer (ii), i.e., to quantify how reliable is our answer to (i). More precisely, I want to quantify the error made in estimating $p(phileqphi_c)$. Such error should depend on the number of measurements $m$, and vanish as $mto infty$ (most likely it scales with $1/sqrt{m}$).



    I would appreciate any hint or references on quantifying the described error. I hope that I was clear enough about the problem...if not, please let me know.










    share|cite|improve this question











    $endgroup$















      1












      1








      1





      $begingroup$


      Consider a physical system that depends on a parameter $0leq phi <infty$. I want to (i) find the probability that this parameter is smaller than a critical value: $phileq phi_c$, and (ii) quantify my confidence on the probability that I produce in (i)---see below. To this aim, I use Bayesian approach. A priori, the probability distribution of this parameter is $p(phi)$. I measure some observable, say $X$. By repeating this measurement $m$ times, I collect the dataset of outcomes of $X$, which I show with the vector ${bf x} = {x_1,x_2,dots,x_m}$.
      From the Bayes rule, I can update the probability distribution $p(phi) to p(phi|{bf x})$ as follows:
      $$ p(phi|{bf x}) = frac{p({bf x}|phi)p(phi)}{int dphi ~ p({bf x}|phi)p(phi)}.$$
      Thus, the answer to (i) is simply $p(phileqphi_c) = int_0^{phi_c} dphi~ p(phi|{bf x})$.
      It remains to answer (ii), i.e., to quantify how reliable is our answer to (i). More precisely, I want to quantify the error made in estimating $p(phileqphi_c)$. Such error should depend on the number of measurements $m$, and vanish as $mto infty$ (most likely it scales with $1/sqrt{m}$).



      I would appreciate any hint or references on quantifying the described error. I hope that I was clear enough about the problem...if not, please let me know.










      share|cite|improve this question











      $endgroup$




      Consider a physical system that depends on a parameter $0leq phi <infty$. I want to (i) find the probability that this parameter is smaller than a critical value: $phileq phi_c$, and (ii) quantify my confidence on the probability that I produce in (i)---see below. To this aim, I use Bayesian approach. A priori, the probability distribution of this parameter is $p(phi)$. I measure some observable, say $X$. By repeating this measurement $m$ times, I collect the dataset of outcomes of $X$, which I show with the vector ${bf x} = {x_1,x_2,dots,x_m}$.
      From the Bayes rule, I can update the probability distribution $p(phi) to p(phi|{bf x})$ as follows:
      $$ p(phi|{bf x}) = frac{p({bf x}|phi)p(phi)}{int dphi ~ p({bf x}|phi)p(phi)}.$$
      Thus, the answer to (i) is simply $p(phileqphi_c) = int_0^{phi_c} dphi~ p(phi|{bf x})$.
      It remains to answer (ii), i.e., to quantify how reliable is our answer to (i). More precisely, I want to quantify the error made in estimating $p(phileqphi_c)$. Such error should depend on the number of measurements $m$, and vanish as $mto infty$ (most likely it scales with $1/sqrt{m}$).



      I would appreciate any hint or references on quantifying the described error. I hope that I was clear enough about the problem...if not, please let me know.







      bayesian bayes-theorem parameter-estimation mean-square-error






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Dec 18 '18 at 13:24







      Moha

















      asked Dec 18 '18 at 11:51









      MohaMoha

      85




      85






















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
          $$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
          explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.



          You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.



          I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
          $$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
          The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
          $$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
          where each $x_i in {0, 1, 2, dots}.$



          Hence the posterior distribution is a gamma distribution:
          $$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
          and we can calculate $I$ using lookup tables.



          I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
          m10
          We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
          Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
          m100
          The distribution has not changed much.
          Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
          m10000
          Now we are really certain that $I$ is close to $1$!



          The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.



          Here is the R code that I used to generate the plots:



          install.packages("bayesboot")
          library(bayesboot)
          phi = 0.95
          priorShape = 1
          priorRate = 1
          postProbLessThanOne = function(x) {
          postShape = priorShape + sum(x)
          postRate = priorRate + length(x)
          postProb = pgamma(1, shape = postShape, rate = postRate)
          return(postProb)
          }

          set.seed(1)
          x10 = rpois(10, phi)
          postProbLessThanOne(x10) # 0.54
          bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
          R2 = 10, use.weights = FALSE, .progress = "text")
          hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")

          x100 = rpois(100, phi)
          postProbLessThanOne(x100) # 0.32
          bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
          R2 = 100, use.weights = FALSE, .progress = "text")
          hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")

          x10000 = rpois(10000, phi)
          postProbLessThanOne(x10000) # 0.96
          bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
          R2 = 10000, use.weights = FALSE, .progress = "text")
          hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")





          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
            $endgroup$
            – Moha
            Dec 23 '18 at 10:43










          • $begingroup$
            1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:29










          • $begingroup$
            2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:35










          • $begingroup$
            Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:53










          • $begingroup$
            I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
            $endgroup$
            – Moha
            Dec 23 '18 at 12:21











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "69"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3045073%2fhow-to-quantify-the-confidence-on-a-bayesian-posterior-probability%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0












          $begingroup$

          As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
          $$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
          explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.



          You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.



          I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
          $$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
          The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
          $$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
          where each $x_i in {0, 1, 2, dots}.$



          Hence the posterior distribution is a gamma distribution:
          $$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
          and we can calculate $I$ using lookup tables.



          I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
          m10
          We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
          Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
          m100
          The distribution has not changed much.
          Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
          m10000
          Now we are really certain that $I$ is close to $1$!



          The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.



          Here is the R code that I used to generate the plots:



          install.packages("bayesboot")
          library(bayesboot)
          phi = 0.95
          priorShape = 1
          priorRate = 1
          postProbLessThanOne = function(x) {
          postShape = priorShape + sum(x)
          postRate = priorRate + length(x)
          postProb = pgamma(1, shape = postShape, rate = postRate)
          return(postProb)
          }

          set.seed(1)
          x10 = rpois(10, phi)
          postProbLessThanOne(x10) # 0.54
          bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
          R2 = 10, use.weights = FALSE, .progress = "text")
          hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")

          x100 = rpois(100, phi)
          postProbLessThanOne(x100) # 0.32
          bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
          R2 = 100, use.weights = FALSE, .progress = "text")
          hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")

          x10000 = rpois(10000, phi)
          postProbLessThanOne(x10000) # 0.96
          bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
          R2 = 10000, use.weights = FALSE, .progress = "text")
          hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")





          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
            $endgroup$
            – Moha
            Dec 23 '18 at 10:43










          • $begingroup$
            1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:29










          • $begingroup$
            2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:35










          • $begingroup$
            Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:53










          • $begingroup$
            I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
            $endgroup$
            – Moha
            Dec 23 '18 at 12:21
















          0












          $begingroup$

          As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
          $$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
          explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.



          You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.



          I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
          $$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
          The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
          $$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
          where each $x_i in {0, 1, 2, dots}.$



          Hence the posterior distribution is a gamma distribution:
          $$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
          and we can calculate $I$ using lookup tables.



          I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
          m10
          We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
          Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
          m100
          The distribution has not changed much.
          Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
          m10000
          Now we are really certain that $I$ is close to $1$!



          The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.



          Here is the R code that I used to generate the plots:



          install.packages("bayesboot")
          library(bayesboot)
          phi = 0.95
          priorShape = 1
          priorRate = 1
          postProbLessThanOne = function(x) {
          postShape = priorShape + sum(x)
          postRate = priorRate + length(x)
          postProb = pgamma(1, shape = postShape, rate = postRate)
          return(postProb)
          }

          set.seed(1)
          x10 = rpois(10, phi)
          postProbLessThanOne(x10) # 0.54
          bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
          R2 = 10, use.weights = FALSE, .progress = "text")
          hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")

          x100 = rpois(100, phi)
          postProbLessThanOne(x100) # 0.32
          bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
          R2 = 100, use.weights = FALSE, .progress = "text")
          hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")

          x10000 = rpois(10000, phi)
          postProbLessThanOne(x10000) # 0.96
          bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
          R2 = 10000, use.weights = FALSE, .progress = "text")
          hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")





          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
            $endgroup$
            – Moha
            Dec 23 '18 at 10:43










          • $begingroup$
            1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:29










          • $begingroup$
            2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:35










          • $begingroup$
            Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:53










          • $begingroup$
            I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
            $endgroup$
            – Moha
            Dec 23 '18 at 12:21














          0












          0








          0





          $begingroup$

          As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
          $$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
          explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.



          You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.



          I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
          $$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
          The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
          $$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
          where each $x_i in {0, 1, 2, dots}.$



          Hence the posterior distribution is a gamma distribution:
          $$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
          and we can calculate $I$ using lookup tables.



          I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
          m10
          We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
          Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
          m100
          The distribution has not changed much.
          Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
          m10000
          Now we are really certain that $I$ is close to $1$!



          The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.



          Here is the R code that I used to generate the plots:



          install.packages("bayesboot")
          library(bayesboot)
          phi = 0.95
          priorShape = 1
          priorRate = 1
          postProbLessThanOne = function(x) {
          postShape = priorShape + sum(x)
          postRate = priorRate + length(x)
          postProb = pgamma(1, shape = postShape, rate = postRate)
          return(postProb)
          }

          set.seed(1)
          x10 = rpois(10, phi)
          postProbLessThanOne(x10) # 0.54
          bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
          R2 = 10, use.weights = FALSE, .progress = "text")
          hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")

          x100 = rpois(100, phi)
          postProbLessThanOne(x100) # 0.32
          bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
          R2 = 100, use.weights = FALSE, .progress = "text")
          hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")

          x10000 = rpois(10000, phi)
          postProbLessThanOne(x10000) # 0.96
          bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
          R2 = 10000, use.weights = FALSE, .progress = "text")
          hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")





          share|cite|improve this answer











          $endgroup$



          As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
          $$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
          explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.



          You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.



          I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
          $$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
          The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
          $$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
          where each $x_i in {0, 1, 2, dots}.$



          Hence the posterior distribution is a gamma distribution:
          $$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
          and we can calculate $I$ using lookup tables.



          I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
          m10
          We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
          Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
          m100
          The distribution has not changed much.
          Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
          m10000
          Now we are really certain that $I$ is close to $1$!



          The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.



          Here is the R code that I used to generate the plots:



          install.packages("bayesboot")
          library(bayesboot)
          phi = 0.95
          priorShape = 1
          priorRate = 1
          postProbLessThanOne = function(x) {
          postShape = priorShape + sum(x)
          postRate = priorRate + length(x)
          postProb = pgamma(1, shape = postShape, rate = postRate)
          return(postProb)
          }

          set.seed(1)
          x10 = rpois(10, phi)
          postProbLessThanOne(x10) # 0.54
          bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
          R2 = 10, use.weights = FALSE, .progress = "text")
          hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")

          x100 = rpois(100, phi)
          postProbLessThanOne(x100) # 0.32
          bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
          R2 = 100, use.weights = FALSE, .progress = "text")
          hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")

          x10000 = rpois(10000, phi)
          postProbLessThanOne(x10000) # 0.96
          bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
          R2 = 10000, use.weights = FALSE, .progress = "text")
          hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")






          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Dec 23 '18 at 11:54

























          answered Dec 19 '18 at 23:40









          AlexAlex

          734412




          734412












          • $begingroup$
            That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
            $endgroup$
            – Moha
            Dec 23 '18 at 10:43










          • $begingroup$
            1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:29










          • $begingroup$
            2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:35










          • $begingroup$
            Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:53










          • $begingroup$
            I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
            $endgroup$
            – Moha
            Dec 23 '18 at 12:21


















          • $begingroup$
            That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
            $endgroup$
            – Moha
            Dec 23 '18 at 10:43










          • $begingroup$
            1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:29










          • $begingroup$
            2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:35










          • $begingroup$
            Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
            $endgroup$
            – Alex
            Dec 23 '18 at 11:53










          • $begingroup$
            I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
            $endgroup$
            – Moha
            Dec 23 '18 at 12:21
















          $begingroup$
          That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
          $endgroup$
          – Moha
          Dec 23 '18 at 10:43




          $begingroup$
          That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
          $endgroup$
          – Moha
          Dec 23 '18 at 10:43












          $begingroup$
          1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
          $endgroup$
          – Alex
          Dec 23 '18 at 11:29




          $begingroup$
          1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
          $endgroup$
          – Alex
          Dec 23 '18 at 11:29












          $begingroup$
          2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
          $endgroup$
          – Alex
          Dec 23 '18 at 11:35




          $begingroup$
          2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
          $endgroup$
          – Alex
          Dec 23 '18 at 11:35












          $begingroup$
          Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
          $endgroup$
          – Alex
          Dec 23 '18 at 11:53




          $begingroup$
          Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
          $endgroup$
          – Alex
          Dec 23 '18 at 11:53












          $begingroup$
          I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
          $endgroup$
          – Moha
          Dec 23 '18 at 12:21




          $begingroup$
          I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
          $endgroup$
          – Moha
          Dec 23 '18 at 12:21


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Mathematics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3045073%2fhow-to-quantify-the-confidence-on-a-bayesian-posterior-probability%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Plaza Victoria

          In PowerPoint, is there a keyboard shortcut for bulleted / numbered list?

          How to put 3 figures in Latex with 2 figures side by side and 1 below these side by side images but in...