Is there an analytic form for the squared error of the difference of two univariate Gaussians?
$begingroup$
Using anchored ensembling it is possible to estimate the mean $mu$ and the variance $sigma^2$ of a output. I had the insight than if I then sampled from $N(mu,sigma^2)$, I could estimate both the aleatoric (data) and epistemic (model) uncertainty in one shot and use it to guide exploration. I could do this optimization using samples, but it has occurred to me that propagating uncertainty through the TD error might result in much less variance in the gradients.
Thanks to the uncertainty propagation page I understand how that might be done, but I am missing the last step which is the cost function.
At first glance it might seem that the sixth example formula ($f=aA^b$) should be enough to derive the squared error, but it is obvious to me that this would just pull the variance downward if I used it as cost directly.
How do I integrate: $E_{a sim N({mu_1},{sigma_1}),b sim N({mu_2},{sigma_2})}(a - b)^2$? I am hoping for something neat that I could backprop through.
calculus probability integration gaussian-integral
$endgroup$
|
show 2 more comments
$begingroup$
Using anchored ensembling it is possible to estimate the mean $mu$ and the variance $sigma^2$ of a output. I had the insight than if I then sampled from $N(mu,sigma^2)$, I could estimate both the aleatoric (data) and epistemic (model) uncertainty in one shot and use it to guide exploration. I could do this optimization using samples, but it has occurred to me that propagating uncertainty through the TD error might result in much less variance in the gradients.
Thanks to the uncertainty propagation page I understand how that might be done, but I am missing the last step which is the cost function.
At first glance it might seem that the sixth example formula ($f=aA^b$) should be enough to derive the squared error, but it is obvious to me that this would just pull the variance downward if I used it as cost directly.
How do I integrate: $E_{a sim N({mu_1},{sigma_1}),b sim N({mu_2},{sigma_2})}(a - b)^2$? I am hoping for something neat that I could backprop through.
calculus probability integration gaussian-integral
$endgroup$
$begingroup$
No. I am assuming that p(a) and p(b) are univariate Gaussian distributions, but have different means and variances. I've changed the question to better reflect what I am asking.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 10:49
3
$begingroup$
Then $mathbb{E}(a-b)^2=(mu_1-mu_2)^2+sigma_1^2+sigma_2^2$ (if $a,b$ are independent).
$endgroup$
– metamorphy
Dec 24 '18 at 11:00
$begingroup$
That seems really convenient, but it can't be right. It is obvious to me that to minimize that function you'd just set the two ${sigma}$s to zero. This would be the case even if the two ${mu}$s were held constant. If I sampled instead and optimized through those then the variances could grow. It is not a matter of them being independent either. Am I asking something wrong here?
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:34
$begingroup$
By can't be right, I mean it can't be right as an optimization target for the task I am trying to solve. Not that I do not believe that the equation is right.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:41
1
$begingroup$
The example formula you reference ($f=a A^b$) is in the "nonlinear" section, you describe something that is linear and there is an exact formula as @metamorphy has given you.
$endgroup$
– JimB
Dec 24 '18 at 18:05
|
show 2 more comments
$begingroup$
Using anchored ensembling it is possible to estimate the mean $mu$ and the variance $sigma^2$ of a output. I had the insight than if I then sampled from $N(mu,sigma^2)$, I could estimate both the aleatoric (data) and epistemic (model) uncertainty in one shot and use it to guide exploration. I could do this optimization using samples, but it has occurred to me that propagating uncertainty through the TD error might result in much less variance in the gradients.
Thanks to the uncertainty propagation page I understand how that might be done, but I am missing the last step which is the cost function.
At first glance it might seem that the sixth example formula ($f=aA^b$) should be enough to derive the squared error, but it is obvious to me that this would just pull the variance downward if I used it as cost directly.
How do I integrate: $E_{a sim N({mu_1},{sigma_1}),b sim N({mu_2},{sigma_2})}(a - b)^2$? I am hoping for something neat that I could backprop through.
calculus probability integration gaussian-integral
$endgroup$
Using anchored ensembling it is possible to estimate the mean $mu$ and the variance $sigma^2$ of a output. I had the insight than if I then sampled from $N(mu,sigma^2)$, I could estimate both the aleatoric (data) and epistemic (model) uncertainty in one shot and use it to guide exploration. I could do this optimization using samples, but it has occurred to me that propagating uncertainty through the TD error might result in much less variance in the gradients.
Thanks to the uncertainty propagation page I understand how that might be done, but I am missing the last step which is the cost function.
At first glance it might seem that the sixth example formula ($f=aA^b$) should be enough to derive the squared error, but it is obvious to me that this would just pull the variance downward if I used it as cost directly.
How do I integrate: $E_{a sim N({mu_1},{sigma_1}),b sim N({mu_2},{sigma_2})}(a - b)^2$? I am hoping for something neat that I could backprop through.
calculus probability integration gaussian-integral
calculus probability integration gaussian-integral
edited Dec 24 '18 at 11:46
Marko Grdinic
asked Dec 24 '18 at 10:18
Marko GrdinicMarko Grdinic
1419
1419
$begingroup$
No. I am assuming that p(a) and p(b) are univariate Gaussian distributions, but have different means and variances. I've changed the question to better reflect what I am asking.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 10:49
3
$begingroup$
Then $mathbb{E}(a-b)^2=(mu_1-mu_2)^2+sigma_1^2+sigma_2^2$ (if $a,b$ are independent).
$endgroup$
– metamorphy
Dec 24 '18 at 11:00
$begingroup$
That seems really convenient, but it can't be right. It is obvious to me that to minimize that function you'd just set the two ${sigma}$s to zero. This would be the case even if the two ${mu}$s were held constant. If I sampled instead and optimized through those then the variances could grow. It is not a matter of them being independent either. Am I asking something wrong here?
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:34
$begingroup$
By can't be right, I mean it can't be right as an optimization target for the task I am trying to solve. Not that I do not believe that the equation is right.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:41
1
$begingroup$
The example formula you reference ($f=a A^b$) is in the "nonlinear" section, you describe something that is linear and there is an exact formula as @metamorphy has given you.
$endgroup$
– JimB
Dec 24 '18 at 18:05
|
show 2 more comments
$begingroup$
No. I am assuming that p(a) and p(b) are univariate Gaussian distributions, but have different means and variances. I've changed the question to better reflect what I am asking.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 10:49
3
$begingroup$
Then $mathbb{E}(a-b)^2=(mu_1-mu_2)^2+sigma_1^2+sigma_2^2$ (if $a,b$ are independent).
$endgroup$
– metamorphy
Dec 24 '18 at 11:00
$begingroup$
That seems really convenient, but it can't be right. It is obvious to me that to minimize that function you'd just set the two ${sigma}$s to zero. This would be the case even if the two ${mu}$s were held constant. If I sampled instead and optimized through those then the variances could grow. It is not a matter of them being independent either. Am I asking something wrong here?
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:34
$begingroup$
By can't be right, I mean it can't be right as an optimization target for the task I am trying to solve. Not that I do not believe that the equation is right.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:41
1
$begingroup$
The example formula you reference ($f=a A^b$) is in the "nonlinear" section, you describe something that is linear and there is an exact formula as @metamorphy has given you.
$endgroup$
– JimB
Dec 24 '18 at 18:05
$begingroup$
No. I am assuming that p(a) and p(b) are univariate Gaussian distributions, but have different means and variances. I've changed the question to better reflect what I am asking.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 10:49
$begingroup$
No. I am assuming that p(a) and p(b) are univariate Gaussian distributions, but have different means and variances. I've changed the question to better reflect what I am asking.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 10:49
3
3
$begingroup$
Then $mathbb{E}(a-b)^2=(mu_1-mu_2)^2+sigma_1^2+sigma_2^2$ (if $a,b$ are independent).
$endgroup$
– metamorphy
Dec 24 '18 at 11:00
$begingroup$
Then $mathbb{E}(a-b)^2=(mu_1-mu_2)^2+sigma_1^2+sigma_2^2$ (if $a,b$ are independent).
$endgroup$
– metamorphy
Dec 24 '18 at 11:00
$begingroup$
That seems really convenient, but it can't be right. It is obvious to me that to minimize that function you'd just set the two ${sigma}$s to zero. This would be the case even if the two ${mu}$s were held constant. If I sampled instead and optimized through those then the variances could grow. It is not a matter of them being independent either. Am I asking something wrong here?
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:34
$begingroup$
That seems really convenient, but it can't be right. It is obvious to me that to minimize that function you'd just set the two ${sigma}$s to zero. This would be the case even if the two ${mu}$s were held constant. If I sampled instead and optimized through those then the variances could grow. It is not a matter of them being independent either. Am I asking something wrong here?
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:34
$begingroup$
By can't be right, I mean it can't be right as an optimization target for the task I am trying to solve. Not that I do not believe that the equation is right.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:41
$begingroup$
By can't be right, I mean it can't be right as an optimization target for the task I am trying to solve. Not that I do not believe that the equation is right.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:41
1
1
$begingroup$
The example formula you reference ($f=a A^b$) is in the "nonlinear" section, you describe something that is linear and there is an exact formula as @metamorphy has given you.
$endgroup$
– JimB
Dec 24 '18 at 18:05
$begingroup$
The example formula you reference ($f=a A^b$) is in the "nonlinear" section, you describe something that is linear and there is an exact formula as @metamorphy has given you.
$endgroup$
– JimB
Dec 24 '18 at 18:05
|
show 2 more comments
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3051126%2fis-there-an-analytic-form-for-the-squared-error-of-the-difference-of-two-univari%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3051126%2fis-there-an-analytic-form-for-the-squared-error-of-the-difference-of-two-univari%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
No. I am assuming that p(a) and p(b) are univariate Gaussian distributions, but have different means and variances. I've changed the question to better reflect what I am asking.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 10:49
3
$begingroup$
Then $mathbb{E}(a-b)^2=(mu_1-mu_2)^2+sigma_1^2+sigma_2^2$ (if $a,b$ are independent).
$endgroup$
– metamorphy
Dec 24 '18 at 11:00
$begingroup$
That seems really convenient, but it can't be right. It is obvious to me that to minimize that function you'd just set the two ${sigma}$s to zero. This would be the case even if the two ${mu}$s were held constant. If I sampled instead and optimized through those then the variances could grow. It is not a matter of them being independent either. Am I asking something wrong here?
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:34
$begingroup$
By can't be right, I mean it can't be right as an optimization target for the task I am trying to solve. Not that I do not believe that the equation is right.
$endgroup$
– Marko Grdinic
Dec 24 '18 at 11:41
1
$begingroup$
The example formula you reference ($f=a A^b$) is in the "nonlinear" section, you describe something that is linear and there is an exact formula as @metamorphy has given you.
$endgroup$
– JimB
Dec 24 '18 at 18:05