Reconciling two interpretations of $E(X^2)$

It's been over 25 years since my last course in probability, so this may be obvious or elementary. However, I've unexpectedly had to think deeply about this stuff for a research project, and I cannot reconcile my issue.

For the sake of simplicity/common ground, let's assume Riemann integrals always suffice, there are no convergence issues, and my random variables take values over all of $mathbb{R}$. All integrals below are $int_mathbb{R} dx $ whether or not the domain is explicitly written.

Given a r.v. $X$ with density function $f(x)$, we define $E(X) = int x f(x) dx$. Linearity is then proven, $E(aX+b)=aE(X)+b$, and this is completely sensible. In deriving the alternative formula for the variance, we bump into $E(X^2)$. In that derivation, we define
$$
E(X^2) = int x^2 f(x) dx
$$
and proceed. This is fine, but here's my issue.

$X^2$ is itself a random variable, so we could ask for its expected value (in reference to "itself," not $X$). To be clearer, we could set $Y=X^2$ and ask for $E(Y)$. This requires us to know the density function for $Y$, and this to me is not clear at all and non-trivial to get your hands on. So, setting $Y=X^2$ with density $h(y)$, why is it true that
$$
int y h(y) dy = int x^2 f(x) dx
$$

so that $E(Y) = E(X^2)$? Why do these two very different interpretations agree? I do not think this is as simple as a $u$-substitution.

For example, I can see that this works out fine if $X$ is standard normal. There, $E(X)=0$ and $Y=X^2$ is $chi^2$ distributed with $1$ degree of freedom. Since $sigma_X^2=1$ it is clear that $E(X^2)=1$. Chasing the calculations I also see that $E(Y)=E(chi_1^2)=1$, so they are in fact in agreement. I can even follow the derivations of the $chi_1^2$ distribution in terms of the $Gamma$-function and see the connection to the standard normal, but I see no reason for this to play out as nicely no matter the density of $X$. It also seems to get worse when considering $E(X^alpha)$ in general.

Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?

asked Dec 17 '18 at 3:13

Randall

10.6k11431

2

$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45

$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52

add a comment |

Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?

asked Dec 17 '18 at 3:13

Randall

10.6k11431

2

$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45

$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52

add a comment |

Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?

asked Dec 17 '18 at 3:13

Randall

10.6k11431

Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?

probability-theory expected-value

asked Dec 17 '18 at 3:13

Randall

10.6k11431

asked Dec 17 '18 at 3:13

Randall

10.6k11431

asked Dec 17 '18 at 3:13

Randall

10.6k11431

asked Dec 17 '18 at 3:13

Randall

10.6k11431

asked Dec 17 '18 at 3:13

Randall

10.6k11431

2

$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45

$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52

add a comment |

2

$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45

$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52

It's the Law of the Unconscious Statistician

– Robert Israel
Dec 17 '18 at 3:45

@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.

– Randall
Dec 17 '18 at 3:52

add a comment |

2 Answers
2

active

oldest

votes

Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.

Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?

Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.

Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$

Then
$$int y h(y) ,dy = int_0^infty y(f(sqrt{y})+f(-sqrt{y}))frac{1}{2sqrt{y}},dy.$$
Letting $u=sqrt{y}$, we have $du=frac{1}{2sqrt{y}},dy$
so
$$int y h(y) ,dy = int_0^infty u^2 (f(u)+f(-u)),du=int_{-infty}^infty u^2 f(u),du=E[X^2]$$

And the abstract approach

Let $(A,Omega,mu)$ be a probability space. A random variable on $A$ is a measurable function $f:Ato Bbb{R}$. The random variable $Y=X^2$ on $A$ is simply the measurable function $amapsto f(a)^2$, the composite of $xmapsto x^2$ with $f$. However, let's be a little more general. Let $g:Bbb{R}to Bbb{R}$ be any continuous function on $Bbb{R}$, and we can consider the random variable $Y=g(X)$, which is the function $gcirc f : Ato Bbb{R}$.

Now we can take the pushforward $mu$ along $f$ to get a measure $f_*mu$ on $Bbb{R}$ defined by $f_*mu(E) = mu(f^{-1}(E))$. If $f_*mu$ is absolutely continuous with respect to Lebesgue measure, then its Radon-Nikodym derivative will be the pdf of $X$, but let's not think about distribution functions for now.

We now have a new measure space, $(Bbb{R},mathcal{B},f_*mu)$. Since we obtained this space by pushing forwards $mu$ along $f$, we see that the identity function on this new space determines the same random variable $X$, since the probability that $X$ ends up in some set $BsubseteqBbb{R}$, by definition originally was $mu(f^{-1}(B)$, but this is $f_*mu(mathrm{id}^{-1}(B))$.

Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.

The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$

Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$

This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.

edited Dec 17 '18 at 3:57

answered Dec 17 '18 at 3:36

jgon

15.4k32143

$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40

add a comment |

It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):

Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.

A random variable is a function "that provides a numerical measurement" to each element $sin S$. (satisfying measurabilty condition). .
So the best view point is to think of expectations as a concept associated to various functions on same SAMPLE SPACE the SAME set up of statistical experiment and fixed probability measure is more useful and closer to application domain. The phrase "Expectation of a random variable" is misleading, and delinking from the underlying probability measure is confusing.

Expectations SHOULD NOT BE viewed as something with respect to
the density of a specific random variable. It is expectation of some function
on the sample space with respect to underlying probability measure (various functions could have wildly different distributions and densities that should be set aside temporarily. That the formula for expectation involves the density of the random variable, But the formula should not be confused with the concept/definition of it. An algorithm for computing GCD of two numbers should not be confused with the definition of gcd)

answered Dec 17 '18 at 4:08

P Vanchinathan

15.5k12136

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3043490%2freconciling-two-interpretations-of-ex2%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.

Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?

Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.

Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$

And the abstract approach

Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.

The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$

Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$

This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.

edited Dec 17 '18 at 3:57

answered Dec 17 '18 at 3:36

jgon

15.4k32143

$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40

add a comment |

Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.

Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?

Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.

Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$

And the abstract approach

Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.

The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$

Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$

This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.

edited Dec 17 '18 at 3:57

answered Dec 17 '18 at 3:36

jgon

15.4k32143

$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40

add a comment |

Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.

Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?

Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.

Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$

And the abstract approach

Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.

The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$

Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$

This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.

edited Dec 17 '18 at 3:57

answered Dec 17 '18 at 3:36

jgon

15.4k32143

Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.

Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?

Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.

Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$

And the abstract approach

Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.

The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$

Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$

This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.

edited Dec 17 '18 at 3:57

answered Dec 17 '18 at 3:36

jgon

15.4k32143

edited Dec 17 '18 at 3:57

answered Dec 17 '18 at 3:36

jgon

15.4k32143

answered Dec 17 '18 at 3:36

jgon

15.4k32143

answered Dec 17 '18 at 3:36

jgon

15.4k32143

$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40

add a comment |

$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40

Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.

– Randall
Dec 17 '18 at 14:40

add a comment |

It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):

Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.

answered Dec 17 '18 at 4:08

P Vanchinathan

15.5k12136

add a comment |

It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):

Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.

answered Dec 17 '18 at 4:08

P Vanchinathan

15.5k12136

add a comment |

It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):

Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.

answered Dec 17 '18 at 4:08

P Vanchinathan

15.5k12136

It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):

Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.

answered Dec 17 '18 at 4:08

P Vanchinathan

15.5k12136

answered Dec 17 '18 at 4:08

P Vanchinathan

15.5k12136

answered Dec 17 '18 at 4:08

P Vanchinathan

15.5k12136

answered Dec 17 '18 at 4:08

P Vanchinathan

15.5k12136

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

C,MvaQwu5tz1QnEJx9vJSEEc97m8zl2q9y42gzXX

搜尋此網誌

Csdrhrt