Reconciling two interpretations of $E(X^2)$
$begingroup$
It's been over 25 years since my last course in probability, so this may be obvious or elementary. However, I've unexpectedly had to think deeply about this stuff for a research project, and I cannot reconcile my issue.
For the sake of simplicity/common ground, let's assume Riemann integrals always suffice, there are no convergence issues, and my random variables take values over all of $mathbb{R}$. All integrals below are $int_mathbb{R} dx $ whether or not the domain is explicitly written.
Given a r.v. $X$ with density function $f(x)$, we define $E(X) = int x f(x) dx$. Linearity is then proven, $E(aX+b)=aE(X)+b$, and this is completely sensible. In deriving the alternative formula for the variance, we bump into $E(X^2)$. In that derivation, we define
$$
E(X^2) = int x^2 f(x) dx
$$
and proceed. This is fine, but here's my issue.
$X^2$ is itself a random variable, so we could ask for its expected value (in reference to "itself," not $X$). To be clearer, we could set $Y=X^2$ and ask for $E(Y)$. This requires us to know the density function for $Y$, and this to me is not clear at all and non-trivial to get your hands on. So, setting $Y=X^2$ with density $h(y)$, why is it true that
$$
int y h(y) dy = int x^2 f(x) dx
$$
so that $E(Y) = E(X^2)$? Why do these two very different interpretations agree? I do not think this is as simple as a $u$-substitution.
For example, I can see that this works out fine if $X$ is standard normal. There, $E(X)=0$ and $Y=X^2$ is $chi^2$ distributed with $1$ degree of freedom. Since $sigma_X^2=1$ it is clear that $E(X^2)=1$. Chasing the calculations I also see that $E(Y)=E(chi_1^2)=1$, so they are in fact in agreement. I can even follow the derivations of the $chi_1^2$ distribution in terms of the $Gamma$-function and see the connection to the standard normal, but I see no reason for this to play out as nicely no matter the density of $X$. It also seems to get worse when considering $E(X^alpha)$ in general.
Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?
probability-theory expected-value
$endgroup$
add a comment |
$begingroup$
It's been over 25 years since my last course in probability, so this may be obvious or elementary. However, I've unexpectedly had to think deeply about this stuff for a research project, and I cannot reconcile my issue.
For the sake of simplicity/common ground, let's assume Riemann integrals always suffice, there are no convergence issues, and my random variables take values over all of $mathbb{R}$. All integrals below are $int_mathbb{R} dx $ whether or not the domain is explicitly written.
Given a r.v. $X$ with density function $f(x)$, we define $E(X) = int x f(x) dx$. Linearity is then proven, $E(aX+b)=aE(X)+b$, and this is completely sensible. In deriving the alternative formula for the variance, we bump into $E(X^2)$. In that derivation, we define
$$
E(X^2) = int x^2 f(x) dx
$$
and proceed. This is fine, but here's my issue.
$X^2$ is itself a random variable, so we could ask for its expected value (in reference to "itself," not $X$). To be clearer, we could set $Y=X^2$ and ask for $E(Y)$. This requires us to know the density function for $Y$, and this to me is not clear at all and non-trivial to get your hands on. So, setting $Y=X^2$ with density $h(y)$, why is it true that
$$
int y h(y) dy = int x^2 f(x) dx
$$
so that $E(Y) = E(X^2)$? Why do these two very different interpretations agree? I do not think this is as simple as a $u$-substitution.
For example, I can see that this works out fine if $X$ is standard normal. There, $E(X)=0$ and $Y=X^2$ is $chi^2$ distributed with $1$ degree of freedom. Since $sigma_X^2=1$ it is clear that $E(X^2)=1$. Chasing the calculations I also see that $E(Y)=E(chi_1^2)=1$, so they are in fact in agreement. I can even follow the derivations of the $chi_1^2$ distribution in terms of the $Gamma$-function and see the connection to the standard normal, but I see no reason for this to play out as nicely no matter the density of $X$. It also seems to get worse when considering $E(X^alpha)$ in general.
Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?
probability-theory expected-value
$endgroup$
2
$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45
$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52
add a comment |
$begingroup$
It's been over 25 years since my last course in probability, so this may be obvious or elementary. However, I've unexpectedly had to think deeply about this stuff for a research project, and I cannot reconcile my issue.
For the sake of simplicity/common ground, let's assume Riemann integrals always suffice, there are no convergence issues, and my random variables take values over all of $mathbb{R}$. All integrals below are $int_mathbb{R} dx $ whether or not the domain is explicitly written.
Given a r.v. $X$ with density function $f(x)$, we define $E(X) = int x f(x) dx$. Linearity is then proven, $E(aX+b)=aE(X)+b$, and this is completely sensible. In deriving the alternative formula for the variance, we bump into $E(X^2)$. In that derivation, we define
$$
E(X^2) = int x^2 f(x) dx
$$
and proceed. This is fine, but here's my issue.
$X^2$ is itself a random variable, so we could ask for its expected value (in reference to "itself," not $X$). To be clearer, we could set $Y=X^2$ and ask for $E(Y)$. This requires us to know the density function for $Y$, and this to me is not clear at all and non-trivial to get your hands on. So, setting $Y=X^2$ with density $h(y)$, why is it true that
$$
int y h(y) dy = int x^2 f(x) dx
$$
so that $E(Y) = E(X^2)$? Why do these two very different interpretations agree? I do not think this is as simple as a $u$-substitution.
For example, I can see that this works out fine if $X$ is standard normal. There, $E(X)=0$ and $Y=X^2$ is $chi^2$ distributed with $1$ degree of freedom. Since $sigma_X^2=1$ it is clear that $E(X^2)=1$. Chasing the calculations I also see that $E(Y)=E(chi_1^2)=1$, so they are in fact in agreement. I can even follow the derivations of the $chi_1^2$ distribution in terms of the $Gamma$-function and see the connection to the standard normal, but I see no reason for this to play out as nicely no matter the density of $X$. It also seems to get worse when considering $E(X^alpha)$ in general.
Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?
probability-theory expected-value
$endgroup$
It's been over 25 years since my last course in probability, so this may be obvious or elementary. However, I've unexpectedly had to think deeply about this stuff for a research project, and I cannot reconcile my issue.
For the sake of simplicity/common ground, let's assume Riemann integrals always suffice, there are no convergence issues, and my random variables take values over all of $mathbb{R}$. All integrals below are $int_mathbb{R} dx $ whether or not the domain is explicitly written.
Given a r.v. $X$ with density function $f(x)$, we define $E(X) = int x f(x) dx$. Linearity is then proven, $E(aX+b)=aE(X)+b$, and this is completely sensible. In deriving the alternative formula for the variance, we bump into $E(X^2)$. In that derivation, we define
$$
E(X^2) = int x^2 f(x) dx
$$
and proceed. This is fine, but here's my issue.
$X^2$ is itself a random variable, so we could ask for its expected value (in reference to "itself," not $X$). To be clearer, we could set $Y=X^2$ and ask for $E(Y)$. This requires us to know the density function for $Y$, and this to me is not clear at all and non-trivial to get your hands on. So, setting $Y=X^2$ with density $h(y)$, why is it true that
$$
int y h(y) dy = int x^2 f(x) dx
$$
so that $E(Y) = E(X^2)$? Why do these two very different interpretations agree? I do not think this is as simple as a $u$-substitution.
For example, I can see that this works out fine if $X$ is standard normal. There, $E(X)=0$ and $Y=X^2$ is $chi^2$ distributed with $1$ degree of freedom. Since $sigma_X^2=1$ it is clear that $E(X^2)=1$. Chasing the calculations I also see that $E(Y)=E(chi_1^2)=1$, so they are in fact in agreement. I can even follow the derivations of the $chi_1^2$ distribution in terms of the $Gamma$-function and see the connection to the standard normal, but I see no reason for this to play out as nicely no matter the density of $X$. It also seems to get worse when considering $E(X^alpha)$ in general.
Are these two viewpoints in potential disagreement, or is there a piece of theory that says that there is no ambiguity?
probability-theory expected-value
probability-theory expected-value
asked Dec 17 '18 at 3:13
RandallRandall
10.6k11431
10.6k11431
2
$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45
$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52
add a comment |
2
$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45
$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52
2
2
$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45
$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45
$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52
$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.
Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?
Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.
Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$
Then
$$int y h(y) ,dy = int_0^infty y(f(sqrt{y})+f(-sqrt{y}))frac{1}{2sqrt{y}},dy.$$
Letting $u=sqrt{y}$, we have $du=frac{1}{2sqrt{y}},dy$
so
$$int y h(y) ,dy = int_0^infty u^2 (f(u)+f(-u)),du=int_{-infty}^infty u^2 f(u),du=E[X^2]$$
And the abstract approach
Let $(A,Omega,mu)$ be a probability space. A random variable on $A$ is a measurable function $f:Ato Bbb{R}$. The random variable $Y=X^2$ on $A$ is simply the measurable function $amapsto f(a)^2$, the composite of $xmapsto x^2$ with $f$. However, let's be a little more general. Let $g:Bbb{R}to Bbb{R}$ be any continuous function on $Bbb{R}$, and we can consider the random variable $Y=g(X)$, which is the function $gcirc f : Ato Bbb{R}$.
Now we can take the pushforward $mu$ along $f$ to get a measure $f_*mu$ on $Bbb{R}$ defined by $f_*mu(E) = mu(f^{-1}(E))$. If $f_*mu$ is absolutely continuous with respect to Lebesgue measure, then its Radon-Nikodym derivative will be the pdf of $X$, but let's not think about distribution functions for now.
We now have a new measure space, $(Bbb{R},mathcal{B},f_*mu)$. Since we obtained this space by pushing forwards $mu$ along $f$, we see that the identity function on this new space determines the same random variable $X$, since the probability that $X$ ends up in some set $BsubseteqBbb{R}$, by definition originally was $mu(f^{-1}(B)$, but this is $f_*mu(mathrm{id}^{-1}(B))$.
Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.
The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$
Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$
This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.
$endgroup$
$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40
add a comment |
$begingroup$
It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):
Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.
A random variable is a function "that provides a numerical measurement" to each element $sin S$. (satisfying measurabilty condition). .
So the best view point is to think of expectations as a concept associated to various functions on same SAMPLE SPACE the SAME set up of statistical experiment and fixed probability measure is more useful and closer to application domain. The phrase "Expectation of a random variable" is misleading, and delinking from the underlying probability measure is confusing.
Expectations SHOULD NOT BE viewed as something with respect to
the density of a specific random variable. It is expectation of some function
on the sample space with respect to underlying probability measure (various functions could have wildly different distributions and densities that should be set aside temporarily. That the formula for expectation involves the density of the random variable, But the formula should not be confused with the concept/definition of it. An algorithm for computing GCD of two numbers should not be confused with the definition of gcd)
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3043490%2freconciling-two-interpretations-of-ex2%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.
Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?
Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.
Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$
Then
$$int y h(y) ,dy = int_0^infty y(f(sqrt{y})+f(-sqrt{y}))frac{1}{2sqrt{y}},dy.$$
Letting $u=sqrt{y}$, we have $du=frac{1}{2sqrt{y}},dy$
so
$$int y h(y) ,dy = int_0^infty u^2 (f(u)+f(-u)),du=int_{-infty}^infty u^2 f(u),du=E[X^2]$$
And the abstract approach
Let $(A,Omega,mu)$ be a probability space. A random variable on $A$ is a measurable function $f:Ato Bbb{R}$. The random variable $Y=X^2$ on $A$ is simply the measurable function $amapsto f(a)^2$, the composite of $xmapsto x^2$ with $f$. However, let's be a little more general. Let $g:Bbb{R}to Bbb{R}$ be any continuous function on $Bbb{R}$, and we can consider the random variable $Y=g(X)$, which is the function $gcirc f : Ato Bbb{R}$.
Now we can take the pushforward $mu$ along $f$ to get a measure $f_*mu$ on $Bbb{R}$ defined by $f_*mu(E) = mu(f^{-1}(E))$. If $f_*mu$ is absolutely continuous with respect to Lebesgue measure, then its Radon-Nikodym derivative will be the pdf of $X$, but let's not think about distribution functions for now.
We now have a new measure space, $(Bbb{R},mathcal{B},f_*mu)$. Since we obtained this space by pushing forwards $mu$ along $f$, we see that the identity function on this new space determines the same random variable $X$, since the probability that $X$ ends up in some set $BsubseteqBbb{R}$, by definition originally was $mu(f^{-1}(B)$, but this is $f_*mu(mathrm{id}^{-1}(B))$.
Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.
The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$
Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$
This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.
$endgroup$
$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40
add a comment |
$begingroup$
Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.
Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?
Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.
Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$
Then
$$int y h(y) ,dy = int_0^infty y(f(sqrt{y})+f(-sqrt{y}))frac{1}{2sqrt{y}},dy.$$
Letting $u=sqrt{y}$, we have $du=frac{1}{2sqrt{y}},dy$
so
$$int y h(y) ,dy = int_0^infty u^2 (f(u)+f(-u)),du=int_{-infty}^infty u^2 f(u),du=E[X^2]$$
And the abstract approach
Let $(A,Omega,mu)$ be a probability space. A random variable on $A$ is a measurable function $f:Ato Bbb{R}$. The random variable $Y=X^2$ on $A$ is simply the measurable function $amapsto f(a)^2$, the composite of $xmapsto x^2$ with $f$. However, let's be a little more general. Let $g:Bbb{R}to Bbb{R}$ be any continuous function on $Bbb{R}$, and we can consider the random variable $Y=g(X)$, which is the function $gcirc f : Ato Bbb{R}$.
Now we can take the pushforward $mu$ along $f$ to get a measure $f_*mu$ on $Bbb{R}$ defined by $f_*mu(E) = mu(f^{-1}(E))$. If $f_*mu$ is absolutely continuous with respect to Lebesgue measure, then its Radon-Nikodym derivative will be the pdf of $X$, but let's not think about distribution functions for now.
We now have a new measure space, $(Bbb{R},mathcal{B},f_*mu)$. Since we obtained this space by pushing forwards $mu$ along $f$, we see that the identity function on this new space determines the same random variable $X$, since the probability that $X$ ends up in some set $BsubseteqBbb{R}$, by definition originally was $mu(f^{-1}(B)$, but this is $f_*mu(mathrm{id}^{-1}(B))$.
Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.
The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$
Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$
This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.
$endgroup$
$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40
add a comment |
$begingroup$
Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.
Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?
Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.
Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$
Then
$$int y h(y) ,dy = int_0^infty y(f(sqrt{y})+f(-sqrt{y}))frac{1}{2sqrt{y}},dy.$$
Letting $u=sqrt{y}$, we have $du=frac{1}{2sqrt{y}},dy$
so
$$int y h(y) ,dy = int_0^infty u^2 (f(u)+f(-u)),du=int_{-infty}^infty u^2 f(u),du=E[X^2]$$
And the abstract approach
Let $(A,Omega,mu)$ be a probability space. A random variable on $A$ is a measurable function $f:Ato Bbb{R}$. The random variable $Y=X^2$ on $A$ is simply the measurable function $amapsto f(a)^2$, the composite of $xmapsto x^2$ with $f$. However, let's be a little more general. Let $g:Bbb{R}to Bbb{R}$ be any continuous function on $Bbb{R}$, and we can consider the random variable $Y=g(X)$, which is the function $gcirc f : Ato Bbb{R}$.
Now we can take the pushforward $mu$ along $f$ to get a measure $f_*mu$ on $Bbb{R}$ defined by $f_*mu(E) = mu(f^{-1}(E))$. If $f_*mu$ is absolutely continuous with respect to Lebesgue measure, then its Radon-Nikodym derivative will be the pdf of $X$, but let's not think about distribution functions for now.
We now have a new measure space, $(Bbb{R},mathcal{B},f_*mu)$. Since we obtained this space by pushing forwards $mu$ along $f$, we see that the identity function on this new space determines the same random variable $X$, since the probability that $X$ ends up in some set $BsubseteqBbb{R}$, by definition originally was $mu(f^{-1}(B)$, but this is $f_*mu(mathrm{id}^{-1}(B))$.
Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.
The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$
Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$
This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.
$endgroup$
Pretty sure this can be reconciled with a bit of measure theory, but here's a concrete working out that they agree in this particular case.
Let $F$ be the cdf for $X$, so that the probability that $a le X le b$ is $F(b)-F(a)$. Then $f=F'$. Now can we use $F$ to work out the cdf for $Y$?
Yes we can! The cdf for $Y$, $H$ will be $H(y) = P(Yle y)=P(X^2le y)$. Thus if $yle 0$, $H(y)=0$. However if $y > 0$, then $P(X^2 le y) = P(-sqrt{y}le x le sqrt{y})=F(sqrt{y})-F(-sqrt{y})$.
Taking the derivative, we see that
$$h(y) = begin{cases} 0 & y le 0 \ frac{f(sqrt{y})+f(-sqrt{y})}{2sqrt{y}} & y > 0end{cases}.$$
Then
$$int y h(y) ,dy = int_0^infty y(f(sqrt{y})+f(-sqrt{y}))frac{1}{2sqrt{y}},dy.$$
Letting $u=sqrt{y}$, we have $du=frac{1}{2sqrt{y}},dy$
so
$$int y h(y) ,dy = int_0^infty u^2 (f(u)+f(-u)),du=int_{-infty}^infty u^2 f(u),du=E[X^2]$$
And the abstract approach
Let $(A,Omega,mu)$ be a probability space. A random variable on $A$ is a measurable function $f:Ato Bbb{R}$. The random variable $Y=X^2$ on $A$ is simply the measurable function $amapsto f(a)^2$, the composite of $xmapsto x^2$ with $f$. However, let's be a little more general. Let $g:Bbb{R}to Bbb{R}$ be any continuous function on $Bbb{R}$, and we can consider the random variable $Y=g(X)$, which is the function $gcirc f : Ato Bbb{R}$.
Now we can take the pushforward $mu$ along $f$ to get a measure $f_*mu$ on $Bbb{R}$ defined by $f_*mu(E) = mu(f^{-1}(E))$. If $f_*mu$ is absolutely continuous with respect to Lebesgue measure, then its Radon-Nikodym derivative will be the pdf of $X$, but let's not think about distribution functions for now.
We now have a new measure space, $(Bbb{R},mathcal{B},f_*mu)$. Since we obtained this space by pushing forwards $mu$ along $f$, we see that the identity function on this new space determines the same random variable $X$, since the probability that $X$ ends up in some set $BsubseteqBbb{R}$, by definition originally was $mu(f^{-1}(B)$, but this is $f_*mu(mathrm{id}^{-1}(B))$.
Similarly, $Y=g(X)$ will be described on this new space by simply the function $g:Bbb{R}toBbb{R}$.
The expected value of $Y$ is then
$$int g(x),f_*mu(dx),$$
however, we can then pushforward $f_*mu$ along $g$ to get $g_*f_*mu$.
This gives that the expected value of $Y$ is
$$int y, g_*f_*mu(dy).$$
Thus, rephrased in abstract language, the statement that you want is that these two values are equal, i.e.
$$int g(x),f_*mu(dx)=int y,g_*f_*mu(dy).$$
This is however, just a special case of change of variables, which says that
$$int j circ k ,dnu = int j ,d k_*nu,$$
with $j=textrm{id}$, $k=g$, and $nu=f_*mu$.
edited Dec 17 '18 at 3:57
answered Dec 17 '18 at 3:36
jgonjgon
15.4k32143
15.4k32143
$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40
add a comment |
$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40
$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40
$begingroup$
Thanks. Though I am glad that measure theory makes everything work out, the concrete approach is what really showed me the light.
$endgroup$
– Randall
Dec 17 '18 at 14:40
add a comment |
$begingroup$
It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):
Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.
A random variable is a function "that provides a numerical measurement" to each element $sin S$. (satisfying measurabilty condition). .
So the best view point is to think of expectations as a concept associated to various functions on same SAMPLE SPACE the SAME set up of statistical experiment and fixed probability measure is more useful and closer to application domain. The phrase "Expectation of a random variable" is misleading, and delinking from the underlying probability measure is confusing.
Expectations SHOULD NOT BE viewed as something with respect to
the density of a specific random variable. It is expectation of some function
on the sample space with respect to underlying probability measure (various functions could have wildly different distributions and densities that should be set aside temporarily. That the formula for expectation involves the density of the random variable, But the formula should not be confused with the concept/definition of it. An algorithm for computing GCD of two numbers should not be confused with the definition of gcd)
$endgroup$
add a comment |
$begingroup$
It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):
Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.
A random variable is a function "that provides a numerical measurement" to each element $sin S$. (satisfying measurabilty condition). .
So the best view point is to think of expectations as a concept associated to various functions on same SAMPLE SPACE the SAME set up of statistical experiment and fixed probability measure is more useful and closer to application domain. The phrase "Expectation of a random variable" is misleading, and delinking from the underlying probability measure is confusing.
Expectations SHOULD NOT BE viewed as something with respect to
the density of a specific random variable. It is expectation of some function
on the sample space with respect to underlying probability measure (various functions could have wildly different distributions and densities that should be set aside temporarily. That the formula for expectation involves the density of the random variable, But the formula should not be confused with the concept/definition of it. An algorithm for computing GCD of two numbers should not be confused with the definition of gcd)
$endgroup$
add a comment |
$begingroup$
It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):
Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.
A random variable is a function "that provides a numerical measurement" to each element $sin S$. (satisfying measurabilty condition). .
So the best view point is to think of expectations as a concept associated to various functions on same SAMPLE SPACE the SAME set up of statistical experiment and fixed probability measure is more useful and closer to application domain. The phrase "Expectation of a random variable" is misleading, and delinking from the underlying probability measure is confusing.
Expectations SHOULD NOT BE viewed as something with respect to
the density of a specific random variable. It is expectation of some function
on the sample space with respect to underlying probability measure (various functions could have wildly different distributions and densities that should be set aside temporarily. That the formula for expectation involves the density of the random variable, But the formula should not be confused with the concept/definition of it. An algorithm for computing GCD of two numbers should not be confused with the definition of gcd)
$endgroup$
It is easier to view expectation in the following way (it is long but worth the effort reading, I hope):
Let us fix the set-up. We have a statistical experiment with sample space $S$, and collection of "events" (subset of $S$) forming a $sigma$-algebra, and a probability measure there.
A random variable is a function "that provides a numerical measurement" to each element $sin S$. (satisfying measurabilty condition). .
So the best view point is to think of expectations as a concept associated to various functions on same SAMPLE SPACE the SAME set up of statistical experiment and fixed probability measure is more useful and closer to application domain. The phrase "Expectation of a random variable" is misleading, and delinking from the underlying probability measure is confusing.
Expectations SHOULD NOT BE viewed as something with respect to
the density of a specific random variable. It is expectation of some function
on the sample space with respect to underlying probability measure (various functions could have wildly different distributions and densities that should be set aside temporarily. That the formula for expectation involves the density of the random variable, But the formula should not be confused with the concept/definition of it. An algorithm for computing GCD of two numbers should not be confused with the definition of gcd)
answered Dec 17 '18 at 4:08
P VanchinathanP Vanchinathan
15.5k12136
15.5k12136
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3043490%2freconciling-two-interpretations-of-ex2%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
$begingroup$
It's the Law of the Unconscious Statistician
$endgroup$
– Robert Israel
Dec 17 '18 at 3:45
$begingroup$
@RobertIsrael Thanks, I'd not heard of that, and am relieved that it is a true subtlety. The wiki article makes sense to me.
$endgroup$
– Randall
Dec 17 '18 at 3:52