How is causation defined mathematically?
What is the mathematical definition of a causal relationship between two random variables?
Given a sample from the joint distribution of two random variables $X$ and $Y$, when would we say $X$ causes $Y$?
For context, I am reading this paper about causal discovery.
machine-learning causality
add a comment |
What is the mathematical definition of a causal relationship between two random variables?
Given a sample from the joint distribution of two random variables $X$ and $Y$, when would we say $X$ causes $Y$?
For context, I am reading this paper about causal discovery.
machine-learning causality
2
As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
– mdewey
Dec 8 at 14:53
2
@mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
– Kodiologist
Dec 8 at 14:55
add a comment |
What is the mathematical definition of a causal relationship between two random variables?
Given a sample from the joint distribution of two random variables $X$ and $Y$, when would we say $X$ causes $Y$?
For context, I am reading this paper about causal discovery.
machine-learning causality
What is the mathematical definition of a causal relationship between two random variables?
Given a sample from the joint distribution of two random variables $X$ and $Y$, when would we say $X$ causes $Y$?
For context, I am reading this paper about causal discovery.
machine-learning causality
machine-learning causality
edited Dec 8 at 22:07
asked Dec 8 at 14:01
Jane
865
865
2
As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
– mdewey
Dec 8 at 14:53
2
@mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
– Kodiologist
Dec 8 at 14:55
add a comment |
2
As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
– mdewey
Dec 8 at 14:53
2
@mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
– Kodiologist
Dec 8 at 14:55
2
2
As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
– mdewey
Dec 8 at 14:53
As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
– mdewey
Dec 8 at 14:53
2
2
@mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
– Kodiologist
Dec 8 at 14:55
@mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
– Kodiologist
Dec 8 at 14:55
add a comment |
3 Answers
3
active
oldest
votes
What is the mathematical definition of a causal relationship between
two random variables?
Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:
$$
x = f_x(epsilon_{x})\
y = f_y(x, epsilon_{y})
$$
This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.
Given a sample from the joint distribution of two random variables X
and Y, when would we say X causes Y?
Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).
This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?
Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.
But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.
Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R
:
library(pcalg)
set.seed(1234)
n <- 500
eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
eps2 <- runif(n) - 0.5
x2 <- 3 + eps2
x1 <- 0.9*x2 + 7 + eps1
# runs lingam
X <- cbind(x1, x2)
res <- lingam(X)
as(res, "amat")
# Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
# [,1] [,2]
# [1,] . .
# [2,] TRUE .
Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.
For the case of the paper you cite, they make this specific assumption (see their "postulate"):
If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.
Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.
As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.
1
Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
– usεr11852
Dec 8 at 23:09
1
@usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
– Carlos Cinelli
Dec 8 at 23:16
1
Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use justlm
) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
– usεr11852
Dec 8 at 23:30
1
@usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
– Carlos Cinelli
Dec 8 at 23:50
2
@usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
– Carlos Cinelli
Dec 9 at 0:07
|
show 2 more comments
There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.
Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).
Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.
thank you. what would be a test for it given a set of samples from joint distribution?
– Jane
Dec 8 at 15:33
3
I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
– Jane
Dec 8 at 16:30
1
I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
– Carlos Cinelli
Dec 8 at 21:29
2
@Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
– Jane
Dec 8 at 21:49
2
@Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
– Vimal
Dec 8 at 21:53
|
show 17 more comments
There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.
- There exists an intervention on $X$ such that the value of $Y$ is changed
An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.
- The simulation of $Y$ requires the simulation of $X$
To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.
In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f380962%2fhow-is-causation-defined-mathematically%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
What is the mathematical definition of a causal relationship between
two random variables?
Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:
$$
x = f_x(epsilon_{x})\
y = f_y(x, epsilon_{y})
$$
This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.
Given a sample from the joint distribution of two random variables X
and Y, when would we say X causes Y?
Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).
This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?
Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.
But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.
Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R
:
library(pcalg)
set.seed(1234)
n <- 500
eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
eps2 <- runif(n) - 0.5
x2 <- 3 + eps2
x1 <- 0.9*x2 + 7 + eps1
# runs lingam
X <- cbind(x1, x2)
res <- lingam(X)
as(res, "amat")
# Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
# [,1] [,2]
# [1,] . .
# [2,] TRUE .
Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.
For the case of the paper you cite, they make this specific assumption (see their "postulate"):
If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.
Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.
As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.
1
Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
– usεr11852
Dec 8 at 23:09
1
@usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
– Carlos Cinelli
Dec 8 at 23:16
1
Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use justlm
) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
– usεr11852
Dec 8 at 23:30
1
@usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
– Carlos Cinelli
Dec 8 at 23:50
2
@usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
– Carlos Cinelli
Dec 9 at 0:07
|
show 2 more comments
What is the mathematical definition of a causal relationship between
two random variables?
Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:
$$
x = f_x(epsilon_{x})\
y = f_y(x, epsilon_{y})
$$
This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.
Given a sample from the joint distribution of two random variables X
and Y, when would we say X causes Y?
Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).
This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?
Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.
But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.
Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R
:
library(pcalg)
set.seed(1234)
n <- 500
eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
eps2 <- runif(n) - 0.5
x2 <- 3 + eps2
x1 <- 0.9*x2 + 7 + eps1
# runs lingam
X <- cbind(x1, x2)
res <- lingam(X)
as(res, "amat")
# Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
# [,1] [,2]
# [1,] . .
# [2,] TRUE .
Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.
For the case of the paper you cite, they make this specific assumption (see their "postulate"):
If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.
Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.
As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.
1
Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
– usεr11852
Dec 8 at 23:09
1
@usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
– Carlos Cinelli
Dec 8 at 23:16
1
Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use justlm
) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
– usεr11852
Dec 8 at 23:30
1
@usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
– Carlos Cinelli
Dec 8 at 23:50
2
@usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
– Carlos Cinelli
Dec 9 at 0:07
|
show 2 more comments
What is the mathematical definition of a causal relationship between
two random variables?
Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:
$$
x = f_x(epsilon_{x})\
y = f_y(x, epsilon_{y})
$$
This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.
Given a sample from the joint distribution of two random variables X
and Y, when would we say X causes Y?
Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).
This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?
Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.
But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.
Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R
:
library(pcalg)
set.seed(1234)
n <- 500
eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
eps2 <- runif(n) - 0.5
x2 <- 3 + eps2
x1 <- 0.9*x2 + 7 + eps1
# runs lingam
X <- cbind(x1, x2)
res <- lingam(X)
as(res, "amat")
# Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
# [,1] [,2]
# [1,] . .
# [2,] TRUE .
Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.
For the case of the paper you cite, they make this specific assumption (see their "postulate"):
If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.
Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.
As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.
What is the mathematical definition of a causal relationship between
two random variables?
Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:
$$
x = f_x(epsilon_{x})\
y = f_y(x, epsilon_{y})
$$
This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.
Given a sample from the joint distribution of two random variables X
and Y, when would we say X causes Y?
Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).
This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?
Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.
But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.
Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R
:
library(pcalg)
set.seed(1234)
n <- 500
eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
eps2 <- runif(n) - 0.5
x2 <- 3 + eps2
x1 <- 0.9*x2 + 7 + eps1
# runs lingam
X <- cbind(x1, x2)
res <- lingam(X)
as(res, "amat")
# Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
# [,1] [,2]
# [1,] . .
# [2,] TRUE .
Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.
For the case of the paper you cite, they make this specific assumption (see their "postulate"):
If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.
Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.
As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.
edited Dec 9 at 0:12
answered Dec 8 at 22:26
Carlos Cinelli
6,41442354
6,41442354
1
Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
– usεr11852
Dec 8 at 23:09
1
@usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
– Carlos Cinelli
Dec 8 at 23:16
1
Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use justlm
) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
– usεr11852
Dec 8 at 23:30
1
@usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
– Carlos Cinelli
Dec 8 at 23:50
2
@usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
– Carlos Cinelli
Dec 9 at 0:07
|
show 2 more comments
1
Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
– usεr11852
Dec 8 at 23:09
1
@usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
– Carlos Cinelli
Dec 8 at 23:16
1
Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use justlm
) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
– usεr11852
Dec 8 at 23:30
1
@usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
– Carlos Cinelli
Dec 8 at 23:50
2
@usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
– Carlos Cinelli
Dec 9 at 0:07
1
1
Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
– usεr11852
Dec 8 at 23:09
Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
– usεr11852
Dec 8 at 23:09
1
1
@usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
– Carlos Cinelli
Dec 8 at 23:16
@usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
– Carlos Cinelli
Dec 8 at 23:16
1
1
Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just
lm
) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)– usεr11852
Dec 8 at 23:30
Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just
lm
) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)– usεr11852
Dec 8 at 23:30
1
1
@usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
– Carlos Cinelli
Dec 8 at 23:50
@usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
– Carlos Cinelli
Dec 8 at 23:50
2
2
@usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
– Carlos Cinelli
Dec 9 at 0:07
@usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
– Carlos Cinelli
Dec 9 at 0:07
|
show 2 more comments
There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.
Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).
Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.
thank you. what would be a test for it given a set of samples from joint distribution?
– Jane
Dec 8 at 15:33
3
I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
– Jane
Dec 8 at 16:30
1
I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
– Carlos Cinelli
Dec 8 at 21:29
2
@Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
– Jane
Dec 8 at 21:49
2
@Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
– Vimal
Dec 8 at 21:53
|
show 17 more comments
There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.
Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).
Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.
thank you. what would be a test for it given a set of samples from joint distribution?
– Jane
Dec 8 at 15:33
3
I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
– Jane
Dec 8 at 16:30
1
I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
– Carlos Cinelli
Dec 8 at 21:29
2
@Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
– Jane
Dec 8 at 21:49
2
@Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
– Vimal
Dec 8 at 21:53
|
show 17 more comments
There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.
Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).
Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.
There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.
Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).
Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.
edited Dec 11 at 19:22
answered Dec 8 at 14:54
Kodiologist
16.7k22953
16.7k22953
thank you. what would be a test for it given a set of samples from joint distribution?
– Jane
Dec 8 at 15:33
3
I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
– Jane
Dec 8 at 16:30
1
I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
– Carlos Cinelli
Dec 8 at 21:29
2
@Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
– Jane
Dec 8 at 21:49
2
@Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
– Vimal
Dec 8 at 21:53
|
show 17 more comments
thank you. what would be a test for it given a set of samples from joint distribution?
– Jane
Dec 8 at 15:33
3
I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
– Jane
Dec 8 at 16:30
1
I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
– Carlos Cinelli
Dec 8 at 21:29
2
@Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
– Jane
Dec 8 at 21:49
2
@Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
– Vimal
Dec 8 at 21:53
thank you. what would be a test for it given a set of samples from joint distribution?
– Jane
Dec 8 at 15:33
thank you. what would be a test for it given a set of samples from joint distribution?
– Jane
Dec 8 at 15:33
3
3
I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
– Jane
Dec 8 at 16:30
I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
– Jane
Dec 8 at 16:30
1
1
I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
– Carlos Cinelli
Dec 8 at 21:29
I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
– Carlos Cinelli
Dec 8 at 21:29
2
2
@Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
– Jane
Dec 8 at 21:49
@Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
– Jane
Dec 8 at 21:49
2
2
@Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
– Vimal
Dec 8 at 21:53
@Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
– Vimal
Dec 8 at 21:53
|
show 17 more comments
There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.
- There exists an intervention on $X$ such that the value of $Y$ is changed
An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.
- The simulation of $Y$ requires the simulation of $X$
To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.
In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.
add a comment |
There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.
- There exists an intervention on $X$ such that the value of $Y$ is changed
An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.
- The simulation of $Y$ requires the simulation of $X$
To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.
In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.
add a comment |
There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.
- There exists an intervention on $X$ such that the value of $Y$ is changed
An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.
- The simulation of $Y$ requires the simulation of $X$
To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.
In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.
There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.
- There exists an intervention on $X$ such that the value of $Y$ is changed
An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.
- The simulation of $Y$ requires the simulation of $X$
To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.
In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.
answered Dec 12 at 14:13
zenna
1266
1266
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f380962%2fhow-is-causation-defined-mathematically%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
– mdewey
Dec 8 at 14:53
2
@mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
– Kodiologist
Dec 8 at 14:55