How is causation defined mathematically?












16














What is the mathematical definition of a causal relationship between two random variables?



Given a sample from the joint distribution of two random variables $X$ and $Y$, when would we say $X$ causes $Y$?



For context, I am reading this paper about causal discovery.










share|cite|improve this question




















  • 2




    As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
    – mdewey
    Dec 8 at 14:53






  • 2




    @mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
    – Kodiologist
    Dec 8 at 14:55


















16














What is the mathematical definition of a causal relationship between two random variables?



Given a sample from the joint distribution of two random variables $X$ and $Y$, when would we say $X$ causes $Y$?



For context, I am reading this paper about causal discovery.










share|cite|improve this question




















  • 2




    As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
    – mdewey
    Dec 8 at 14:53






  • 2




    @mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
    – Kodiologist
    Dec 8 at 14:55
















16












16








16


10





What is the mathematical definition of a causal relationship between two random variables?



Given a sample from the joint distribution of two random variables $X$ and $Y$, when would we say $X$ causes $Y$?



For context, I am reading this paper about causal discovery.










share|cite|improve this question















What is the mathematical definition of a causal relationship between two random variables?



Given a sample from the joint distribution of two random variables $X$ and $Y$, when would we say $X$ causes $Y$?



For context, I am reading this paper about causal discovery.







machine-learning causality






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Dec 8 at 22:07

























asked Dec 8 at 14:01









Jane

865




865








  • 2




    As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
    – mdewey
    Dec 8 at 14:53






  • 2




    @mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
    – Kodiologist
    Dec 8 at 14:55
















  • 2




    As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
    – mdewey
    Dec 8 at 14:53






  • 2




    @mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
    – Kodiologist
    Dec 8 at 14:55










2




2




As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
– mdewey
Dec 8 at 14:53




As far as I can see causality is a scientific not mathematical concept. Can you edit to clarify?
– mdewey
Dec 8 at 14:53




2




2




@mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
– Kodiologist
Dec 8 at 14:55






@mdewey I disagree. Causality can be cashed out in entirely formal terms. See e.g. my answer.
– Kodiologist
Dec 8 at 14:55












3 Answers
3






active

oldest

votes


















9















What is the mathematical definition of a causal relationship between
two random variables?




Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:



$$
x = f_x(epsilon_{x})\
y = f_y(x, epsilon_{y})
$$



This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.




Given a sample from the joint distribution of two random variables X
and Y, when would we say X causes Y?




Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).



This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?



Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.



But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.



Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R:



library(pcalg)
set.seed(1234)
n <- 500
eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
eps2 <- runif(n) - 0.5
x2 <- 3 + eps2
x1 <- 0.9*x2 + 7 + eps1

# runs lingam
X <- cbind(x1, x2)
res <- lingam(X)
as(res, "amat")

# Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
# [,1] [,2]
# [1,] . .
# [2,] TRUE .


Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.



For the case of the paper you cite, they make this specific assumption (see their "postulate"):



If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.



Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.



As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.






share|cite|improve this answer



















  • 1




    Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
    – usεr11852
    Dec 8 at 23:09






  • 1




    @usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
    – Carlos Cinelli
    Dec 8 at 23:16








  • 1




    Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just lm) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
    – usεr11852
    Dec 8 at 23:30






  • 1




    @usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
    – Carlos Cinelli
    Dec 8 at 23:50






  • 2




    @usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
    – Carlos Cinelli
    Dec 9 at 0:07



















4














There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.



Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).



Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.






share|cite|improve this answer























  • thank you. what would be a test for it given a set of samples from joint distribution?
    – Jane
    Dec 8 at 15:33






  • 3




    I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
    – Jane
    Dec 8 at 16:30








  • 1




    I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
    – Carlos Cinelli
    Dec 8 at 21:29






  • 2




    @Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
    – Jane
    Dec 8 at 21:49








  • 2




    @Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
    – Vimal
    Dec 8 at 21:53





















0














There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.




  1. There exists an intervention on $X$ such that the value of $Y$ is changed


An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.




  1. The simulation of $Y$ requires the simulation of $X$


To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.



In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.






share|cite|improve this answer





















    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "65"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f380962%2fhow-is-causation-defined-mathematically%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    9















    What is the mathematical definition of a causal relationship between
    two random variables?




    Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:



    $$
    x = f_x(epsilon_{x})\
    y = f_y(x, epsilon_{y})
    $$



    This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.




    Given a sample from the joint distribution of two random variables X
    and Y, when would we say X causes Y?




    Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).



    This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?



    Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.



    But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.



    Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R:



    library(pcalg)
    set.seed(1234)
    n <- 500
    eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
    eps2 <- runif(n) - 0.5
    x2 <- 3 + eps2
    x1 <- 0.9*x2 + 7 + eps1

    # runs lingam
    X <- cbind(x1, x2)
    res <- lingam(X)
    as(res, "amat")

    # Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
    # [,1] [,2]
    # [1,] . .
    # [2,] TRUE .


    Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.



    For the case of the paper you cite, they make this specific assumption (see their "postulate"):



    If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.



    Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.



    As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.






    share|cite|improve this answer



















    • 1




      Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
      – usεr11852
      Dec 8 at 23:09






    • 1




      @usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
      – Carlos Cinelli
      Dec 8 at 23:16








    • 1




      Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just lm) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
      – usεr11852
      Dec 8 at 23:30






    • 1




      @usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
      – Carlos Cinelli
      Dec 8 at 23:50






    • 2




      @usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
      – Carlos Cinelli
      Dec 9 at 0:07
















    9















    What is the mathematical definition of a causal relationship between
    two random variables?




    Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:



    $$
    x = f_x(epsilon_{x})\
    y = f_y(x, epsilon_{y})
    $$



    This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.




    Given a sample from the joint distribution of two random variables X
    and Y, when would we say X causes Y?




    Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).



    This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?



    Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.



    But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.



    Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R:



    library(pcalg)
    set.seed(1234)
    n <- 500
    eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
    eps2 <- runif(n) - 0.5
    x2 <- 3 + eps2
    x1 <- 0.9*x2 + 7 + eps1

    # runs lingam
    X <- cbind(x1, x2)
    res <- lingam(X)
    as(res, "amat")

    # Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
    # [,1] [,2]
    # [1,] . .
    # [2,] TRUE .


    Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.



    For the case of the paper you cite, they make this specific assumption (see their "postulate"):



    If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.



    Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.



    As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.






    share|cite|improve this answer



















    • 1




      Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
      – usεr11852
      Dec 8 at 23:09






    • 1




      @usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
      – Carlos Cinelli
      Dec 8 at 23:16








    • 1




      Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just lm) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
      – usεr11852
      Dec 8 at 23:30






    • 1




      @usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
      – Carlos Cinelli
      Dec 8 at 23:50






    • 2




      @usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
      – Carlos Cinelli
      Dec 9 at 0:07














    9












    9








    9







    What is the mathematical definition of a causal relationship between
    two random variables?




    Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:



    $$
    x = f_x(epsilon_{x})\
    y = f_y(x, epsilon_{y})
    $$



    This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.




    Given a sample from the joint distribution of two random variables X
    and Y, when would we say X causes Y?




    Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).



    This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?



    Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.



    But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.



    Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R:



    library(pcalg)
    set.seed(1234)
    n <- 500
    eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
    eps2 <- runif(n) - 0.5
    x2 <- 3 + eps2
    x1 <- 0.9*x2 + 7 + eps1

    # runs lingam
    X <- cbind(x1, x2)
    res <- lingam(X)
    as(res, "amat")

    # Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
    # [,1] [,2]
    # [1,] . .
    # [2,] TRUE .


    Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.



    For the case of the paper you cite, they make this specific assumption (see their "postulate"):



    If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.



    Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.



    As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.






    share|cite|improve this answer















    What is the mathematical definition of a causal relationship between
    two random variables?




    Mathematically, a causal model consists of functional relationships between variables. For instance, consider the system of structural equations below:



    $$
    x = f_x(epsilon_{x})\
    y = f_y(x, epsilon_{y})
    $$



    This means that $x$ functionally determines the value of $y$ (if you intervene on $x$ this changes the values of $y$) but not the other way around. Graphically, this is usually represented by $x rightarrow y$, which means that $x$ enters the structural equation of y. As an addendum, you can also express a causal model in terms of joint distributions of counterfactual variables, which is mathematically equivalent to functional models.




    Given a sample from the joint distribution of two random variables X
    and Y, when would we say X causes Y?




    Sometimes (or most of the times) you do not have knowledge about the shape of the structural equations $f_{x}$, $f_y$, nor even whether $xrightarrow y$ or $y rightarrow x$. The only information you have is the joint probability distribution $p(y,x)$ (or samples from this distribution).



    This leads to your question: when can I recover the direction of causality just from the data? Or, more precisely, when can I recover whether $x$ enters the structural equation of $y$ or vice-versa, just from the data?



    Of course, without any fundamentally untestable assumptions about the causal model, this is impossible. The problem is that several different causal models can entail the same joint probability distribution of observed variables. The most common example is a causal linear system with gaussian noise.



    But under some causal assumptions, this might be possible---and this is what the causal discovery literature works on. If you have no prior exposure to this topic, you might want to start from Elements of Causal Inference by Peters, Janzing and Scholkopf, as well as chapter 2 from Causality by Judea Pearl. We have a topic here on CV for references on causal discovery, but we don't have that many references listed there yet.



    Therefore, there isn't just one answer to your question, since it depends on the assumptions one makes. The paper you mention cites some examples, such as assuming a linear model with non-gaussian noise. This case is known as LINGAN (short for linear non-gaussian acyclic model), here is an example in R:



    library(pcalg)
    set.seed(1234)
    n <- 500
    eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n)))
    eps2 <- runif(n) - 0.5
    x2 <- 3 + eps2
    x1 <- 0.9*x2 + 7 + eps1

    # runs lingam
    X <- cbind(x1, x2)
    res <- lingam(X)
    as(res, "amat")

    # Adjacency Matrix 'amat' (2 x 2) of type ‘pag’:
    # [,1] [,2]
    # [1,] . .
    # [2,] TRUE .


    Notice here we have a linear causal model with non-gaussian noise where $x_2$ causes $x_1$ and lingam correctly recovers the causal direction. However, notice this depends critically on the LINGAM assumptions.



    For the case of the paper you cite, they make this specific assumption (see their "postulate"):



    If $xrightarrow y$ , the minimal description length of the mechanism mapping X to Y is independent of the value of X, whereas the minimal description length of the mechanism mapping Y to X is dependent on the value of Y.



    Note this is an assumption. This is what we would call their "identification condition". Essentially, the postulate imposes restrictions on the joint distribution $p(x,y)$. That is, the postulate says that if $x rightarrow y$ certain restrictions holds in the data, and if $y rightarrow x$ other restrictions hold. These types of restrictions that have testable implications (impose constraints on $p(y,x)$) is what allows one to recover directionally from observational data.



    As a final remark, causal discovery results are still very limited, and depend on strong assumptions, be careful when applying these on real world context.







    share|cite|improve this answer














    share|cite|improve this answer



    share|cite|improve this answer








    edited Dec 9 at 0:12

























    answered Dec 8 at 22:26









    Carlos Cinelli

    6,41442354




    6,41442354








    • 1




      Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
      – usεr11852
      Dec 8 at 23:09






    • 1




      @usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
      – Carlos Cinelli
      Dec 8 at 23:16








    • 1




      Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just lm) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
      – usεr11852
      Dec 8 at 23:30






    • 1




      @usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
      – Carlos Cinelli
      Dec 8 at 23:50






    • 2




      @usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
      – Carlos Cinelli
      Dec 9 at 0:07














    • 1




      Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
      – usεr11852
      Dec 8 at 23:09






    • 1




      @usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
      – Carlos Cinelli
      Dec 8 at 23:16








    • 1




      Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just lm) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
      – usεr11852
      Dec 8 at 23:30






    • 1




      @usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
      – Carlos Cinelli
      Dec 8 at 23:50






    • 2




      @usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
      – Carlos Cinelli
      Dec 9 at 0:07








    1




    1




    Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
    – usεr11852
    Dec 8 at 23:09




    Is there a chance you augment your answer to somehow include some simple examples with fake data please? For example, having read a bit of Elements of Causal Inference and viewed some of Peters' lectures, and a regression framework is commonly used to motivate the need for understanding the problem in detail (I am not even touching on their ICP work). I have the (maybe mistaken) impression that in your effort to move away from the RCM, your answers leave out all the actual tangible modelling machinery.
    – usεr11852
    Dec 8 at 23:09




    1




    1




    @usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
    – Carlos Cinelli
    Dec 8 at 23:16






    @usεr11852 I'm not sure I understand the context of your questions, do you want examples of causal discovery? There are several examples in the very paper Jane has provided. Also, I'm not sure I understand what you mean by "avoiding RCM and leaving out actual tangible modeling machinery", what tangible machinery are we missing in the causal discovery context here?
    – Carlos Cinelli
    Dec 8 at 23:16






    1




    1




    Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just lm) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
    – usεr11852
    Dec 8 at 23:30




    Apologies for the confusion, I do not care about examples from papers. I can cite other papers myself. (For example, Lopez-Paz et al. CVPR 2017 about their neural causation coefficient) What I care is for a simple numerical example with fake data that someone run in R (or your favourite language) and see what you mean. If you cite for example Peters' et al. book and they have small code snippets that hugely helpful (and occasionally use just lm) . We cannot all work around the Tuebingen datasets observational samples to get an idea of causal discovery! :)
    – usεr11852
    Dec 8 at 23:30




    1




    1




    @usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
    – Carlos Cinelli
    Dec 8 at 23:50




    @usεr11852 sure, including a fake example is trivial, I can include one using lingam in R. But would you care to explain what you meant by "avoiding RCM and leaving out actual tangible modeling machinery"?
    – Carlos Cinelli
    Dec 8 at 23:50




    2




    2




    @usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
    – Carlos Cinelli
    Dec 9 at 0:07




    @usεr11852 ok thanks for the feedback, I will try to include more code when appropriate. As a final remark, causal discovery results are still very limited, so people need to be very careful when applying these depending on context.
    – Carlos Cinelli
    Dec 9 at 0:07













    4














    There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.



    Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).



    Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.






    share|cite|improve this answer























    • thank you. what would be a test for it given a set of samples from joint distribution?
      – Jane
      Dec 8 at 15:33






    • 3




      I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
      – Jane
      Dec 8 at 16:30








    • 1




      I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
      – Carlos Cinelli
      Dec 8 at 21:29






    • 2




      @Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
      – Jane
      Dec 8 at 21:49








    • 2




      @Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
      – Vimal
      Dec 8 at 21:53


















    4














    There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.



    Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).



    Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.






    share|cite|improve this answer























    • thank you. what would be a test for it given a set of samples from joint distribution?
      – Jane
      Dec 8 at 15:33






    • 3




      I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
      – Jane
      Dec 8 at 16:30








    • 1




      I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
      – Carlos Cinelli
      Dec 8 at 21:29






    • 2




      @Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
      – Jane
      Dec 8 at 21:49








    • 2




      @Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
      – Vimal
      Dec 8 at 21:53
















    4












    4








    4






    There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.



    Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).



    Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.






    share|cite|improve this answer














    There are a variety of approaches to formalizing causality (which is in keeping with substantial philosophical disagreement about causality that has been around for centuries). A popular one is in terms of potential outcomes. The potential-outcomes approach, called the Rubin causal model, supposes that for each causal state of affairs, there's a different random variable. So, $Y_1$ might be the random variable of possible outcomes from a clinical trial if a subject takes the study drug, and $Y_2$ might be the random variable if he takes the placebo. The causal effect is the difference between $Y_1$ and $Y_2$. If in fact $Y_1 = Y_2$, we could say that the treatment has no effect. Otherwise, we could say that the treatment condition causes the outcome.



    Causal relationships between variables can also be represented with directional acylical graphs, which have a very different flavor but turn out to be mathematically equivalent to the Rubin model (Wasserman, 2004, section 17.8).



    Wasserman, L. (2004). All of statistics: A concise course in statistical inference. New York, NY: Springer. ISBN 978-0-387-40272-7.







    share|cite|improve this answer














    share|cite|improve this answer



    share|cite|improve this answer








    edited Dec 11 at 19:22

























    answered Dec 8 at 14:54









    Kodiologist

    16.7k22953




    16.7k22953












    • thank you. what would be a test for it given a set of samples from joint distribution?
      – Jane
      Dec 8 at 15:33






    • 3




      I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
      – Jane
      Dec 8 at 16:30








    • 1




      I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
      – Carlos Cinelli
      Dec 8 at 21:29






    • 2




      @Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
      – Jane
      Dec 8 at 21:49








    • 2




      @Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
      – Vimal
      Dec 8 at 21:53




















    • thank you. what would be a test for it given a set of samples from joint distribution?
      – Jane
      Dec 8 at 15:33






    • 3




      I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
      – Jane
      Dec 8 at 16:30








    • 1




      I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
      – Carlos Cinelli
      Dec 8 at 21:29






    • 2




      @Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
      – Jane
      Dec 8 at 21:49








    • 2




      @Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
      – Vimal
      Dec 8 at 21:53


















    thank you. what would be a test for it given a set of samples from joint distribution?
    – Jane
    Dec 8 at 15:33




    thank you. what would be a test for it given a set of samples from joint distribution?
    – Jane
    Dec 8 at 15:33




    3




    3




    I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
    – Jane
    Dec 8 at 16:30






    I am reading arxiv.org/abs/1804.04622. I haven't read its references. I am trying to understand what one means by causality based on observational data.
    – Jane
    Dec 8 at 16:30






    1




    1




    I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
    – Carlos Cinelli
    Dec 8 at 21:29




    I'm sorry (-1), this is not what is being asked, you don't observe $Y_1$ nor $Y_2$, you observe a sample of factual variables $X$, $Y$. See the paper Jane has linked.
    – Carlos Cinelli
    Dec 8 at 21:29




    2




    2




    @Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
    – Jane
    Dec 8 at 21:49






    @Vimal:I understand the case where we have "interventional distributions". We don't have "interventional distributions" in this setting and that is what makes it harder to understand. In the motivating example in the paper they give something like $(x, y=x^3+epsilon)$. The conditional distribution of y given x is essentially the distribution of the noise $epsilon$ plus some translation, while that doesn't hold for the conditional distribution of x given y. I initiatively understand the example. I am trying to understand what is the general definition for observational discovery of causality.
    – Jane
    Dec 8 at 21:49






    2




    2




    @Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
    – Vimal
    Dec 8 at 21:53






    @Jane for observational case (for your question), in general you cannot infer direction of causality purely mathematically, at least for the two variable case. For more variables, under additional (untestable) assumptions you could make a claim, but the conclusion can still be questioned. This discussion is very long in comments. :)
    – Vimal
    Dec 8 at 21:53













    0














    There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.




    1. There exists an intervention on $X$ such that the value of $Y$ is changed


    An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.




    1. The simulation of $Y$ requires the simulation of $X$


    To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.



    In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.






    share|cite|improve this answer


























      0














      There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.




      1. There exists an intervention on $X$ such that the value of $Y$ is changed


      An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.




      1. The simulation of $Y$ requires the simulation of $X$


      To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.



      In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.






      share|cite|improve this answer
























        0












        0








        0






        There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.




        1. There exists an intervention on $X$ such that the value of $Y$ is changed


        An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.




        1. The simulation of $Y$ requires the simulation of $X$


        To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.



        In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.






        share|cite|improve this answer












        There are two ways to determine whether $X$ is the cause of $Y$. The first is standard while the second is my own claim.




        1. There exists an intervention on $X$ such that the value of $Y$ is changed


        An intervention is a surgical change to a variable that does not affect variables it depends on. Interventions have been formalized rigorously in structural equations and causal graphical models, but as far as I know, there is no definition which is independent of a particular model class.




        1. The simulation of $Y$ requires the simulation of $X$


        To make this rigorous requires formalizing a model over $X$ and $Y$, and in particular the semantics which define how it is simulated.



        In modern approaches to causation, intervention is taken as the primitive object which defines causal relationships (definition 1). In my opinion, however, intervention is a reflection of, and necessarily consistent with simulation dynamics.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Dec 12 at 14:13









        zenna

        1266




        1266






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f380962%2fhow-is-causation-defined-mathematically%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Plaza Victoria

            Puebla de Zaragoza

            Musa