How to find the conditional CDF based on observed data in R [on hold]





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







3












$begingroup$


If we have two samples (generally their distribution is not known),say $Xsim N(0,1)$, $Y|Xsim N(X,X^2/2)$. Can we recover the conditional CDF of $Y|X$ based on the observed samples in R?



n=1000
x=rnorm(n)
y=rnorm(n,x,x^2/2)









share|cite|improve this question









$endgroup$



put on hold as off-topic by Nick Cox, Peter Flom Apr 19 at 11:26


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Nick Cox, Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.
















  • $begingroup$
    conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
    $endgroup$
    – Gary Moore
    Apr 19 at 5:05










  • $begingroup$
    Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
    $endgroup$
    – J.Mike
    Apr 19 at 5:26




















3












$begingroup$


If we have two samples (generally their distribution is not known),say $Xsim N(0,1)$, $Y|Xsim N(X,X^2/2)$. Can we recover the conditional CDF of $Y|X$ based on the observed samples in R?



n=1000
x=rnorm(n)
y=rnorm(n,x,x^2/2)









share|cite|improve this question









$endgroup$



put on hold as off-topic by Nick Cox, Peter Flom Apr 19 at 11:26


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Nick Cox, Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.
















  • $begingroup$
    conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
    $endgroup$
    – Gary Moore
    Apr 19 at 5:05










  • $begingroup$
    Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
    $endgroup$
    – J.Mike
    Apr 19 at 5:26
















3












3








3





$begingroup$


If we have two samples (generally their distribution is not known),say $Xsim N(0,1)$, $Y|Xsim N(X,X^2/2)$. Can we recover the conditional CDF of $Y|X$ based on the observed samples in R?



n=1000
x=rnorm(n)
y=rnorm(n,x,x^2/2)









share|cite|improve this question









$endgroup$




If we have two samples (generally their distribution is not known),say $Xsim N(0,1)$, $Y|Xsim N(X,X^2/2)$. Can we recover the conditional CDF of $Y|X$ based on the observed samples in R?



n=1000
x=rnorm(n)
y=rnorm(n,x,x^2/2)






r distributions conditional-probability cdf






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Apr 19 at 4:43









J.MikeJ.Mike

1415




1415




put on hold as off-topic by Nick Cox, Peter Flom Apr 19 at 11:26


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Nick Cox, Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.







put on hold as off-topic by Nick Cox, Peter Flom Apr 19 at 11:26


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Nick Cox, Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.












  • $begingroup$
    conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
    $endgroup$
    – Gary Moore
    Apr 19 at 5:05










  • $begingroup$
    Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
    $endgroup$
    – J.Mike
    Apr 19 at 5:26




















  • $begingroup$
    conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
    $endgroup$
    – Gary Moore
    Apr 19 at 5:05










  • $begingroup$
    Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
    $endgroup$
    – J.Mike
    Apr 19 at 5:26


















$begingroup$
conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
$endgroup$
– Gary Moore
Apr 19 at 5:05




$begingroup$
conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
$endgroup$
– Gary Moore
Apr 19 at 5:05












$begingroup$
Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
$endgroup$
– J.Mike
Apr 19 at 5:26






$begingroup$
Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
$endgroup$
– J.Mike
Apr 19 at 5:26












2 Answers
2






active

oldest

votes


















2












$begingroup$

You can't determine the CDF from samples, but you can easily get an empirical estimate:



set.seed(1359)
n <- 1000
x <- rnorm(n)
y <- rnorm(n, x, x^2/2)

LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
coef(LM)


Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





If you don't know anything about their distributions, you could try a non-parametric approach.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    Thank you for your answer! Do you have the reference for the nonparametric approach?
    $endgroup$
    – J.Mike
    Apr 19 at 5:28



















1












$begingroup$

Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.






share|cite|improve this answer









$endgroup$




















    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2












    $begingroup$

    You can't determine the CDF from samples, but you can easily get an empirical estimate:



    set.seed(1359)
    n <- 1000
    x <- rnorm(n)
    y <- rnorm(n, x, x^2/2)

    LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
    coef(LM)


    Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



    Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





    If you don't know anything about their distributions, you could try a non-parametric approach.






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      Thank you for your answer! Do you have the reference for the nonparametric approach?
      $endgroup$
      – J.Mike
      Apr 19 at 5:28
















    2












    $begingroup$

    You can't determine the CDF from samples, but you can easily get an empirical estimate:



    set.seed(1359)
    n <- 1000
    x <- rnorm(n)
    y <- rnorm(n, x, x^2/2)

    LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
    coef(LM)


    Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



    Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





    If you don't know anything about their distributions, you could try a non-parametric approach.






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      Thank you for your answer! Do you have the reference for the nonparametric approach?
      $endgroup$
      – J.Mike
      Apr 19 at 5:28














    2












    2








    2





    $begingroup$

    You can't determine the CDF from samples, but you can easily get an empirical estimate:



    set.seed(1359)
    n <- 1000
    x <- rnorm(n)
    y <- rnorm(n, x, x^2/2)

    LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
    coef(LM)


    Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



    Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





    If you don't know anything about their distributions, you could try a non-parametric approach.






    share|cite|improve this answer









    $endgroup$



    You can't determine the CDF from samples, but you can easily get an empirical estimate:



    set.seed(1359)
    n <- 1000
    x <- rnorm(n)
    y <- rnorm(n, x, x^2/2)

    LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
    coef(LM)


    Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



    Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





    If you don't know anything about their distributions, you could try a non-parametric approach.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Apr 19 at 5:04









    Frans RodenburgFrans Rodenburg

    3,7501529




    3,7501529












    • $begingroup$
      Thank you for your answer! Do you have the reference for the nonparametric approach?
      $endgroup$
      – J.Mike
      Apr 19 at 5:28


















    • $begingroup$
      Thank you for your answer! Do you have the reference for the nonparametric approach?
      $endgroup$
      – J.Mike
      Apr 19 at 5:28
















    $begingroup$
    Thank you for your answer! Do you have the reference for the nonparametric approach?
    $endgroup$
    – J.Mike
    Apr 19 at 5:28




    $begingroup$
    Thank you for your answer! Do you have the reference for the nonparametric approach?
    $endgroup$
    – J.Mike
    Apr 19 at 5:28













    1












    $begingroup$

    Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.






    share|cite|improve this answer









    $endgroup$


















      1












      $begingroup$

      Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.






      share|cite|improve this answer









      $endgroup$
















        1












        1








        1





        $begingroup$

        Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.






        share|cite|improve this answer









        $endgroup$



        Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Apr 19 at 10:13









        BenBen

        29.1k234130




        29.1k234130















            Popular posts from this blog

            Plaza Victoria

            In PowerPoint, is there a keyboard shortcut for bulleted / numbered list?

            How to put 3 figures in Latex with 2 figures side by side and 1 below these side by side images but in...