How to find the conditional CDF based on observed data in R [on hold]





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







3












$begingroup$


If we have two samples (generally their distribution is not known),say $Xsim N(0,1)$, $Y|Xsim N(X,X^2/2)$. Can we recover the conditional CDF of $Y|X$ based on the observed samples in R?



n=1000
x=rnorm(n)
y=rnorm(n,x,x^2/2)









share|cite|improve this question









$endgroup$



put on hold as off-topic by Nick Cox, Peter Flom Apr 19 at 11:26


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Nick Cox, Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.
















  • $begingroup$
    conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
    $endgroup$
    – Gary Moore
    Apr 19 at 5:05










  • $begingroup$
    Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
    $endgroup$
    – J.Mike
    Apr 19 at 5:26




















3












$begingroup$


If we have two samples (generally their distribution is not known),say $Xsim N(0,1)$, $Y|Xsim N(X,X^2/2)$. Can we recover the conditional CDF of $Y|X$ based on the observed samples in R?



n=1000
x=rnorm(n)
y=rnorm(n,x,x^2/2)









share|cite|improve this question









$endgroup$



put on hold as off-topic by Nick Cox, Peter Flom Apr 19 at 11:26


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Nick Cox, Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.
















  • $begingroup$
    conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
    $endgroup$
    – Gary Moore
    Apr 19 at 5:05










  • $begingroup$
    Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
    $endgroup$
    – J.Mike
    Apr 19 at 5:26
















3












3








3





$begingroup$


If we have two samples (generally their distribution is not known),say $Xsim N(0,1)$, $Y|Xsim N(X,X^2/2)$. Can we recover the conditional CDF of $Y|X$ based on the observed samples in R?



n=1000
x=rnorm(n)
y=rnorm(n,x,x^2/2)









share|cite|improve this question









$endgroup$




If we have two samples (generally their distribution is not known),say $Xsim N(0,1)$, $Y|Xsim N(X,X^2/2)$. Can we recover the conditional CDF of $Y|X$ based on the observed samples in R?



n=1000
x=rnorm(n)
y=rnorm(n,x,x^2/2)






r distributions conditional-probability cdf






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Apr 19 at 4:43









J.MikeJ.Mike

1415




1415




put on hold as off-topic by Nick Cox, Peter Flom Apr 19 at 11:26


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Nick Cox, Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.







put on hold as off-topic by Nick Cox, Peter Flom Apr 19 at 11:26


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question appears to be off-topic because EITHER it is not about statistics, machine learning, data analysis, data mining, or data visualization, OR it focuses on programming, debugging, or performing routine operations within a statistical computing platform. If the latter, you could try the support links we maintain." – Nick Cox, Peter Flom

If this question can be reworded to fit the rules in the help center, please edit the question.












  • $begingroup$
    conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
    $endgroup$
    – Gary Moore
    Apr 19 at 5:05










  • $begingroup$
    Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
    $endgroup$
    – J.Mike
    Apr 19 at 5:26




















  • $begingroup$
    conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
    $endgroup$
    – Gary Moore
    Apr 19 at 5:05










  • $begingroup$
    Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
    $endgroup$
    – J.Mike
    Apr 19 at 5:26


















$begingroup$
conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
$endgroup$
– Gary Moore
Apr 19 at 5:05




$begingroup$
conditional PDF and CDF can be estimated nonparametrically. There is supposedly at least one package available for those purposes.
$endgroup$
– Gary Moore
Apr 19 at 5:05












$begingroup$
Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
$endgroup$
– J.Mike
Apr 19 at 5:26






$begingroup$
Thank you for your kind comment! Do you have the reference for the method of nonparametric estimation and the R package?
$endgroup$
– J.Mike
Apr 19 at 5:26












2 Answers
2






active

oldest

votes


















2












$begingroup$

You can't determine the CDF from samples, but you can easily get an empirical estimate:



set.seed(1359)
n <- 1000
x <- rnorm(n)
y <- rnorm(n, x, x^2/2)

LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
coef(LM)


Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





If you don't know anything about their distributions, you could try a non-parametric approach.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    Thank you for your answer! Do you have the reference for the nonparametric approach?
    $endgroup$
    – J.Mike
    Apr 19 at 5:28



















1












$begingroup$

Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.






share|cite|improve this answer









$endgroup$




















    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2












    $begingroup$

    You can't determine the CDF from samples, but you can easily get an empirical estimate:



    set.seed(1359)
    n <- 1000
    x <- rnorm(n)
    y <- rnorm(n, x, x^2/2)

    LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
    coef(LM)


    Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



    Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





    If you don't know anything about their distributions, you could try a non-parametric approach.






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      Thank you for your answer! Do you have the reference for the nonparametric approach?
      $endgroup$
      – J.Mike
      Apr 19 at 5:28
















    2












    $begingroup$

    You can't determine the CDF from samples, but you can easily get an empirical estimate:



    set.seed(1359)
    n <- 1000
    x <- rnorm(n)
    y <- rnorm(n, x, x^2/2)

    LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
    coef(LM)


    Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



    Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





    If you don't know anything about their distributions, you could try a non-parametric approach.






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      Thank you for your answer! Do you have the reference for the nonparametric approach?
      $endgroup$
      – J.Mike
      Apr 19 at 5:28














    2












    2








    2





    $begingroup$

    You can't determine the CDF from samples, but you can easily get an empirical estimate:



    set.seed(1359)
    n <- 1000
    x <- rnorm(n)
    y <- rnorm(n, x, x^2/2)

    LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
    coef(LM)


    Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



    Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





    If you don't know anything about their distributions, you could try a non-parametric approach.






    share|cite|improve this answer









    $endgroup$



    You can't determine the CDF from samples, but you can easily get an empirical estimate:



    set.seed(1359)
    n <- 1000
    x <- rnorm(n)
    y <- rnorm(n, x, x^2/2)

    LM <- lm(y ~ 0 + x) # no intercept because you know this, though usually you won't
    coef(LM)


    Gives me $hat{beta} = 1.076$, or $hat{y} = 1.076 cdot x + epsilon$ (about $1times x$, as you specified).



    Similarly, you can get an empirical estimate of the standard deviation you supplied with sd(resid(LM)).





    If you don't know anything about their distributions, you could try a non-parametric approach.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Apr 19 at 5:04









    Frans RodenburgFrans Rodenburg

    3,7501529




    3,7501529












    • $begingroup$
      Thank you for your answer! Do you have the reference for the nonparametric approach?
      $endgroup$
      – J.Mike
      Apr 19 at 5:28


















    • $begingroup$
      Thank you for your answer! Do you have the reference for the nonparametric approach?
      $endgroup$
      – J.Mike
      Apr 19 at 5:28
















    $begingroup$
    Thank you for your answer! Do you have the reference for the nonparametric approach?
    $endgroup$
    – J.Mike
    Apr 19 at 5:28




    $begingroup$
    Thank you for your answer! Do you have the reference for the nonparametric approach?
    $endgroup$
    – J.Mike
    Apr 19 at 5:28













    1












    $begingroup$

    Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.






    share|cite|improve this answer









    $endgroup$


















      1












      $begingroup$

      Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.






      share|cite|improve this answer









      $endgroup$
















        1












        1








        1





        $begingroup$

        Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.






        share|cite|improve this answer









        $endgroup$



        Finding the conditional distribution of a variable $Y$ conditional on another observed variable $X$ is the entire subject matter of regression analysis (construed in its wide sense to include linear and nonlinear regression models, GLMs, GLMMs, etc.). This is a huge subject and a core part of statistical education. If you would like to learn more about it, I would recommend starting with some material on linear regression analysis, and then building up from there.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Apr 19 at 10:13









        BenBen

        29.1k234130




        29.1k234130















            Popular posts from this blog

            Plaza Victoria

            How to extract passwords from Mobaxterm Free Version

            IC on Digikey is 5x more expensive than board containing same IC on Alibaba: How? [on hold]