A random number from any distribution?‏

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

A random number from any distribution?‏

ivan popivanov

Hello,
 
I have some data, and I want to generate random numbers following the distribution of this data (in other words, to generate a synthetic data set sharing the same stats as a given data set). Reading an old thread I found the following text:
 
>If you can compute the quantile function of the distribution (i.e., the
>inverse of the integral of the pdf), then you can use the probability
>integral transform: If U is a U(0,1) random variable and Q is the quantile
>function of the distribution F, then Q(U) is a random variable distributed
>as F.
 
That sounds good, but is there a quick way to do this in R? Let's say my data is contained in "ee", I can get the quantiles using:
 
qq = quantile(ee, probs=(0,1,0.25))
           0%           25%           50%           75%          100%
-0.2573385519 -0.0041451053  0.0004538924  0.0049276991  0.1037823292
 
Then I "know" how to use the above method to generate Q(U) (by looking up U in the first row, and then mapping it to a number using the second row), but is there an R function that does that? Otherwise I need to write my own to lookup the table.
 
Thanks in advance,
Ivan
     
_________________________________________________________________


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A random number from any distribution??

Peter Dalgaard
ivan popivanov wrote:

> Hello,
>  
> I have some data, and I want to generate random numbers following the distribution of this data (in other words, to generate a synthetic data set sharing the same stats as a given data set). Reading an old thread I found the following text:
>  
>> If you can compute the quantile function of the distribution (i.e., the
>> inverse of the integral of the pdf), then you can use the probability
>> integral transform: If U is a U(0,1) random variable and Q is the quantile
>> function of the distribution F, then Q(U) is a random variable distributed
>> as F.
>  
> That sounds good, but is there a quick way to do this in R? Let's say my data is contained in "ee", I can get the quantiles using:
>  
> qq = quantile(ee, probs=(0,1,0.25))
>            0%           25%           50%           75%          100%
> -0.2573385519 -0.0041451053  0.0004538924  0.0049276991  0.1037823292
>  
> Then I "know" how to use the above method to generate Q(U) (by looking up U in the first row, and then mapping it to a number using the second row), but is there an R function that does that? Otherwise I need to write my own to lookup the table.
>  
> Thanks in advance,
> Ivan

Q <- approxfun(x,sort(ee)) with x=(0:(n-1))/(n-1) is your friend, I think.

Beware the details of the interpolation, though, in some variants you
end up reinventing the bootstrap. Also the fact that your generated
variables tend to be constrained to the range of ee should at least be
noted.

--
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - ([hidden email])              FAX: (+45) 35327907

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A random number from any distribution?‏

Greg Snow-2
In reply to this post by ivan popivanov
Look at the logspline package for an alternative.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[hidden email]
801.408.8111


> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of ivan popivanov
> Sent: Saturday, December 12, 2009 7:38 PM
> To: [hidden email]
> Subject: [R] A random number from any distribution?‏
>
>
> Hello,
>
> I have some data, and I want to generate random numbers following the
> distribution of this data (in other words, to generate a synthetic data
> set sharing the same stats as a given data set). Reading an old thread
> I found the following text:
>
> >If you can compute the quantile function of the distribution (i.e.,
> the
> >inverse of the integral of the pdf), then you can use the probability
> >integral transform: If U is a U(0,1) random variable and Q is the
> quantile
> >function of the distribution F, then Q(U) is a random variable
> distributed
> >as F.
>
> That sounds good, but is there a quick way to do this in R? Let's say
> my data is contained in "ee", I can get the quantiles using:
>
> qq = quantile(ee, probs=(0,1,0.25))
>            0%           25%           50%           75%          100%
> -0.2573385519 -0.0041451053  0.0004538924  0.0049276991  0.1037823292
>
> Then I "know" how to use the above method to generate Q(U) (by looking
> up U in the first row, and then mapping it to a number using the second
> row), but is there an R function that does that? Otherwise I need to
> write my own to lookup the table.
>
> Thanks in advance,
> Ivan
>
> _________________________________________________________________
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A random number from any distribution?þ

Bert Gunter
Sounds like the poster might be interested in bootstrap sampling ...
As usual, what's the question of interest?

Bert Gunter
Genentech Nonclinical Biostatistics
 
 
-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Greg Snow
Sent: Monday, December 14, 2009 12:45 PM
To: ivan popivanov; [hidden email]
Subject: Re: [R]A random number from any distribution?‏

Look at the logspline package for an alternative.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[hidden email]
801.408.8111


> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of ivan popivanov
> Sent: Saturday, December 12, 2009 7:38 PM
> To: [hidden email]
> Subject: [R] A random number from any distribution?‏
>
>
> Hello,
>
> I have some data, and I want to generate random numbers following the
> distribution of this data (in other words, to generate a synthetic data
> set sharing the same stats as a given data set). Reading an old thread
> I found the following text:
>
> >If you can compute the quantile function of the distribution (i.e.,
> the
> >inverse of the integral of the pdf), then you can use the probability
> >integral transform: If U is a U(0,1) random variable and Q is the
> quantile
> >function of the distribution F, then Q(U) is a random variable
> distributed
> >as F.
>
> That sounds good, but is there a quick way to do this in R? Let's say
> my data is contained in "ee", I can get the quantiles using:
>
> qq = quantile(ee, probs=(0,1,0.25))
>            0%           25%           50%           75%          100%
> -0.2573385519 -0.0041451053  0.0004538924  0.0049276991  0.1037823292
>
> Then I "know" how to use the above method to generate Q(U) (by looking
> up U in the first row, and then mapping it to a number using the second
> row), but is there an R function that does that? Otherwise I need to
> write my own to lookup the table.
>
> Thanks in advance,
> Ivan
>
> _________________________________________________________________
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A random number from any distribution?þ

ivan popivanov

:) I might be trying to do something stupid so let me try again:

1) I have a large sample - daily percentage movement for a stock
2) I want to generate a synthetic stock which has daily movements from the same distribution as the original

The solution I was planning to implement (following the old post cited) is:

1) Compute the quantiles over the known data
2) Generate a uniformly distributed U in (0,1)
3) Find the quantile corresponding to U
4) Using U's offset from the left end of the quantile, compute the daily movement for the synthetic stock

Below is the function I came up with. Two questions:
1) Is there an existing R function to do that?
2) Is this a sound approach?

Thanks in advance!

rsample = function(s, n, step=0.01)
{
   qs = quantile(s, probs=seq(0, 1, step))

   res = rep(0, n)

   unif = runif(n)

   for(i in 1:n)
   {
      uu = unif[i]

      # find uu's quantile
      qid = ceiling(uu / step)
      qleft = (qid - 1)*step

      # compute the result using uu's offset within the quantile
      res[i] = as.numeric(qs[qid]) + ((uu - qleft)/step)*(as.numeric(qs[qid+1]) - as.numeric(qs[qid]))
   }

   return(res)
}

> From: [hidden email]
> To: [hidden email]; [hidden email]; [hidden email]
> Subject: RE: [R]A random number from any distribution?þ
> Date: Mon, 14 Dec 2009 13:04:37 -0800
>
> Sounds like the poster might be interested in bootstrap sampling ...
> As usual, what's the question of interest?
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>  
>  
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Greg Snow
> Sent: Monday, December 14, 2009 12:45 PM
> To: ivan popivanov; [hidden email]
> Subject: Re: [R]A random number from any distribution?‏
>
> Look at the logspline package for an alternative.
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [hidden email]
> 801.408.8111
>
>
> > -----Original Message-----
> > From: [hidden email] [mailto:r-help-bounces@r-
> > project.org] On Behalf Of ivan popivanov
> > Sent: Saturday, December 12, 2009 7:38 PM
> > To: [hidden email]
> > Subject: [R] A random number from any distribution?‏
> >
> >
> > Hello,
> >
> > I have some data, and I want to generate random numbers following the
> > distribution of this data (in other words, to generate a synthetic data
> > set sharing the same stats as a given data set). Reading an old thread
> > I found the following text:
> >
> > >If you can compute the quantile function of the distribution (i.e.,
> > the
> > >inverse of the integral of the pdf), then you can use the probability
> > >integral transform: If U is a U(0,1) random variable and Q is the
> > quantile
> > >function of the distribution F, then Q(U) is a random variable
> > distributed
> > >as F.
> >
> > That sounds good, but is there a quick way to do this in R? Let's say
> > my data is contained in "ee", I can get the quantiles using:
> >
> > qq = quantile(ee, probs=(0,1,0.25))
> >            0%           25%           50%           75%          100%
> > -0.2573385519 -0.0041451053  0.0004538924  0.0049276991  0.1037823292
> >
> > Then I "know" how to use the above method to generate Q(U) (by looking
> > up U in the first row, and then mapping it to a number using the second
> > row), but is there an R function that does that? Otherwise I need to
> > write my own to lookup the table.
> >
> > Thanks in advance,
> > Ivan
> >
> > _________________________________________________________________
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
     
_________________________________________________________________
Windows Live: Make it easier for your friends to see what you’re up to on Facebook.
http://go.microsoft.com/?linkid=9691816
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: A random number from any distribution?þ

Bert Gunter
Questionable. Doesn't this implicitly assumes that the log(stock prices) form an AR(1) series? If so, is this reasonable? And what about the occasional shocks?

Appropriate simulation of time series like stock prices is a tricky business, I believe. I would question whether your naïve approach is going to capture enough of the real dynamics to give meaningful answers. As I'm far from an expert on this sort of thing, I'll just leave it at that.


Bert Gunter
Genentech Nonclinical Biostatistics

http://devo.gene.com/groups/devo/depts/ncb/home.shtml

 

 


-----Original Message-----
From: ivan popivanov [mailto:[hidden email]]
Sent: Monday, December 14, 2009 2:27 PM
To: [hidden email]; [hidden email]; [hidden email]
Subject: RE: [R]A random number from any distribution?þ

:) I might be trying to do something stupid so let me try again:

1) I have a large sample - daily percentage movement for a stock
2) I want to generate a synthetic stock which has daily movements from the same distribution as the original

The solution I was planning to implement (following the old post cited) is:

1) Compute the quantiles over the known data
2) Generate a uniformly distributed U in (0,1)
3) Find the quantile corresponding to U
4) Using U's offset from the left end of the quantile, compute the daily movement for the synthetic stock

Below is the function I came up with. Two questions:
1) Is there an existing R function to do that?
2) Is this a sound approach?

Thanks in advance!

rsample = function(s, n, step=0.01)
{
   qs = quantile(s, probs=seq(0, 1, step))

   res = rep(0, n)

   unif = runif(n)

   for(i in 1:n)
   {
      uu = unif[i]

      # find uu's quantile
      qid = ceiling(uu / step)
      qleft = (qid - 1)*step

      # compute the result using uu's offset within the quantile
      res[i] = as.numeric(qs[qid]) + ((uu - qleft)/step)*(as.numeric(qs[qid+1]) - as.numeric(qs[qid]))
   }

   return(res)
}

> From: [hidden email]
> To: [hidden email]; [hidden email]; [hidden email]
> Subject: RE: [R]A random number from any distribution?þ
> Date: Mon, 14 Dec 2009 13:04:37 -0800
>
> Sounds like the poster might be interested in bootstrap sampling ...
> As usual, what's the question of interest?
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Greg Snow
> Sent: Monday, December 14, 2009 12:45 PM
> To: ivan popivanov; [hidden email]
> Subject: Re: [R]A random number from any distribution?‏
>
> Look at the logspline package for an alternative.
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [hidden email]
> 801.408.8111
>
>
> > -----Original Message-----
> > From: [hidden email] [mailto:r-help-bounces@r-
> > project.org] On Behalf Of ivan popivanov
> > Sent: Saturday, December 12, 2009 7:38 PM
> > To: [hidden email]
> > Subject: [R] A random number from any distribution?‏
> >
> >
> > Hello,
> >
> > I have some data, and I want to generate random numbers following the
> > distribution of this data (in other words, to generate a synthetic data
> > set sharing the same stats as a given data set). Reading an old thread
> > I found the following text:
> >
> > >If you can compute the quantile function of the distribution (i.e.,
> > the
> > >inverse of the integral of the pdf), then you can use the probability
> > >integral transform: If U is a U(0,1) random variable and Q is the
> > quantile
> > >function of the distribution F, then Q(U) is a random variable
> > distributed
> > >as F.
> >
> > That sounds good, but is there a quick way to do this in R? Let's say
> > my data is contained in "ee", I can get the quantiles using:
> >
> > qq = quantile(ee, probs=(0,1,0.25))
> > 0% 25% 50% 75% 100%
> > -0.2573385519 -0.0041451053 0.0004538924 0.0049276991 0.1037823292
> >
> > Then I "know" how to use the above method to generate Q(U) (by looking
> > up U in the first row, and then mapping it to a number using the second
> > row), but is there an R function that does that? Otherwise I need to
> > write my own to lookup the table.
> >
> > Thanks in advance,
> > Ivan
> >
> > _________________________________________________________________
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


________________________________

Windows Live: Make it easier for your friends to see what you’re up to on Facebook. <http://go.microsoft.com/?linkid=9691811>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.