difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

Taka Matzmoto
Hi R users

This looks a simple question

Is there any difference between between rnorm(1000,0,1) and running
rnorm(500,0,1) twice in terms of outcome ?

TM

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

Romain François
Le 08.02.2006 04:21, Taka Matzmoto a écrit :

>Hi R users
>
>This looks a simple question
>
>Is there any difference between between rnorm(1000,0,1) and running
>rnorm(500,0,1) twice in terms of outcome ?
>
>TM
>  
>
Not here :

R> set.seed(1)
R> x <- rnorm(1000, 0, 1)
R> set.seed(1)
R> y <- rnorm(500, 0, 1)
R> z <- rnorm(500, 0, 1)
R> all(x == c(y,z))
[1] TRUE

Romain

--
visit the R Graph Gallery : http://addictedtor.free.fr/graphiques
mixmod 1.7 is released : http://www-math.univ-fcomte.fr/mixmod/index.php
+---------------------------------------------------------------+
| Romain FRANCOIS - http://francoisromain.free.fr               |
| Doctorant INRIA Futurs / EDF                                  |
+---------------------------------------------------------------+

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

Philippe Grosjean
Romain Francois wrote:

> Le 08.02.2006 04:21, Taka Matzmoto a écrit :
>
>
>>Hi R users
>>
>>This looks a simple question
>>
>>Is there any difference between between rnorm(1000,0,1) and running
>>rnorm(500,0,1) twice in terms of outcome ?
>>
>>TM
>>
>>
>
> Not here :
>
> R> set.seed(1)
> R> x <- rnorm(1000, 0, 1)
> R> set.seed(1)
> R> y <- rnorm(500, 0, 1)
> R> z <- rnorm(500, 0, 1)
> R> all(x == c(y,z))
> [1] TRUE
>
> Romain

Indeed! The pseudo-random number generator is initialized at the same
state, and thus, returns the same 1000 pseudo-random numbers in both
cases. So, no differences.
Best,

Philippe Grosjean

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

Bjørn-Helge Mevik
In reply to this post by Taka Matzmoto
Why don't you test it yourself?

E.g.,

set.seed(42)
bob1 <- rnorm(1000,0,1)
set.seed(42)
bob2 <- rnorm(500,0,1)
bob3 <- rnorm(500,0,1)
identical(bob1, c(bob2, bob3))

I won't tell you the answer. :-)

--
Bjørn-Helge Mevik

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

Duncan Murdoch
On 2/8/2006 4:53 AM, Bjørn-Helge Mevik wrote:

> Why don't you test it yourself?
>
> E.g.,
>
> set.seed(42)
> bob1 <- rnorm(1000,0,1)
> set.seed(42)
> bob2 <- rnorm(500,0,1)
> bob3 <- rnorm(500,0,1)
> identical(bob1, c(bob2, bob3))
>
> I won't tell you the answer. :-)

This isn't really something that can be proved by a test.  Perhaps the
current implementation makes those equal only because 500 is even, or
divisible by 5, or whatever...

I think the intention is that those should be equal, but in a quick
search I've been unable to find a documented guarantee of that.  So I
would take a defensive stance and assume that there may be conditions
where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n).

If someone can point out the document I missed, I'd appreciate it.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: difference between rnorm(1000, 0, 1) and running rnorm(5

Ted.Harding
On 08-Feb-06 Duncan Murdoch wrote:

> On 2/8/2006 4:53 AM, Bjørn-Helge Mevik wrote:
>> Why don't you test it yourself?
>>
>> E.g.,
>>
>> set.seed(42)
>> bob1 <- rnorm(1000,0,1)
>> set.seed(42)
>> bob2 <- rnorm(500,0,1)
>> bob3 <- rnorm(500,0,1)
>> identical(bob1, c(bob2, bob3))
>>
>> I won't tell you the answer. :-)
>
> This isn't really something that can be proved by a test.  Perhaps the
> current implementation makes those equal only because 500 is even, or
> divisible by 5, or whatever...
>
> I think the intention is that those should be equal, but in a quick
> search I've been unable to find a documented guarantee of that.  So I
> would take a defensive stance and assume that there may be conditions
> where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n).
>
> If someone can point out the document I missed, I'd appreciate it.
>
> Duncan Murdoch

On my understanding, once the seed is set the sequence generated
by the underlying RNG is determined, whether it is the result of
a single call to produce a long sequence or multiple calls to
generate many shorter sequences. Example:

> set.seed(42)
> multi<-numeric(20)
> set.seed(42)
> single<-rnorm(20)
> set.seed(42)
> for(i in (1:20)) multi[i]<-rnorm(1)
> print(max(multi-single),digits=22)
[1] 0
> print(min(multi-single),digits=22)
[1] 0

In other words: identical!

Whether there are possible exceptions, in some implementations
of r<dist> where <dist> is other than "norm", has to be answered
by people who are familiar with the internals of these functions.

Best wishes to all,
Ted.




--------------------------------------------------------------------
E-Mail: (Ted Harding) <[hidden email]>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Feb-06                                       Time: 13:26:10
------------------------------ XFMail ------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

Duncan Murdoch
In reply to this post by Taka Matzmoto
On 2/8/2006 8:30 AM, Brian D Ripley wrote:

> On Wed, 8 Feb 2006, Duncan Murdoch wrote:
>
>> On 2/8/2006 4:53 AM, Bj�rn-Helge Mevik wrote:
>> > Why don't you test it yourself?
>> >
>> > E.g.,
>> >
>> > set.seed(42)
>> > bob1 <- rnorm(1000,0,1)
>> > set.seed(42)
>> > bob2 <- rnorm(500,0,1)
>> > bob3 <- rnorm(500,0,1)
>> > identical(bob1, c(bob2, bob3))
>> >
>> > I won't tell you the answer. :-)
>>
>> This isn't really something that can be proved by a test.  Perhaps the
>> current implementation makes those equal only because 500 is even, or
>> divisible by 5, or whatever...
>>
>> I think the intention is that those should be equal, but in a quick
>> search I've been unable to find a documented guarantee of that.  So I
>> would take a defensive stance and assume that there may be conditions
>> where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n).
>>
>> If someone can point out the document I missed, I'd appreciate it.
>
> It's various source files in R_HOME/src/main.
>
> Barring bugs, they will be the same.  As you know
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.

I didn't mean guarantee in the sense of warranty, just guarantee in the
sense that if someone found a situation where they weren't equal, we
would consider it a bug and fix it or document it as an exception.

Should we add a statement to the RNG man page or manuals somewhere that
says this is the intention?

For others who aren't as familiar with the issues as Brian: this isn't
necessarily a good idea.  We have a lot of RNGs, and it's fairly easy to
write one so that this isn't true.  For example, the Box-Muller method
naturally generates pairs of normals; a naive implementation would just
throw one away at the end if asked for an odd number.  (Ours doesn't do
that.)

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

Peter Dalgaard
In reply to this post by Duncan Murdoch
Duncan Murdoch <[hidden email]> writes:

> This isn't really something that can be proved by a test.  Perhaps the
> current implementation makes those equal only because 500 is even, or
> divisible by 5, or whatever...
>
> I think the intention is that those should be equal, but in a quick
> search I've been unable to find a documented guarantee of that.  So I
> would take a defensive stance and assume that there may be conditions
> where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n).
>
> If someone can point out the document I missed, I'd appreciate it.

I think it's a fair assumption that *uniform* random numbers have the
property, since these are engines that produce a continuous stream of
values, of which we select the next n and m values.

As long as the normal.kind (see ?RNGkind) is "Inversion", we can be
sure that the property carries to rnorm, but it might not be the case
for other methods. In particular the ones that generate normal
variates in batches are suspect. However, empirically, I can't seem to
provoke the effect with any of R's built-in generators. One *could* of
course check the source code and see whether there is state
information being kept between invokations...

--
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - ([hidden email])                  FAX: (+45) 35327907

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice

Brian Ripley
In reply to this post by Duncan Murdoch
On Wed, 8 Feb 2006, Duncan Murdoch wrote:

> On 2/8/2006 8:30 AM, Brian D Ripley wrote:
> On Wed, 8 Feb 2006, Duncan Murdoch wrote:
> >> On 2/8/2006 4:53 AM, Bjÿÿrn-Helge Mevik wrote:
>> > Why don't you test it yourself?
>> >>> > E.g.,>> >>> > set.seed(42)>> > bob1 <- rnorm(1000,0,1)>> > set.seed(42)>> > bob2 <- rnorm(500,0,1)>> > bob3 <- rnorm(500,0,1)>> > identical(bob1, c(bob2, bob3))>> >>> > I won't tell you the answer. :-)

>> This isn't really something that can be proved by a test.  Perhaps the
>> current implementation makes those equal only because 500 is even, or
>> divisible by 5, or whatever...
>> I think the intention is that those should be equal, but in a quick
>> search I've been unable to find a documented guarantee of that.  So I
>> would take a defensive stance and assume that there may be conditions
>> where c(rnorm(m), rnorm(n)) is not equal to rnorm(m+n).
>>
>> If someone can point out the document I missed, I'd appreciate it.
> > It's various source files in R_HOME/src/main.
> > Barring bugs, they will be the same.  As you know
> > R is free software and comes with ABSOLUTELY NO WARRANTY.

> I didn't mean guarantee in the sense of warranty, just guarantee in the
> sense that if someone found a situation where they weren't equal, we
> would consider it a bug and fix it or document it as an exception.

> Should we add a statement to the RNG man page or manuals somewhere that
> says this is the intention?

I think that is part of the sense of `no warranty': we allow ourselves to
change anything which is not documented, and so things are as a result
deliberately not documented.

> For others who aren't as familiar with the issues as Brian: this isn't
> necessarily a good idea.  We have a lot of RNGs, and it's fairly easy to
> write one so that this isn't true.  For example, the Box-Muller method
> naturally generates pairs of normals; a naive implementation would just
> throw one away at the end if asked for an odd number.  (Ours doesn't do
> that.)

I think we should allow future methods to do things like that, and
preferably document that they do them.

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html