Problem with sample(...,size = 1000000000,...)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with sample(...,size = 1000000000,...)

Huy Nguyễn
When I ran this code:
"
x<-sample(1:5,1000000000,TRUE,c(0.1,0.2,0.4,0.2,0.1))
print(table(x)/1000000000)
plot(table(x)/1000000000,type="h",xlab="x",ylab="P(x)")
"
My laptop was frozen and didn't respond. Although I used ctrl+alt+del to
terminate r program, my laptop still did nothing. And I must restart my
laptop immediately or my laptop might be broken down.
Thus, I think in the future the program should have something to control
the memory and time when it is running and can be terminated if necessary.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem with sample(...,size = 1000000000,...)

jholtman
Do you realize you are trying to create a vector with 1 billion
entries, so this will take some time.  How much memory do you have on
your computer?

Here are some times to generate increasing sample sizes.  I have 16GB
on my computer and it took only 30 seconds to generate the data and
used almost 12GB of memory.

> system.time(x<-sample(1:5,100000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
   user  system elapsed
      0       0       0
> system.time(x<-sample(1:5,1000000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
   user  system elapsed
   0.03    0.00    0.03
> system.time(x<-sample(1:5,10000000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
   user  system elapsed
   0.47    0.02    0.49
> system.time(x<-sample(1:5,100000000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
   user  system elapsed
   3.09    0.24    3.33
> system.time(x<-sample(1:5,1000000000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
   user  system elapsed
  30.76    1.70   32.92
> memory.size()
[1] 11502.52

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Sat, Oct 15, 2016 at 12:19 PM, Huy Nguyễn <[hidden email]> wrote:

> When I ran this code:
> "
> x<-sample(1:5,1000000000,TRUE,c(0.1,0.2,0.4,0.2,0.1))
> print(table(x)/1000000000)
> plot(table(x)/1000000000,type="h",xlab="x",ylab="P(x)")
> "
> My laptop was frozen and didn't respond. Although I used ctrl+alt+del to
> terminate r program, my laptop still did nothing. And I must restart my
> laptop immediately or my laptop might be broken down.
> Thus, I think in the future the program should have something to control
> the memory and time when it is running and can be terminated if necessary.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem with sample(...,size = 1000000000,...)

jholtman
I forgot to add that if you have less than 16GB of memory, then you
were probably paging memory to disk and that would have take a much,
much, longer time.  When you are trying to do something BIG, do it in
some smaller steps and look at the resources that it takes (memory,
cpu, ...).

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Sat, Oct 15, 2016 at 4:06 PM, jim holtman <[hidden email]> wrote:

> Do you realize you are trying to create a vector with 1 billion
> entries, so this will take some time.  How much memory do you have on
> your computer?
>
> Here are some times to generate increasing sample sizes.  I have 16GB
> on my computer and it took only 30 seconds to generate the data and
> used almost 12GB of memory.
>
>> system.time(x<-sample(1:5,100000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
>    user  system elapsed
>       0       0       0
>> system.time(x<-sample(1:5,1000000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
>    user  system elapsed
>    0.03    0.00    0.03
>> system.time(x<-sample(1:5,10000000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
>    user  system elapsed
>    0.47    0.02    0.49
>> system.time(x<-sample(1:5,100000000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
>    user  system elapsed
>    3.09    0.24    3.33
>> system.time(x<-sample(1:5,1000000000,TRUE,c(0.1,0.2,0.4,0.2,0.1)))
>    user  system elapsed
>   30.76    1.70   32.92
>> memory.size()
> [1] 11502.52
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
>
> On Sat, Oct 15, 2016 at 12:19 PM, Huy Nguyễn <[hidden email]> wrote:
>> When I ran this code:
>> "
>> x<-sample(1:5,1000000000,TRUE,c(0.1,0.2,0.4,0.2,0.1))
>> print(table(x)/1000000000)
>> plot(table(x)/1000000000,type="h",xlab="x",ylab="P(x)")
>> "
>> My laptop was frozen and didn't respond. Although I used ctrl+alt+del to
>> terminate r program, my laptop still did nothing. And I must restart my
>> laptop immediately or my laptop might be broken down.
>> Thus, I think in the future the program should have something to control
>> the memory and time when it is running and can be terminated if necessary.
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.