Increasing the maximum number of rows

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Increasing the maximum number of rows

Alex Ruiz Euler
Dear R helpers,

I created a somewhat big database (+206,700 rows) in MySQL and have
exported into a csv file, but I can't open the whole thing in R. I am
using:

> base<-read.csv("/path/to/file.csv", header=F, sep="," nrows=206720)

R doesn't complain but it only opens 128,328 observations (the number of
columns corresponds to the original database):

> dim(base)
[1] 128328    134

In case it's useful, my system's profile:

[~]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 30
file size               (blocks, -f) unlimited
pending signals                 (-i) 31547
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 100
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

I also tried
[~]$ R --max-vsize=500M


I haven't found any obvious way to expand the number of rows. Is this a
system wide issue, or is it R? Where should I look for a solution? Any
pointers greatly appreciated.

I'm using Fedora 12, 64bit, x86_64 on a 4Gb Ram laptop.
 
Thank you,
Alex

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Increasing the maximum number of rows

Erik Iverson-3
Alex Ruiz E. wrote:

> Dear R helpers,
>
> I created a somewhat big database (+206,700 rows) in MySQL and have
> exported into a csv file, but I can't open the whole thing in R. I am
> using:
>
>> base<-read.csv("/path/to/file.csv", header=F, sep="," nrows=206720)
>
> R doesn't complain but it only opens 128,328 observations (the number of
> columns corresponds to the original database):
>
>> dim(base)
> [1] 128328    134

Have you verified that file.csv does indeed contain the number of rows you think
it does? Can you go to line 128328 of the CSV file and look if it's any different?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Increasing the maximum number of rows

jholtman
You might also try setting the following parameters on read.csv:

comment.char='', quote=''

If you have a "#", this might cause missing data; also an unbalanced
quote will cause missing lines.

On Sat, May 22, 2010 at 2:12 AM, Erik Iverson <[hidden email]> wrote:

> Alex Ruiz E. wrote:
>>
>> Dear R helpers,
>>
>> I created a somewhat big database (+206,700 rows) in MySQL and have
>> exported into a csv file, but I can't open the whole thing in R. I am
>> using:
>>
>>> base<-read.csv("/path/to/file.csv", header=F, sep="," nrows=206720)
>>
>> R doesn't complain but it only opens 128,328 observations (the number of
>> columns corresponds to the original database):
>>
>>> dim(base)
>>
>> [1] 128328    134
>
> Have you verified that file.csv does indeed contain the number of rows you
> think it does? Can you go to line 128328 of the CSV file and look if it's
> any different?
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Increasing the maximum number of rows

Alex Ruiz Euler

Hi thank you both for your answers.

I did verify that the number of rows in the csv is actually ~207,000,
both in the MySQL output and then directly in the csv file. The line
128328 looks exactly as all others above and below.

I tried the comment.char and quote parameters and it turns out that the
quote parameter worked. I appreciate the pointer.

Regards,
Alex



On Sat, 2010-05-22 at 08:04 -0400, jim holtman wrote:

> You might also try setting the following parameters on read.csv:
>
> comment.char='', quote=''
>
> If you have a "#", this might cause missing data; also an unbalanced
> quote will cause missing lines.
>
> On Sat, May 22, 2010 at 2:12 AM, Erik Iverson <[hidden email]> wrote:
> > Alex Ruiz E. wrote:
> >>
> >> Dear R helpers,
> >>
> >> I created a somewhat big database (+206,700 rows) in MySQL and have
> >> exported into a csv file, but I can't open the whole thing in R. I am
> >> using:
> >>
> >>> base<-read.csv("/path/to/file.csv", header=F, sep="," nrows=206720)
> >>
> >> R doesn't complain but it only opens 128,328 observations (the number of
> >> columns corresponds to the original database):
> >>
> >>> dim(base)
> >>
> >> [1] 128328    134
> >
> > Have you verified that file.csv does indeed contain the number of rows you
> > think it does? Can you go to line 128328 of the CSV file and look if it's
> > any different?
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Increasing the maximum number of rows

Wu Gong
In reply to this post by Alex Ruiz Euler
Might there be a limit ?

> c <- matrix(1:100000000, ncol=200)
> dim(c)
[1] 500000    200
> c <- matrix(1:1000000000, ncol=200)
Error: cannot allocate vector of size 3.7 Gb
Reply | Threaded
Open this post in threaded view
|

Re: Increasing the maximum number of rows

jholtman
You are trying to create an object with 1G elements.  Given that these
are integers, this will require about 4GB of space.  If you are
running on a 32-bit system, which has a total phyical limit of 2-3GB
depending on what options you are running (at least on Windows), then
you have exceeded the limits.  It is a good idea to limit your largest
object to about 25% of physical memory in case copies have to be made
during some of the analysis.


On Sat, May 22, 2010 at 10:31 PM, Wu Gong <[hidden email]> wrote:

>
> Might there be a limit ?
>
>> c <- matrix(1:100000000, ncol=200)
>> dim(c)
> [1] 500000    200
>> c <- matrix(1:1000000000, ncol=200)
> Error: cannot allocate vector of size 3.7 Gb
>
>
> -----
> A R learner.
> --
> View this message in context: http://r.789695.n4.nabble.com/Increasing-the-maximum-number-of-rows-tp2226950p2227578.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Increasing the maximum number of rows

Tal Galili
Hello Jim,
It sounds like a good time to go read about the packages
bigmemory
and/or
ff

Best,
Tal


----------------Contact
Details:-------------------------------------------------------
Contact me: [hidden email] |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------




On Sun, May 23, 2010 at 12:31 PM, jim holtman <[hidden email]> wrote:

> You are trying to create an object with 1G elements.  Given that these
> are integers, this will require about 4GB of space.  If you are
> running on a 32-bit system, which has a total phyical limit of 2-3GB
> depending on what options you are running (at least on Windows), then
> you have exceeded the limits.  It is a good idea to limit your largest
> object to about 25% of physical memory in case copies have to be made
> during some of the analysis.
>
>
> On Sat, May 22, 2010 at 10:31 PM, Wu Gong <[hidden email]> wrote:
> >
> > Might there be a limit ?
> >
> >> c <- matrix(1:100000000, ncol=200)
> >> dim(c)
> > [1] 500000    200
> >> c <- matrix(1:1000000000, ncol=200)
> > Error: cannot allocate vector of size 3.7 Gb
> >
> >
> > -----
> > A R learner.
> > --
> > View this message in context:
> http://r.789695.n4.nabble.com/Increasing-the-maximum-number-of-rows-tp2226950p2227578.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.