quantile regression: out of memory error

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

quantile regression: out of memory error

Prew, Paul
Hello,  I’m wondering if anyone can offer advice on the out-of-memory error I’m getting. I’m using R2.12.2 on Windows XP, Platform: i386-pc-mingw32/i386 (32-bit).

I am using the quantreg package,  trying to perform a quantile regression on a dataframe that has 11,254 rows and 5 columns.

> object.size(subsetAudit.dat)
450832 bytes

> str(subsetAudit.dat)
'data.frame':   11253 obs. of  5 variables:
 $ Satisfaction     : num  0.64 0.87 0.78 0.75 0.83 0.75 0.74 0.8 0.89 0.78 ...
 $ Return           : num  0.84 0.92 0.91 0.89 0.95 0.81 0.9 0.87 0.95 0.88 ...
 $ Recommend        : num  0.53 0.64 0.58 0.58 0.62 0.6 0.56 0.7 0.64 0.65 ...
 $ Cust.Clean       : num  0.75 0.85 0.72 0.72 0.81 0.79 0.79 0.8 0.78 0.75 ...
 $ CleanScore.Cycle1: num  96.7 83.3 93.3 86.7 96.7 96.7 90 80 81.7 86.7 ...

rq(subsetAudit.dat$Satisfaction ~ subsetAudit.dat$CleanScore.Cycle1, tau = -1)

ERROR:  cannot allocate vector of size 2.8 Gb

I don’t know much about computers – software, hardware, algorithms – but does this mean that the quantreg  package is creating some massive intermediate vector as it performs the rq function?   Quantile regression is something I’m just starting to explore, but believe it involves ordering data prior to the regression, which could be a huge job when using 11,000 records.   Does bigmemory have functionality to help me with this?

Thank you,
Paul






Paul Prew   ▪  Statistician
651-795-5942   ▪   fax 651-204-7504
Ecolab Research Center   ▪  Mail Stop ESC-F4412-A
655 Lone Oak Drive   ▪   Eagan, MN 55121-1560




CONFIDENTIALITY NOTICE:
This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: quantile regression: out of memory error

RKoenker
Paul,

Yours is NOT a large problem, but it becomes a large problem when you ask for ALL the distinct
QR solutions by specifying tau = -1.  You probably don't want to see all these solutions, I suspect
that only tau = 1:19/20 or so would suffice.  Try this, and see how it goes.

Roger

url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    [hidden email]            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Urbana, IL 61801

On Jul 11, 2011, at 12:39 PM, Prew, Paul wrote:

> Hello,  I’m wondering if anyone can offer advice on the out-of-memory error I’m getting. I’m using R2.12.2 on Windows XP, Platform: i386-pc-mingw32/i386 (32-bit).
>
> I am using the quantreg package,  trying to perform a quantile regression on a dataframe that has 11,254 rows and 5 columns.
>
>> object.size(subsetAudit.dat)
> 450832 bytes
>
>> str(subsetAudit.dat)
> 'data.frame':   11253 obs. of  5 variables:
> $ Satisfaction     : num  0.64 0.87 0.78 0.75 0.83 0.75 0.74 0.8 0.89 0.78 ...
> $ Return           : num  0.84 0.92 0.91 0.89 0.95 0.81 0.9 0.87 0.95 0.88 ...
> $ Recommend        : num  0.53 0.64 0.58 0.58 0.62 0.6 0.56 0.7 0.64 0.65 ...
> $ Cust.Clean       : num  0.75 0.85 0.72 0.72 0.81 0.79 0.79 0.8 0.78 0.75 ...
> $ CleanScore.Cycle1: num  96.7 83.3 93.3 86.7 96.7 96.7 90 80 81.7 86.7 ...
>
> rq(subsetAudit.dat$Satisfaction ~ subsetAudit.dat$CleanScore.Cycle1, tau = -1)
>
> ERROR:  cannot allocate vector of size 2.8 Gb
>
> I don’t know much about computers – software, hardware, algorithms – but does this mean that the quantreg  package is creating some massive intermediate vector as it performs the rq function?   Quantile regression is something I’m just starting to explore, but believe it involves ordering data prior to the regression, which could be a huge job when using 11,000 records.   Does bigmemory have functionality to help me with this?
>
> Thank you,
> Paul
>
>
>
>
>
>
> Paul Prew   ▪  Statistician
> 651-795-5942   ▪   fax 651-204-7504
> Ecolab Research Center   ▪  Mail Stop ESC-F4412-A
> 655 Lone Oak Drive   ▪   Eagan, MN 55121-1560
>
>
>
>
> CONFIDENTIALITY NOTICE:
> This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above.
> Any unauthorized review, use, disclosure or distribution is prohibited.
> If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: quantile regression: out of memory error

Prew, Paul
Thank you, Roger, that was my problem.  Specifying tau = 1:19/20 worked fine.  Regards, Paul

Paul Prew  |  Statistician
651-795-5942   |   fax 651-204-7504
Ecolab Research Center  | Mail Stop ESC-F4412-A
655 Lone Oak Drive  |  Eagan, MN 55121-1560


-----Original Message-----
From: Roger Koenker [mailto:[hidden email]]
Sent: Monday, July 11, 2011 12:48 PM
To: Prew, Paul
Cc: [hidden email] help
Subject: Re: [R] quantile regression: out of memory error

Paul,

Yours is NOT a large problem, but it becomes a large problem when you ask for ALL the distinct
QR solutions by specifying tau = -1.  You probably don't want to see all these solutions, I suspect
that only tau = 1:19/20 or so would suffice.  Try this, and see how it goes.

Roger

url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    [hidden email]            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Urbana, IL 61801

On Jul 11, 2011, at 12:39 PM, Prew, Paul wrote:

> Hello,  I’m wondering if anyone can offer advice on the out-of-memory error I’m getting. I’m using R2.12.2 on Windows XP, Platform: i386-pc-mingw32/i386 (32-bit).
>
> I am using the quantreg package,  trying to perform a quantile regression on a dataframe that has 11,254 rows and 5 columns.
>
>> object.size(subsetAudit.dat)
> 450832 bytes
>
>> str(subsetAudit.dat)
> 'data.frame':   11253 obs. of  5 variables:
> $ Satisfaction     : num  0.64 0.87 0.78 0.75 0.83 0.75 0.74 0.8 0.89 0.78 ...
> $ Return           : num  0.84 0.92 0.91 0.89 0.95 0.81 0.9 0.87 0.95 0.88 ...
> $ Recommend        : num  0.53 0.64 0.58 0.58 0.62 0.6 0.56 0.7 0.64 0.65 ...
> $ Cust.Clean       : num  0.75 0.85 0.72 0.72 0.81 0.79 0.79 0.8 0.78 0.75 ...
> $ CleanScore.Cycle1: num  96.7 83.3 93.3 86.7 96.7 96.7 90 80 81.7 86.7 ...
>
> rq(subsetAudit.dat$Satisfaction ~ subsetAudit.dat$CleanScore.Cycle1, tau = -1)
>
> ERROR:  cannot allocate vector of size 2.8 Gb
>
> I don’t know much about computers – software, hardware, algorithms – but does this mean that the quantreg  package is creating some massive intermediate vector as it performs the rq function?   Quantile regression is something I’m just starting to explore, but believe it involves ordering data prior to the regression, which could be a huge job when using 11,000 records.   Does bigmemory have functionality to help me with this?
>
> Thank you,
> Paul
>
>
>
>
>
>
> Paul Prew   ▪  Statistician
> 651-795-5942   ▪   fax 651-204-7504
> Ecolab Research Center   ▪  Mail Stop ESC-F4412-A
> 655 Lone Oak Drive   ▪   Eagan, MN 55121-1560
>
>
>
>
> CONFIDENTIALITY NOTICE:
> This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above.
> Any unauthorized review, use, disclosure or distribution is prohibited.
> If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


CONFIDENTIALITY NOTICE:
This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: quantile regression: out of memory error

Cade, Brian
In reply to this post by Prew, Paul
Using tau = -1 is causing rq() to try and estimate all possible quantiles
and store the results.  With 11253 observations this would be a formidable
feat.   Try estimating the model with say tau = 1:99/100 to give a more
tractable number of estimates.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [hidden email]
tel:  970 226-9326



From:
"Prew, Paul" <[hidden email]>
To:
"[hidden email]" <[hidden email]>
Date:
07/11/2011 11:42 AM
Subject:
[R] quantile regression: out of memory error
Sent by:
[hidden email]



Hello,  I?m wondering if anyone can offer advice on the out-of-memory
error I?m getting. I?m using R2.12.2 on Windows XP, Platform:
i386-pc-mingw32/i386 (32-bit).

I am using the quantreg package,  trying to perform a quantile regression
on a dataframe that has 11,254 rows and 5 columns.

> object.size(subsetAudit.dat)
450832 bytes

> str(subsetAudit.dat)
'data.frame':   11253 obs. of  5 variables:
 $ Satisfaction     : num  0.64 0.87 0.78 0.75 0.83 0.75 0.74 0.8 0.89
0.78 ...
 $ Return           : num  0.84 0.92 0.91 0.89 0.95 0.81 0.9 0.87 0.95
0.88 ...
 $ Recommend        : num  0.53 0.64 0.58 0.58 0.62 0.6 0.56 0.7 0.64 0.65
...
 $ Cust.Clean       : num  0.75 0.85 0.72 0.72 0.81 0.79 0.79 0.8 0.78
0.75 ...
 $ CleanScore.Cycle1: num  96.7 83.3 93.3 86.7 96.7 96.7 90 80 81.7 86.7
...

rq(subsetAudit.dat$Satisfaction ~ subsetAudit.dat$CleanScore.Cycle1, tau =
-1)

ERROR:  cannot allocate vector of size 2.8 Gb

I don?t know much about computers ? software, hardware, algorithms ? but
does this mean that the quantreg  package is creating some massive
intermediate vector as it performs the rq function?   Quantile regression
is something I?m just starting to explore, but believe it involves
ordering data prior to the regression, which could be a huge job when
using 11,000 records.   Does bigmemory have functionality to help me with
this?

Thank you,
Paul






Paul Prew   ?  Statistician
651-795-5942   ?   fax 651-204-7504
Ecolab Research Center   ?  Mail Stop ESC-F4412-A
655 Lone Oak Drive   ?   Eagan, MN 55121-1560




CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:20}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.