Re: [R] RuleFit & quantreg: partial dependence plots; showing an effect

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [R] RuleFit & quantreg: partial dependence plots; showing an effect

Mark Difford
Dear List,

I would greatly appreciate help on the following matter:

The RuleFit program of Professor Friedman uses partial dependence plots
to explore the effect of an explanatory variable on the response
variable, after accounting for the average effects of the other
variables.  The plot method [plot(summary(rq(y ~ x1 + x2,
t=seq(.1,.9,.05))))] of Professor Koenker's quantreg program appears to
do the same thing.


Question:
Is there a difference between these two types of plot in the manner in which they depict the relationship between explanatory variables and the response variable ?

Thank you inav for your help.

Regards,
Mark Difford.

-------------------------------------------------------------
Mark Difford
Ph.D. candidate, Botany Department,
Nelson Mandela Metropolitan University,
Port Elizabeth, SA.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
Reply | Threaded
Open this post in threaded view
|

Re: [R] RuleFit & quantreg: partial dependence plots; showing an effect

RKoenker
They are entirely different:  Rulefit is a fiendishly clever  
combination of decision tree  formulation
of models and L1-regularization intended to select parsimonious fits  
to very complicated
responses yielding e.g. piecewise constant functions.  Rulefit  
estimates the  conditional
mean of the response over the covariate space, but permits a very  
flexible, but linear in
parameters specifications of the covariate effects on the conditional  
mean.  The quantile
regression plotting you refer to adopts a fixed, linear specification  
for conditional quantile
functions and given that specification depicts how the covariates  
influence the various
conditional quantiles of the response.   Thus, roughly speaking,  
Rulefit is focused on
flexibility in the x-space, maintaining the classical conditional  
mean objective; while
QR is trying to be more flexible in the y-direction, and maintaining  
a fixed, linear
in parameters specification for the covariate effects at each quantile.


url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    [hidden email]            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Champaign, IL 61820


On Dec 20, 2006, at 4:17 AM, Mark Difford wrote:

> Dear List,
>
> I would greatly appreciate help on the following matter:
>
> The RuleFit program of Professor Friedman uses partial dependence  
> plots
> to explore the effect of an explanatory variable on the response
> variable, after accounting for the average effects of the other
> variables.  The plot method [plot(summary(rq(y ~ x1 + x2,
> t=seq(.1,.9,.05))))] of Professor Koenker's quantreg program  
> appears to
> do the same thing.
>
>
> Question:
> Is there a difference between these two types of plot in the manner  
> in which they depict the relationship between explanatory variables  
> and the response variable ?
>
> Thank you inav for your help.
>
> Regards,
> Mark Difford.
>
> -------------------------------------------------------------
> Mark Difford
> Ph.D. candidate, Botany Department,
> Nelson Mandela Metropolitan University,
> Port Elizabeth, SA.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] RuleFit & quantreg: partial dependence plots; showing an effect

Ravi Varadhan
Dear Roger,

Is it possible to combine the two ideas that you mentioned: (1) algorithmic
approaches of Breiman, Friedman, and others that achieve flexibility in the
predictor space, and (2) robust and flexible regression like QR that achieve
flexibility in the response space, so as to achieve complete flexibility?
If it is possible, are you or anyone else in the R community working on
this?

Thanks,
Ravi.

----------------------------------------------------------------------------
-------

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [hidden email]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 

----------------------------------------------------------------------------
--------

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of roger koenker
Sent: Wednesday, December 20, 2006 8:57 AM
To: Mark Difford
Cc: R-help list
Subject: Re: [R] RuleFit & quantreg: partial dependence plots; showing an
effect

They are entirely different:  Rulefit is a fiendishly clever  
combination of decision tree  formulation
of models and L1-regularization intended to select parsimonious fits  
to very complicated
responses yielding e.g. piecewise constant functions.  Rulefit  
estimates the  conditional
mean of the response over the covariate space, but permits a very  
flexible, but linear in
parameters specifications of the covariate effects on the conditional  
mean.  The quantile
regression plotting you refer to adopts a fixed, linear specification  
for conditional quantile
functions and given that specification depicts how the covariates  
influence the various
conditional quantiles of the response.   Thus, roughly speaking,  
Rulefit is focused on
flexibility in the x-space, maintaining the classical conditional  
mean objective; while
QR is trying to be more flexible in the y-direction, and maintaining  
a fixed, linear
in parameters specification for the covariate effects at each quantile.


url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    [hidden email]            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Champaign, IL 61820


On Dec 20, 2006, at 4:17 AM, Mark Difford wrote:

> Dear List,
>
> I would greatly appreciate help on the following matter:
>
> The RuleFit program of Professor Friedman uses partial dependence  
> plots
> to explore the effect of an explanatory variable on the response
> variable, after accounting for the average effects of the other
> variables.  The plot method [plot(summary(rq(y ~ x1 + x2,
> t=seq(.1,.9,.05))))] of Professor Koenker's quantreg program  
> appears to
> do the same thing.
>
>
> Question:
> Is there a difference between these two types of plot in the manner  
> in which they depict the relationship between explanatory variables  
> and the response variable ?
>
> Thank you inav for your help.
>
> Regards,
> Mark Difford.
>
> -------------------------------------------------------------
> Mark Difford
> Ph.D. candidate, Botany Department,
> Nelson Mandela Metropolitan University,
> Port Elizabeth, SA.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] RuleFit & quantreg: partial dependence plots; showing an effect

RKoenker


On Dec 20, 2006, at 8:43 AM, Ravi Varadhan wrote:

> Dear Roger,
>
> Is it possible to combine the two ideas that you mentioned: (1)  
> algorithmic
> approaches of Breiman, Friedman, and others that achieve  
> flexibility in the
> predictor space, and (2) robust and flexible regression like QR  
> that achieve
> flexibility in the response space, so as to achieve complete  
> flexibility?
> If it is possible, are you or anyone else in the R community  
> working on
> this?
>
>
There are some tentative steps in this direction.  One is the rqss()  
fitting
in my quantreg package which does QR fitting with additive models
using total variation as a roughness penalty for nonlinear terms.
Another, along more tree structured lines, is Nicolai Meinshausen's
quantregforest package.

>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of roger koenker
> Sent: Wednesday, December 20, 2006 8:57 AM
> To: Mark Difford
> Cc: R-help list
> Subject: Re: [R] RuleFit & quantreg: partial dependence plots;  
> showing an
> effect
>
> They are entirely different:  Rulefit is a fiendishly clever
> combination of decision tree  formulation
> of models and L1-regularization intended to select parsimonious fits
> to very complicated
> responses yielding e.g. piecewise constant functions.  Rulefit
> estimates the  conditional
> mean of the response over the covariate space, but permits a very
> flexible, but linear in
> parameters specifications of the covariate effects on the conditional
> mean.  The quantile
> regression plotting you refer to adopts a fixed, linear specification
> for conditional quantile
> functions and given that specification depicts how the covariates
> influence the various
> conditional quantiles of the response.   Thus, roughly speaking,
> Rulefit is focused on
> flexibility in the x-space, maintaining the classical conditional
> mean objective; while
> QR is trying to be more flexible in the y-direction, and maintaining
> a fixed, linear
> in parameters specification for the covariate effects at each  
> quantile.
>
>
> url:    www.econ.uiuc.edu/~roger            Roger Koenker
> email    [hidden email]            Department of Economics
> vox:     217-333-4558                University of Illinois
> fax:       217-244-6678                Champaign, IL 61820
>
>
> On Dec 20, 2006, at 4:17 AM, Mark Difford wrote:
>
>> Dear List,
>>
>> I would greatly appreciate help on the following matter:
>>
>> The RuleFit program of Professor Friedman uses partial dependence
>> plots
>> to explore the effect of an explanatory variable on the response
>> variable, after accounting for the average effects of the other
>> variables.  The plot method [plot(summary(rq(y ~ x1 + x2,
>> t=seq(.1,.9,.05))))] of Professor Koenker's quantreg program
>> appears to
>> do the same thing.
>>
>>
>> Question:
>> Is there a difference between these two types of plot in the manner
>> in which they depict the relationship between explanatory variables
>> and the response variable ?
>>
>> Thank you inav for your help.
>>
>> Regards,
>> Mark Difford.
>>
>> -------------------------------------------------------------
>> Mark Difford
>> Ph.D. candidate, Botany Department,
>> Nelson Mandela Metropolitan University,
>> Port Elizabeth, SA.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] RuleFit & quantreg: partial dependence plots; showing an effect

Ravi Varadhan
Thanks, Roger.  These should be very useful tools.

Ravi.

----------------------------------------------------------------------------
-------

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [hidden email]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 

----------------------------------------------------------------------------
--------


-----Original Message-----
From: roger koenker [mailto:[hidden email]]
Sent: Wednesday, December 20, 2006 10:59 AM
To: Ravi Varadhan
Cc: 'Mark Difford'; 'R-help list'
Subject: Re: [R] RuleFit & quantreg: partial dependence plots; showing an
effect



On Dec 20, 2006, at 8:43 AM, Ravi Varadhan wrote:

> Dear Roger,
>
> Is it possible to combine the two ideas that you mentioned: (1)  
> algorithmic
> approaches of Breiman, Friedman, and others that achieve  
> flexibility in the
> predictor space, and (2) robust and flexible regression like QR  
> that achieve
> flexibility in the response space, so as to achieve complete  
> flexibility?
> If it is possible, are you or anyone else in the R community  
> working on
> this?
>
>
There are some tentative steps in this direction.  One is the rqss()  
fitting
in my quantreg package which does QR fitting with additive models
using total variation as a roughness penalty for nonlinear terms.
Another, along more tree structured lines, is Nicolai Meinshausen's
quantregforest package.

>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of roger koenker
> Sent: Wednesday, December 20, 2006 8:57 AM
> To: Mark Difford
> Cc: R-help list
> Subject: Re: [R] RuleFit & quantreg: partial dependence plots;  
> showing an
> effect
>
> They are entirely different:  Rulefit is a fiendishly clever
> combination of decision tree  formulation
> of models and L1-regularization intended to select parsimonious fits
> to very complicated
> responses yielding e.g. piecewise constant functions.  Rulefit
> estimates the  conditional
> mean of the response over the covariate space, but permits a very
> flexible, but linear in
> parameters specifications of the covariate effects on the conditional
> mean.  The quantile
> regression plotting you refer to adopts a fixed, linear specification
> for conditional quantile
> functions and given that specification depicts how the covariates
> influence the various
> conditional quantiles of the response.   Thus, roughly speaking,
> Rulefit is focused on
> flexibility in the x-space, maintaining the classical conditional
> mean objective; while
> QR is trying to be more flexible in the y-direction, and maintaining
> a fixed, linear
> in parameters specification for the covariate effects at each  
> quantile.
>
>
> url:    www.econ.uiuc.edu/~roger            Roger Koenker
> email    [hidden email]            Department of Economics
> vox:     217-333-4558                University of Illinois
> fax:       217-244-6678                Champaign, IL 61820
>
>
> On Dec 20, 2006, at 4:17 AM, Mark Difford wrote:
>
>> Dear List,
>>
>> I would greatly appreciate help on the following matter:
>>
>> The RuleFit program of Professor Friedman uses partial dependence
>> plots
>> to explore the effect of an explanatory variable on the response
>> variable, after accounting for the average effects of the other
>> variables.  The plot method [plot(summary(rq(y ~ x1 + x2,
>> t=seq(.1,.9,.05))))] of Professor Koenker's quantreg program
>> appears to
>> do the same thing.
>>
>>
>> Question:
>> Is there a difference between these two types of plot in the manner
>> in which they depict the relationship between explanatory variables
>> and the response variable ?
>>
>> Thank you inav for your help.
>>
>> Regards,
>> Mark Difford.
>>
>> -------------------------------------------------------------
>> Mark Difford
>> Ph.D. candidate, Botany Department,
>> Nelson Mandela Metropolitan University,
>> Port Elizabeth, SA.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] RuleFit & quantreg: partial dependence plots; showing an effect

Mark Difford
In reply to this post by Mark Difford
Dear Professors Koenker and Varadhan,

Thank you for your detailed and engaging replies.  The (very) muddy waters clear slowly, but only if I keep moving my hands!

Kind regards,
Mark Difford.
 
Mark Difford
Ph.D. candidate, Botany Department,
Nelson Mandela Metropolitan University,
Port Elizabeth, SA.

----- Original Message ----
From: roger koenker <[hidden email]>
To: Mark Difford <[hidden email]>
Cc: R-help list <[hidden email]>
Sent: Wednesday, 20 December, 2006 3:57:02 PM
Subject: Re: [R] RuleFit & quantreg: partial dependence plots; showing an effect

They are entirely different:  Rulefit is a fiendishly clever  
combination of decision tree  formulation
of models and L1-regularization intended to select parsimonious fits  
to very complicated
responses yielding e.g. piecewise constant functions.  Rulefit  
estimates the  conditional
mean of the response over the covariate space, but permits a very  
flexible, but linear in
parameters specifications of the covariate effects on the conditional  
mean.  The quantile
regression plotting you refer to adopts a fixed, linear specification  
for conditional quantile
functions and given that specification depicts how the covariates  
influence the various
conditional quantiles of the response.   Thus, roughly speaking,  
Rulefit is focused on
flexibility in the x-space, maintaining the classical conditional  
mean objective; while
QR is trying to be more flexible in the y-direction, and maintaining  
a fixed, linear
in parameters specification for the covariate effects at each quantile.


url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    [hidden email]            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Champaign, IL 61820


On Dec 20, 2006, at 4:17 AM, Mark Difford wrote:

> Dear List,
>
> I would greatly appreciate help on the following matter:
>
> The RuleFit program of Professor Friedman uses partial dependence  
> plots
> to explore the effect of an explanatory variable on the response
> variable, after accounting for the average effects of the other
> variables.  The plot method [plot(summary(rq(y ~ x1 + x2,
> t=seq(.1,.9,.05))))] of Professor Koenker's quantreg program  
> appears to
> do the same thing.
>
>
> Question:
> Is there a difference between these two types of plot in the manner  
> in which they depict the relationship between explanatory variables  
> and the response variable ?
>
> Thank you inav for your help.
>
> Regards,
> Mark Difford.
>
> -------------------------------------------------------------
> Mark Difford
> Ph.D. candidate, Botany Department,
> Nelson Mandela Metropolitan University,
> Port Elizabeth, SA.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.





Send instant messages to your online friends http://uk.messenger.yahoo.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa