Quantcast

Package to remove collinear variables

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Package to remove collinear variables

Roberto
Hi,
I need to remove collinear variables to my Near-Infrared table of spectra.

What package can I use?

Something simple, because I am a novice about statistic.

Thank you.

Best regards,
Roberto
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Package to remove collinear variables

Uwe Ligges-3


On 05.08.2012 05:27, Roberto wrote:
> Hi,
> I need to remove collinear variables to my Near-Infrared table of spectra.
>
> What package can I use?
>
> Something simple, because I am a novice about statistic.


Remove those where

isTRUE(all.equal(cor(x, y), 1))

is TRUE?

Uwe Ligges


>
> Thank you.
>
> Best regards,
> Roberto
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Package to remove collinear variables

Roberto
I do not know, because I tried to use rfe function (Backwards Feature Selection, Caret Package) to select wavelengths useful for a prediction model. Otherwise, rfe function give me back a lot of warning messages about collinearity between variables.

So, I do not know if your script can be useful.
I tried to use VIF-Regression to select variables, but rfe function advise me with the same warning messages again.

What do you think about that?

Thank you very much for your help.

Best,
Roberto
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Package to remove collinear variables

Jeff Newmiller
There is no "magic bullet" (package) for your problem. You must either learn enough statistics to understand how to analyze your data, or consult with someone who does.

FWIW collinearity is not in general amenable to automatic removal. However, you can identify which inputs are collinear with each other, and omit the redundant ones next iteration of your analysis, using (for example) the approach suggested by Uwe.  Deciding WHICH of the redundant inputs is most appropriate to keep is the part computers are not so good at... that is where you must be smarter or more creative than the computer.

Also, it would help you get responses if you included the context (earlier discussion) in your replies.. most people do not use Nabble here. Reading and following the requests in the footer of every message will also help.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.

Roberto <[hidden email]> wrote:

>I do not know, because I tried to use rfe function (Backwards Feature
>Selection, Caret Package) to select wavelengths useful for a prediction
>model. Otherwise, rfe function give me back a lot of warning messages
>about
>collinearity between variables.
>
>So, I do not know if your script can be useful.
>I tried to use VIF-Regression to select variables, but rfe function
>advise
>me with the same warning messages again.
>
>What do you think about that?
>
>Thank you very much for your help.
>
>Best,
>Roberto
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html
>Sent from the R help mailing list archive at Nabble.com.
>
>______________________________________________
>[hidden email] mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Package to remove collinear variables

Roberto
Hi,
thank you for your help. I know, I need to learn enough statistics to
understand how to process my data. The reason because of I write on this
forum is to ask to people a way to learn.
I am a postharvest researcher and statistic is not my main field, so I
try to do my best.

Do you know a book (or literature) than can help me?

Thank you very much for your time and suggestions.

Best regards,
Roberto

Il 05/08/2012 12:55, Jeff Newmiller ha scritto:

> There is no "magic bullet" (package) for your problem. You must either learn enough statistics to understand how to analyze your data, or consult with someone who does.
>
> FWIW collinearity is not in general amenable to automatic removal. However, you can identify which inputs are collinear with each other, and omit the redundant ones next iteration of your analysis, using (for example) the approach suggested by Uwe.  Deciding WHICH of the redundant inputs is most appropriate to keep is the part computers are not so good at... that is where you must be smarter or more creative than the computer.
>
> Also, it would help you get responses if you included the context (earlier discussion) in your replies.. most people do not use Nabble here. Reading and following the requests in the footer of every message will also help.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
>                                        Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> Roberto <[hidden email]> wrote:
>
>> I do not know, because I tried to use rfe function (Backwards Feature
>> Selection, Caret Package) to select wavelengths useful for a prediction
>> model. Otherwise, rfe function give me back a lot of warning messages
>> about
>> collinearity between variables.
>>
>> So, I do not know if your script can be useful.
>> I tried to use VIF-Regression to select variables, but rfe function
>> advise
>> me with the same warning messages again.
>>
>> What do you think about that?
>>
>> Thank you very much for your help.
>>
>> Best,
>> Roberto
>>
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Package to remove collinear variables

Gabor Grothendieck
In reply to this post by Roberto
On Sat, Aug 4, 2012 at 11:27 PM, Roberto <[hidden email]> wrote:
> Hi,
> I need to remove collinear variables to my Near-Infrared table of spectra.
>
> What package can I use?
>
> Something simple, because I am a novice about statistic.
>


There many methods of assessing multicollinearlity but to pick one
that has a good help page try vif in the HH package. (There are also
other packages that have implemented vif or variations of it.)


--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Package to remove collinear variables

John C Frain
In reply to this post by Roberto
For background have a look at http://en.wikipedia.org/wiki/Multicollinearity.

I have also used

Regression Diagnostics: Identifying Influential Data and Sources of
Collinearity (Wiley Series in Probability and Statistics) by David A.
Belsley, Edwin Kuh and Roy E. Welsch

Sections 1.9 to 1.12 of
Hands-On Intermediate Econometrics Using R: Templates for Extending
Dozens of Practical Examples [With CDROM] by Hrishikesh D. Vinod
(2008)

Basically how you proceed depends a lot on what you are trying to achieve.

Best Regards John

On 5 August 2012 23:04, Roberto Moscetti <[hidden email]> wrote:

> Hi,
> thank you for your help. I know, I need to learn enough statistics to
> understand how to process my data. The reason because of I write on this
> forum is to ask to people a way to learn.
> I am a postharvest researcher and statistic is not my main field, so I try
> to do my best.
>
> Do you know a book (or literature) than can help me?
>
> Thank you very much for your time and suggestions.
>
> Best regards,
> Roberto
>
> Il 05/08/2012 12:55, Jeff Newmiller ha scritto:
>
>> There is no "magic bullet" (package) for your problem. You must either
>> learn enough statistics to understand how to analyze your data, or consult
>> with someone who does.
>>
>> FWIW collinearity is not in general amenable to automatic removal.
>> However, you can identify which inputs are collinear with each other, and
>> omit the redundant ones next iteration of your analysis, using (for example)
>> the approach suggested by Uwe.  Deciding WHICH of the redundant inputs is
>> most appropriate to keep is the part computers are not so good at... that is
>> where you must be smarter or more creative than the computer.
>>
>> Also, it would help you get responses if you included the context (earlier
>> discussion) in your replies.. most people do not use Nabble here. Reading
>> and following the requests in the footer of every message will also help.
>>
>> ---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go
>> Live...
>> DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live
>> Go...
>>                                        Live:   OO#.. Dead: OO#..  Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#.
>> rocks...1k
>>
>> ---------------------------------------------------------------------------
>> Sent from my phone. Please excuse my brevity.
>>
>> Roberto <[hidden email]> wrote:
>>
>>> I do not know, because I tried to use rfe function (Backwards Feature
>>> Selection, Caret Package) to select wavelengths useful for a prediction
>>> model. Otherwise, rfe function give me back a lot of warning messages
>>> about
>>> collinearity between variables.
>>>
>>> So, I do not know if your script can be useful.
>>> I tried to use VIF-Regression to select variables, but rfe function
>>> advise
>>> me with the same warning messages again.
>>>
>>> What do you think about that?
>>>
>>> Thank you very much for your help.
>>>
>>> Best,
>>> Roberto
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>>
>>> http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
John C Frain
Economics Department
Trinity College Dublin
Dublin 2
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:[hidden email]
mailto:[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Package to remove collinear variables

Roberto
Thank you very much for your help.
Really appreciated.

Best regards,
Roberto

Il 07/08/2012 12:18, John C Frain ha scritto:

> For background have a look at http://en.wikipedia.org/wiki/Multicollinearity.
>
> I have also used
>
> Regression Diagnostics: Identifying Influential Data and Sources of
> Collinearity (Wiley Series in Probability and Statistics) by David A.
> Belsley, Edwin Kuh and Roy E. Welsch
>
> Sections 1.9 to 1.12 of
> Hands-On Intermediate Econometrics Using R: Templates for Extending
> Dozens of Practical Examples [With CDROM] by Hrishikesh D. Vinod
> (2008)
>
> Basically how you proceed depends a lot on what you are trying to achieve.
>
> Best Regards John
>
> On 5 August 2012 23:04, Roberto Moscetti <[hidden email]> wrote:
>> Hi,
>> thank you for your help. I know, I need to learn enough statistics to
>> understand how to process my data. The reason because of I write on this
>> forum is to ask to people a way to learn.
>> I am a postharvest researcher and statistic is not my main field, so I try
>> to do my best.
>>
>> Do you know a book (or literature) than can help me?
>>
>> Thank you very much for your time and suggestions.
>>
>> Best regards,
>> Roberto
>>
>> Il 05/08/2012 12:55, Jeff Newmiller ha scritto:
>>
>>> There is no "magic bullet" (package) for your problem. You must either
>>> learn enough statistics to understand how to analyze your data, or consult
>>> with someone who does.
>>>
>>> FWIW collinearity is not in general amenable to automatic removal.
>>> However, you can identify which inputs are collinear with each other, and
>>> omit the redundant ones next iteration of your analysis, using (for example)
>>> the approach suggested by Uwe.  Deciding WHICH of the redundant inputs is
>>> most appropriate to keep is the part computers are not so good at... that is
>>> where you must be smarter or more creative than the computer.
>>>
>>> Also, it would help you get responses if you included the context (earlier
>>> discussion) in your replies.. most people do not use Nabble here. Reading
>>> and following the requests in the footer of every message will also help.
>>>
>>> ---------------------------------------------------------------------------
>>> Jeff Newmiller                        The     .....       .....  Go
>>> Live...
>>> DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live
>>> Go...
>>>                                         Live:   OO#.. Dead: OO#..  Playing
>>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>>> /Software/Embedded Controllers)               .OO#.       .OO#.
>>> rocks...1k
>>>
>>> ---------------------------------------------------------------------------
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> Roberto <[hidden email]> wrote:
>>>
>>>> I do not know, because I tried to use rfe function (Backwards Feature
>>>> Selection, Caret Package) to select wavelengths useful for a prediction
>>>> model. Otherwise, rfe function give me back a lot of warning messages
>>>> about
>>>> collinearity between variables.
>>>>
>>>> So, I do not know if your script can be useful.
>>>> I tried to use VIF-Regression to select variables, but rfe function
>>>> advise
>>>> me with the same warning messages again.
>>>>
>>>> What do you think about that?
>>>>
>>>> Thank you very much for your help.
>>>>
>>>> Best,
>>>> Roberto
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>>
>>>> http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>


--
Roberto Moscetti
PhD Student
University of Tuscia
Viterbo, Italy
----------------
Mobile +39 346 8041267
Phone  +39 0761 357415

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...