Quantcast

Fitting data and removing outliers

classic Classic list List threaded Threaded
5 messages Options
ccm
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Fitting data and removing outliers

ccm
This post was updated on .
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Fitting data and removing outliers

David Carlson
I didn't actually see any question in this posting, but instead of removing the outliers consider using a robust linear model.

library(MASS)
?rlm

The TeachingDemos package has a data set called outliers to show what can happen when you iteratively remove "outliers" in the way you suggest.

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


----- Original Message -----

From: "Lauren Vogric" <[hidden email]>
To: [hidden email]
Sent: Friday, July 13, 2012 1:36:43 PM
Subject: [R] Fitting data and removing outliers

What I'm trying to do is create best fit line in R for a set of data points and then remove all the outliers to re-create a best fit. I can't use IQR because the outliers I have in mind are easily within the range, but way out of line for the best fit, which is ruining the fit. I'd rather throw out those points all together.

Thanks!

[[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Fitting data and removing outliers

ssefick
Do you have a good reason to throw these points out?

On Fri, Jul 13, 2012 at 2:17 PM, David L Carlson <[hidden email]> wrote:

> I didn't actually see any question in this posting, but instead of removing the outliers consider using a robust linear model.
>
> library(MASS)
> ?rlm
>
> The TeachingDemos package has a data set called outliers to show what can happen when you iteratively remove "outliers" in the way you suggest.
>
> -------------------------------------
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
> ----- Original Message -----
>
> From: "Lauren Vogric" <[hidden email]>
> To: [hidden email]
> Sent: Friday, July 13, 2012 1:36:43 PM
> Subject: [R] Fitting data and removing outliers
>
> What I'm trying to do is create best fit line in R for a set of data points and then remove all the outliers to re-create a best fit. I can't use IQR because the outliers I have in mind are easily within the range, but way out of line for the best fit, which is ruining the fit. I'd rather throw out those points all together.
>
> Thanks!
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Stephen Sefick
**************************************************
Auburn University
Biological Sciences
331 Funchess Hall
Auburn, Alabama
36849
**************************************************
[hidden email]
http://www.auburn.edu/~sas0025
**************************************************

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

                                -K. Mullis

"A big computer, a complex algorithm and a long time does not equal science."

                              -Robert Gentleman

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Fitting data and removing outliers

ssefick
They are due to measurement error, sample of a different population, or
... ?  What is the unusual event?  Does it explain something important
about the system that you are working on?  I am not telling you not to
do what you are doing, but just writing things that I consider when I am
doing regression modelling.
FWIW,

Stephen

On 07/13/2012 02:26 PM, Lauren Vogric wrote:

> Yes, they are unusual events that occurred that affected my data. They have no positive affect in shaping a strong model.
>
> -----Original Message-----
> From: stephen sefick [mailto:[hidden email]]
> Sent: Friday, July 13, 2012 3:24 PM
> To: David L Carlson
> Cc: Lauren Vogric; [hidden email]
> Subject: Re: [R] Fitting data and removing outliers
>
> Do you have a good reason to throw these points out?
>
> On Fri, Jul 13, 2012 at 2:17 PM, David L Carlson <[hidden email]> wrote:
>> I didn't actually see any question in this posting, but instead of removing the outliers consider using a robust linear model.
>>
>> library(MASS)
>> ?rlm
>>
>> The TeachingDemos package has a data set called outliers to show what can happen when you iteratively remove "outliers" in the way you suggest.
>>
>> -------------------------------------
>> David L Carlson
>> Associate Professor of Anthropology
>> Texas A&M University
>> College Station, TX 77840-4352
>>
>>
>> ----- Original Message -----
>>
>> From: "Lauren Vogric" <[hidden email]>
>> To: [hidden email]
>> Sent: Friday, July 13, 2012 1:36:43 PM
>> Subject: [R] Fitting data and removing outliers
>>
>> What I'm trying to do is create best fit line in R for a set of data points and then remove all the outliers to re-create a best fit. I can't use IQR because the outliers I have in mind are easily within the range, but way out of line for the best fit, which is ruining the fit. I'd rather throw out those points all together.
>>
>> Thanks!
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> --
> Stephen Sefick
> **************************************************
> Auburn University
> Biological Sciences
> 331 Funchess Hall
> Auburn, Alabama
> 36849
> **************************************************
> [hidden email]
> http://www.auburn.edu/~sas0025
> **************************************************
>
> Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods.  We are mammals, and have not exhausted the annoying little problems of being mammals.
>
>                                  -K. Mullis
>
> "A big computer, a complex algorithm and a long time does not equal science."
>
>                                -Robert Gentleman

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Fitting data and removing outliers

David Carlson
If not linear, then perhaps nlrob() in package robustbase.

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


----- Original Message -----

From: "Stephen Sefick" <[hidden email]>
To: "Lauren Vogric" <[hidden email]>, [hidden email]
Sent: Friday, July 13, 2012 3:15:25 PM
Subject: Re: [R] Fitting data and removing outliers

They are due to measurement error, sample of a different population, or
... ? What is the unusual event? Does it explain something important
about the system that you are working on? I am not telling you not to
do what you are doing, but just writing things that I consider when I am
doing regression modelling.
FWIW,

Stephen

On 07/13/2012 02:26 PM, Lauren Vogric wrote:

> Yes, they are unusual events that occurred that affected my data. They have no positive affect in shaping a strong model.
>
> -----Original Message-----
> From: stephen sefick [mailto:[hidden email]]
> Sent: Friday, July 13, 2012 3:24 PM
> To: David L Carlson
> Cc: Lauren Vogric; [hidden email]
> Subject: Re: [R] Fitting data and removing outliers
>
> Do you have a good reason to throw these points out?
>
> On Fri, Jul 13, 2012 at 2:17 PM, David L Carlson <[hidden email]> wrote:
>> I didn't actually see any question in this posting, but instead of removing the outliers consider using a robust linear model.
>>
>> library(MASS)
>> ?rlm
>>
>> The TeachingDemos package has a data set called outliers to show what can happen when you iteratively remove "outliers" in the way you suggest.
>>
>> -------------------------------------
>> David L Carlson
>> Associate Professor of Anthropology
>> Texas A&M University
>> College Station, TX 77840-4352
>>
>>
>> ----- Original Message -----
>>
>> From: "Lauren Vogric" <[hidden email]>
>> To: [hidden email]
>> Sent: Friday, July 13, 2012 1:36:43 PM
>> Subject: [R] Fitting data and removing outliers
>>
>> What I'm trying to do is create best fit line in R for a set of data points and then remove all the outliers to re-create a best fit. I can't use IQR because the outliers I have in mind are easily within the range, but way out of line for the best fit, which is ruining the fit. I'd rather throw out those points all together.
>>
>> Thanks!
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> --
> Stephen Sefick
> **************************************************
> Auburn University
> Biological Sciences
> 331 Funchess Hall
> Auburn, Alabama
> 36849
> **************************************************
> [hidden email]
> http://www.auburn.edu/~sas0025 
> **************************************************
>
> Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals.
>
> -K. Mullis
>
> "A big computer, a complex algorithm and a long time does not equal science."
>
> -Robert Gentleman

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...