|
Hi,
I am new to R and I would like to get your help in finding 'outliers'. I have mvoutlier package installed in my system and added the package . But I not able find a function from 'mvoutlier' package which will identify 'outliers'. This is the sample list of data I have got which has one out-lier. 11489 11008 11873 80000000 9558 8645 8024 8371 It will be of great help if somebody have got an example script for the same. Thanks & Regards, Thomas [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Hi
I had not see any answer yet but maybe there is nobody who wants to touch the elusive object of "outlier". Neither me, but here are some ideas how one can proceed. First of all its always up to you what is considered an outlier and how will you deal with them. I usually call an outlier any item which does not fit to the pattern and the pattern is usually best observed by some plotting function. You can identify outlier points, inspect the data source, correct typing mistakes and only if the value is really measured and you can not find any reason why it has such value it is real outlier. Then ***you*** need to decide what to do with it - discard, can come from some long tailed distribution, ... So here are my 0.02$ regarding an outlier theme. Regards Petr > > Hi, > I am new to R and I would like to get your help in finding > 'outliers'. > I have mvoutlier package installed in my system and added the package . > But I not able find a function from 'mvoutlier' package which will identify > 'outliers'. > This is the sample list of data I have got which has one out-lier. > 11489 11008 11873 80000000 9558 8645 8024 8371 It will be of > great help if somebody have got an example script for the same. > > Thanks & Regards, > Thomas > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Petr et. al:
FWIW (probably not much). As you know, tens of thousands of pages about "outliers" have been written by statisticians. IMHO, it is another of the really terrible ideas of our discipline and has led to much scientific abuse, as indicated by this posting. For this reason, I have eliminated it from my vocabulary, using instead "unusual" or "unexpected" values, whose meaning and purpose is pretty much as you described -- to bring the user's attention to data issues that may require investigation and intervention. By eliminating the term, I feel it excises the notion that there can somehow be statistical tests (alone) that can, irrespective of scientific context, statistically identify "illegitimate" data. A really dangerous and pernicious idea imho. Best, Bert On Thu, May 17, 2012 at 6:44 AM, Petr PIKAL <[hidden email]> wrote: > Hi > > I had not see any answer yet but maybe there is nobody who wants to touch > the elusive object of "outlier". Neither me, but here are some ideas how > one can proceed. > > First of all its always up to you what is considered an outlier and how > will you deal with them. > > I usually call an outlier any item which does not fit to the pattern and > the pattern is usually best observed by some plotting function. You can > identify outlier points, inspect the data source, correct typing mistakes > and only if the value is really measured and you can not find any reason > why it has such value it is real outlier. Then ***you*** need to decide > what to do with it - discard, can come from some long tailed distribution, > ... > > So here are my 0.02$ regarding an outlier theme. > > Regards > Petr > >> >> Hi, >> I am new to R and I would like to get your help in finding >> 'outliers'. >> I have mvoutlier package installed in my system and added the package . >> But I not able find a function from 'mvoutlier' package which will > identify >> 'outliers'. >> This is the sample list of data I have got which has one out-lier. >> 11489 11008 11873 80000000 9558 8645 8024 8371 It will be of >> great help if somebody have got an example script for the same. >> >> Thanks & Regards, >> Thomas >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Prakash Thomas
Petr and Bert offer sound advice. At the risk of getting completely
ostracized, here's how you could find outliers using the definition of 'outlier' used by R's boxplot function and at the same time see your data. > dat = c(11489, 11008, 11873, 80000000, 9558, 8645, 8024, 8371) > a = boxplot(dat) > a$out [1] 8e+07 -----Original Message----- From: Prakash Thomas Sent: Tuesday, May 15, 2012 7:00 AM To: [hidden email] Subject: [R] how to find outliers from the list of values Hi, I am new to R and I would like to get your help in finding 'outliers'. I have mvoutlier package installed in my system and added the package . But I not able find a function from 'mvoutlier' package which will identify 'outliers'. This is the sample list of data I have got which has one out-lier. 11489 11008 11873 80000000 9558 8645 8024 8371 It will be of great help if somebody have got an example script for the same. Thanks & Regards, Thomas [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ------------------------------------------ Robert W. Baer, Ph.D. Professor of Physiology Kirksville College of Osteopathic Medicine A. T. Still University of Health Sciences 800 W. Jefferson St. Kirksville, MO 63501 660-626-2322 FAX 660-626-2965 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Thank you Robert,Petr and Bert.
I know my question is generic one, which does specify what is range to values which comes as "unusual" value. I start with R a week ago and trying to learn it. Can somebody tell me how to set the range (say >7000000) for the outlier value? Thanks & Regards, Prakash On Thu, May 17, 2012 at 9:38 PM, Robert Baer <[hidden email]> wrote: > Petr and Bert offer sound advice. At the risk of getting completely > ostracized, here's how you could find outliers using the definition of > 'outlier' used by R's boxplot function and at the same time see your data. > > dat = c(11489, 11008, 11873, 80000000, 9558, 8645, 8024, 8371) >> a = boxplot(dat) >> a$out >> > [1] 8e+07 > > > > -----Original Message----- From: Prakash Thomas > Sent: Tuesday, May 15, 2012 7:00 AM > To: [hidden email] > Subject: [R] how to find outliers from the list of values > > > Hi, > I am new to R and I would like to get your help in finding > 'outliers'. > I have mvoutlier package installed in my system and added the package . > But I not able find a function from 'mvoutlier' package which will identify > 'outliers'. > This is the sample list of data I have got which has one out-lier. > 11489 11008 11873 80000000 9558 8645 8024 8371 It will be of > great help if somebody have got an example script for the same. > > Thanks & Regards, > Thomas > > [[alternative HTML version deleted]] > > ______________________________**________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > > ------------------------------**------------ > Robert W. Baer, Ph.D. > Professor of Physiology > Kirksville College of Osteopathic Medicine > A. T. Still University of Health Sciences > 800 W. Jefferson St. > Kirksville, MO 63501 > 660-626-2322 > FAX 660-626-2965 > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
