# drop levels problem

 Classic List Threaded
5 messages
Reply | Threaded
Open this post in threaded view
|

## drop levels problem

 Hi all: I am having trouble dropping levels, got a few hints online without success. Please consider the dataset below:  I was under the inpression that subset(......drop=TRUE) would work but it doesn't library(ggplot2)     library(hmisc) x <- structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232, 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056, 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894, 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766, 40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419, 42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634, 38.7079, 38.0791, 44.3136, 39.5333)), .Names = c("first", "second", "third"), class = "data.frame", row.names = c(NA, -14L))  head(x);str(x) xmelt <- melt(x)  names(xmelt) <- c("year","fatPerc")   # Year variable is a factor with three levels  # Subset to plot only 'first' year firstyear <- subset(xmelt,year=='first');str(firstyear) # Plot showing three levels still after I made the subset   ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter() # Try to drop the levels but dropUnusedLevels() doesn't seem to work here   dropUnusedLevels() ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter() # code below also should drop levels but it doesn't #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)} else{x})) str(firstyear)   Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish & Wildlife Service California, USA ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

## Re: drop levels problem

 Hi Felipe, On Mon, Nov 29, 2010 at 11:01 AM, Felipe Carrillo <[hidden email]> wrote: > Hi all: > I am having trouble dropping levels, got a few hints online without success. > Please consider the dataset below: >  I was under the inpression that subset(......drop=TRUE) would work but it > doesn't Here drop is referring to: data.frame(1:10)[, 1] data.frame(1:10)[, 1, drop = FALSE] not to levels of a factor. > > library(ggplot2) >     library(hmisc) > > x <- structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232, > 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056, > 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894, > 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766, > 40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419, > 42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634, > 38.7079, 38.0791, 44.3136, 39.5333)), .Names = c("first", "second", > "third"), class = "data.frame", row.names = c(NA, -14L)) Thanks for the nice example! > >  head(x);str(x) > xmelt <- melt(x) >  names(xmelt) <- c("year","fatPerc") > >   # Year variable is a factor with three levels >  # Subset to plot only 'first' year > firstyear <- subset(xmelt,year=='first');str(firstyear) > # Plot showing three levels still after I made the subset >   ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter() right, because it is possible to have levels of a factor that have no observations---sometimes these are the most interesting (e.g., if you subset by smoking and found that there were no instances of lung cancer in non-smokers (not that extreme, but you get the point)). > > # Try to drop the levels but dropUnusedLevels() doesn't seem to work here >   dropUnusedLevels() sorry, I have had some difficulty installing Hmisc on my linux system and never gotten around to working it out. > ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter() > > # code below also should drop levels but it doesn't > #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)} > else{x})) it would if you assigned it back to firstyear.  You do it, and then just print to screen and the changed data goes off to oblivion. firstyear <- data.frame(lapply(firstyear, function(x) if(is.factor(x)) {factor(x)} else {x})) str(firstyear) # should now just have one level Cheers, Josh > str(firstyear) > > Felipe D. Carrillo > Supervisory Fishery Biologist > Department of the Interior > US Fish & Wildlife Service > California, USA > > > > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

## Re: drop levels problem

 In reply to this post by Felipe Carrillo Take a look on droplevels function (R >= 2.12) On Mon, Nov 29, 2010 at 5:01 PM, Felipe Carrillo <[hidden email]>wrote: > Hi all: > I am having trouble dropping levels, got a few hints online without > success. > Please consider the dataset below: >  I was under the inpression that subset(......drop=TRUE) would work but it > doesn't > > library(ggplot2) >     library(hmisc) > > x <- structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232, > 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056, > 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894, > 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766, > 40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419, > 42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634, > 38.7079, 38.0791, 44.3136, 39.5333)), .Names = c("first", "second", > "third"), class = "data.frame", row.names = c(NA, -14L)) > >  head(x);str(x) > xmelt <- melt(x) >  names(xmelt) <- c("year","fatPerc") > >   # Year variable is a factor with three levels >  # Subset to plot only 'first' year > firstyear <- subset(xmelt,year=='first');str(firstyear) > # Plot showing three levels still after I made the subset >   ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter() > > # Try to drop the levels but dropUnusedLevels() doesn't seem to work here >   dropUnusedLevels() > ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter() > > # code below also should drop levels but it doesn't > #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)} > else{x})) > str(firstyear) > > Felipe D. Carrillo > Supervisory Fishery Biologist > Department of the Interior > US Fish & Wildlife Service > California, USA > > > > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

## Re: drop levels problem

 In reply to this post by Joshua Wiley-2 Thanks Joshua, I get it now, levels sometimes drive me loco....   Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish & Wildlife Service California, USA ----- Original Message ---- > From: Joshua Wiley <[hidden email]> > To: Felipe Carrillo <[hidden email]> > Cc: [hidden email] > Sent: Mon, November 29, 2010 11:18:45 AM > Subject: Re: [R] drop levels problem > > Hi Felipe, > > On Mon, Nov 29, 2010 at 11:01 AM, Felipe Carrillo > <[hidden email]> wrote: > > Hi all: > > I am having trouble dropping levels, got a few hints online without success. > > Please consider the dataset below: > >  I was under the inpression that subset(......drop=TRUE) would work but it > > doesn't > > Here drop is referring to: > > data.frame(1:10)[, 1] > data.frame(1:10)[, 1, drop = FALSE] > > not to levels of a factor. > > > > > library(ggplot2) > >     library(hmisc) > > > > x <- structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232, > > 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056, > > 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894, > > 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766, > > 40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419, > > 42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634, > > 38.7079, 38.0791, 44.3136, 39.5333)), .Names = c("first", "second", > > "third"), class = "data.frame", row.names = c(NA, -14L)) > > Thanks for the nice example! > > > > >  head(x);str(x) > > xmelt <- melt(x) > >  names(xmelt) <- c("year","fatPerc") > > > >   # Year variable is a factor with three levels > >  # Subset to plot only 'first' year > > firstyear <- subset(xmelt,year=='first');str(firstyear) > > # Plot showing three levels still after I made the subset > >   ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter() > > right, because it is possible to have levels of a factor that have no > observations---sometimes these are the most interesting (e.g., if you > subset by smoking and found that there were no instances of lung > cancer in non-smokers (not that extreme, but you get the point)). > > > > > # Try to drop the levels but dropUnusedLevels() doesn't seem to work here > >   dropUnusedLevels() > > sorry, I have had some difficulty installing Hmisc on my linux system > and never gotten around to working it out. > > > ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter() > > > > # code below also should drop levels but it doesn't > > #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)} > > else{x})) > > it would if you assigned it back to firstyear.  You do it, and then > just print to screen and the changed data goes off to oblivion. > > firstyear <- data.frame(lapply(firstyear, function(x) if(is.factor(x)) > {factor(x)} else {x})) > str(firstyear) # should now just have one level > > Cheers, > > Josh > > > str(firstyear) > > > > Felipe D. Carrillo > > Supervisory Fishery Biologist > > Department of the Interior > > US Fish & Wildlife Service > > California, USA > > > > > > > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > University of California, Los Angeles > http://www.joshuawiley.com/> ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

## Re: drop levels problem

 In reply to this post by Joshua Wiley-2 Just to follow up on my own post a bit: xmelt\$year[xmelt\$year == "first", drop = TRUE] will do what you want.  I think because in the subset there are multiple columns not all of which are factor, the method for '[' being used is not the factor one that would drop unused levels.  I did not make that clear at all the first time around (and probably still butchered it, which some knowledgeable soul may correct me on).  Also I did get Hmisc installed, but I think dropUnusedLevels() does not work in this case for a similar reason. Henrique's solution is, as usual, the shortest :) Josh [snip] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.