How to select a subset data to do a barplot in ggplot2

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How to select a subset data to do a barplot in ggplot2

Yao He
Hi,everybody

I have a dataframe like this

FID IID STATUS
1    4621    live
1    4628    dead
2    4631    live
2    4632    live
2    4633    live
2    4634    live
6    4675    live
6    4679    dead
10    4716    dead
10    4719    live
10    4721    dead
11    4726    live
11    4728    nosperm
11    4730    nosperm
12    4732    live
17    4783    live
17    4783    live
17    4784    live

I just want a barblot to count "live" or "dead" in every "FID", and fill
the bar with different colour.

I try these codes:

p<-ggplot(data,aes(x=FID));
p+geom_bar(aes(x=factor(FID),y=..count..,fill=STATUS))

But how could I exclude "nosperm" or other levels just in the use of
ggplot2 without generating another dataframe

Thanks a lot

Yao He
—————————————————————————
Master candidate in 2rd year
Department of Animal genetics & breeding
Room 436,College of Animial Science&Technology,
China Agriculture University,Beijing,100193
E-mail: [hidden email] <[hidden email]>
——————————————————————————

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How to select a subset data to do a barplot in ggplot2

arun kirshna
Hi,

May be this:
p<-ggplot(subset(dat1,STATUS!="nosperm"),aes(x=FID))
 p+geom_bar(aes(x=factor(FID),y=..count..,fill=STATUS))
A.K.




----- Original Message -----
From: Yao He <[hidden email]>
To: [hidden email]
Cc:
Sent: Thursday, December 13, 2012 7:38 AM
Subject: [R] How to select a subset data to do a barplot in ggplot2

Hi,everybody

I have a dataframe like this

FID IID STATUS
1    4621    live
1    4628    dead
2    4631    live
2    4632    live
2    4633    live
2    4634    live
6    4675    live
6    4679    dead
10    4716    dead
10    4719    live
10    4721    dead
11    4726    live
11    4728    nosperm
11    4730    nosperm
12    4732    live
17    4783    live
17    4783    live
17    4784    live

I just want a barblot to count "live" or "dead" in every "FID", and fill
the bar with different colour.

I try these codes:

p<-ggplot(data,aes(x=FID));
p+geom_bar(aes(x=factor(FID),y=..count..,fill=STATUS))

But how could I exclude "nosperm" or other levels just in the use of
ggplot2 without generating another dataframe

Thanks a lot

Yao He
—————————————————————————
Master candidate in 2rd year
Department of Animal genetics & breeding
Room 436,College of Animial Science&Technology,
China Agriculture University,Beijing,100193
E-mail: [hidden email] <[hidden email]>
——————————————————————————

    [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How to select a subset data to do a barplot in ggplot2

djmuseR
In reply to this post by Yao He
Hi:

The simplest way to do it is to modify the input data frame by taking
out the records not having status live or dead and then redefining the
factor in the new data frame to get rid of the removed levels. Calling
your input data frame DF rather than data,

DF <- structure(list(FID = c(1L, 1L, 2L, 2L, 2L, 2L, 6L, 6L, 10L, 10L,
10L, 11L, 11L, 11L, 12L, 17L, 17L, 17L), IID = c(4621L, 4628L,
4631L, 4632L, 4633L, 4634L, 4675L, 4679L, 4716L, 4719L, 4721L,
4726L, 4728L, 4730L, 4732L, 4783L, 4783L, 4784L), STATUS = structure(c(2L,
1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 3L, 2L, 2L, 2L,
2L), .Label = c("dead", "live", "nosperm"), class = "factor")), .Names
= c("FID",
"IID", "STATUS"), class = "data.frame", row.names = c(NA, -18L
))

# The right hand side above came from dput(DF), where DF was created by
# DF <- read.table(textConnection("<your posted data>"), header = TRUE)
# Consider using dput() to represent your data in the future.

# Retain the records with status live or dead only
DF2 <- DF[DF$STATUS %in% c("live", "dead"), ]

# This does not get rid of the original levels...
levels(DF2$STATUS)
# ...so redefine the factor
DF2$STATUS <- factor(DF2$STATUS)

> str(DF2)
'data.frame':   16 obs. of  3 variables:
 $ FID   : int  1 1 2 2 2 2 6 6 10 10 ...
 $ IID   : int  4621 4628 4631 4632 4633 4634 4675 4679 4716 4719 ...
 $ STATUS: Factor w/ 2 levels "dead","live": 2 1 2 2 2 2 2 1 1 2 ...

# now plot:

# (1) FID numeric
ggplot(DF2, aes(x = FID, fill = STATUS)) + geom_bar()

# (2) FID factor
ggplot(DF2, aes(x = factor(FID), fill = STATUS)) + geom_bar()

The second one makes more sense to me, but you may have reasons to
prefer the first.

Dennis

On Thu, Dec 13, 2012 at 4:38 AM, Yao He <[hidden email]> wrote:

> FID IID STATUS
> 1    4621    live
> 1    4628    dead
> 2    4631    live
> 2    4632    live
> 2    4633    live
> 2    4634    live
> 6    4675    live
> 6    4679    dead
> 10    4716    dead
> 10    4719    live
> 10    4721    dead
> 11    4726    live
> 11    4728    nosperm
> 11    4730    nosperm
> 12    4732    live
> 17    4783    live
> 17    4783    live
> 17    4784    live

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.