Dear R Users,
I would like to know is there any package to create a plot like this? http://dl.dropbox.com/u/5409929/cs1160521f01.gif X axis is categorical. And the positions of the points are corresponding to the frequency. (similar to violinplot) Thank you. Regards, CH -- CH Chan ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Hi,
I don't understand what is being plotted here. Can you describe what you want in more detail? Best, Ista On Tue, Mar 8, 2011 at 7:27 AM, C.H. <[hidden email]> wrote: > Dear R Users, > > I would like to know is there any package to create a plot like this? > > http://dl.dropbox.com/u/5409929/cs1160521f01.gif > > X axis is categorical. And the positions of the points are > corresponding to the frequency. (similar to violinplot) > > Thank you. > > Regards, > > CH > > -- > CH Chan > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Sorry for being ambiguous.
The data are some continuous variable (Y-axis) categorized into 3 groups (Controls, Depressed EF and Preserved EF, X-asis). The band of dots on the plot is the data point. The density of dots and the "fatness" of the band present the frequency of a particular value in Y-axis. This property is similar to the violin plot: showing the probability density of the data at different values. Instead of showing a shape in violin plot, this plot shows the actual distribution of the data points. Thank you. Regards, CH On Tue, Mar 8, 2011 at 11:14 PM, Ista Zahn <[hidden email]> wrote: > Hi, > I don't understand what is being plotted here. Can you describe what > you want in more detail? > > Best, > Ista > > On Tue, Mar 8, 2011 at 7:27 AM, C.H. <[hidden email]> wrote: >> Dear R Users, >> >> I would like to know is there any package to create a plot like this? >> >> http://dl.dropbox.com/u/5409929/cs1160521f01.gif >> >> X axis is categorical. And the positions of the points are >> corresponding to the frequency. (similar to violinplot) >> >> Thank you. >> >> Regards, >> >> CH >> >> -- >> CH Chan >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology > http://yourpsyche.org > -- CH Chan ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Ista Zahn-2
I'm afraid I still don't get it (see in line below).
On Wed, Mar 9, 2011 at 1:49 AM, C.H. <[hidden email]> wrote: > Sorry for being ambiguous. > > The data are some continuous variable (Y-axis) categorized into 3 > groups (Controls, Depressed EF and Preserved EF, X-asis). OK. I'm having trouble way before this though. Even taking just one group, I still don't understand what this graphic is showing. > > The band of dots on the plot is the data point. This is where I'm getting lost. You have a *band* of dots for each data point (singular)? My head is exploding thinking about this. Why do you have mutiple dots for each point? The density of dots > and the "fatness" of the band present the frequency of a particular > value in Y-axis. But the Y-axis variable is continuous. In order to have counts at different values of y you will need to bin it somehow. And even then you will still have only *one* value for the count in each bin. So again, how and why are there muliple dots in each band (your term) / bin (my term)? This property is similar to the violin plot: showing > the probability density of the data at different values. Instead of > showing a shape in violin plot, this plot shows the actual > distribution of the data points. OK, so let's say we bin y and get the counts in each bin. We can plot a point for each point that goes into that count, but what determines the position of each point on the x-axis? It's entirely possible (even probable) that my poor little brain just can't wrap itself around the idea. Maybe someone else can chime in and help me out. Best, Ista > > Thank you. > > Regards, > > CH > > > On Tue, Mar 8, 2011 at 11:14 PM, Ista Zahn <[hidden email]> wrote: >> Hi, >> I don't understand what is being plotted here. Can you describe what >> you want in more detail? >> >> Best, >> Ista >> >> On Tue, Mar 8, 2011 at 7:27 AM, C.H. <[hidden email]> wrote: >>> Dear R Users, >>> >>> I would like to know is there any package to create a plot like this? >>> >>> http://dl.dropbox.com/u/5409929/cs1160521f01.gif >>> >>> X axis is categorical. And the positions of the points are >>> corresponding to the frequency. (similar to violinplot) >>> >>> Thank you. >>> >>> Regards, >>> >>> CH >>> >>> -- >>> CH Chan >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Ista Zahn >> Graduate student >> University of Rochester >> Department of Clinical and Social Psychology >> http://yourpsyche.org >> > > > > -- > CH Chan > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Hi CH,
Its sounds like what you are trying to do is plot continuous Y values at categorical X values and rather than simply overplotting, spread out points that would overlap on the X axis. I have been working on implementing something like this in R based on an algorithm from Leland Wilkinson. However, it is not yet release ready and does not handle multiple groups (though that is on my list down the line, but I have other problems to sort out, not the least of which is my own lack of knowledge). The attached PDF shows an example of what I mean, it is the "mpg" column from the "mtcars" dataset. If you want, you can get the source for my development code here: http://joshuawiley.com/R.aspx In any case, what about doing something like a jittered scatter plot + violin plot for now? the jittering will let you see the data points and the violin plot will get you the distribution. It should be relatively straight forward to do in ggplot2 or lattice. Cheers, Josh On Tue, Mar 8, 2011 at 6:51 PM, Ista Zahn <[hidden email]> wrote: > I'm afraid I still don't get it (see in line below). > > On Wed, Mar 9, 2011 at 1:49 AM, C.H. <[hidden email]> wrote: >> Sorry for being ambiguous. >> >> The data are some continuous variable (Y-axis) categorized into 3 >> groups (Controls, Depressed EF and Preserved EF, X-asis). > > OK. I'm having trouble way before this though. Even taking just one > group, I still don't understand what this graphic is showing. > >> >> The band of dots on the plot is the data point. > > This is where I'm getting lost. You have a *band* of dots for each > data point (singular)? My head is exploding thinking about this. Why > do you have mutiple dots for each point? > > The density of dots >> and the "fatness" of the band present the frequency of a particular >> value in Y-axis. > > But the Y-axis variable is continuous. In order to have counts at > different values of y you will need to bin it somehow. And even then > you will still have only *one* value for the count in each bin. So > again, how and why are there muliple dots in each band (your term) / > bin (my term)? > > Â This property is similar to the violin plot: showing >> the probability density of the data at different values. Instead of >> showing a shape in violin plot, this plot shows the actual >> distribution of the data points. > > OK, so let's say we bin y and get the counts in each bin. We can plot > a point for each point that goes into that count, but what determines > the position of each point on the x-axis? > > It's entirely possible (even probable) that my poor little brain just > can't wrap itself around the idea. Maybe someone else can chime in and > help me out. > > Best, > Ista > >> >> Thank you. >> >> Regards, >> >> CH >> >> >> On Tue, Mar 8, 2011 at 11:14 PM, Ista Zahn <[hidden email]> wrote: >>> Hi, >>> I don't understand what is being plotted here. Can you describe what >>> you want in more detail? >>> >>> Best, >>> Ista >>> >>> On Tue, Mar 8, 2011 at 7:27 AM, C.H. <[hidden email]> wrote: >>>> Dear R Users, >>>> >>>> I would like to know is there any package to create a plot like this? >>>> >>>> http://dl.dropbox.com/u/5409929/cs1160521f01.gif >>>> >>>> X axis is categorical. And the positions of the points are >>>> corresponding to the frequency. (similar to violinplot) >>>> >>>> Thank you. >>>> >>>> Regards, >>>> >>>> CH >>>> >>>> -- >>>> CH Chan >>>> >>>> ______________________________________________ >>>> [hidden email] mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> Ista Zahn >>> Graduate student >>> University of Rochester >>> Department of Clinical and Social Psychology >>> http://yourpsyche.org >>> >> >> >> >> -- >> CH Chan >> > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology > http://yourpsyche.org > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. stackeddotplot.pdf (6K) Download Attachment |
In reply to this post by Ista Zahn-2
On Mar 8, 2011, at 9:51 PM, Ista Zahn wrote: > I'm afraid I still don't get it (see in line below). > > On Wed, Mar 9, 2011 at 1:49 AM, C.H. <[hidden email]> wrote: >> Sorry for being ambiguous. >> >> The data are some continuous variable (Y-axis) categorized into 3 >> groups (Controls, Depressed EF and Preserved EF, X-asis). > > OK. I'm having trouble way before this though. Even taking just one > group, I still don't understand what this graphic is showing. > >> >> The band of dots on the plot is the data point. > > This is where I'm getting lost. You have a *band* of dots for each > data point (singular)? My head is exploding thinking about this. Why > do you have mutiple dots for each point? > > The density of dots >> and the "fatness" of the band present the frequency of a particular >> value in Y-axis. > > But the Y-axis variable is continuous. In order to have counts at > different values of y you will need to bin it somehow. And even then > you will still have only *one* value for the count in each bin. So > again, how and why are there muliple dots in each band (your term) / > bin (my term)? > > This property is similar to the violin plot: showing >> the probability density of the data at different values. Instead of >> showing a shape in violin plot, this plot shows the actual >> distribution of the data points. > > OK, so let's say we bin y and get the counts in each bin. We can plot > a point for each point that goes into that count, but what determines > the position of each point on the x-axis? > > It's entirely possible (even probable) that my poor little brain just > can't wrap itself around the idea. Maybe someone else can chime in and > help me out. I was think that one might use jitter on the x coordinate (which is going to be a discrete value and if its a factor then: plot(jitter(as.numeric(...)) It's not going to be in neat bands at each pseudo-continuous level. I have a fair amount of data that has values rounded to the nearest tenth, so my plots might look like the example graphic offered, if there were such a function. (I have avoid taking on this challenge so far because there was no example data.) y <- round(rnorm(300), digits=1) plot(jitter(rep(1,300), 5), y, xlim=c(0,2) ) Or I suppose you could table() the y values, set up a blank plot with ylim set to range(y), and then "walk through" the entries with the points() function. -- David. > > Best, > Ista > >> >> Thank you. >> >> Regards, >> >> CH >> >> >> On Tue, Mar 8, 2011 at 11:14 PM, Ista Zahn >> <[hidden email]> wrote: >>> Hi, >>> I don't understand what is being plotted here. Can you describe what >>> you want in more detail? >>> >>> Best, >>> Ista >>> >>> On Tue, Mar 8, 2011 at 7:27 AM, C.H. <[hidden email]> >>> wrote: >>>> Dear R Users, >>>> >>>> I would like to know is there any package to create a plot like >>>> this? >>>> >>>> http://dl.dropbox.com/u/5409929/cs1160521f01.gif >>>> >>>> X axis is categorical. And the positions of the points are >>>> corresponding to the frequency. (similar to violinplot) >>>> >>>> Thank you. >>>> >>>> Regards, >>>> >>>> CH >>>> >>>> -- >>>> CH Chan >>>> >>>> ______________________________________________ >>>> [hidden email] mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> Ista Zahn >>> Graduate student >>> University of Rochester >>> Department of Clinical and Social Psychology >>> http://yourpsyche.org >>> >> >> >> >> -- >> CH Chan >> > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology > http://yourpsyche.org > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
On Mar 8, 2011, at 10:19 PM, David Winsemius wrote: > > On Mar 8, 2011, at 9:51 PM, Ista Zahn wrote: > >> I'm afraid I still don't get it (see in line below). >> >> On Wed, Mar 9, 2011 at 1:49 AM, C.H. <[hidden email]> wrote: >>> Sorry for being ambiguous. >>> >>> The data are some continuous variable (Y-axis) categorized into 3 >>> groups (Controls, Depressed EF and Preserved EF, X-asis). >> >> OK. I'm having trouble way before this though. Even taking just one >> group, I still don't understand what this graphic is showing. >> >>> >>> The band of dots on the plot is the data point. >> >> This is where I'm getting lost. You have a *band* of dots for each >> data point (singular)? My head is exploding thinking about this. Why >> do you have mutiple dots for each point? >> >> The density of dots >>> and the "fatness" of the band present the frequency of a particular >>> value in Y-axis. >> >> But the Y-axis variable is continuous. In order to have counts at >> different values of y you will need to bin it somehow. And even then >> you will still have only *one* value for the count in each bin. So >> again, how and why are there muliple dots in each band (your term) / >> bin (my term)? >> >> This property is similar to the violin plot: showing >>> the probability density of the data at different values. Instead of >>> showing a shape in violin plot, this plot shows the actual >>> distribution of the data points. >> >> OK, so let's say we bin y and get the counts in each bin. We can plot >> a point for each point that goes into that count, but what determines >> the position of each point on the x-axis? >> >> It's entirely possible (even probable) that my poor little brain just >> can't wrap itself around the idea. Maybe someone else can chime in >> and >> help me out. > > I was think that one might use jitter on the x coordinate (which is > going to be a discrete value and if its a factor then: > > plot(jitter(as.numeric(...)) > > It's not going to be in neat bands at each pseudo-continuous level. > I have a fair amount of data that has values rounded to the nearest > tenth, so my plots might look like the example graphic offered, if > there were such a function. > > (I have avoid taking on this challenge so far because there was no > example data.) > > y <- round(rnorm(300), digits=1) > plot(jitter(rep(1,300), 5), y, xlim=c(0,2) ) > > Or I suppose you could table() the y values, set up a blank plot > with ylim set to range(y), and then "walk through" the entries with > the points() function. x <- round(rnorm(300), digits=1) plot(NULL, type="n", xlim=c(0,2), ylim=range(x)) points( x= unlist( sapply(table(x), function(z) seq(1.05-z/100, length=z, by=0.02) )), y= rep( as.numeric(names(table(x))), table(x)) , cex=0.8) > > -- > David. > >> >> Best, >> Ista >> >>> >>> Thank you. >>> >>> Regards, >>> >>> CH >>> >>> >>> On Tue, Mar 8, 2011 at 11:14 PM, Ista Zahn <[hidden email] >>> > wrote: >>>> Hi, >>>> I don't understand what is being plotted here. Can you describe >>>> what >>>> you want in more detail? >>>> >>>> Best, >>>> Ista >>>> >>>> On Tue, Mar 8, 2011 at 7:27 AM, C.H. <[hidden email]> >>>> wrote: >>>>> Dear R Users, >>>>> >>>>> I would like to know is there any package to create a plot like >>>>> this? >>>>> >>>>> http://dl.dropbox.com/u/5409929/cs1160521f01.gif >>>>> >>>>> X axis is categorical. And the positions of the points are >>>>> corresponding to the frequency. (similar to violinplot) >>>>> >>>>> Thank you. >>>>> >>>>> Regards, >>>>> >>>>> CH >>>>> >>>>> -- >>>>> CH Chan >>>>> >>>>> ______________________________________________ >>>>> [hidden email] mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>>> >>>> -- >>>> Ista Zahn >>>> Graduate student >>>> University of Rochester >>>> Department of Clinical and Social Psychology >>>> http://yourpsyche.org >>>> >>> >>> >>> >>> -- >>> CH Chan >>> >> >> >> >> -- >> Ista Zahn >> Graduate student >> University of Rochester >> Department of Clinical and Social Psychology >> http://yourpsyche.org >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by C.H.
On 03/08/2011 11:27 PM, C.H. wrote:
> Dear R Users, > > I would like to know is there any package to create a plot like this? > > http://dl.dropbox.com/u/5409929/cs1160521f01.gif > > X axis is categorical. And the positions of the points are > corresponding to the frequency. (similar to violinplot) > Hi CH, This looks like it uses the same method as in the function brkdn.plot in the plotrix package. What this does is to offset each value in a particular stack by a small amount to the right, then twice that amount to the left, then three times that amount to the right, and so on. This would spread your data points out for each bin of values. The plot looks as though the data points in each bin are sorted before spreading, so that the "branches" in each bin slant upward. Let's see: x<-list(runif(90),runif(100),runif(80)) dendroPlot<-function(x,breaks=NA,nudge=NA) { if(is.na(breaks[1])) breaks=seq(min(unlist(x),na.rm=TRUE), max(unlist(x),na.rm=TRUE),length.out=10) plot(c(0,length(x)+1),range(unlist(x)),type="n") if(is.na(nudge)) nudge<-strwidth("o")/2 for(list_element in 1:length(x)) { binvar<-cut(x[[list_element]],breaks=breaks) for(bin in 1:length(levels(binvar))) { thisbin<-which(as.numeric(binvar)==bin) offset<-(1:length(x[[list_element]][thisbin])-1)*nudge offset[seq(2,length(offset),by=2)]<- -offset[seq(2,length(offset),by=2)] points(list_element+offset,sort(x[[list_element]][thisbin])) } } } dendroPlot(x) I think this might make it into plotrix. Let me know if it does what you want. Jim ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Hello C.H.
I've also been curious about this plot a while ago. Your question today inspired me to collect all that I've came by in one post, and I published it here: http://www.r-statistics.com/2011/03/scatter-dot-beeswarm-box-violin-plot-and-plotting-it-with-r/ I hope it might help you or others in the future. Best, Tal <http://www.r-statistics.com/2011/03/scatter-dot-beeswarm-box-violin-plot-and-plotting-it-with-r/> ----------------Contact Details:------------------------------------------------------- Contact me: [hidden email] | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Wed, Mar 9, 2011 at 9:27 AM, Jim Lemon <[hidden email]> wrote: > On 03/08/2011 11:27 PM, C.H. wrote: > >> Dear R Users, >> >> I would like to know is there any package to create a plot like this? >> >> http://dl.dropbox.com/u/5409929/cs1160521f01.gif >> >> X axis is categorical. And the positions of the points are >> corresponding to the frequency. (similar to violinplot) >> >> Hi CH, > This looks like it uses the same method as in the function brkdn.plot in > the plotrix package. What this does is to offset each value in a particular > stack by a small amount to the right, then twice that amount to the left, > then three times that amount to the right, and so on. This would spread your > data points out for each bin of values. The plot looks as though the data > points in each bin are sorted before spreading, so that the "branches" in > each bin slant upward. Let's see: > > x<-list(runif(90),runif(100),runif(80)) > > dendroPlot<-function(x,breaks=NA,nudge=NA) { > if(is.na(breaks[1])) > breaks=seq(min(unlist(x),na.rm=TRUE), > max(unlist(x),na.rm=TRUE),length.out=10) > plot(c(0,length(x)+1),range(unlist(x)),type="n") > if(is.na(nudge)) nudge<-strwidth("o")/2 > for(list_element in 1:length(x)) { > binvar<-cut(x[[list_element]],breaks=breaks) > for(bin in 1:length(levels(binvar))) { > thisbin<-which(as.numeric(binvar)==bin) > offset<-(1:length(x[[list_element]][thisbin])-1)*nudge > offset[seq(2,length(offset),by=2)]<- > -offset[seq(2,length(offset),by=2)] > points(list_element+offset,sort(x[[list_element]][thisbin])) > } > } > } > > dendroPlot(x) > > I think this might make it into plotrix. Let me know if it does what you > want. > > Jim > > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Powered by Nabble | Edit this page |