Marc Schwartz (via MN) wrote:
> On Tue, 2005-12-13 at 10:53 +0000, Dan Bolser wrote: > >>Hi, I am plotting a distribution of (ordered) values as a barplot. I >>would like to label groups of bars together to highlight aspects of the >>distribution. The label for the group should be the range of values in >>those bars. >> >>As this is hard to describe, here is an example; >> >> >>x <- rlnorm(50)*2 >> >>barplot(sort(x,decreasing=T)) >> >>y <- quantile(x, seq(0, 1, 0.2)) >> >>y >> >>plot(diff(y)) >> >> >> >>That last plot is to highlight that I want to label lots of the small >>columns together, and have a few more labels for the bigger columns >>(more densely labeled). I guess I will have to turn out my own labels >>using low level plotting functions, but I am stumped as to how to >>perform the calculation for label placement. >> >>I imagine drawing several line segments, one for each group of bars to >>be labeled together, and putting the range under each line segment as >>the label. Each line segment will sit under the group of bars that it >>covers. >> >>Thanks for any help with the above! >> >>Cheers, >>Dan. > > > Dan, > > Here is a hint. > > barplot() returns the bar midpoints: > > mp <- barplot(sort(x, decreasing = TRUE)) > > >>head(mp) > > [,1] > [1,] 0.7 > [2,] 1.9 > [3,] 3.1 > [4,] 4.3 > [5,] 5.5 > [6,] 6.7 > > There will be one value in 'mp' for each bar in your series. > > You can then use those values along the x axis to draw your line > segments under the bars as you require, based upon the cut points you > want to highlight. > > To get the center of a given group of bars, you can use: > > mean(mp[start:end]) > > where 'start' and 'end' are the extreme bars in each of your groups. > > Two other things that might be helpful. See ?cut and ?hist, noting the > output in the latter when 'plot = FALSE'. > > HTH, Thanks all for help on this question, including those who emailed me off list. I went with the suggestion of Marc above, because I could follow through how to implement the code (other more complete solutions were hard for me to 'reverse engineer'). Here is my solution in full, which I feel gives rather nice output :) ## Approximate my data for you to try x <- sort((runif(70)*100)^3,decreasing=T) ## Plot the barplot mp <- barplot(x, # Remove default label names names.arg=rep('',70) ) ## Break data range, and count bars per break my.hist <- hist(x,plot=F, ## Pick the (approximate) number of labels ## NB: using quantiles is incorrect here breaks=4 ) ## Check for sanity ## points(mp[length(mp)],x[length(mp)],col=2) ## Counts become new 'breaks' my.new.breaks <- my.hist$counts ## Some formating stuff my.names <- sprintf("%.1d",my.hist$breaks) # Prepare to add labels op<-par(xpd=TRUE) i <- length(mp) # Note we label from right to left q <- 1 # for(j in my.new.breaks){ st <- i # en <- i-j+1 # ## segments(mp[st],-50000, mp[en],-50000,lwd=2,col=2) ## text(mean(mp[st:en]),-100000,pos=1, paste(paste(my.names[q],"-",sep=" "), my.names[q+1],sep="\n"),cex=0.6) ## i <- i-j # q <- q+1 } You should see that the density of labels corresponds to the range of data (hopefully not too dense), giving more labels to regions of the plot with bigger ranges. > Marc Schwartz > > Cheers, Dan. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
Note that if I follow this correctly then you could remove the loop. In
particular note that 1. st is just the cumulative sum of new.break.points but summed from the end: st <- rev(cumsum(rev(my.new.breaks))) 2. segments and text both take vector arguments and 3. averaging over the groups can be done by defining a factor g whose levels are the groups using cut and then performing the averaging with tapply: g <- cut(seq(mp), c(1, st.), include.lowest = TRUE) tapply(mp, g, mean) On 12/14/05, Dan Bolser <[hidden email]> wrote: > Marc Schwartz (via MN) wrote: > > On Tue, 2005-12-13 at 10:53 +0000, Dan Bolser wrote: > > > >>Hi, I am plotting a distribution of (ordered) values as a barplot. I > >>would like to label groups of bars together to highlight aspects of the > >>distribution. The label for the group should be the range of values in > >>those bars. > >> > >>As this is hard to describe, here is an example; > >> > >> > >>x <- rlnorm(50)*2 > >> > >>barplot(sort(x,decreasing=T)) > >> > >>y <- quantile(x, seq(0, 1, 0.2)) > >> > >>y > >> > >>plot(diff(y)) > >> > >> > >> > >>That last plot is to highlight that I want to label lots of the small > >>columns together, and have a few more labels for the bigger columns > >>(more densely labeled). I guess I will have to turn out my own labels > >>using low level plotting functions, but I am stumped as to how to > >>perform the calculation for label placement. > >> > >>I imagine drawing several line segments, one for each group of bars to > >>be labeled together, and putting the range under each line segment as > >>the label. Each line segment will sit under the group of bars that it > >>covers. > >> > >>Thanks for any help with the above! > >> > >>Cheers, > >>Dan. > > > > > > Dan, > > > > Here is a hint. > > > > barplot() returns the bar midpoints: > > > > mp <- barplot(sort(x, decreasing = TRUE)) > > > > > >>head(mp) > > > > [,1] > > [1,] 0.7 > > [2,] 1.9 > > [3,] 3.1 > > [4,] 4.3 > > [5,] 5.5 > > [6,] 6.7 > > > > There will be one value in 'mp' for each bar in your series. > > > > You can then use those values along the x axis to draw your line > > segments under the bars as you require, based upon the cut points you > > want to highlight. > > > > To get the center of a given group of bars, you can use: > > > > mean(mp[start:end]) > > > > where 'start' and 'end' are the extreme bars in each of your groups. > > > > Two other things that might be helpful. See ?cut and ?hist, noting the > > output in the latter when 'plot = FALSE'. > > > > HTH, > > Thanks all for help on this question, including those who emailed me off > list. > > I went with the suggestion of Marc above, because I could follow through > how to implement the code (other more complete solutions were hard for > me to 'reverse engineer'). > > Here is my solution in full, which I feel gives rather nice output :) > > ## Approximate my data for you to try > x <- sort((runif(70)*100)^3,decreasing=T) > > ## Plot the barplot > mp <- > barplot(x, > # Remove default label names > names.arg=rep('',70) > ) > > ## Break data range, and count bars per break > my.hist <- > hist(x,plot=F, > ## Pick the (approximate) number of labels > ## NB: using quantiles is incorrect here > breaks=4 > ) > > ## Check for sanity > ## points(mp[length(mp)],x[length(mp)],col=2) > > ## Counts become new 'breaks' > my.new.breaks <- > my.hist$counts > > ## Some formating stuff > my.names <- > sprintf("%.1d",my.hist$breaks) > > # Prepare to add labels > op<-par(xpd=TRUE) > > i <- length(mp) # Note we label from right to left > q <- 1 > # > for(j in my.new.breaks){ > st <- i # > en <- i-j+1 # > ## > segments(mp[st],-50000, > mp[en],-50000,lwd=2,col=2) > ## > text(mean(mp[st:en]),-100000,pos=1, > paste(paste(my.names[q],"-",sep=" "), > my.names[q+1],sep="\n"),cex=0.6) > ## > i <- i-j # > q <- q+1 > } > > > You should see that the density of labels corresponds to the range of > data (hopefully not too dense), giving more labels to regions of the > plot with bigger ranges. > > > > Marc Schwartz > > > > > > > Cheers, > Dan. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
Free forum by Nabble | Edit this page |