|
Dear All, Apologies for this simple question and thanks in advance for any help given. Suppose I wanted to plot 1 million observations and produce the command plot(rnorm(1000000)) The labels of the xaxis are 0, e+00 2 e+05 etc. These are clearly not very attractive (The plots are for a PhD. thesis). I would like the axes to be 0,2,4,6,8,10 with a *10^5 on the right hand side. Is there a simple command for this? Best Wishes Roger ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
Here is one way to do it with a smaller set of data, but the 'range' is the
same: > x <- c(1,1000,1000000) > y <- pretty(range(x)) > y [1] 0e+00 2e+05 4e+05 6e+05 8e+05 1e+06 > plot(x,1:3,xaxt='n', xlab="X * 10^5") > axis(1, at=y, labels=y/100000) > On 12/28/05, [hidden email] <[hidden email]> wrote: > > > Dear All, > > Apologies for this simple question and thanks in advance for any help > given. > > Suppose I wanted to plot 1 million observations and produce the command > > plot(rnorm(1000000)) > > The labels of the xaxis are 0, e+00 2 e+05 etc. These are clearly not very > attractive (The plots are for a PhD. thesis). > > I would like the axes to be 0,2,4,6,8,10 with a *10^5 on the right hand > side. > > Is there a simple command for this? > > Best Wishes > > Roger > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
In reply to this post by R.C.GILL
Try this:
x <- c(1, 1e6); y <- 0:1 par(mar = c(5, 4, 4, 5) + 0.1) # make room at the right plot(x, y, axes = FALSE) box() axis(2) axis(1, at = 0:5 * 2 * 1e5, labels = 0:5 * 2) mtext(text = expression(phantom(0)%*%10^5), side = 1, line = 1, at = 11.0 * 1e5) Peter Ehlers [hidden email] wrote: > Dear All, > > Apologies for this simple question and thanks in advance for any help given. > > Suppose I wanted to plot 1 million observations and produce the command > > plot(rnorm(1000000)) > > The labels of the xaxis are 0, e+00 2 e+05 etc. These are clearly not very > attractive (The plots are for a PhD. thesis). > > I would like the axes to be 0,2,4,6,8,10 with a *10^5 on the right hand > side. > > Is there a simple command for this? > > Best Wishes > > Roger > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
In reply to this post by R.C.GILL
On Wed, 2005-12-28 at 20:15 +0000, [hidden email] wrote:
> Dear All, > > Apologies for this simple question and thanks in advance for any help > given. > > Suppose I wanted to plot 1 million observations and produce the > command > > plot(rnorm(1000000)) > > The labels of the xaxis are 0, e+00 2 e+05 etc. These are clearly not > very > attractive (The plots are for a PhD. thesis). > > I would like the axes to be 0,2,4,6,8,10 with a *10^5 on the right > hand > side. > > Is there a simple command for this? > > Best Wishes > > Roger See ?plotmath for some additional examples and there are some others in the list archives. set.seed(1) x <- rnorm(1000000) # Now do the plot, but leave the x axis blank plot(x, xaxt = "n") # Set the x axis label tick marks x.at <- seq(0, 10, 2) * 10 ^ 5 # Create the expressions for the tick mark labels # Using parse() takes the character vectors from paste() # and converts them to expressions for use in plotmath x.lab <- parse(text = paste(seq(0, 10, 2), "%*% 10^5")) # Now do the axis labels axis(1, at = x.at, labels = x.lab) HTH, Marc Schwartz ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
>>>>> "Marc" == Marc Schwartz (via MN) <[hidden email]>
>>>>> on Wed, 28 Dec 2005 15:46:37 -0600 writes: Marc> On Wed, 2005-12-28 at 20:15 +0000, Marc> [hidden email] wrote: >> Dear All, >> >> Apologies for this simple question and thanks in advance >> for any help given. >> >> Suppose I wanted to plot 1 million observations and >> produce the command >> >> plot(rnorm(1000000)) >> >> The labels of the xaxis are 0, e+00 2 e+05 etc. These are >> clearly not very attractive (The plots are for a >> PhD. thesis). >> >> I would like the axes to be 0,2,4,6,8,10 with a *10^5 on >> the right hand side. >> >> Is there a simple command for this? >> >> Best Wishes >> >> Roger Marc> See ?plotmath for some additional examples and there Marc> are some others in the list archives. Yes, I think this one is there too: It has the "* 10^k" after each number; the nice thing about it is that it works for all kind of data -- and of course, in principle it could be built into R ... ###----------------- Do "a 10^k" labeling instead of "a e<k>" --- axTexpr <- function(side, at = axTicks(side, axp=axp, usr=usr, log=log), axp = NULL, usr = NULL, log = NULL) { ## Purpose: Do "a 10^k" labeling instead of "a e<k>" ## this auxiliary should return 'at' and 'label' (expression) ## ---------------------------------------------------------------------- ## Arguments: as for axTicks() ## ---------------------------------------------------------------------- ## Author: Martin Maechler, Date: 7 May 2004, 18:01 eT <- floor(log10(abs(at)))# at == 0 case is dealt with below mT <- at / 10^eT ss <- lapply(seq(along = at), function(i) if(at[i] == 0) quote(0) else substitute(A %*% 10^E, list(A=mT[i], E=eT[i]))) do.call("expression", ss) } x <- 1e7*(-10:50) y <- dnorm(x, m=10e7, s=20e7) plot(x,y) ## ^^^^^^ not so nice; ok, try par(mar=.1+c(5,5,4,1))## << For the horizontal y-axis labels, need more space plot(x,y, axes= FALSE, frame=TRUE) aX <- axTicks(1); axis(1, at=aX, label= axTexpr(1, aX)) if(FALSE) # rather the next one { aY <- axTicks(2); axis(2, at=aY, label= axTexpr(2, aY))} ## or rather (horizontal labels on y-axis): aY <- axTicks(2); axis(2, at=aY, label= axTexpr(2, aY), las=2) ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
On Thu, 2005-12-29 at 22:06 +0100, Martin Maechler wrote:
> >>>>> "Marc" == Marc Schwartz (via MN) <[hidden email]> > >>>>> on Wed, 28 Dec 2005 15:46:37 -0600 writes: > > Marc> On Wed, 2005-12-28 at 20:15 +0000, > Marc> [hidden email] wrote: > >> Dear All, > >> > >> Apologies for this simple question and thanks in advance > >> for any help given. > >> > >> Suppose I wanted to plot 1 million observations and > >> produce the command > >> > >> plot(rnorm(1000000)) > >> > >> The labels of the xaxis are 0, e+00 2 e+05 etc. These are > >> clearly not very attractive (The plots are for a > >> PhD. thesis). > >> > >> I would like the axes to be 0,2,4,6,8,10 with a *10^5 on > >> the right hand side. > >> > >> Is there a simple command for this? > >> > >> Best Wishes > >> > >> Roger > > > Marc> See ?plotmath for some additional examples and there > Marc> are some others in the list archives. > > Yes, I think this one is there too: > It has the "* 10^k" after each number; > the nice thing about it is that it works for all kind of data > -- and of course, in principle it could be built into R ... > > > > ###----------------- Do "a 10^k" labeling instead of "a e<k>" --- > > axTexpr <- function(side, at = axTicks(side, axp=axp, usr=usr, log=log), > axp = NULL, usr = NULL, log = NULL) > { > ## Purpose: Do "a 10^k" labeling instead of "a e<k>" > ## this auxiliary should return 'at' and 'label' (expression) > ## ---------------------------------------------------------------------- > ## Arguments: as for axTicks() > ## ---------------------------------------------------------------------- > ## Author: Martin Maechler, Date: 7 May 2004, 18:01 > eT <- floor(log10(abs(at)))# at == 0 case is dealt with below > mT <- at / 10^eT > ss <- lapply(seq(along = at), > function(i) if(at[i] == 0) quote(0) else > substitute(A %*% 10^E, list(A=mT[i], E=eT[i]))) > do.call("expression", ss) > } > > > x <- 1e7*(-10:50) > y <- dnorm(x, m=10e7, s=20e7) > plot(x,y) > ## ^^^^^^ not so nice; ok, try > > par(mar=.1+c(5,5,4,1))## << For the horizontal y-axis labels, need more space > plot(x,y, axes= FALSE, frame=TRUE) > aX <- axTicks(1); axis(1, at=aX, label= axTexpr(1, aX)) > if(FALSE) # rather the next one > { aY <- axTicks(2); axis(2, at=aY, label= axTexpr(2, aY))} > ## or rather (horizontal labels on y-axis): > aY <- axTicks(2); axis(2, at=aY, label= axTexpr(2, aY), las=2) Nice Martin! I do like that. I also like the handling of zero, which I realized after sending my reply, thus should have used: x <- rnorm(1000000) plot(x, xaxt = "n") x.at <- seq(0, 10, 2) * 10 ^ 5 # Handle the zero here this time x.lab <- parse(text = paste(seq(2, 10, 2), "%*% 10^5")) axis(1, at = x.at, labels = c(0, x.lab)) BTW, on your approach, it was here: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/36462.html and more recently, here: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/57011.html :-) Best regards and Happy New Year, Marc ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
In reply to this post by Martin Maechler
[hidden email] wrote:
> ... > I would like the axes to be 0,2,4,6,8,10 with a *10^5 on > the right hand side. > > Is there a simple command for this? This post so interested me that I wrote the following function: axis.mult<-function(side=1,at=NULL,labels,mult=1,mult.label,mult.line, mult.labelpos=NULL,...) { if(is.null(at)) at<-axTicks(side) if(missing(labels)) labels<-at/mult axis(side,at,labels,...) if(missing(mult.label)) mult.label<-paste("x",mult,collapse="") # multiplier position defaults to centered on the outside if(is.null(mult.labelpos)) mult.labelpos<-side edges<-par("usr") if(side %% 2) { # either top or bottom if(mult.labelpos %% 2) { adj<-0.5 at<-(edges[1]+edges[2])/2 if(missing(mult.line)) mult.line<-ifelse(mult.labelpos == side,3,0) } else { adj<-ifelse(mult.labelpos == 2,1,0) at<-ifelse(mult.labelpos == 2,edges[1],edges[2]) if(missing(mult.line)) mult.line<-1 } } else { # either left or right if(mult.labelpos %% 2) { adj<-ifelse(mult.labelpos == 1,1,0) at<-ifelse(mult.labelpos == 1,edges[3],edges[4]) if(missing(mult.line)) mult.line<-1 } else { adj<-0.5 at<-(edges[3]+edges[4])/2 if(missing(mult.line)) mult.line=ifelse(mult.labelpos == side,3,0) } } mtext(mult.label,side,mult.line,at=at,adj=adj) } which will be in the next (v2.0.1) version of plotrix. Jim ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
Hi,
I have a difficulty with the bootstrap procedure in boot package. How can I specify the size of sample at each bootstrap ? I use >myboot<-boot(data,boot.fun,R=300) when I display myboot$t I get a vector with just one value the than the compute statistic in my data file after reading about bootstrap in the book MASS I add >set.seed(101) But I get the same result. than mean than at each boot the sample draw is exactly the data file. How can I do resolve this ? Sincerly ! --------------------------------- [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
On Fri, 2005-12-30 at 18:37 +0100, justin bem wrote:
> Hi, > > I have a difficulty with the bootstrap procedure in boot package. > How can I specify the size of sample at each bootstrap ? > I use > >myboot<-boot(data,boot.fun,R=300) > when I display myboot$t I get a vector with just one value the than > the compute statistic in my data file after reading about bootstrap in > the book MASS I add > >set.seed(101) > But I get the same result. than mean than at each boot the sample > draw is exactly the data file. How can I do resolve this ? > > Sincerly ! Justin, First, when creating a new post, please do not do so by replying to an existing post. This screws up the threading in the list archive, making it difficult for folks to search for your post and replies, when the thread ends up being embedded in another unrelated one. Second, with bootstrapping, the bootstrap samples are done using sampling with replacement from the original sample. Thus, the bootstrap replicates are, by default, the same size as the original sample. You might want to look at the examples in ?sample, specifically the resample() function created there, for some additional insight. Using set.seed() simply allows for the ability to repeat the same sequence of sampling randomizations. It has no effect on the replicate sample size. In terms of what you are seeing with respect to only getting a single statistic value, I suspect that there is a problem in how you have defined your statistic function. Using the example from page 134 in MASS4: library(MASS) library(boot) set.seed(101) gal.boot <- boot(galaxies, function(x, i) median(x[i]), R = 1000) > str(gal.boot$t) num [1:1000, 1] 20930 21492 20196 20179 21184 ... Here 't' is of length 1000, the same as R. The median function is passed the galaxies data as 'x' and a set of sampled indices 'i'. Thus, x[i] is the randomly selected replicate passed to median(). Each replicate has 82 elements (the length of galaxies) and this is repeated 1000 times. Note that by default, the 'statistic' function in boot() is defined to take two arguments, 'data' and 'i'. Since median() does not use these, we create the boot() function call as above. We could have done it as: my.median <- function(data, i) { median(data[i]) } set.seed(101) my.boot <- boot(galaxies, my.median, R = 1000) # Now let's compare the resultant boot objects > all.equal(gal.boot, my.boot) [1] "Component 6: target, current don't match when deparsed" [2] "Component 8: target, current don't match when deparsed" They are identical except for the 'statistic' and 'call' elements, which simply reflects the different statistic function expressions used. The key is that the statistic function (in a routine bootstrap such as these) takes two arguments. It could take more (see below). If you want to get a feel for how each of the 1000 replicates looks, you could use the boot.array() function: boot.array(gal.boot, indices = TRUE) boot.array() with 'indices = TRUE' will return a matrix of values representing each replicate as a row (for 1000 rows in this case) and the _indices_ of the values from galaxies (1:82) that was used in each replicate (row). With 'indices = FALSE' (the default here), boot.array() will give you information pertaining to how frequently each of the 82 elements in galaxies was used in each replicate. Keep in mind that this is sampling WITH replacement, so some values will be used more than once in each replicate. For example: > table(boot.array(gal.boot, indices = FALSE)) 0 1 2 3 4 5 6 7 30027 30317 15134 4998 1248 233 37 6 This shows that the 82 values in galaxies were used anywhere from 0 to 7 times in any given replicate. The sum() of the above is 82000 of course and each replicate does have 82 elements: # Get the sum of each row from the boot.array() call # above and create a table > table(rowSums(boot.array(gal.boot, indices = FALSE))) 82 1000 If you actually wanted to generate the replicates used, you could do something like: matrix(galaxies[boot.array(gal.boot, indices = TRUE)], ncol = 82, byrow = TRUE) which would return a 1000 x 82 matrix with the actual galaxies data as sampled in the 1000 replicates. In terms of defining the sample size in each replicate, there is an example in Davison and Hinkley's book (cited in ?boot) and for which the boot package is supporting software. The example is on page 528 and provides an approach for specifying the sample size for the replicates, if that it what you truly want to do. This example uses the 'city' data: > city u x 1 138 143 2 93 104 3 61 69 4 179 260 5 48 75 6 37 63 7 29 50 8 23 48 9 30 111 10 2 50 city.subset <- function(data, i, n = 10) { # This step takes the sampled indices # and returns the first 'n' of them. # Then subsets 'data' using these, in # this case, using the first 'n' rows of # all of the sampled rows in the city dataset d <- data[i[1:n], ] # Now the statistic is calculated on a # replicate with a sample size of 'n' mean(d[, 2]) / mean(d[, 1]) } # Let's do the boot with 200 replicates # each with a sample size of 5. 'n' is passed as # part of the "..." argument in boot(). city.boot <- boot(city, city.subset, R = 200, n = 5) Note that the 'statistic' function here takes three arguments: data: this will be 'city' i: sampled indices n: replicate sample size As pointed out in D&H, the boot.array() function would need to be adjusted here to work properly: boot.array(city.boot, indices = TRUE)[, 1:5] so that you only return the first 5 values for each replicate. There are important considerations if your replicates are sized less than the original sample size, so I would not do this lightly and would recommend securing a copy of D&H to cover some of this area. HTH, Marc Schwartz ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
|
Thanks you very much for you answer. The mistake was in the statistic function I didn't specificy indices Best wishes to all R users and R core team Sincerly. Marc Schwartz <[hidden email]> a écrit : On Fri, 2005-12-30 at 18:37 +0100, justin bem wrote: > Hi, > > I have a difficulty with the bootstrap procedure in boot package. > How can I specify the size of sample at each bootstrap ? > I use > >myboot<-boot(data,boot.fun,R=300) > when I display myboot$t I get a vector with just one value the than > the compute statistic in my data file after reading about bootstrap in > the book MASS I add > >set.seed(101) > But I get the same result. than mean than at each boot the sample > draw is exactly the data file. How can I do resolve this ? > > Sincerly ! First, when creating a new post, please do not do so by replying to an existing post. This screws up the threading in the list archive, making it difficult for folks to search for your post and replies, when the thread ends up being embedded in another unrelated one. Second, with bootstrapping, the bootstrap samples are done using sampling with replacement from the original sample. Thus, the bootstrap replicates are, by default, the same size as the original sample. You might want to look at the examples in ?sample, specifically the resample() function created there, for some additional insight. Using set.seed() simply allows for the ability to repeat the same sequence of sampling randomizations. It has no effect on the replicate sample size. In terms of what you are seeing with respect to only getting a single statistic value, I suspect that there is a problem in how you have defined your statistic function. Using the example from page 134 in MASS4: library(MASS) library(boot) set.seed(101) gal.boot <- boot(galaxies, function(x, i) median(x[i]), R = 1000) > str(gal.boot$t) num [1:1000, 1] 20930 21492 20196 20179 21184 ... Here 't' is of length 1000, the same as R. The median function is passed the galaxies data as 'x' and a set of sampled indices 'i'. Thus, x[i] is the randomly selected replicate passed to median(). Each replicate has 82 elements (the length of galaxies) and this is repeated 1000 times. Note that by default, the 'statistic' function in boot() is defined to take two arguments, 'data' and 'i'. Since median() does not use these, we create the boot() function call as above. We could have done it as: my.median <- function(data, i) { median(data[i]) } set.seed(101) my.boot <- boot(galaxies, my.median, R = 1000) # Now let's compare the resultant boot objects > all.equal(gal.boot, my.boot) [1] "Component 6: target, current don't match when deparsed" [2] "Component 8: target, current don't match when deparsed" They are identical except for the 'statistic' and 'call' elements, which simply reflects the different statistic function expressions used. The key is that the statistic function (in a routine bootstrap such as these) takes two arguments. It could take more (see below). If you want to get a feel for how each of the 1000 replicates looks, you could use the boot.array() function: boot.array(gal.boot, indices = TRUE) boot.array() with 'indices = TRUE' will return a matrix of values representing each replicate as a row (for 1000 rows in this case) and the _indices_ of the values from galaxies (1:82) that was used in each replicate (row). With 'indices = FALSE' (the default here), boot.array() will give you information pertaining to how frequently each of the 82 elements in galaxies was used in each replicate. Keep in mind that this is sampling WITH replacement, so some values will be used more than once in each replicate. For example: > table(boot.array(gal.boot, indices = FALSE)) 0 1 2 3 4 5 6 7 30027 30317 15134 4998 1248 233 37 6 This shows that the 82 values in galaxies were used anywhere from 0 to 7 times in any given replicate. The sum() of the above is 82000 of course and each replicate does have 82 elements: # Get the sum of each row from the boot.array() call # above and create a table > table(rowSums(boot.array(gal.boot, indices = FALSE))) 82 1000 If you actually wanted to generate the replicates used, you could do something like: matrix(galaxies[boot.array(gal.boot, indices = TRUE)], ncol = 82, byrow = TRUE) which would return a 1000 x 82 matrix with the actual galaxies data as sampled in the 1000 replicates. In terms of defining the sample size in each replicate, there is an example in Davison and Hinkley's book (cited in ?boot) and for which the boot package is supporting software. The example is on page 528 and provides an approach for specifying the sample size for the replicates, if that it what you truly want to do. This example uses the 'city' data: > city u x 1 138 143 2 93 104 3 61 69 4 179 260 5 48 75 6 37 63 7 29 50 8 23 48 9 30 111 10 2 50 city.subset <- function(data, i, n = 10) { # This step takes the sampled indices # and returns the first 'n' of them. # Then subsets 'data' using these, in # this case, using the first 'n' rows of # all of the sampled rows in the city dataset d <- data[i[1:n], ] # Now the statistic is calculated on a # replicate with a sample size of 'n' mean(d[, 2]) / mean(d[, 1]) } # Let's do the boot with 200 replicates # each with a sample size of 5. 'n' is passed as # part of the "..." argument in boot(). city.boot <- boot(city, city.subset, R = 200, n = 5) Note that the 'statistic' function here takes three arguments: data: this will be 'city' i: sampled indices n: replicate sample size As pointed out in D&H, the boot.array() function would need to be adjusted here to work properly: boot.array(city.boot, indices = TRUE)[, 1:5] so that you only return the first 5 values for each replicate. There are important considerations if your replicates are sized less than the original sample size, so I would not do this lightly and would recommend securing a copy of D&H to cover some of this area. HTH, Marc Schwartz --------------------------------- [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
| Powered by Nabble | Edit this page |
