Systematic resampling (in sequential Monte Carlo)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Systematic resampling (in sequential Monte Carlo)

Giovanni Petris


Dear all,

Here is a little coding problem. It falls in the category of "how can I do
this efficiently?" rather than "how can I do this?" (I know how to do it
inefficiently). So, if you want to take the challenge, keep reading, otherwise
just skip to the next post - I won't be offended by that ;-)

I am coding a particle filter and I want to use systematic resampling to
"regenerate" particles. Basically, what this means is the following.

1. Start with a vector of "labels", x, say, of length N and a vector of
   probabilities, w, say - w[i] is the probability attached to x[i].

2. Draw N equally spaced random number in (0,1) as u = (1 : N + runif(1)) / N.

3. Define a new sample of "labels", y, say, by inverting the cdf defined by
   the probabilities w. That is, set
   y[i] = x[k] if cumsum(w)[k-1] <= u[i] < cumsum(w)[k],  i=1, ..., N.

The following is a piece of R code that implements the procedure.

> ### Systematic resampling
> set.seed(2)
> x <- LETTERS[1 : 6] # labels
> w <- rexp(6)
> w <- w / sum(w) # probabilities
> W <- c(0, cumsum(w)) # cdf
> u <- (1 : 6 + runif(1)) / 6
> indNew <- sapply(seq_along(u),
+                  function(i) (sum(W[i] <= u & u < W[i + 1])))
> (y <- rep(x, indNew))
[1] "A" "B" "D" "D" "F"
> ## graphically...
> plot(stepfun(seq_along(u), W), xaxt = "n")
> axis(1, at = seq_along(u), x)
> abline(h = u, lty = "dotted")
   
The question is the following. I believe the way I compute "newInd" is of
order N^2. Is there a smart way of doing it which is of order N? (I know there
is one, I just cannot find it...)

Thanks in advance,
Giovanni

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Systematic resampling

Giovanni Petris


Dear all,

This is the sequel to a question of mine a couple of days ago about how to
implement systematic resampling from a discrete distribution efficiently. (See
below the original question and description of the problem - the problem, in a
nutshell, is how to draw N points from a discrete distribution on N points).

I did my homework and found an implementation of systematic resampling which
is O(N) - my initial one was O(N^2). The code is given below. However,
compared with multinomial resampling, the difference in execution time is
huge. Is there anybody out there who sees a way of improving the efficiency of
my code?? Is coding it in C my only option?

Thank you in advance for your suggestions and insight,
Giovanni

######### R example #########
> ### Timing systematic resampling
> set.seed(2)
> N <- 1e6
> x <- LETTERS[1 : N] # labels
> w <- rexp(N)
> w <- w / sum(w) # probabilities
> W <- cumsum(w) # cdf
> system.time({
+ u <- (0 : (N - 1) + runif(1)) / N
+ indexNew <- rep(0, N)
+ h <- 1
+ k <- 1
+ for (i in 1 : N)
+ {
+     j <- h
+     while (u[h] < W[k] && h <= N) h <- h + 1
+     indexNew[i] <- h - j
+     k <- k + 1
+ }
+ (y <- rep(x, indexNew))
+ })
   user  system elapsed
 12.791   0.073  12.887
> ### ...while using multinomial resampling:
> system.time(
+ index <- sample(N, N, replace = TRUE, prob = w)
+ )
   user  system elapsed
  0.344   0.020   0.366

> Date: Wed, 29 Jul 2009 13:32:14 -0500 (CDT)
> From: Giovanni Petris <[hidden email]>
> Sender: [hidden email]
> Precedence: list
>
>
>
> Dear all,
>
> Here is a little coding problem. It falls in the category of "how can I do
> this efficiently?" rather than "how can I do this?" (I know how to do it
> inefficiently). So, if you want to take the challenge, keep reading, otherwise
> just skip to the next post - I won't be offended by that ;-)
>
> I am coding a particle filter and I want to use systematic resampling to
> "regenerate" particles. Basically, what this means is the following.
>
> 1. Start with a vector of "labels", x, say, of length N and a vector of
>    probabilities, w, say - w[i] is the probability attached to x[i].
>
> 2. Draw N equally spaced random number in (0,1) as u = (1 : N + runif(1)) / N.
>
> 3. Define a new sample of "labels", y, say, by inverting the cdf defined by
>    the probabilities w. That is, set
>    y[i] = x[k] if cumsum(w)[k-1] <= u[i] < cumsum(w)[k],  i=1, ..., N.
>
> The following is a piece of R code that implements the procedure.
>
> > ### Systematic resampling
> > set.seed(2)
> > x <- LETTERS[1 : 6] # labels
> > w <- rexp(6)
> > w <- w / sum(w) # probabilities
> > W <- c(0, cumsum(w)) # cdf
> > u <- (1 : 6 + runif(1)) / 6
> > indNew <- sapply(seq_along(u),
> +                  function(i) (sum(W[i] <= u & u < W[i + 1])))
> > (y <- rep(x, indNew))
> [1] "A" "B" "D" "D" "F"
> > ## graphically...
> > plot(stepfun(seq_along(u), W), xaxt = "n")
> > axis(1, at = seq_along(u), x)
> > abline(h = u, lty = "dotted")
>    
> The question is the following. I believe the way I compute "newInd" is of
> order N^2. Is there a smart way of doing it which is of order N? (I know there
> is one, I just cannot find it...)
>
> Thanks in advance,
> Giovanni
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.