# Percentiles/Quantiles with Weighting Classic List Threaded 8 messages Open this post in threaded view
|

## Percentiles/Quantiles with Weighting

 Hi All, I am looking at applications of percentiles to time sequenced data.  I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1)) {   Xprime <- NA   for(i in 1:length(X))   {     Xprime <- c(Xprime, rep(X[i], times=i))   }   print("Percentiles:")   print(quantile(X, pctile))   print("Weighted:")   print(Xprime)   print("Weighted Percentiles:")   print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10))         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Percentiles/Quantiles with Weighting

 I do know that Harrell's Quantile function in the Hmisc package will   allow quantile estimates from models. Whether it is general enough to   extend to time series, I have no experience and cannot say. -- David Winsemius On Feb 17, 2009, at 11:57 AM, Brigid Mooney wrote: > Hi All, > > I am looking at applications of percentiles to time sequenced data.   > I had > just been using the quantile function to get percentiles over various > periods, but am more interested in if there is an accepted (and/or > R-implemented) method to apply weighting to the data so as to weigh   > recent > data more heavily. > > I wrote the following function, but it seems quite inefficient, and   > not > really very flexible in its applications - so if anyone has any   > suggestions > on how to look at quantiles/percentiles within R while also using a > weighting schema, I would be very interested. > > Note - this function supposes the data in X is time-sequenced, with   > the most > recent (and thus heaviest weighted) data at the end of the vector > > WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1)) > { >  Xprime <- NA > >  for(i in 1:length(X)) >  { >    Xprime <- c(Xprime, rep(X[i], times=i)) >  } > >  print("Percentiles:") >  print(quantile(X, pctile)) >  print("Weighted:") >  print(Xprime) >  print("Weighted Percentiles:") >  print(quantile(Xprime, pctile, na.rm=TRUE)) > } > > WtPercentile(1:10) > WtPercentile(rnorm(10)) > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Percentiles/Quantiles with Weighting

 In reply to this post by Brigid Mooney Thanks for pointing me to the quantreg package as a resource.  I was hoping to ask be able to address one quick follow-up question... I get slightly different variants between using the rq funciton with formula = mydata ~ 1 as I would if I ran the same data using the quantile function. Example: mydata <- (1:10)^2/2 pctile <- seq(.59, .99, .1) quantile(mydata, pctile) 59%    69%    79%    89%    99% 20.015 26.075 32.935 40.595 49.145 rq(mydata~1, tau=pctile) Call: rq(formula = mydata ~ 1, tau = pctile) Coefficients:             tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99 (Intercept)        18      24.5        32      40.5        50 Degrees of freedom: 10 total; 9 residual Is it correct to assume this is due to the different accepted methods of calculating quantiles?  If so, do you know where I would be able to see the algorithms used in these functions?  I'm not finding it in the documentation for function rq, and am new enough to R that I don't know where those references would generally be. On Tue, Feb 17, 2009 at 12:29 PM, roger koenker <[hidden email]> wrote: > http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869> > gives one possibility... > > url:    www.econ.uiuc.edu/~roger            Roger Koenker > email    [hidden email]            Department of Economics > vox:     217-333-4558                University of Illinois > fax:       217-244-6678                Champaign, IL 61820 > > > > > On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote: > >   Hi All, >> >> I am looking at applications of percentiles to time sequenced data.  I had >> just been using the quantile function to get percentiles over various >> periods, but am more interested in if there is an accepted (and/or >> R-implemented) method to apply weighting to the data so as to weigh recent >> data more heavily. >> >> I wrote the following function, but it seems quite inefficient, and not >> really very flexible in its applications - so if anyone has any >> suggestions >> on how to look at quantiles/percentiles within R while also using a >> weighting schema, I would be very interested. >> >> Note - this function supposes the data in X is time-sequenced, with the >> most >> recent (and thus heaviest weighted) data at the end of the vector >> >> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1)) >> { >>  Xprime <- NA >> >>  for(i in 1:length(X)) >>  { >>   Xprime <- c(Xprime, rep(X[i], times=i)) >>  } >> >>  print("Percentiles:") >>  print(quantile(X, pctile)) >>  print("Weighted:") >>  print(Xprime) >>  print("Weighted Percentiles:") >>  print(quantile(Xprime, pctile, na.rm=TRUE)) >> } >> >> WtPercentile(1:10) >> WtPercentile(rnorm(10)) >> >>        [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Percentiles/Quantiles with Weighting

 url:    www.econ.uiuc.edu/~roger            Roger Koenker email    [hidden email]            Department of Economics vox:     217-333-4558                University of Illinois fax:       217-244-6678                Champaign, IL 61820 On Feb 17, 2009, at 1:58 PM, Brigid Mooney wrote: > Thanks for pointing me to the quantreg package as a resource.  I was   > hoping to ask be able to address one quick follow-up question... > > I get slightly different variants between using the rq funciton with   > formula = mydata ~ 1 as I would if I ran the same data using the   > quantile function. > Example: > > mydata <- (1:10)^2/2 > pctile <- seq(.59, .99, .1) > > quantile(mydata, pctile) > 59%    69%    79%    89%    99% > 20.015 26.075 32.935 40.595 49.145 > > rq(mydata~1, tau=pctile) > Call: > rq(formula = mydata ~ 1, tau = pctile) > Coefficients: >             tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99 > (Intercept)        18      24.5        32      40.5        50 > Degrees of freedom: 10 total; 9 residual > > Is it correct to assume this is due to the different accepted   > methods of calculating quantiles?  If so, do you know where I would   > be able to see the algorithms used in these functions?  I'm not   > finding it in the documentation for function rq, and am new enough   > to R that I don't know where those references would generally be. > Yes,  quantile() in base R documents 9 varieties of quantiles, 2 more   than William Empson's famous 7 Types of Ambiguity.  In quantreg the function rq() finds a solution to an underlying optimization problem and   doesn't ask any further into the nature of the ambiguity -- it does often produce a warning indicating that there may be more than one solution.  The   default base R quantile is interpolated, while the default rq() with method =   "br" using the simplex algorithm finds an order statistic, typically.  If   you prefer something more like interpolation, you can try rq() with method = "fn" which is using an interior point algorithm and when there are multiple solutions it tends to produce something more like the centroid of the solution set.  I hope that this helps. > > > > On Tue, Feb 17, 2009 at 12:29 PM, roger koenker <[hidden email]>   > wrote: > http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869> > gives one possibility... > > url:    www.econ.uiuc.edu/~roger            Roger Koenker > email    [hidden email]            Department of Economics > vox:     217-333-4558                University of Illinois > fax:       217-244-6678                Champaign, IL 61820 > > > > > On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote: > > Hi All, > > I am looking at applications of percentiles to time sequenced data.   > I had > just been using the quantile function to get percentiles over various > periods, but am more interested in if there is an accepted (and/or > R-implemented) method to apply weighting to the data so as to weigh   > recent > data more heavily. > > I wrote the following function, but it seems quite inefficient, and   > not > really very flexible in its applications - so if anyone has any   > suggestions > on how to look at quantiles/percentiles within R while also using a > weighting schema, I would be very interested. > > Note - this function supposes the data in X is time-sequenced, with   > the most > recent (and thus heaviest weighted) data at the end of the vector > > WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1)) > { >  Xprime <- NA > >  for(i in 1:length(X)) >  { >   Xprime <- c(Xprime, rep(X[i], times=i)) >  } > >  print("Percentiles:") >  print(quantile(X, pctile)) >  print("Weighted:") >  print(Xprime) >  print("Weighted Percentiles:") >  print(quantile(Xprime, pctile, na.rm=TRUE)) > } > > WtPercentile(1:10) > WtPercentile(rnorm(10)) > >        [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Percentiles/Quantiles with Weighting

 In reply to this post by Brigid Mooney Here is one kind of weighted quantile function. The basic idea is very simple: wquantile <- function( v, w, p )   {     v <- v[order(v)]     w <- w[order(v)]     v [ which.max( cumsum(w) / sum(w) >= p ) ]   } With some more error-checking and general clean-up, it looks like this: # Simple weighted quantile # # v  A numeric vector of observations # w  A numeric vector of positive weights # p  The probability 0<=p<=1 # # Nothing fancy: no interpolation etc. # Basic idea wquantile <- function( v, w, p )   {     v <- v[order(v)]     w <- w[order(v)]     v [ which.max( cumsum(w) / sum(w) >= p ) ]   } # Simple weighted quantile # # v  A numeric vector of observations # w  A numeric vector of positive weights # p  The probability 0<=p<=1 # # Nothing fancy: no interpolation etc. wquantile <- function(v,w=rep(1,length(v)),p=.5)    {      if (!is.numeric(v) || !is.numeric(w) || length(v) != length(w))        stop("Values and weights must be equal-length numeric vectors")      if ( !is.numeric(p) || any( p<0 | p>1 ) )        stop("Quantiles must be 0<=p<=1")      ranking <- order(v)      sumw <- cumsum(w[ranking])      if ( is.na(w) || w<0 ) stop("Weights must be non-negative numbers")      plist <- sumw/sumw[length(sumw)]      sapply(p, function(p) v [ ranking [ which.max( plist >= p ) ] ])    } I would appreciate any comments people have on this -- whether correctness, efficiency, style, ....               -s On Tue, Feb 17, 2009 at 11:57 AM, Brigid Mooney <[hidden email]> wrote: > Hi All, > > I am looking at applications of percentiles to time sequenced data.  I had > just been using the quantile function to get percentiles over various > periods, but am more interested in if there is an accepted (and/or > R-implemented) method to apply weighting to the data so as to weigh recent > data more heavily. > > I wrote the following function, but it seems quite inefficient, and not > really very flexible in its applications - so if anyone has any suggestions > on how to look at quantiles/percentiles within R while also using a > weighting schema, I would be very interested. > > Note - this function supposes the data in X is time-sequenced, with the most > recent (and thus heaviest weighted) data at the end of the vector > > WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1)) > { >  Xprime <- NA > >  for(i in 1:length(X)) >  { >    Xprime <- c(Xprime, rep(X[i], times=i)) >  } > >  print("Percentiles:") >  print(quantile(X, pctile)) >  print("Weighted:") >  print(Xprime) >  print("Weighted Percentiles:") >  print(quantile(Xprime, pctile, na.rm=TRUE)) > } > > WtPercentile(1:10) > WtPercentile(rnorm(10)) > >        [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Fwd: Percentiles/Quantiles with Weighting

 Some minor improvements and corrections below # Simple weighted quantile # # v  A vector of sortable observations # w A numeric vector of positive weights # p  The quantile 0<=p<=1 # # Nothing fancy: no interpolation etc.; NA cases not thought through  wquantile <- function(v,w=rep(1,length(v)),p=.5)   {     if ( !is.numeric(w) || length(v) != length(w) )       stop("Values and weights must be equal-length vectors")     if ( !is.numeric(p) || any( p<0 | p>1) )       stop("Quantiles must be 0<=p<=1")     if ( min(w) < 0 ) stop("Weights must be non-negative numbers")     ranking <- order(v)     sumw <- cumsum(w[ranking])     plist <- sumw / sumw[ length(sumw) ]     sapply(p, function(p) v [ ranking [ which.max( plist >= p ) ] ] )   } I would appreciate any comments people have on this -- whether correctness, efficiency, style, ....              -s ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.