# How to rank matrix data by deciles?

6 messages
Open this post in threaded view
|

## How to rank matrix data by deciles?

 Hi R users, I have a matrix of data similar to: > y=matrix(rnorm(55),ncol=5) I would like to know to which decile each number belongs compared to the numbers in its column. Say y[1,1] is the third decile among y[1:11,1] and y[2,1] is in the second decile I would like get a matrix that would return their ranks in decile, i.e., y[1,1] -> 3 y[2,1] -> 2 Your help is much appreciated!
Open this post in threaded view
|

## Re: How to rank matrix data by deciles?

 Vincent -     I think   apply(y,2,function(x)             cut(x,quantile(x,(0:10)/10),label=FALSE,include.lowest=TRUE)) will give you what you want (although you didn't use set.seed so I can't verify it against your example.)   - Phil Spector   Statistical Computing Facility   Department of Statistics   UC Berkeley   [hidden email] On Thu, 6 May 2010, vincent.deluard wrote: > > > Hi R users, > > I have a matrix of data similar to: > >> y=matrix(rnorm(55),ncol=5) > > > I would like to know to which decile each number belongs compared to the > numbers in its column. > > Say y[1,1] is the third decile among y[1:11,1] and y[2,1] is in the second > decile > I would like get a matrix that would return their ranks in decile, i.e., > > y[1,1] -> 3 > y[2,1] -> 2 > > Your help is much appreciated! > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-rank-matrix-data-by-deciles-tp2133496p2133496.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: How to rank matrix data by deciles?

 Dear Phil, You helped me with a request to rand matrix columns by deciles two weeks ago. This really unblocked me on this project but I found a little bug. As in before, my data is in a matrix: > madebt[1:16,1:2]        X4.19.2010  X4.16.2010  [1,] 26.61197531 26.58950617  [2,]  5.72765432  5.73074074  [3,]  5.95839506  5.96222222  [4,]  5.64333333  5.64777778  [5,] 20.93814815 20.95728395  [6,]  0.00000000  0.00000000  [7,]  0.07000000  0.07000000  [8,] 12.87802469 12.86888889  [9,]  3.64407407  3.64543210 [10,]  0.05037037  0.05049383 [11,] 25.59024691 25.60888889 [12,]  3.47987654  3.53246914 [13,]  0.00000000  0.00000000 [14,] 31.39037037 31.39049383 [15,]  3.78296296  3.77641975 [16,] 13.17876543 13.19617284 The apply function will work for this sample of my data: debtdeciles = apply(madebt[1:16,1:2],2,function(x)             cut(x,quantile(x,(0:10)/10, na.rm=TRUE),label=FALSE,include.lowest=TRUE)) debtdeciles      X4.19.2010 X4.16.2010  [1,]         10         10  [2,]          6          6  [3,]          6          6  [4,]          5          5  [5,]          8          8  [6,]          1          1  [7,]          2          2  [8,]          7          7  [9,]          4          4 [10,]          2          2 [11,]          9          9 [12,]          3          3 [13,]          1          1 [14,]         10         10 [15,]          4          4 [16,]          8          8 However, it will fail for > madebt[1:17,1:2]        X4.19.2010  X4.16.2010  [1,] 26.61197531 26.58950617  [2,]  5.72765432  5.73074074  [3,]  5.95839506  5.96222222  [4,]  5.64333333  5.64777778  [5,] 20.93814815 20.95728395  [6,]  0.00000000  0.00000000  [7,]  0.07000000  0.07000000  [8,] 12.87802469 12.86888889  [9,]  3.64407407  3.64543210 [10,]  0.05037037  0.05049383 [11,] 25.59024691 25.60888889 [12,]  3.47987654  3.53246914 [13,]  0.00000000  0.00000000 [14,] 31.39037037 31.39049383 [15,]  3.78296296  3.77641975 [16,] 13.17876543 13.19617284 [17,]  0.00000000  0.00000000 > debtdeciles = apply(madebt[1:17,1:2],2,function(x) +             cut(x,quantile(x,(0:10)/10, na.rm=TRUE),label=FALSE,include.lowest=TRUE)) Error in cut.default(x, quantile(x, (0:10)/10, na.rm = TRUE), label = FALSE, :   'breaks' are not unique My guess is that we now have 3 "zeros" in each column. For each decile, we cannot have more than 2 elements (total of 17 numbers in each column) and I believe R cannot determine where to put the third "zero". Do you have any solution for this problem? Many thanks, -------------------------------------------- Vincent Deluard [hidden email] Global Equity Strategist, CFA Charter Award Pending TrimTabs Investment Research 40 Wall Street, 28th Floor New York, NY 10005 Phone: (+1) 646-512-5616 -----Original Message----- From: Phil Spector [mailto:[hidden email]] Sent: Thursday, May 06, 2010 7:46 PM To: vincent.deluard Cc: [hidden email] Subject: Re: [R] How to rank matrix data by deciles? Vincent -     I think   apply(y,2,function(x)             cut(x,quantile(x,(0:10)/10),label=FALSE,include.lowest=TRUE)) will give you what you want (although you didn't use set.seed so I can't verify it against your example.)   - Phil Spector   Statistical Computing Facility   Department of Statistics   UC Berkeley   [hidden email] On Thu, 6 May 2010, vincent.deluard wrote: > > > Hi R users, > > I have a matrix of data similar to: > >> y=matrix(rnorm(55),ncol=5) > > > I would like to know to which decile each number belongs compared to the > numbers in its column. > > Say y[1,1] is the third decile among y[1:11,1] and y[2,1] is in the second > decile > I would like get a matrix that would return their ranks in decile, i.e., > > y[1,1] -> 3 > y[2,1] -> 2 > > Your help is much appreciated! > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-rank-matrix-data-by-deciles-tp2133496p2133496.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: How to rank matrix data by deciles?

 In reply to this post by Phil Spector Dear Phil, You helped me with a request to rand matrix columns by deciles two weeks ago. This really un-blocked me on this project but I found a little bug. As in before, my data is in a matrix: > madebt[1:16,1:2]        X4.19.2010  X4.16.2010  [1,] 26.61197531 26.58950617  [2,]  5.72765432  5.73074074  [3,]  5.95839506  5.96222222  [4,]  5.64333333  5.64777778  [5,] 20.93814815 20.95728395  [6,]  0.00000000  0.00000000  [7,]  0.07000000  0.07000000  [8,] 12.87802469 12.86888889  [9,]  3.64407407  3.64543210 [10,]  0.05037037  0.05049383 [11,] 25.59024691 25.60888889 [12,]  3.47987654  3.53246914 [13,]  0.00000000  0.00000000 [14,] 31.39037037 31.39049383 [15,]  3.78296296  3.77641975 [16,] 13.17876543 13.19617284 The apply function will work for this sample of my data: debtdeciles = apply(madebt[1:16,1:2],2,function(x)             cut(x,quantile(x,(0:10)/10, na.rm=TRUE),label=FALSE,include.lowest=TRUE)) debtdeciles      X4.19.2010 X4.16.2010  [1,]         10         10  [2,]          6          6  [3,]          6          6  [4,]          5          5  [5,]          8          8  [6,]          1          1  [7,]          2          2  [8,]          7          7  [9,]          4          4 [10,]          2          2 [11,]          9          9 [12,]          3          3 [13,]          1          1 [14,]         10         10 [15,]          4          4 [16,]          8          8 However, it will fail for > madebt[1:17,1:2]        X4.19.2010  X4.16.2010  [1,] 26.61197531 26.58950617  [2,]  5.72765432  5.73074074  [3,]  5.95839506  5.96222222  [4,]  5.64333333  5.64777778  [5,] 20.93814815 20.95728395  [6,]  0.00000000  0.00000000  [7,]  0.07000000  0.07000000  [8,] 12.87802469 12.86888889  [9,]  3.64407407  3.64543210 [10,]  0.05037037  0.05049383 [11,] 25.59024691 25.60888889 [12,]  3.47987654  3.53246914 [13,]  0.00000000  0.00000000 [14,] 31.39037037 31.39049383 [15,]  3.78296296  3.77641975 [16,] 13.17876543 13.19617284 [17,]  0.00000000  0.00000000 > debtdeciles = apply(madebt[1:17,1:2],2,function(x) +             cut(x,quantile(x,(0:10)/10, na.rm=TRUE),label=FALSE,include.lowest=TRUE)) Error in cut.default(x, quantile(x, (0:10)/10, na.rm = TRUE), label = FALSE,  :   'breaks' are not unique My guess is that we now have 3 "zeros" in each column. For each decile, we cannot have more than 2 elements (total of 17 numbers in each column) and I believe R cannot determine where to put the third "zero". Do you have any solution for this problem? Many thanks,