# Find the 50 highest values in a matrix

6 messages
Open this post in threaded view
|

## Find the 50 highest values in a matrix

 Hi, I have a huge matrix (4000 * 2000 data points) and I would like to retrieve the coordinates (column and row) for the top 50 (or x) values. Some positions in the matrix have NA as a value. These should be discarded. My current method is to replace all NAs by 0, then rank all the values and then extract the positions with the 50 highest ranks. It is very time-consuming! Is there a simpler way to do this? Thank you, Ulrich
Open this post in threaded view
|

## Re: Find the 50 highest values in a matrix

 Matrix is just a vector. So order should work haven't verified the following code. a <- matrix(rnorm(4000*2000), 4000, 2000) b <- order(a, na.last=TRUE, decreasing=TRUE)[1:50] use %% or %/% to get the row# and column #s Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina [hidden email] On Jun 18, 2010, at 1:41 AM, uschlecht wrote: > > Hi, > > I have a huge matrix (4000 * 2000 data points) and I would like to   > retrieve > the coordinates (column and row) for the top 50 (or x) values. Some > positions in the matrix have NA as a value. These should be discarded. > > My current method is to replace all NAs by 0, then rank all the   > values and > then extract the positions with the 50 highest ranks. It is very > time-consuming! > > Is there a simpler way to do this? > > Thank you, > Ulrich > > -- > View this message in context: http://r.789695.n4.nabble.com/Find-the-50-highest-values-in-a-matrix-tp2259721p2259721.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. b - b%%nrow(a) ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Find the 50 highest values in a matrix

 In reply to this post by uschlecht Hi: Here's a faked up example: a <- matrix(rnorm(4000*2000), 4000, 2000) # Generate some NAs in the matrix nr <- sample(50, 1:4000) nc <- sample(50, 1:2000) a[nr, nc] <- NA # convert to data frame: b <- data.frame(row = rep(1:4000, 2000), col = rep(1:2000, each = 4000),                           x = as.vector(a)) # relatively time consuming...about 13.5 s on my machine bb <- b[rev(order(b\$x, na.last = FALSE)), ] > bb[1:10, ]          row  col        x 691269  3269  173 5.103704 7815076 3076 1954 4.961544 4999621 3621 1250 4.953265 500469   469  126 4.937655 5878224 2224 1470 4.929150 4287270 3270 1072 4.913791 4442521 2521 1111 4.896869 4668867  867 1168 4.863504 5716575  575 1430 4.760778 3055274 3274  764 4.758995 HTH, Dennis On Thu, Jun 17, 2010 at 10:41 PM, uschlecht <[hidden email]>wrote: > > Hi, > > I have a huge matrix (4000 * 2000 data points) and I would like to retrieve > the coordinates (column and row) for the top 50 (or x) values. Some > positions in the matrix have NA as a value. These should be discarded. > > My current method is to replace all NAs by 0, then rank all the values and > then extract the positions with the 50 highest ranks. It is very > time-consuming! > > Is there a simpler way to do this? > > Thank you, > Ulrich > > -- > View this message in context: > http://r.789695.n4.nabble.com/Find-the-50-highest-values-in-a-matrix-tp2259721p2259721.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Find the 50 highest values in a matrix

 m <- matrix(round(rnorm(4000 * 2000), 4), nr = 4000) is.na(m) <- sample(8e6, 1e6) system.time(    idx <- which(      matrix(m %in% head(sort(m, TRUE), 50),             nr = nrow(m)), arr.ind = TRUE)) #   user  system elapsed #   3.12    0.19    3.18    -Peter Ehlers On 2010-06-18 5:13, Dennis Murphy wrote: > Hi: > > Here's a faked up example: > > a<- matrix(rnorm(4000*2000), 4000, 2000) > # Generate some NAs in the matrix > nr<- sample(50, 1:4000) > nc<- sample(50, 1:2000) > a[nr, nc]<- NA > > # convert to data frame: > b<- data.frame(row = rep(1:4000, 2000), col = rep(1:2000, each = 4000), >                            x = as.vector(a)) > # relatively time consuming...about 13.5 s on my machine > bb<- b[rev(order(b\$x, na.last = FALSE)), ] >> bb[1:10, ] >           row  col        x > 691269  3269  173 5.103704 > 7815076 3076 1954 4.961544 > 4999621 3621 1250 4.953265 > 500469   469  126 4.937655 > 5878224 2224 1470 4.929150 > 4287270 3270 1072 4.913791 > 4442521 2521 1111 4.896869 > 4668867  867 1168 4.863504 > 5716575  575 1430 4.760778 > 3055274 3274  764 4.758995 > > HTH, > Dennis > > > On Thu, Jun 17, 2010 at 10:41 PM, uschlecht<[hidden email]>wrote: > >> >> Hi, >> >> I have a huge matrix (4000 * 2000 data points) and I would like to retrieve >> the coordinates (column and row) for the top 50 (or x) values. Some >> positions in the matrix have NA as a value. These should be discarded. >> >> My current method is to replace all NAs by 0, then rank all the values and >> then extract the positions with the 50 highest ranks. It is very >> time-consuming! >> >> Is there a simpler way to do this? >> >> Thank you, >> Ulrich >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Find-the-50-highest-values-in-a-matrix-tp2259721p2259721.html>> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.