# Is that an efficient way to find the overlapped , upstream and downstream ranges for a bunch of ranges

3 messages
Open this post in threaded view
|

## Is that an efficient way to find the overlapped , upstream and downstream ranges for a bunch of ranges

 I do have a bunch of genes ( nearly ~50000)  from the whole genome, which read in genomic ranges A range(gene) can be seem as an observation has three columns chromosome, start and end, like that        seqnames start end width strand gene1     chr1     1   5     5      + gene2     chr1    10  15     6      + gene3     chr1    12  17     6      + gene4     chr1    20  25     6      + gene5     chr1    30  40    11      + I just wondering is there an efficient way to find overlapped, upstream and downstream genes for each gene in the granges For example, assuming all_genes_gr is a ~50000 genes genomic range, the result I want like belows: gene_nameupstream_genedownstream_geneoverlapped_gene gene1NAgene2NA gene2gene1gene4gene3 gene3gene1gene4gene2 gene4gene3gene5NA Currently ,  the strategy I use is like that,   library(GenomicRanges) find_overlapped_gene <- function(idx, all_genes_gr) {   #cat(idx, "\n")   curr_gene <- all_genes_gr[idx]   other_genes <- all_genes_gr[-idx]   n <- countOverlaps(curr_gene, other_genes)   gene <- subsetByOverlaps(curr_gene, other_genes)   return(list(n, gene)) }​ system.time(lapply(1:100, function(idx)  find_overlapped_gene(idx, all_genes_gr))) However, for 100 genes, it use nearly ~8s by system.time().That means if I had 50000 genes, nearly one hour for just find overlapped gene. I am just wondering any algorithm or strategy to do that efficiently, perhaps 50000 genes in ~10min or even less           [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.