make a shortlist of data based on blocks of binary values in one column

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

make a shortlist of data based on blocks of binary values in one column

baboon2010
My question is twofold.

Part 1:
My data looks like this:

(example set, real data has 2*10^6 rows)
binary<-c(1,1,1,0,0,0,1,1,1,0,0)
Chromosome<-c(1,1,1,1,1,1,2,2,2,2,2)
start<-c(12,17,18,20,25,36,12,15,16,17,19)
Table<-cbind(Chromosome,start,binary)
      Chromosome start binary
 [1,]          1    12      1
 [2,]          1    17      1
 [3,]          1    18      1
 [4,]          1    20      0
 [5,]          1    25      0
 [6,]          1    36      0
 [7,]          2    12      1
 [8,]          2    15      1
 [9,]          2    16      1
[10,]          2    17      0
[11,]          2    19      0

As output I need a shortlist for each binary block: giving me the starting and ending position of each block.
Which for these example would look like this:
     Chromosome2 position_start position_end binary2
[1,]           1             12           18       1
[2,]           1             20           36       0
[3,]           2             12           16       1
[4,]           2             17           19       0

Part 2:
Based on the output of part 1, I need to assign the binary to rows of another data set. If the position value in this second data set falls in one of the blocks defined in the shortlist made in part1,the binary value of the shortlist should be assigned to an extra column for this row.  This would look something like this:
     Chromosome3 position Value binary3
 [1,] "1"         "12"     "a"   "1"    
 [2,] "1"         "13"     "b"   "1"    
 [3,] "1"         "14"     "c"   "1"    
 [4,] "1"         "15"     "d"   "1"    
 [5,] "1"         "16"     "e"   "1"    
 [6,] "1"         "18"     "f"   "1"    
 [7,] "1"         "20"     "g"   "0"    
 [8,] "1"         "21"     "h"   "0"    
 [9,] "1"         "22"     "i"   "0"    
[10,] "1"         "23"     "j"   "0"    
[11,] "1"         "25"     "k"   "0"    
[12,] "1"         "35"     "l"   "0"    
[13,] "2"         "12"     "m"   "1"    
[14,] "2"         "13"     "n"   "1"    
[15,] "2"         "14"     "o"   "1"    
[16,] "2"         "15"     "p"   "1"    
[17,] "2"         "16"     "q"   "1"    
[18,] "2"         "17"     "s"   "0"    
[19,] "2"         "18"     "d"   "0"    
[20,] "2"         "19"     "f"   "0"    

Many thanks in advance,

Niels