

Dear all,
please could you advise on the R code I could use in order to do the
following operation :
a.  I have 2 lists of "genome coordinates" : a list is composed by
numbers that represent genome coordinates;
let's say list N :
n1
n2
n3
n4
and a list M:
m1
m2
m3
m4
m5
2  and a data frame C, where for some pairs of coordinates (n,m) from the
lists above, we have a numerical intensity;
for example :
n1; m1; 100
n1; m2; 300
The question would be : what is the most efficient R code I could use in
order to integrate the list N, the list M, and the data frame C, in order
to obtain a DATA FRAME,
 list N as the columns names
 list M as the rows names
 the values in the cells of N * M, corresponding to the numerical values
in the data frame C.
A little example would be :
n1 n2 n3 n4
m1 100   
m2 300   
m3    
m4    
m5    
I wrote a script in perl, although i would like to do this in R
Many thanks ;)
 bogdan
Reproducible example, please.  In particular, what exactly does C look ilike?
(You should know this by now).
 Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
Dear Bert,
thank you for your response. here it is the piece of R code : given 3 data
frames below 
N < data.frame(N=c("n1","n2","n3","n4"))
M < data.frame(M=c("m1","m2","m3","m4","m5"))
C < data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
how shall I integrate N, and M, and C in such a way that at the end we have
a data frame with :
 list N as the columns names
 list M as the rows names
 the values in the cells of N * M, corresponding to the numerical
values in the data frame C.
more precisely, the result shall be :
n1 n2 n3 n4
m1 100 200  
m2    
m3   300 
m4    
m5    
thank you !
Hi Bogdan,
Kinda messy, but:
N < data.frame(N=c("n1","n2","n3","n4"))
M < data.frame(M=c("m1","m2","m3","m4","m5"))
C < data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
MN<as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
names(MN)<M[,1]
rownames(MN)<N[,1]
C[,1]<as.character(C[,1])
C[,2]<as.character(C[,2])
for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<C[row,3]
Jim
Thank you Jim !
Here's another approach:
N < data.frame(N=c("n1","n2","n3","n4"))
M < data.frame(M=c("m1","m2","m3","m4","m5"))
C < data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
# Rebuild the factors using M and N
C$m < factor(as.character(C$m), levels=levels(M$M))
C$n < factor(as.character(C$n), levels=levels(N$N))
MN < xtabs(I~m+n, C)
print(MN, zero.print="")
# n
# m n1 n2 n3 n4
# m1 100 300  
# m2    
# m3   400 
# m4    
# m5    
class(MN)
# [1] "xtabs" "table"
# MN is a table. If you want a data.frame
MN < as.data.frame.matrix(MN)
class(MN)
# [1] "data.frame"

David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 778404352
Original Message
Thank you David for the code, as I am learning about xtabs operation. That
works great too ;)
Thank you David. Using xtabs operation simplifies the code very much, many
thanks ;)
Simple matrix indexing suffices without any fancier functionality.
## First convert M and N to character vectors  which they should
have been in the first place!
M < sort(as.character(M[,1]))
N < sort(as.character(N[,1]))
## This could be a oneliner, but I'll split it up for clarity.
res <matrix(NA, length(M),length(N),dimnames = list(M,N))
res[as.matrix(C[,2:1])] < C$I ## matrix indexing
res
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
Thank you Bert for your suggestion ;).
