integrating 2 lists and a data frame in R

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

integrating 2 lists and a data frame in R

Bogdan Tanasa
 Dear all,

please could you advise on the R code I could use in order to do the
following operation :

a. -- I have 2 lists of "genome coordinates" : a list is composed by
numbers that represent genome coordinates;

let's say list N :

n1

n2

n3

n4

and a list M:

m1

m2

m3

m4

m5

2 -- and a data frame C, where for some pairs of coordinates (n,m) from the
lists above, we have a numerical intensity;

for example :

n1; m1; 100

n1; m2; 300

The question would be : what is the most efficient R code I could use in
order to integrate the list N, the list M, and the data frame C, in order
to obtain a DATA FRAME,

-- list N as the columns names
-- list M as the rows names
-- the values in the cells of N * M, corresponding to the numerical values
in the data frame C.

A little example would be :

      n1  n2  n3 n4

      m1  100  -   -   -

      m2  300  -   -   -

      m3   -   -   -   -

      m4   -   -   -   -

      m5   -   -   -   -
I wrote a script in perl, although i would like to do this in R
Many thanks ;)
-- bogdan

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

Bert Gunter-2
Reproducible example, please. -- In particular, what exactly does C look ilike?

(You should know this by now).

-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]> wrote:

>  Dear all,
>
> please could you advise on the R code I could use in order to do the
> following operation :
>
> a. -- I have 2 lists of "genome coordinates" : a list is composed by
> numbers that represent genome coordinates;
>
> let's say list N :
>
> n1
>
> n2
>
> n3
>
> n4
>
> and a list M:
>
> m1
>
> m2
>
> m3
>
> m4
>
> m5
>
> 2 -- and a data frame C, where for some pairs of coordinates (n,m) from the
> lists above, we have a numerical intensity;
>
> for example :
>
> n1; m1; 100
>
> n1; m2; 300
>
> The question would be : what is the most efficient R code I could use in
> order to integrate the list N, the list M, and the data frame C, in order
> to obtain a DATA FRAME,
>
> -- list N as the columns names
> -- list M as the rows names
> -- the values in the cells of N * M, corresponding to the numerical values
> in the data frame C.
>
> A little example would be :
>
>       n1  n2  n3 n4
>
>       m1  100  -   -   -
>
>       m2  300  -   -   -
>
>       m3   -   -   -   -
>
>       m4   -   -   -   -
>
>       m5   -   -   -   -
> I wrote a script in perl, although i would like to do this in R
> Many thanks ;)
> -- bogdan
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

Bogdan Tanasa
Dear Bert,

thank you for your response. here it is the piece of R code : given 3 data
frames below ---

N <- data.frame(N=c("n1","n2","n3","n4"))

M <- data.frame(M=c("m1","m2","m3","m4","m5"))

C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))

how shall I integrate N, and M, and C in such a way that at the end we have
a data frame with :


   - list N as the columns names
   - list M as the rows names
   - the values in the cells of N * M, corresponding to the numerical
   values in the data frame C.

more precisely, the result shall be :

     n1  n2  n3 n4
m1  100  200   -   -
m2   -   -   -   -
m3   -   -   300   -
m4   -   -   -   -
m5   -   -   -   -

thank you !


On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]> wrote:

> Reproducible example, please. -- In particular, what exactly does C look
> ilike?
>
> (You should know this by now).
>
> -- Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]> wrote:
> >  Dear all,
> >
> > please could you advise on the R code I could use in order to do the
> > following operation :
> >
> > a. -- I have 2 lists of "genome coordinates" : a list is composed by
> > numbers that represent genome coordinates;
> >
> > let's say list N :
> >
> > n1
> >
> > n2
> >
> > n3
> >
> > n4
> >
> > and a list M:
> >
> > m1
> >
> > m2
> >
> > m3
> >
> > m4
> >
> > m5
> >
> > 2 -- and a data frame C, where for some pairs of coordinates (n,m) from
> the
> > lists above, we have a numerical intensity;
> >
> > for example :
> >
> > n1; m1; 100
> >
> > n1; m2; 300
> >
> > The question would be : what is the most efficient R code I could use in
> > order to integrate the list N, the list M, and the data frame C, in order
> > to obtain a DATA FRAME,
> >
> > -- list N as the columns names
> > -- list M as the rows names
> > -- the values in the cells of N * M, corresponding to the numerical
> values
> > in the data frame C.
> >
> > A little example would be :
> >
> >       n1  n2  n3 n4
> >
> >       m1  100  -   -   -
> >
> >       m2  300  -   -   -
> >
> >       m3   -   -   -   -
> >
> >       m4   -   -   -   -
> >
> >       m5   -   -   -   -
> > I wrote a script in perl, although i would like to do this in R
> > Many thanks ;)
> > -- bogdan
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

Jim Lemon-4
Hi Bogdan,
Kinda messy, but:

N <- data.frame(N=c("n1","n2","n3","n4"))
M <- data.frame(M=c("m1","m2","m3","m4","m5"))
C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
names(MN)<-M[,1]
rownames(MN)<-N[,1]
C[,1]<-as.character(C[,1])
C[,2]<-as.character(C[,2])
for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]

Jim

On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <[hidden email]> wrote:

> Dear Bert,
>
> thank you for your response. here it is the piece of R code : given 3 data
> frames below ---
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
>
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
>
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
>
> how shall I integrate N, and M, and C in such a way that at the end we have
> a data frame with :
>
>
>    - list N as the columns names
>    - list M as the rows names
>    - the values in the cells of N * M, corresponding to the numerical
>    values in the data frame C.
>
> more precisely, the result shall be :
>
>      n1  n2  n3 n4
> m1  100  200   -   -
> m2   -   -   -   -
> m3   -   -   300   -
> m4   -   -   -   -
> m5   -   -   -   -
>
> thank you !
>
>
> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]> wrote:
>
>> Reproducible example, please. -- In particular, what exactly does C look
>> ilike?
>>
>> (You should know this by now).
>>
>> -- Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]> wrote:
>> >  Dear all,
>> >
>> > please could you advise on the R code I could use in order to do the
>> > following operation :
>> >
>> > a. -- I have 2 lists of "genome coordinates" : a list is composed by
>> > numbers that represent genome coordinates;
>> >
>> > let's say list N :
>> >
>> > n1
>> >
>> > n2
>> >
>> > n3
>> >
>> > n4
>> >
>> > and a list M:
>> >
>> > m1
>> >
>> > m2
>> >
>> > m3
>> >
>> > m4
>> >
>> > m5
>> >
>> > 2 -- and a data frame C, where for some pairs of coordinates (n,m) from
>> the
>> > lists above, we have a numerical intensity;
>> >
>> > for example :
>> >
>> > n1; m1; 100
>> >
>> > n1; m2; 300
>> >
>> > The question would be : what is the most efficient R code I could use in
>> > order to integrate the list N, the list M, and the data frame C, in order
>> > to obtain a DATA FRAME,
>> >
>> > -- list N as the columns names
>> > -- list M as the rows names
>> > -- the values in the cells of N * M, corresponding to the numerical
>> values
>> > in the data frame C.
>> >
>> > A little example would be :
>> >
>> >       n1  n2  n3 n4
>> >
>> >       m1  100  -   -   -
>> >
>> >       m2  300  -   -   -
>> >
>> >       m3   -   -   -   -
>> >
>> >       m4   -   -   -   -
>> >
>> >       m5   -   -   -   -
>> > I wrote a script in perl, although i would like to do this in R
>> > Many thanks ;)
>> > -- bogdan
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

Bogdan Tanasa
Thank you Jim !

On Tue, Jun 6, 2017 at 4:01 AM, Jim Lemon <[hidden email]> wrote:

> Hi Bogdan,
> Kinda messy, but:
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> names(MN)<-M[,1]
> rownames(MN)<-N[,1]
> C[,1]<-as.character(C[,1])
> C[,2]<-as.character(C[,2])
> for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
>
> Jim
>
> On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <[hidden email]> wrote:
> > Dear Bert,
> >
> > thank you for your response. here it is the piece of R code : given 3
> data
> > frames below ---
> >
> > N <- data.frame(N=c("n1","n2","n3","n4"))
> >
> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >
> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> >
> > how shall I integrate N, and M, and C in such a way that at the end we
> have
> > a data frame with :
> >
> >
> >    - list N as the columns names
> >    - list M as the rows names
> >    - the values in the cells of N * M, corresponding to the numerical
> >    values in the data frame C.
> >
> > more precisely, the result shall be :
> >
> >      n1  n2  n3 n4
> > m1  100  200   -   -
> > m2   -   -   -   -
> > m3   -   -   300   -
> > m4   -   -   -   -
> > m5   -   -   -   -
> >
> > thank you !
> >
> >
> > On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]>
> wrote:
> >
> >> Reproducible example, please. -- In particular, what exactly does C look
> >> ilike?
> >>
> >> (You should know this by now).
> >>
> >> -- Bert
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]> wrote:
> >> >  Dear all,
> >> >
> >> > please could you advise on the R code I could use in order to do the
> >> > following operation :
> >> >
> >> > a. -- I have 2 lists of "genome coordinates" : a list is composed by
> >> > numbers that represent genome coordinates;
> >> >
> >> > let's say list N :
> >> >
> >> > n1
> >> >
> >> > n2
> >> >
> >> > n3
> >> >
> >> > n4
> >> >
> >> > and a list M:
> >> >
> >> > m1
> >> >
> >> > m2
> >> >
> >> > m3
> >> >
> >> > m4
> >> >
> >> > m5
> >> >
> >> > 2 -- and a data frame C, where for some pairs of coordinates (n,m)
> from
> >> the
> >> > lists above, we have a numerical intensity;
> >> >
> >> > for example :
> >> >
> >> > n1; m1; 100
> >> >
> >> > n1; m2; 300
> >> >
> >> > The question would be : what is the most efficient R code I could use
> in
> >> > order to integrate the list N, the list M, and the data frame C, in
> order
> >> > to obtain a DATA FRAME,
> >> >
> >> > -- list N as the columns names
> >> > -- list M as the rows names
> >> > -- the values in the cells of N * M, corresponding to the numerical
> >> values
> >> > in the data frame C.
> >> >
> >> > A little example would be :
> >> >
> >> >       n1  n2  n3 n4
> >> >
> >> >       m1  100  -   -   -
> >> >
> >> >       m2  300  -   -   -
> >> >
> >> >       m3   -   -   -   -
> >> >
> >> >       m4   -   -   -   -
> >> >
> >> >       m5   -   -   -   -
> >> > I wrote a script in perl, although i would like to do this in R
> >> > Many thanks ;)
> >> > -- bogdan
> >> >
> >> >         [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

David Carlson
In reply to this post by Jim Lemon-4
Here's another approach:

N <- data.frame(N=c("n1","n2","n3","n4"))
M <- data.frame(M=c("m1","m2","m3","m4","m5"))
C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))

# Rebuild the factors using M and N
C$m <- factor(as.character(C$m), levels=levels(M$M))
C$n <- factor(as.character(C$n), levels=levels(N$N))
MN <- xtabs(I~m+n, C)
print(MN, zero.print="-")
#     n
# m     n1  n2  n3 n4
#   m1 100 300   -  -
#   m2   -   -   -  -
#   m3   -   - 400  -
#   m4   -   -   -  -
#   m5   -   -   -  -

class(MN)
# [1] "xtabs" "table"
# MN is a table. If you want a data.frame
MN <- as.data.frame.matrix(MN)
class(MN)
# [1] "data.frame"

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of Jim Lemon
Sent: Tuesday, June 6, 2017 6:02 AM
To: Bogdan Tanasa <[hidden email]>; r-help mailing list <[hidden email]>
Subject: Re: [R] integrating 2 lists and a data frame in R

Hi Bogdan,
Kinda messy, but:

N <- data.frame(N=c("n1","n2","n3","n4"))
M <- data.frame(M=c("m1","m2","m3","m4","m5"))
C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
names(MN)<-M[,1]
rownames(MN)<-N[,1]
C[,1]<-as.character(C[,1])
C[,2]<-as.character(C[,2])
for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]

Jim

On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <[hidden email]> wrote:

> Dear Bert,
>
> thank you for your response. here it is the piece of R code : given 3 data
> frames below ---
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
>
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
>
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
>
> how shall I integrate N, and M, and C in such a way that at the end we have
> a data frame with :
>
>
>    - list N as the columns names
>    - list M as the rows names
>    - the values in the cells of N * M, corresponding to the numerical
>    values in the data frame C.
>
> more precisely, the result shall be :
>
>      n1  n2  n3 n4
> m1  100  200   -   -
> m2   -   -   -   -
> m3   -   -   300   -
> m4   -   -   -   -
> m5   -   -   -   -
>
> thank you !
>
>
> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]> wrote:
>
>> Reproducible example, please. -- In particular, what exactly does C look
>> ilike?
>>
>> (You should know this by now).
>>
>> -- Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]> wrote:
>> >  Dear all,
>> >
>> > please could you advise on the R code I could use in order to do the
>> > following operation :
>> >
>> > a. -- I have 2 lists of "genome coordinates" : a list is composed by
>> > numbers that represent genome coordinates;
>> >
>> > let's say list N :
>> >
>> > n1
>> >
>> > n2
>> >
>> > n3
>> >
>> > n4
>> >
>> > and a list M:
>> >
>> > m1
>> >
>> > m2
>> >
>> > m3
>> >
>> > m4
>> >
>> > m5
>> >
>> > 2 -- and a data frame C, where for some pairs of coordinates (n,m) from
>> the
>> > lists above, we have a numerical intensity;
>> >
>> > for example :
>> >
>> > n1; m1; 100
>> >
>> > n1; m2; 300
>> >
>> > The question would be : what is the most efficient R code I could use in
>> > order to integrate the list N, the list M, and the data frame C, in order
>> > to obtain a DATA FRAME,
>> >
>> > -- list N as the columns names
>> > -- list M as the rows names
>> > -- the values in the cells of N * M, corresponding to the numerical
>> values
>> > in the data frame C.
>> >
>> > A little example would be :
>> >
>> >       n1  n2  n3 n4
>> >
>> >       m1  100  -   -   -
>> >
>> >       m2  300  -   -   -
>> >
>> >       m3   -   -   -   -
>> >
>> >       m4   -   -   -   -
>> >
>> >       m5   -   -   -   -
>> > I wrote a script in perl, although i would like to do this in R
>> > Many thanks ;)
>> > -- bogdan
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

Bogdan Tanasa
Thank you David for the code, as I am learning about xtabs operation. That
works great too ;)

On Tue, Jun 6, 2017 at 7:34 AM, David L Carlson <[hidden email]> wrote:

> Here's another approach:
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
>
> # Rebuild the factors using M and N
> C$m <- factor(as.character(C$m), levels=levels(M$M))
> C$n <- factor(as.character(C$n), levels=levels(N$N))
> MN <- xtabs(I~m+n, C)
> print(MN, zero.print="-")
> #     n
> # m     n1  n2  n3 n4
> #   m1 100 300   -  -
> #   m2   -   -   -  -
> #   m3   -   - 400  -
> #   m4   -   -   -  -
> #   m5   -   -   -  -
>
> class(MN)
> # [1] "xtabs" "table"
> # MN is a table. If you want a data.frame
> MN <- as.data.frame.matrix(MN)
> class(MN)
> # [1] "data.frame"
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -----Original Message-----
> From: R-help [mailto:[hidden email]] On Behalf Of Jim Lemon
> Sent: Tuesday, June 6, 2017 6:02 AM
> To: Bogdan Tanasa <[hidden email]>; r-help mailing list <
> [hidden email]>
> Subject: Re: [R] integrating 2 lists and a data frame in R
>
> Hi Bogdan,
> Kinda messy, but:
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> names(MN)<-M[,1]
> rownames(MN)<-N[,1]
> C[,1]<-as.character(C[,1])
> C[,2]<-as.character(C[,2])
> for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
>
> Jim
>
> On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <[hidden email]> wrote:
> > Dear Bert,
> >
> > thank you for your response. here it is the piece of R code : given 3
> data
> > frames below ---
> >
> > N <- data.frame(N=c("n1","n2","n3","n4"))
> >
> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >
> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> >
> > how shall I integrate N, and M, and C in such a way that at the end we
> have
> > a data frame with :
> >
> >
> >    - list N as the columns names
> >    - list M as the rows names
> >    - the values in the cells of N * M, corresponding to the numerical
> >    values in the data frame C.
> >
> > more precisely, the result shall be :
> >
> >      n1  n2  n3 n4
> > m1  100  200   -   -
> > m2   -   -   -   -
> > m3   -   -   300   -
> > m4   -   -   -   -
> > m5   -   -   -   -
> >
> > thank you !
> >
> >
> > On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]>
> wrote:
> >
> >> Reproducible example, please. -- In particular, what exactly does C look
> >> ilike?
> >>
> >> (You should know this by now).
> >>
> >> -- Bert
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]> wrote:
> >> >  Dear all,
> >> >
> >> > please could you advise on the R code I could use in order to do the
> >> > following operation :
> >> >
> >> > a. -- I have 2 lists of "genome coordinates" : a list is composed by
> >> > numbers that represent genome coordinates;
> >> >
> >> > let's say list N :
> >> >
> >> > n1
> >> >
> >> > n2
> >> >
> >> > n3
> >> >
> >> > n4
> >> >
> >> > and a list M:
> >> >
> >> > m1
> >> >
> >> > m2
> >> >
> >> > m3
> >> >
> >> > m4
> >> >
> >> > m5
> >> >
> >> > 2 -- and a data frame C, where for some pairs of coordinates (n,m)
> from
> >> the
> >> > lists above, we have a numerical intensity;
> >> >
> >> > for example :
> >> >
> >> > n1; m1; 100
> >> >
> >> > n1; m2; 300
> >> >
> >> > The question would be : what is the most efficient R code I could use
> in
> >> > order to integrate the list N, the list M, and the data frame C, in
> order
> >> > to obtain a DATA FRAME,
> >> >
> >> > -- list N as the columns names
> >> > -- list M as the rows names
> >> > -- the values in the cells of N * M, corresponding to the numerical
> >> values
> >> > in the data frame C.
> >> >
> >> > A little example would be :
> >> >
> >> >       n1  n2  n3 n4
> >> >
> >> >       m1  100  -   -   -
> >> >
> >> >       m2  300  -   -   -
> >> >
> >> >       m3   -   -   -   -
> >> >
> >> >       m4   -   -   -   -
> >> >
> >> >       m5   -   -   -   -
> >> > I wrote a script in perl, although i would like to do this in R
> >> > Many thanks ;)
> >> > -- bogdan
> >> >
> >> >         [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

David Winsemius
In reply to this post by Jim Lemon-4

> On Jun 6, 2017, at 4:01 AM, Jim Lemon <[hidden email]> wrote:
>
> Hi Bogdan,
> Kinda messy, but:
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
> MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> names(MN)<-M[,1]
> rownames(MN)<-N[,1]
> C[,1]<-as.character(C[,1])
> C[,2]<-as.character(C[,2])
> for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]

`xtabs` offers another route:

C$m <- factor(C$m, levels=M$M)
C$n <- factor(C$n, levels=N$N)

Option 1:  Zeroes in the empty positions:
> (X <- xtabs(I ~ m+n , C, addNA=TRUE))
    n
m     n1  n2  n3  n4
  m1 100 300   0   0
  m2   0   0   0   0
  m3   0   0 400   0
  m4   0   0   0   0
  m5   0   0   0   0

Option 2: Sparase matrix
> (X <- xtabs(I ~ m+n , C, sparse=TRUE))
5 x 4 sparse Matrix of class "dgCMatrix"
    n
m     n1  n2  n3 n4
  m1 100 300   .  .
  m2   .   .   .  .
  m3   .   . 400  .
  m4   .   .   .  .
  m5   .   .   .  .

I wasn't sure if the sparse reuslts of xtabs would make a distinction between 0 and NA, but happily it does:

> C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3", "m4", "m5"), I=c(100,300,400, NA, 0))
> C
   n  m   I
1 n1 m1 100
2 n2 m1 300
3 n3 m3 400
4 n3 m4  NA
5 n4 m5   0
> (X <- xtabs(I ~ m+n , C, sparse=TRUE))
4 x 4 sparse Matrix of class "dgCMatrix"
    n
m     n1  n2  n3 n4
  m1 100 300   .  .
  m3   .   . 400  .
  m4   .   .   .  .
  m5   .   .   .  0

(In the example I forgot to repeat the lines that augmented the factor levels so m2 is not seen.

--
Davod

>
>
> Jim
>
> On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <[hidden email]> wrote:
>> Dear Bert,
>>
>> thank you for your response. here it is the piece of R code : given 3 data
>> frames below ---
>>
>> N <- data.frame(N=c("n1","n2","n3","n4"))
>>
>> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
>>
>> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))
>>
>> how shall I integrate N, and M, and C in such a way that at the end we have
>> a data frame with :
>>
>>
>>   - list N as the columns names
>>   - list M as the rows names
>>   - the values in the cells of N * M, corresponding to the numerical
>>   values in the data frame C.
>>
>> more precisely, the result shall be :
>>
>>     n1  n2  n3 n4
>> m1  100  200   -   -
>> m2   -   -   -   -
>> m3   -   -   300   -
>> m4   -   -   -   -
>> m5   -   -   -   -
>>
>> thank you !
>>
>>
>> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]> wrote:
>>
>>> Reproducible example, please. -- In particular, what exactly does C look
>>> ilike?
>>>
>>> (You should know this by now).
>>>
>>> -- Bert
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]> wrote:
>>>> Dear all,
>>>>
>>>> please could you advise on the R code I could use in order to do the
>>>> following operation :
>>>>
>>>> a. -- I have 2 lists of "genome coordinates" : a list is composed by
>>>> numbers that represent genome coordinates;
>>>>
>>>> let's say list N :
>>>>
>>>> n1
>>>>
>>>> n2
>>>>
>>>> n3
>>>>
>>>> n4
>>>>
>>>> and a list M:
>>>>
>>>> m1
>>>>
>>>> m2
>>>>
>>>> m3
>>>>
>>>> m4
>>>>
>>>> m5
>>>>
>>>> 2 -- and a data frame C, where for some pairs of coordinates (n,m) from
>>> the
>>>> lists above, we have a numerical intensity;
>>>>
>>>> for example :
>>>>
>>>> n1; m1; 100
>>>>
>>>> n1; m2; 300
>>>>
>>>> The question would be : what is the most efficient R code I could use in
>>>> order to integrate the list N, the list M, and the data frame C, in order
>>>> to obtain a DATA FRAME,
>>>>
>>>> -- list N as the columns names
>>>> -- list M as the rows names
>>>> -- the values in the cells of N * M, corresponding to the numerical
>>> values
>>>> in the data frame C.
>>>>
>>>> A little example would be :
>>>>
>>>>      n1  n2  n3 n4
>>>>
>>>>      m1  100  -   -   -
>>>>
>>>>      m2  300  -   -   -
>>>>
>>>>      m3   -   -   -   -
>>>>
>>>>      m4   -   -   -   -
>>>>
>>>>      m5   -   -   -   -
>>>> I wrote a script in perl, although i would like to do this in R
>>>> Many thanks ;)
>>>> -- bogdan
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

Bogdan Tanasa
Thank you David. Using xtabs operation simplifies the code very much, many
thanks ;)

On Tue, Jun 6, 2017 at 7:44 AM, David Winsemius <[hidden email]>
wrote:

>
> > On Jun 6, 2017, at 4:01 AM, Jim Lemon <[hidden email]> wrote:
> >
> > Hi Bogdan,
> > Kinda messy, but:
> >
> > N <- data.frame(N=c("n1","n2","n3","n4"))
> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> > names(MN)<-M[,1]
> > rownames(MN)<-N[,1]
> > C[,1]<-as.character(C[,1])
> > C[,2]<-as.character(C[,2])
> > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
>
> `xtabs` offers another route:
>
> C$m <- factor(C$m, levels=M$M)
> C$n <- factor(C$n, levels=N$N)
>
> Option 1:  Zeroes in the empty positions:
> > (X <- xtabs(I ~ m+n , C, addNA=TRUE))
>     n
> m     n1  n2  n3  n4
>   m1 100 300   0   0
>   m2   0   0   0   0
>   m3   0   0 400   0
>   m4   0   0   0   0
>   m5   0   0   0   0
>
> Option 2: Sparase matrix
> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> 5 x 4 sparse Matrix of class "dgCMatrix"
>     n
> m     n1  n2  n3 n4
>   m1 100 300   .  .
>   m2   .   .   .  .
>   m3   .   . 400  .
>   m4   .   .   .  .
>   m5   .   .   .  .
>
> I wasn't sure if the sparse reuslts of xtabs would make a distinction
> between 0 and NA, but happily it does:
>
> > C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3",
> "m4", "m5"), I=c(100,300,400, NA, 0))
> > C
>    n  m   I
> 1 n1 m1 100
> 2 n2 m1 300
> 3 n3 m3 400
> 4 n3 m4  NA
> 5 n4 m5   0
> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> 4 x 4 sparse Matrix of class "dgCMatrix"
>     n
> m     n1  n2  n3 n4
>   m1 100 300   .  .
>   m3   .   . 400  .
>   m4   .   .   .  .
>   m5   .   .   .  0
>
> (In the example I forgot to repeat the lines that augmented the factor
> levels so m2 is not seen.
>
> --
> Davod
> >
> >
> > Jim
> >
> > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <[hidden email]> wrote:
> >> Dear Bert,
> >>
> >> thank you for your response. here it is the piece of R code : given 3
> data
> >> frames below ---
> >>
> >> N <- data.frame(N=c("n1","n2","n3","n4"))
> >>
> >> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >>
> >> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> >>
> >> how shall I integrate N, and M, and C in such a way that at the end we
> have
> >> a data frame with :
> >>
> >>
> >>   - list N as the columns names
> >>   - list M as the rows names
> >>   - the values in the cells of N * M, corresponding to the numerical
> >>   values in the data frame C.
> >>
> >> more precisely, the result shall be :
> >>
> >>     n1  n2  n3 n4
> >> m1  100  200   -   -
> >> m2   -   -   -   -
> >> m3   -   -   300   -
> >> m4   -   -   -   -
> >> m5   -   -   -   -
> >>
> >> thank you !
> >>
> >>
> >> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]>
> wrote:
> >>
> >>> Reproducible example, please. -- In particular, what exactly does C
> look
> >>> ilike?
> >>>
> >>> (You should know this by now).
> >>>
> >>> -- Bert
> >>> Bert Gunter
> >>>
> >>> "The trouble with having an open mind is that people keep coming along
> >>> and sticking things into it."
> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>>
> >>>
> >>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]>
> wrote:
> >>>> Dear all,
> >>>>
> >>>> please could you advise on the R code I could use in order to do the
> >>>> following operation :
> >>>>
> >>>> a. -- I have 2 lists of "genome coordinates" : a list is composed by
> >>>> numbers that represent genome coordinates;
> >>>>
> >>>> let's say list N :
> >>>>
> >>>> n1
> >>>>
> >>>> n2
> >>>>
> >>>> n3
> >>>>
> >>>> n4
> >>>>
> >>>> and a list M:
> >>>>
> >>>> m1
> >>>>
> >>>> m2
> >>>>
> >>>> m3
> >>>>
> >>>> m4
> >>>>
> >>>> m5
> >>>>
> >>>> 2 -- and a data frame C, where for some pairs of coordinates (n,m)
> from
> >>> the
> >>>> lists above, we have a numerical intensity;
> >>>>
> >>>> for example :
> >>>>
> >>>> n1; m1; 100
> >>>>
> >>>> n1; m2; 300
> >>>>
> >>>> The question would be : what is the most efficient R code I could use
> in
> >>>> order to integrate the list N, the list M, and the data frame C, in
> order
> >>>> to obtain a DATA FRAME,
> >>>>
> >>>> -- list N as the columns names
> >>>> -- list M as the rows names
> >>>> -- the values in the cells of N * M, corresponding to the numerical
> >>> values
> >>>> in the data frame C.
> >>>>
> >>>> A little example would be :
> >>>>
> >>>>      n1  n2  n3 n4
> >>>>
> >>>>      m1  100  -   -   -
> >>>>
> >>>>      m2  300  -   -   -
> >>>>
> >>>>      m3   -   -   -   -
> >>>>
> >>>>      m4   -   -   -   -
> >>>>
> >>>>      m5   -   -   -   -
> >>>> I wrote a script in perl, although i would like to do this in R
> >>>> Many thanks ;)
> >>>> -- bogdan
> >>>>
> >>>>        [[alternative HTML version deleted]]
> >>>>
> >>>> ______________________________________________
> >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide http://www.R-project.org/
> >>> posting-guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >>        [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

Bert Gunter-2
Simple matrix indexing suffices without any fancier functionality.

## First convert M and N to character vectors -- which they should
have been in the first place!

M <- sort(as.character(M[,1]))
N <-  sort(as.character(N[,1]))

## This could be a one-liner, but I'll split it up for clarity.

res <-matrix(NA, length(M),length(N),dimnames = list(M,N))

res[as.matrix(C[,2:1])] <- C$I ## matrix indexing

res

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jun 6, 2017 at 7:46 AM, Bogdan Tanasa <[hidden email]> wrote:

> Thank you David. Using xtabs operation simplifies the code very much, many
> thanks ;)
>
> On Tue, Jun 6, 2017 at 7:44 AM, David Winsemius <[hidden email]>
> wrote:
>
>>
>> > On Jun 6, 2017, at 4:01 AM, Jim Lemon <[hidden email]> wrote:
>> >
>> > Hi Bogdan,
>> > Kinda messy, but:
>> >
>> > N <- data.frame(N=c("n1","n2","n3","n4"))
>> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
>> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
>> I=c(100,300,400))
>> > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
>> > names(MN)<-M[,1]
>> > rownames(MN)<-N[,1]
>> > C[,1]<-as.character(C[,1])
>> > C[,2]<-as.character(C[,2])
>> > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
>>
>> `xtabs` offers another route:
>>
>> C$m <- factor(C$m, levels=M$M)
>> C$n <- factor(C$n, levels=N$N)
>>
>> Option 1:  Zeroes in the empty positions:
>> > (X <- xtabs(I ~ m+n , C, addNA=TRUE))
>>     n
>> m     n1  n2  n3  n4
>>   m1 100 300   0   0
>>   m2   0   0   0   0
>>   m3   0   0 400   0
>>   m4   0   0   0   0
>>   m5   0   0   0   0
>>
>> Option 2: Sparase matrix
>> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
>> 5 x 4 sparse Matrix of class "dgCMatrix"
>>     n
>> m     n1  n2  n3 n4
>>   m1 100 300   .  .
>>   m2   .   .   .  .
>>   m3   .   . 400  .
>>   m4   .   .   .  .
>>   m5   .   .   .  .
>>
>> I wasn't sure if the sparse reuslts of xtabs would make a distinction
>> between 0 and NA, but happily it does:
>>
>> > C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3",
>> "m4", "m5"), I=c(100,300,400, NA, 0))
>> > C
>>    n  m   I
>> 1 n1 m1 100
>> 2 n2 m1 300
>> 3 n3 m3 400
>> 4 n3 m4  NA
>> 5 n4 m5   0
>> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
>> 4 x 4 sparse Matrix of class "dgCMatrix"
>>     n
>> m     n1  n2  n3 n4
>>   m1 100 300   .  .
>>   m3   .   . 400  .
>>   m4   .   .   .  .
>>   m5   .   .   .  0
>>
>> (In the example I forgot to repeat the lines that augmented the factor
>> levels so m2 is not seen.
>>
>> --
>> Davod
>> >
>> >
>> > Jim
>> >
>> > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <[hidden email]> wrote:
>> >> Dear Bert,
>> >>
>> >> thank you for your response. here it is the piece of R code : given 3
>> data
>> >> frames below ---
>> >>
>> >> N <- data.frame(N=c("n1","n2","n3","n4"))
>> >>
>> >> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
>> >>
>> >> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
>> I=c(100,300,400))
>> >>
>> >> how shall I integrate N, and M, and C in such a way that at the end we
>> have
>> >> a data frame with :
>> >>
>> >>
>> >>   - list N as the columns names
>> >>   - list M as the rows names
>> >>   - the values in the cells of N * M, corresponding to the numerical
>> >>   values in the data frame C.
>> >>
>> >> more precisely, the result shall be :
>> >>
>> >>     n1  n2  n3 n4
>> >> m1  100  200   -   -
>> >> m2   -   -   -   -
>> >> m3   -   -   300   -
>> >> m4   -   -   -   -
>> >> m5   -   -   -   -
>> >>
>> >> thank you !
>> >>
>> >>
>> >> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]>
>> wrote:
>> >>
>> >>> Reproducible example, please. -- In particular, what exactly does C
>> look
>> >>> ilike?
>> >>>
>> >>> (You should know this by now).
>> >>>
>> >>> -- Bert
>> >>> Bert Gunter
>> >>>
>> >>> "The trouble with having an open mind is that people keep coming along
>> >>> and sticking things into it."
>> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> >>>
>> >>>
>> >>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]>
>> wrote:
>> >>>> Dear all,
>> >>>>
>> >>>> please could you advise on the R code I could use in order to do the
>> >>>> following operation :
>> >>>>
>> >>>> a. -- I have 2 lists of "genome coordinates" : a list is composed by
>> >>>> numbers that represent genome coordinates;
>> >>>>
>> >>>> let's say list N :
>> >>>>
>> >>>> n1
>> >>>>
>> >>>> n2
>> >>>>
>> >>>> n3
>> >>>>
>> >>>> n4
>> >>>>
>> >>>> and a list M:
>> >>>>
>> >>>> m1
>> >>>>
>> >>>> m2
>> >>>>
>> >>>> m3
>> >>>>
>> >>>> m4
>> >>>>
>> >>>> m5
>> >>>>
>> >>>> 2 -- and a data frame C, where for some pairs of coordinates (n,m)
>> from
>> >>> the
>> >>>> lists above, we have a numerical intensity;
>> >>>>
>> >>>> for example :
>> >>>>
>> >>>> n1; m1; 100
>> >>>>
>> >>>> n1; m2; 300
>> >>>>
>> >>>> The question would be : what is the most efficient R code I could use
>> in
>> >>>> order to integrate the list N, the list M, and the data frame C, in
>> order
>> >>>> to obtain a DATA FRAME,
>> >>>>
>> >>>> -- list N as the columns names
>> >>>> -- list M as the rows names
>> >>>> -- the values in the cells of N * M, corresponding to the numerical
>> >>> values
>> >>>> in the data frame C.
>> >>>>
>> >>>> A little example would be :
>> >>>>
>> >>>>      n1  n2  n3 n4
>> >>>>
>> >>>>      m1  100  -   -   -
>> >>>>
>> >>>>      m2  300  -   -   -
>> >>>>
>> >>>>      m3   -   -   -   -
>> >>>>
>> >>>>      m4   -   -   -   -
>> >>>>
>> >>>>      m5   -   -   -   -
>> >>>> I wrote a script in perl, although i would like to do this in R
>> >>>> Many thanks ;)
>> >>>> -- bogdan
>> >>>>
>> >>>>        [[alternative HTML version deleted]]
>> >>>>
>> >>>> ______________________________________________
>> >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>> PLEASE do read the posting guide http://www.R-project.org/
>> >>> posting-guide.html
>> >>>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>
>> >>        [[alternative HTML version deleted]]
>> >>
>> >> ______________________________________________
>> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >
>> > ______________________________________________
>> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: integrating 2 lists and a data frame in R

Bogdan Tanasa
Thank you Bert for your suggestion ;).

On Tue, Jun 6, 2017 at 8:19 AM, Bert Gunter <[hidden email]> wrote:

> Simple matrix indexing suffices without any fancier functionality.
>
> ## First convert M and N to character vectors -- which they should
> have been in the first place!
>
> M <- sort(as.character(M[,1]))
> N <-  sort(as.character(N[,1]))
>
> ## This could be a one-liner, but I'll split it up for clarity.
>
> res <-matrix(NA, length(M),length(N),dimnames = list(M,N))
>
> res[as.matrix(C[,2:1])] <- C$I ## matrix indexing
>
> res
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Jun 6, 2017 at 7:46 AM, Bogdan Tanasa <[hidden email]> wrote:
> > Thank you David. Using xtabs operation simplifies the code very much,
> many
> > thanks ;)
> >
> > On Tue, Jun 6, 2017 at 7:44 AM, David Winsemius <[hidden email]>
> > wrote:
> >
> >>
> >> > On Jun 6, 2017, at 4:01 AM, Jim Lemon <[hidden email]> wrote:
> >> >
> >> > Hi Bogdan,
> >> > Kinda messy, but:
> >> >
> >> > N <- data.frame(N=c("n1","n2","n3","n4"))
> >> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> >> I=c(100,300,400))
> >> > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> >> > names(MN)<-M[,1]
> >> > rownames(MN)<-N[,1]
> >> > C[,1]<-as.character(C[,1])
> >> > C[,2]<-as.character(C[,2])
> >> > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
> >>
> >> `xtabs` offers another route:
> >>
> >> C$m <- factor(C$m, levels=M$M)
> >> C$n <- factor(C$n, levels=N$N)
> >>
> >> Option 1:  Zeroes in the empty positions:
> >> > (X <- xtabs(I ~ m+n , C, addNA=TRUE))
> >>     n
> >> m     n1  n2  n3  n4
> >>   m1 100 300   0   0
> >>   m2   0   0   0   0
> >>   m3   0   0 400   0
> >>   m4   0   0   0   0
> >>   m5   0   0   0   0
> >>
> >> Option 2: Sparase matrix
> >> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> >> 5 x 4 sparse Matrix of class "dgCMatrix"
> >>     n
> >> m     n1  n2  n3 n4
> >>   m1 100 300   .  .
> >>   m2   .   .   .  .
> >>   m3   .   . 400  .
> >>   m4   .   .   .  .
> >>   m5   .   .   .  .
> >>
> >> I wasn't sure if the sparse reuslts of xtabs would make a distinction
> >> between 0 and NA, but happily it does:
> >>
> >> > C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3",
> >> "m4", "m5"), I=c(100,300,400, NA, 0))
> >> > C
> >>    n  m   I
> >> 1 n1 m1 100
> >> 2 n2 m1 300
> >> 3 n3 m3 400
> >> 4 n3 m4  NA
> >> 5 n4 m5   0
> >> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> >> 4 x 4 sparse Matrix of class "dgCMatrix"
> >>     n
> >> m     n1  n2  n3 n4
> >>   m1 100 300   .  .
> >>   m3   .   . 400  .
> >>   m4   .   .   .  .
> >>   m5   .   .   .  0
> >>
> >> (In the example I forgot to repeat the lines that augmented the factor
> >> levels so m2 is not seen.
> >>
> >> --
> >> Davod
> >> >
> >> >
> >> > Jim
> >> >
> >> > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <[hidden email]>
> wrote:
> >> >> Dear Bert,
> >> >>
> >> >> thank you for your response. here it is the piece of R code : given 3
> >> data
> >> >> frames below ---
> >> >>
> >> >> N <- data.frame(N=c("n1","n2","n3","n4"))
> >> >>
> >> >> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >> >>
> >> >> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> >> I=c(100,300,400))
> >> >>
> >> >> how shall I integrate N, and M, and C in such a way that at the end
> we
> >> have
> >> >> a data frame with :
> >> >>
> >> >>
> >> >>   - list N as the columns names
> >> >>   - list M as the rows names
> >> >>   - the values in the cells of N * M, corresponding to the numerical
> >> >>   values in the data frame C.
> >> >>
> >> >> more precisely, the result shall be :
> >> >>
> >> >>     n1  n2  n3 n4
> >> >> m1  100  200   -   -
> >> >> m2   -   -   -   -
> >> >> m3   -   -   300   -
> >> >> m4   -   -   -   -
> >> >> m5   -   -   -   -
> >> >>
> >> >> thank you !
> >> >>
> >> >>
> >> >> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <[hidden email]>
> >> wrote:
> >> >>
> >> >>> Reproducible example, please. -- In particular, what exactly does C
> >> look
> >> >>> ilike?
> >> >>>
> >> >>> (You should know this by now).
> >> >>>
> >> >>> -- Bert
> >> >>> Bert Gunter
> >> >>>
> >> >>> "The trouble with having an open mind is that people keep coming
> along
> >> >>> and sticking things into it."
> >> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >> >>>
> >> >>>
> >> >>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <[hidden email]>
> >> wrote:
> >> >>>> Dear all,
> >> >>>>
> >> >>>> please could you advise on the R code I could use in order to do
> the
> >> >>>> following operation :
> >> >>>>
> >> >>>> a. -- I have 2 lists of "genome coordinates" : a list is composed
> by
> >> >>>> numbers that represent genome coordinates;
> >> >>>>
> >> >>>> let's say list N :
> >> >>>>
> >> >>>> n1
> >> >>>>
> >> >>>> n2
> >> >>>>
> >> >>>> n3
> >> >>>>
> >> >>>> n4
> >> >>>>
> >> >>>> and a list M:
> >> >>>>
> >> >>>> m1
> >> >>>>
> >> >>>> m2
> >> >>>>
> >> >>>> m3
> >> >>>>
> >> >>>> m4
> >> >>>>
> >> >>>> m5
> >> >>>>
> >> >>>> 2 -- and a data frame C, where for some pairs of coordinates (n,m)
> >> from
> >> >>> the
> >> >>>> lists above, we have a numerical intensity;
> >> >>>>
> >> >>>> for example :
> >> >>>>
> >> >>>> n1; m1; 100
> >> >>>>
> >> >>>> n1; m2; 300
> >> >>>>
> >> >>>> The question would be : what is the most efficient R code I could
> use
> >> in
> >> >>>> order to integrate the list N, the list M, and the data frame C, in
> >> order
> >> >>>> to obtain a DATA FRAME,
> >> >>>>
> >> >>>> -- list N as the columns names
> >> >>>> -- list M as the rows names
> >> >>>> -- the values in the cells of N * M, corresponding to the numerical
> >> >>> values
> >> >>>> in the data frame C.
> >> >>>>
> >> >>>> A little example would be :
> >> >>>>
> >> >>>>      n1  n2  n3 n4
> >> >>>>
> >> >>>>      m1  100  -   -   -
> >> >>>>
> >> >>>>      m2  300  -   -   -
> >> >>>>
> >> >>>>      m3   -   -   -   -
> >> >>>>
> >> >>>>      m4   -   -   -   -
> >> >>>>
> >> >>>>      m5   -   -   -   -
> >> >>>> I wrote a script in perl, although i would like to do this in R
> >> >>>> Many thanks ;)
> >> >>>> -- bogdan
> >> >>>>
> >> >>>>        [[alternative HTML version deleted]]
> >> >>>>
> >> >>>> ______________________________________________
> >> >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >>>> PLEASE do read the posting guide http://www.R-project.org/
> >> >>> posting-guide.html
> >> >>>> and provide commented, minimal, self-contained, reproducible code.
> >> >>>
> >> >>
> >> >>        [[alternative HTML version deleted]]
> >> >>
> >> >> ______________________________________________
> >> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> >> and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >> David Winsemius
> >> Alameda, CA, USA
> >>
> >>
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.