Merge two dataframe with "by", and problems with the common field

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Merge two dataframe with "by", and problems with the common field

miao
Hi,

   From time to time I merge two dataframes with possibly a common field.
Then the common field is no longer present,but what are present fieldname.x
and fieldname.y. How can I fix the problem so that I can still call by the
orignal fieldname? If you don't understand my problem, please see the
example below.

   Thanks

Miao


> d1
  a b c
1 1 4 5
2 2 5 6
3 3 6 7
> d2
  d a  f b
1 6 1  8 4
2 7 2  9 5
3 8 3 10 6
> d3<-merge(d1, d2, by="b")
> d3
  b a.x c d a.y  f
1 4   1 5 6   1  8
2 5   2 6 7   2  9
3 6   3 7 8   3 10
> d3["a"]
Error in `[.data.frame`(d3, "a") : undefined columns selected
> d3["a.x"]
  a.x
1   1
2   2
3   3

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Merge two dataframe with "by", and problems with the common field

Jim Lemon
On 05/07/2013 04:33 PM, jpm miao wrote:

> Hi,
>
>     From time to time I merge two dataframes with possibly a common field.
> Then the common field is no longer present,but what are present fieldname.x
> and fieldname.y. How can I fix the problem so that I can still call by the
> orignal fieldname? If you don't understand my problem, please see the
> example below.
>
>     Thanks
>
> Miao
>
>
>> d1
>    a b c
> 1 1 4 5
> 2 2 5 6
> 3 3 6 7
>> d2
>    d a  f b
> 1 6 1  8 4
> 2 7 2  9 5
> 3 8 3 10 6
>> d3<-merge(d1, d2, by="b")
>> d3
>    b a.x c d a.y  f
> 1 4   1 5 6   1  8
> 2 5   2 6 7   2  9
> 3 6   3 7 8   3 10
>> d3["a"]
> Error in `[.data.frame`(d3, "a") : undefined columns selected
>> d3["a.x"]
>    a.x
> 1   1
> 2   2
> 3   3
>
Hi jpm miao,
Because you have a column named "a" in both data frames, the merge
function adds ".x" and ".y" to the fields with common names. You could
change the name of one column, for example, change the name of the "a"
column in d2 to "e". You could also drop one of the "a" columns in this
case as the two columns are identical.

d3<-merge(d1, d2[,c("d","f","b")], by="b")

Jim

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Merge two dataframe with "by", and problems with the common field

Rainer Schuermann
In reply to this post by miao
Not sure whether this really helps you but at least it works for your sample:

d3 <- merge( d1, d2, by = c(  "a", "b" )  )

> d3                                                                                                                                                                                        
  a b c d  f                                                                                                                                                                                
1 1 4 5 6  8                                                                                                                                                                                
2 2 5 6 7  9                                                                                                                                                                                
3 3 6 7 8 10

Rgds,
Rainer


On Tuesday 07 May 2013 14:33:12 jpm miao wrote:

> Hi,
>
>    From time to time I merge two dataframes with possibly a common field.
> Then the common field is no longer present,but what are present fieldname.x
> and fieldname.y. How can I fix the problem so that I can still call by the
> orignal fieldname? If you don't understand my problem, please see the
> example below.
>
>    Thanks
>
> Miao
>
>
> > d1
>   a b c
> 1 1 4 5
> 2 2 5 6
> 3 3 6 7
> > d2
>   d a  f b
> 1 6 1  8 4
> 2 7 2  9 5
> 3 8 3 10 6
> > d3<-merge(d1, d2, by="b")
> > d3
>   b a.x c d a.y  f
> 1 4   1 5 6   1  8
> 2 5   2 6 7   2  9
> 3 6   3 7 8   3 10
> > d3["a"]
> Error in `[.data.frame`(d3, "a") : undefined columns selected
> > d3["a.x"]
>   a.x
> 1   1
> 2   2
> 3   3
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Merge two dataframe with "by", and problems with the common field

Jeff Newmiller
In reply to this post by miao
Either d1$a and d2$a are always the same, or they are not.

If they are already the same, you can either omit one of them in the merge:

merge(d1, d2[,-2], by="b")

or you can use a set of columns for your by:

merge(d1,d2, by=c("a","b"))

If the "a" columns are distinct, then at least one of them needs a new name in the merged table, and the simplest option is to rename the columns appropriately in d1 and d2 (since they apparently represent different data anyway).
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.

jpm miao <[hidden email]> wrote:

>Hi,
>
> From time to time I merge two dataframes with possibly a common field.
>Then the common field is no longer present,but what are present
>fieldname.x
>and fieldname.y. How can I fix the problem so that I can still call by
>the
>orignal fieldname? If you don't understand my problem, please see the
>example below.
>
>   Thanks
>
>Miao
>
>
>> d1
>  a b c
>1 1 4 5
>2 2 5 6
>3 3 6 7
>> d2
>  d a  f b
>1 6 1  8 4
>2 7 2  9 5
>3 8 3 10 6
>> d3<-merge(d1, d2, by="b")
>> d3
>  b a.x c d a.y  f
>1 4   1 5 6   1  8
>2 5   2 6 7   2  9
>3 6   3 7 8   3 10
>> d3["a"]
>Error in `[.data.frame`(d3, "a") : undefined columns selected
>> d3["a.x"]
>  a.x
>1   1
>2   2
>3   3
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Merge two dataframe with "by", and problems with the common field

William Dunlap
> If the "a" columns are distinct, then at least one of them needs a new name in the
> merged table, and the simplest option is to rename the columns appropriately in d1 and
> d2 (since they apparently represent different data anyway).

You can also use the 'suffixes. argument to merge to control the naming of the
common column names that are not used as 'by' columns:
  > merge(d1,d2,by="b", suffixes=c("", ".y"))
    b a c d a.y  f
  1 4 1 5 6   1  8
  2 5 2 6 7   2  9
  3 6 3 7 8   3 10

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf
> Of Jeff Newmiller
> Sent: Tuesday, May 07, 2013 12:10 AM
> To: jpm miao; r-help
> Subject: Re: [R] Merge two dataframe with "by", and problems with the common field
>
> Either d1$a and d2$a are always the same, or they are not.
>
> If they are already the same, you can either omit one of them in the merge:
>
> merge(d1, d2[,-2], by="b")
>
> or you can use a set of columns for your by:
>
> merge(d1,d2, by=c("a","b"))
>
> If the "a" columns are distinct, then at least one of them needs a new name in the
> merged table, and the simplest option is to rename the columns appropriately in d1 and
> d2 (since they apparently represent different data anyway).
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> jpm miao <[hidden email]> wrote:
>
> >Hi,
> >
> > From time to time I merge two dataframes with possibly a common field.
> >Then the common field is no longer present,but what are present
> >fieldname.x
> >and fieldname.y. How can I fix the problem so that I can still call by
> >the
> >orignal fieldname? If you don't understand my problem, please see the
> >example below.
> >
> >   Thanks
> >
> >Miao
> >
> >
> >> d1
> >  a b c
> >1 1 4 5
> >2 2 5 6
> >3 3 6 7
> >> d2
> >  d a  f b
> >1 6 1  8 4
> >2 7 2  9 5
> >3 8 3 10 6
> >> d3<-merge(d1, d2, by="b")
> >> d3
> >  b a.x c d a.y  f
> >1 4   1 5 6   1  8
> >2 5   2 6 7   2  9
> >3 6   3 7 8   3 10
> >> d3["a"]
> >Error in `[.data.frame`(d3, "a") : undefined columns selected
> >> d3["a.x"]
> >  a.x
> >1   1
> >2   2
> >3   3
> >
> > [[alternative HTML version deleted]]
> >
> >______________________________________________
> >[hidden email] mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.