strange output with merge function

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

strange output with merge function

anikaM
Hi,

I have two data frames like this:

> head(a)
                V1    V2     V3     V4 V5   V6      V7          V8    V9
 V10
1: ENSG00000272636 chr17 181637 181636  - 4924 -769472   rs7216126 chr17
951108
2: ENSG00000273172 chr17 191588 191587  - 4978   40553  rs62053745 chr17
151035
3: ENSG00000273172 chr17 191588 191587  - 4978   39501  rs77383171 chr17
152087
4: ENSG00000273172 chr17 191588 191587  - 4978   38817  rs34245596 chr17
152771
5: ENSG00000273172 chr17 191588 191587  - 4978   38580 rs112513622 chr17
153008
6: ENSG00000273172 chr17 191588 191587  - 4978   37794   rs8069278 chr17
153794
      V11         V12       V13 V14          V15  V16
1: 951108 4.03837e-05  0.429720   1 6.967229e-05 TRUE
2: 151035 1.42190e-08 -0.488594   0 7.139876e-05 TRUE
3: 152087 1.76913e-09 -0.664469   0 7.139876e-05 TRUE
4: 152771 2.04442e-08 -0.479176   0 7.139876e-05 TRUE
5: 153008 6.46268e-07 -0.768610   0 7.139876e-05 TRUE
6: 153794 1.95944e-08 -0.480011   0 7.139876e-05 TRUE


> head(f)
            V8
1:  rs12940868
2:   rs4383187
3:   rs4404112
4:   rs7214091
5:  rs35871790
6: rs112532541

I am trying to merge with:

m=merge(a,f,by="V8")

but I am getting this output where column on which I am merging V8 is
replaced with number 17...

> head(m)
   V8              V1    V2     V3     V4 V5   V6    V7    V9    V10    V11
1: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480 190480
2: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480 190480
3: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480 190480
4: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858 355858
5: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858 355858
6: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858 355858
           V12       V13 V14          V15  V16
1: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
2: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
3: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
4: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE
5: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE
6: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE

Please advise,
Thanks
Ana

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: strange output with merge function

Jeff Newmiller
My guess would be that your key columns are factors with different levels. You have to be very careful manipulating factors... I recommend using character data until you are ready to do analysis steps that need factors. You may be able to resolve this by importing your data with the as.is=TRUE or stringsAsFactors=FALSE depending what functions you are using (read the help for your input function).

On December 3, 2019 10:02:45 AM PST, Ana Marija <[hidden email]> wrote:

>Hi,
>
>I have two data frames like this:
>
>> head(a)
>               V1    V2     V3     V4 V5   V6      V7          V8    V9
> V10
>1: ENSG00000272636 chr17 181637 181636  - 4924 -769472   rs7216126
>chr17
>951108
>2: ENSG00000273172 chr17 191588 191587  - 4978   40553  rs62053745
>chr17
>151035
>3: ENSG00000273172 chr17 191588 191587  - 4978   39501  rs77383171
>chr17
>152087
>4: ENSG00000273172 chr17 191588 191587  - 4978   38817  rs34245596
>chr17
>152771
>5: ENSG00000273172 chr17 191588 191587  - 4978   38580 rs112513622
>chr17
>153008
>6: ENSG00000273172 chr17 191588 191587  - 4978   37794   rs8069278
>chr17
>153794
>      V11         V12       V13 V14          V15  V16
>1: 951108 4.03837e-05  0.429720   1 6.967229e-05 TRUE
>2: 151035 1.42190e-08 -0.488594   0 7.139876e-05 TRUE
>3: 152087 1.76913e-09 -0.664469   0 7.139876e-05 TRUE
>4: 152771 2.04442e-08 -0.479176   0 7.139876e-05 TRUE
>5: 153008 6.46268e-07 -0.768610   0 7.139876e-05 TRUE
>6: 153794 1.95944e-08 -0.480011   0 7.139876e-05 TRUE
>
>
>> head(f)
>            V8
>1:  rs12940868
>2:   rs4383187
>3:   rs4404112
>4:   rs7214091
>5:  rs35871790
>6: rs112532541
>
>I am trying to merge with:
>
>m=merge(a,f,by="V8")
>
>but I am getting this output where column on which I am merging V8 is
>replaced with number 17...
>
>> head(m)
>V8              V1    V2     V3     V4 V5   V6    V7    V9    V10  
>V11
>1: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480
>190480
>2: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480
>190480
>3: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480
>190480
>4: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858
>355858
>5: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858
>355858
>6: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858
>355858
>           V12       V13 V14          V15  V16
>1: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
>2: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
>3: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
>4: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE
>5: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE
>6: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE
>
>Please advise,
>Thanks
>Ana
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: strange output with merge function

anikaM
HI Jeff,

the issue got resolved after I converted data in as.data.frame.

Thank you for the useful input!

Ana

On Tue, Dec 3, 2019 at 12:46 PM Jeff Newmiller <[hidden email]>
wrote:

> My guess would be that your key columns are factors with different levels.
> You have to be very careful manipulating factors... I recommend using
> character data until you are ready to do analysis steps that need factors.
> You may be able to resolve this by importing your data with the as.is=TRUE
> or stringsAsFactors=FALSE depending what functions you are using (read the
> help for your input function).
>
> On December 3, 2019 10:02:45 AM PST, Ana Marija <
> [hidden email]> wrote:
> >Hi,
> >
> >I have two data frames like this:
> >
> >> head(a)
> >               V1    V2     V3     V4 V5   V6      V7          V8    V9
> > V10
> >1: ENSG00000272636 chr17 181637 181636  - 4924 -769472   rs7216126
> >chr17
> >951108
> >2: ENSG00000273172 chr17 191588 191587  - 4978   40553  rs62053745
> >chr17
> >151035
> >3: ENSG00000273172 chr17 191588 191587  - 4978   39501  rs77383171
> >chr17
> >152087
> >4: ENSG00000273172 chr17 191588 191587  - 4978   38817  rs34245596
> >chr17
> >152771
> >5: ENSG00000273172 chr17 191588 191587  - 4978   38580 rs112513622
> >chr17
> >153008
> >6: ENSG00000273172 chr17 191588 191587  - 4978   37794   rs8069278
> >chr17
> >153794
> >      V11         V12       V13 V14          V15  V16
> >1: 951108 4.03837e-05  0.429720   1 6.967229e-05 TRUE
> >2: 151035 1.42190e-08 -0.488594   0 7.139876e-05 TRUE
> >3: 152087 1.76913e-09 -0.664469   0 7.139876e-05 TRUE
> >4: 152771 2.04442e-08 -0.479176   0 7.139876e-05 TRUE
> >5: 153008 6.46268e-07 -0.768610   0 7.139876e-05 TRUE
> >6: 153794 1.95944e-08 -0.480011   0 7.139876e-05 TRUE
> >
> >
> >> head(f)
> >            V8
> >1:  rs12940868
> >2:   rs4383187
> >3:   rs4404112
> >4:   rs7214091
> >5:  rs35871790
> >6: rs112532541
> >
> >I am trying to merge with:
> >
> >m=merge(a,f,by="V8")
> >
> >but I am getting this output where column on which I am merging V8 is
> >replaced with number 17...
> >
> >> head(m)
> >V8              V1    V2     V3     V4 V5   V6    V7    V9    V10
> >V11
> >1: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480
> >190480
> >2: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480
> >190480
> >3: 17 ENSG00000273172 chr17 191588 191587  - 4978  1108 chr17 190480
> >190480
> >4: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858
> >355858
> >5: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858
> >355858
> >6: 17 ENSG00000262061 chr17 331206 331205  + 5858 24653 chr17 355858
> >355858
> >           V12       V13 V14          V15  V16
> >1: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
> >2: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
> >3: 8.23430e-09 -0.511644   0 7.139876e-05 TRUE
> >4: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE
> >5: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE
> >6: 5.46479e-05 -0.329785   0 5.962828e-05 TRUE
> >
> >Please advise,
> >Thanks
> >Ana
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.