Merging two data frames, but keeping NAs

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Merging two data frames, but keeping NAs

Rainer Krug-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi

My brain is giving up on this...

I have the following two data.frames:

  x <-  data.frame(ref=c(NA, NA, NA, 10:5, NA, 1:5))
  y <-  data.frame(id = c(2, 3, 4, 6, 7, 9, 8), val = 101:107)

Which look as follow:

> x
   ref
1   NA
2   NA
3   NA
4   10
5    9
6    8
7    7
8    6
9    5
10  NA
11   1
12   2
13   3
14   4
15   5
> y
  id val
1  2 101
2  3 102
3  4 103
4  6 104
5  7 105
6  9 106
7  8 107
>

Now I want to merge y into x, but that

a) the sort order of x stays the same (sort=FALSE in merge()) and
b) the NAs stay

The result should look as follow (column id only here for clarity):

> result
   ref  id  val
1   NA  NA  NA
2   NA  NA  NA
3   NA  NA  NA
4   10  NA  NA
5    9   9   106
6    8   8   107
7    7   7   105
8    6   6   104
9    5  NA  NA
10  NA  NA  NA
11   1  NA  NA
12   2   2  101
13   3   3  102
14   4   4  103
15   5  NA  NA

merge(x, y, by.x="ref", by.y="id", sort=FALSE) leaves out the NA, but
otherwise it works:

> merge(x, y, by.x=1, by.y="id", sort=FALSE)
  ref val
1   9 106
2   8 107
3   7 105
4   6 104
5   2 101
6   3 102
7   4 103

Is there any way that I can tell merge() to keep the NA, or how can I
achieve what I want?

Thanks,

Rainer

- --
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :       +33 - (0)9 53 10 27 44
Cell:       +33 - (0)6 85 62 59 98
Fax :       +33 - (0)9 58 10 27 44

Fax (D):    +49 - (0)3 21 21 25 22 44

email:      [hidden email]

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSoJQpAAoJENvXNx4PUvmCW2oH/A9s2HbZ16PZRmFcQsxH3uYC
T20b1HXQu8iVqVkfD1D1tbPnogU5QJF1+tJMVzwkg+enhDtop6qpS5Vm5RV9KFnk
eJxmwdIQI3sZOkpReH9cPCnG0bHGO5f+iW3fA7mx95jQTm8WHaU+7zo7Ueb62oeX
/Toc4cVKI2qljzRfJkJCyKZclXbTe0YRv/EKqHDjyI1k/1/1jYVxALm/CqvSZTQQ
SE7nhDVvKHbuBfvrH4A5iy0X/TyHTYgP5eVV7/W4D4OcBYgQDJMwm1z0JTKeF37Z
e1gaUEkDbbJrNIOvB5Bl9EWaym0FFyv7w9XUV/FtqJy7QgQ6qoNxUAo1CfXkx/s=
=hX0S
-----END PGP SIGNATURE-----

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Merging two data frames, but keeping NAs

Sarah Goslee
Adding the argument all.x=TRUE to merge() will retain the NA values,
but the only reliable way I've found to preserve order with NA values
in a merge is to add an index column to x, merge the data, sort on the
index column, then delete it.

Sarah

On Thu, Dec 5, 2013 at 9:56 AM, Rainer M Krug <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi
>
> My brain is giving up on this...
>
> I have the following two data.frames:
>
>   x <-  data.frame(ref=c(NA, NA, NA, 10:5, NA, 1:5))
>   y <-  data.frame(id = c(2, 3, 4, 6, 7, 9, 8), val = 101:107)
>
> Which look as follow:
>
>> x
>    ref
> 1   NA
> 2   NA
> 3   NA
> 4   10
> 5    9
> 6    8
> 7    7
> 8    6
> 9    5
> 10  NA
> 11   1
> 12   2
> 13   3
> 14   4
> 15   5
>> y
>   id val
> 1  2 101
> 2  3 102
> 3  4 103
> 4  6 104
> 5  7 105
> 6  9 106
> 7  8 107
>>
>
> Now I want to merge y into x, but that
>
> a) the sort order of x stays the same (sort=FALSE in merge()) and
> b) the NAs stay
>
> The result should look as follow (column id only here for clarity):
>
>> result
>    ref  id  val
> 1   NA  NA  NA
> 2   NA  NA  NA
> 3   NA  NA  NA
> 4   10  NA  NA
> 5    9   9   106
> 6    8   8   107
> 7    7   7   105
> 8    6   6   104
> 9    5  NA  NA
> 10  NA  NA  NA
> 11   1  NA  NA
> 12   2   2  101
> 13   3   3  102
> 14   4   4  103
> 15   5  NA  NA
>
> merge(x, y, by.x="ref", by.y="id", sort=FALSE) leaves out the NA, but
> otherwise it works:
>
>> merge(x, y, by.x=1, by.y="id", sort=FALSE)
>   ref val
> 1   9 106
> 2   8 107
> 3   7 105
> 4   6 104
> 5   2 101
> 6   3 102
> 7   4 103
>
> Is there any way that I can tell merge() to keep the NA, or how can I
> achieve what I want?
>
> Thanks,
>
> Rainer
>

--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Merging two data frames, but keeping NAs

arun kirshna
In reply to this post by Rainer Krug-3
Hi,
Try ?join()

library(plyr)
y$ref <- y$id
> join(x,y,by="ref")
   ref id val
1   NA NA  NA
2   NA NA  NA
3   NA NA  NA
4   10 NA  NA
5    9  9 106
6    8  8 107
7    7  7 105
8    6  6 104
9    5 NA  NA
10  NA NA  NA
11   1 NA  NA
12   2  2 101
13   3  3 102
14   4  4 103
15   5 NA  NA


A.K.


On Thursday, December 5, 2013 9:58 AM, Rainer M Krug <[hidden email]> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi

My brain is giving up on this...

I have the following two data.frames:

  x <-  data.frame(ref=c(NA, NA, NA, 10:5, NA, 1:5))
  y <-  data.frame(id = c(2, 3, 4, 6, 7, 9, 8), val = 101:107)

Which look as follow:

> x
   ref
1   NA
2   NA
3   NA
4   10
5    9
6    8
7    7
8    6
9    5
10  NA
11   1
12   2
13   3
14   4
15   5
> y
  id val
1  2 101
2  3 102
3  4 103
4  6 104
5  7 105
6  9 106
7  8 107
>

Now I want to merge y into x, but that

a) the sort order of x stays the same (sort=FALSE in merge()) and
b) the NAs stay

The result should look as follow (column id only here for clarity):

> result
   ref  id  val
1   NA  NA  NA
2   NA  NA  NA
3   NA  NA  NA
4   10  NA  NA
5    9   9   106
6    8   8   107
7    7   7   105
8    6   6   104
9    5  NA  NA
10  NA  NA  NA
11   1  NA  NA
12   2   2  101
13   3   3  102
14   4   4  103
15   5  NA  NA

merge(x, y, by.x="ref", by.y="id", sort=FALSE) leaves out the NA, but
otherwise it works:

> merge(x, y, by.x=1, by.y="id", sort=FALSE)
  ref val
1   9 106
2   8 107
3   7 105
4   6 104
5   2 101
6   3 102
7   4 103

Is there any way that I can tell merge() to keep the NA, or how can I
achieve what I want?

Thanks,

Rainer

- --
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :       +33 - (0)9 53 10 27 44
Cell:       +33 - (0)6 85 62 59 98
Fax :       +33 - (0)9 58 10 27 44

Fax (D):    +49 - (0)3 21 21 25 22 44

email:      [hidden email]

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSoJQpAAoJENvXNx4PUvmCW2oH/A9s2HbZ16PZRmFcQsxH3uYC
T20b1HXQu8iVqVkfD1D1tbPnogU5QJF1+tJMVzwkg+enhDtop6qpS5Vm5RV9KFnk
eJxmwdIQI3sZOkpReH9cPCnG0bHGO5f+iW3fA7mx95jQTm8WHaU+7zo7Ueb62oeX
/Toc4cVKI2qljzRfJkJCyKZclXbTe0YRv/EKqHDjyI1k/1/1jYVxALm/CqvSZTQQ
SE7nhDVvKHbuBfvrH4A5iy0X/TyHTYgP5eVV7/W4D4OcBYgQDJMwm1z0JTKeF37Z
e1gaUEkDbbJrNIOvB5Bl9EWaym0FFyv7w9XUV/FtqJy7QgQ6qoNxUAo1CfXkx/s=
=hX0S
-----END PGP SIGNATURE-----

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Merging two data frames, but keeping NAs

Rainer Krug-3
In reply to this post by Sarah Goslee
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/05/13, 16:11 , Sarah Goslee wrote:
> Adding the argument all.x=TRUE to merge() will retain the NA
> values, but the only reliable way I've found to preserve order with
> NA values in a merge is to add an index column to x, merge the
> data, sort on the index column, then delete it.

Thanks Sarah - that works nicely, although it is a not so nice
workaround 0 there should be an argument in merge to keep NA...

Cheers,

Rainer

>
> Sarah
>
> On Thu, Dec 5, 2013 at 9:56 AM, Rainer M Krug <[hidden email]>
> wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>
>> Hi
>>
>> My brain is giving up on this...
>>
>> I have the following two data.frames:
>>
>> x <-  data.frame(ref=c(NA, NA, NA, 10:5, NA, 1:5)) y <-
>> data.frame(id = c(2, 3, 4, 6, 7, 9, 8), val = 101:107)
>>
>> Which look as follow:
>>
>>> x
>> ref 1   NA 2   NA 3   NA 4   10 5    9 6    8 7    7 8    6 9
>> 5 10  NA 11   1 12   2 13   3 14   4 15   5
>>> y
>> id val 1  2 101 2  3 102 3  4 103 4  6 104 5  7 105 6  9 106 7  8
>> 107
>>>
>>
>> Now I want to merge y into x, but that
>>
>> a) the sort order of x stays the same (sort=FALSE in merge())
>> and b) the NAs stay
>>
>> The result should look as follow (column id only here for
>> clarity):
>>
>>> result
>> ref  id  val 1   NA  NA  NA 2   NA  NA  NA 3   NA  NA  NA 4   10
>> NA  NA 5    9   9   106 6    8   8   107 7    7   7   105 8    6
>> 6   104 9    5  NA  NA 10  NA  NA  NA 11   1  NA  NA 12   2   2
>> 101 13   3   3  102 14   4   4  103 15   5  NA  NA
>>
>> merge(x, y, by.x="ref", by.y="id", sort=FALSE) leaves out the NA,
>> but otherwise it works:
>>
>>> merge(x, y, by.x=1, by.y="id", sort=FALSE)
>> ref val 1   9 106 2   8 107 3   7 105 4   6 104 5   2 101 6   3
>> 102 7   4 103
>>
>> Is there any way that I can tell merge() to keep the NA, or how
>> can I achieve what I want?
>>
>> Thanks,
>>
>> Rainer
>>
>

- --
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :       +33 - (0)9 53 10 27 44
Cell:       +33 - (0)6 85 62 59 98
Fax :       +33 - (0)9 58 10 27 44

Fax (D):    +49 - (0)3 21 21 25 22 44

email:      [hidden email]

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSoYwnAAoJENvXNx4PUvmCTjwH/2s8NdixLDI7uWvZ0p90wFxK
OMq9IcOTQ/VEK6ksYzN5e8Q6ukGCgMPW2OKqrLkqr9xhtt49toWR64CgXGgqnKYu
Vu5BT8MldwvtLYLWjyGGlrsz4VXFBixTQxfPPltSXakT742Wno7T0OLIL7V8FBgk
AqdRZpN6+QfBiQGFO7doXWndvnvXXD3uOqEAe89xwV3PBNHLCNDcMKY74HQ+t4F+
RrBzKZRvBOrwyfHFGFGfvEluewpcsPY2ooR/TqcO1XaLz94A5F2RcHdedqkIcdln
tEcOWZq9j9RWQo/9Af4pdxv9CClt8molP3rG4JRYA4x9JiSj4GNYNNF5wnofTAw=
=nxjF
-----END PGP SIGNATURE-----

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Merging two data frames, but keeping NAs

Rainer Krug-3
In reply to this post by arun kirshna
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On 12/05/13, 16:37 , arun wrote:
> Hi, Try ?join()
>
> library(plyr)

Well - what would we do without Hadley ...

He solved many problems we didn't know we would have soon...

Cheers,

Rainer

> y$ref <- y$id
>> join(x,y,by="ref")
> ref id val 1   NA NA  NA 2   NA NA  NA 3   NA NA  NA 4   10 NA  NA
> 5    9  9 106 6    8  8 107 7    7  7 105 8    6  6 104 9    5 NA
> NA 10  NA NA  NA 11   1 NA  NA 12   2  2 101 13   3  3 102 14   4
> 4 103 15   5 NA  NA
>
>
> A.K.
>
>
> On Thursday, December 5, 2013 9:58 AM, Rainer M Krug
> <[hidden email]> wrote: Hi
>
> My brain is giving up on this...
>
> I have the following two data.frames:
>
> x <-  data.frame(ref=c(NA, NA, NA, 10:5, NA, 1:5)) y <-
> data.frame(id = c(2, 3, 4, 6, 7, 9, 8), val = 101:107)
>
> Which look as follow:
>
>> x
> ref 1   NA 2   NA 3   NA 4   10 5    9 6    8 7    7 8    6 9    5
> 10  NA 11   1 12   2 13   3 14   4 15   5
>> y
> id val 1  2 101 2  3 102 3  4 103 4  6 104 5  7 105 6  9 106 7  8
> 107
>
>
> Now I want to merge y into x, but that
>
> a) the sort order of x stays the same (sort=FALSE in merge()) and
> b) the NAs stay
>
> The result should look as follow (column id only here for
> clarity):
>
>> result
> ref  id  val 1   NA  NA  NA 2   NA  NA  NA 3   NA  NA  NA 4   10
> NA  NA 5    9   9   106 6    8   8   107 7    7   7   105 8    6
> 6   104 9    5  NA  NA 10  NA  NA  NA 11   1  NA  NA 12   2   2
> 101 13   3   3  102 14   4   4  103 15   5  NA  NA
>
> merge(x, y, by.x="ref", by.y="id", sort=FALSE) leaves out the NA,
> but otherwise it works:
>
>> merge(x, y, by.x=1, by.y="id", sort=FALSE)
> ref val 1   9 106 2   8 107 3   7 105 4   6 104 5   2 101 6   3
> 102 7   4 103
>
> Is there any way that I can tell merge() to keep the NA, or how can
> I achieve what I want?
>
> Thanks,
>
> Rainer
>
>
> ______________________________________________ [hidden email]
> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

- --
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :       +33 - (0)9 53 10 27 44
Cell:       +33 - (0)6 85 62 59 98
Fax :       +33 - (0)9 58 10 27 44

Fax (D):    +49 - (0)3 21 21 25 22 44

email:      [hidden email]

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSoYxtAAoJENvXNx4PUvmC8JMIANWUXBhCFgKv+wZs2oKv1jMm
qGLcd31a55j8NSoZZRf5v6coG+UEdVGhBu4cLlt1+0BRAhYIK9AnLvV9KXbt5zbI
PKySevB3box1ILbwsr8JH2YyOtlgjjint4LcGuEr4doNy0uo7a3G9J3ctxZgDFeE
QrmDH8EFc55lX76gzp41xUaAxvBP72GlgwK9O4jyO4f19LFcJ87C68s7Gwm2Qs4x
Ysc3JmZ8tC4BlD4H5FV/Pf6cLCxoX3CgQERGD+NNe5HCW/XSXOYsKzreamPr7ayd
bAuTDLRpPqUSYKG/nbcvjj0HMs06YNTYP4LTnwp08QUJ2VH98viQkTBF8OxDGgI=
=mK8w
-----END PGP SIGNATURE-----

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.