problems when merging two data sets

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

problems when merging two data sets

SasaK
Dear All,

I would like to merge two data sets however I am doing something wrong...
1 data set contains 2 columns of  'species occurrence'(1 column) in Germany
and  'species names' (2 column).
and the second one names of 'Red list species'(1 column) and 'species
status' (2 column).
so I would like to merge Red list species with species names from the first
table and to sign the  species status
I have tried with merge function but got this an error:" 'by' must specify
a uniquely valid column"
I also tried with the function left_join, however no success.

Also columns in two data sets are different in size. 1 table has 7189 rows
and 2 table just 426 rows as we do not have much Red list Species.

I would appreciate your help.

Kind regards,
Sasha


Dr Sasha Kosanic
Ecology Lab (Biology Department)
Room M842
University of Konstanz
Universitätsstraße 10
D-78464 Konstanz
Phone: +49 7531 883321 & +49 (0)175 9172503

http://cms.uni-konstanz.de/vkleunen/
https://tinyurl.com/y8u5wyoj
https://tinyurl.com/cgec6tu

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: problems when merging two data sets

Jeff Newmiller
There are many examples of how to do this properly on the web, and many ways you could have failed to follow those examples. You need to be much more specific (using actual R code) about what you did in order for us to help you get past your specific error. [1][2][3]

You will also avoid the what-we-see-is-different-than-what-you-saw problems with your email if you read the Posting Guide and insure that your email client is configured to send plain text format rather than HTML- format email to the mailing list.

[1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

[2] http://adv-r.had.co.nz/Reproducibility.html

[3] https://cran.r-project.org/web/packages/reprex/index.html (read the vignette)


On February 5, 2019 9:56:37 AM PST, sasa kosanic <[hidden email]> wrote:

>Dear All,
>
>I would like to merge two data sets however I am doing something
>wrong...
>1 data set contains 2 columns of  'species occurrence'(1 column) in
>Germany
>and  'species names' (2 column).
>and the second one names of 'Red list species'(1 column) and 'species
>status' (2 column).
>so I would like to merge Red list species with species names from the
>first
>table and to sign the  species status
>I have tried with merge function but got this an error:" 'by' must
>specify
>a uniquely valid column"
>I also tried with the function left_join, however no success.
>
>Also columns in two data sets are different in size. 1 table has 7189
>rows
>and 2 table just 426 rows as we do not have much Red list Species.
>
>I would appreciate your help.
>
>Kind regards,
>Sasha
>
>
>Dr Sasha Kosanic
>Ecology Lab (Biology Department)
>Room M842
>University of Konstanz
>Universitätsstraße 10
>D-78464 Konstanz
>Phone: +49 7531 883321 & +49 (0)175 9172503
>
>http://cms.uni-konstanz.de/vkleunen/
>https://tinyurl.com/y8u5wyoj
>https://tinyurl.com/cgec6tu
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: problems when merging two data sets

Bert Gunter-2
In reply to this post by SasaK
Show us your code! (as the posting guide below requests. Please read the
posting guide).


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Feb 5, 2019 at 10:04 AM sasa kosanic <[hidden email]> wrote:

> Dear All,
>
> I would like to merge two data sets however I am doing something wrong...
> 1 data set contains 2 columns of  'species occurrence'(1 column) in Germany
> and  'species names' (2 column).
> and the second one names of 'Red list species'(1 column) and 'species
> status' (2 column).
> so I would like to merge Red list species with species names from the first
> table and to sign the  species status
> I have tried with merge function but got this an error:" 'by' must specify
> a uniquely valid column"
> I also tried with the function left_join, however no success.
>
> Also columns in two data sets are different in size. 1 table has 7189 rows
> and 2 table just 426 rows as we do not have much Red list Species.
>
> I would appreciate your help.
>
> Kind regards,
> Sasha
>
>
> Dr Sasha Kosanic
> Ecology Lab (Biology Department)
> Room M842
> University of Konstanz
> Universitätsstraße 10
> D-78464 Konstanz
> Phone: +49 7531 883321 & +49 (0)175 9172503
>
> http://cms.uni-konstanz.de/vkleunen/
> https://tinyurl.com/y8u5wyoj
> https://tinyurl.com/cgec6tu
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: problems when merging two data sets

François Collin-2
Quite agree with Jeff Newmiller and Bert Gunter.

The error you get (" 'by' must specify a uniquely valid column") is a
very common mistake when the function merge is misused. Although, the
function merge is the good choice. Have you read the manual of the
function sending the command `?merge`. That is always a good start.

Hereafter is what the function call look like:

`merge(x, y, by = intersect(names(x), names(y)), by.x = by, by.y = by,
all = FALSE, all.x = all, all.y = all, sort = TRUE, suffixes =
c(".x",".y"), no.dups = TRUE, incomparables = NULL, ...)`

For your matter, you probably need only 4 arguments:

`merge(x = dataset1, y = dataset2, by.x = "key1", by.y = "key2")`

In the example, key1 correspond to the column name in the dataset1 that
should match the column name in the dataset2. Likewise for key2.

Again, read the manual to understand the other arguments, I would
especially advise you to look at the arguments suffixes, all.x, all.y
which will help you doing exactly what you want.

Cheers,

Francois COLLIN

On 05/02/2019 19:49, Bert Gunter wrote:

> Show us your code! (as the posting guide below requests. Please read the
> posting guide).
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Feb 5, 2019 at 10:04 AM sasa kosanic <[hidden email]> wrote:
>
>> Dear All,
>>
>> I would like to merge two data sets however I am doing something wrong...
>> 1 data set contains 2 columns of  'species occurrence'(1 column) in Germany
>> and  'species names' (2 column).
>> and the second one names of 'Red list species'(1 column) and 'species
>> status' (2 column).
>> so I would like to merge Red list species with species names from the first
>> table and to sign the  species status
>> I have tried with merge function but got this an error:" 'by' must specify
>> a uniquely valid column"
>> I also tried with the function left_join, however no success.
>>
>> Also columns in two data sets are different in size. 1 table has 7189 rows
>> and 2 table just 426 rows as we do not have much Red list Species.
>>
>> I would appreciate your help.
>>
>> Kind regards,
>> Sasha
>>
>>
>> Dr Sasha Kosanic
>> Ecology Lab (Biology Department)
>> Room M842
>> University of Konstanz
>> Universitätsstraße 10
>> D-78464 Konstanz
>> Phone: +49 7531 883321 & +49 (0)175 9172503
>>
>> http://cms.uni-konstanz.de/vkleunen/
>> https://tinyurl.com/y8u5wyoj
>> https://tinyurl.com/cgec6tu
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: problems when merging two data sets

Jim Lemon-4
In reply to this post by SasaK
Hi Sasha,
I'll take a wild guess that your column names have periods (.)
replacing the spaces in the names you use:

species occurrence -> species.occurrence

The error message means that R can't find the variable name you have
used in the "by" argument. The second wild guess is that your column
names for the species names are different and you must use the "by.x"
and "by.y" arguments instead of just "by".

Jim

On Wed, Feb 6, 2019 at 5:04 AM sasa kosanic <[hidden email]> wrote:

>
> Dear All,
>
> I would like to merge two data sets however I am doing something wrong...
> 1 data set contains 2 columns of  'species occurrence'(1 column) in Germany
> and  'species names' (2 column).
> and the second one names of 'Red list species'(1 column) and 'species
> status' (2 column).
> so I would like to merge Red list species with species names from the first
> table and to sign the  species status
> I have tried with merge function but got this an error:" 'by' must specify
> a uniquely valid column"
> I also tried with the function left_join, however no success.
>
> Also columns in two data sets are different in size. 1 table has 7189 rows
> and 2 table just 426 rows as we do not have much Red list Species.
>
> I would appreciate your help.
>
> Kind regards,
> Sasha
>
>
> Dr Sasha Kosanic
> Ecology Lab (Biology Department)
> Room M842
> University of Konstanz
> Universitätsstraße 10
> D-78464 Konstanz
> Phone: +49 7531 883321 & +49 (0)175 9172503
>
> http://cms.uni-konstanz.de/vkleunen/
> https://tinyurl.com/y8u5wyoj
> https://tinyurl.com/cgec6tu
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.