Joining tables with different order and matched values

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Joining tables with different order and matched values

abo dalash
Hi All ..,


I have 2 tables and I'm trying to have some information from the 1st table to appear in the second table with different order.


For Example, let's say this is my 1st table :-



Drug name           indications

 Ibuprofen                Pain

 Simvastatin            hyperlipidemia

losartan                   hypertension



my 2nd table is in different order for the 1st column :-


Drug name       indications


Simvastatin

losartan

Ibuprofen

Metformin


I wish to see the indication of each drug in my 2nd table subsisted from the information in my 1st table so the final table

would be like this


Drug name       indications


Simvastatin     hyperlipidemia

losartan           hypertension

Ibuprofen       pain

Metformin    N/A


I have been trying to use Sqldf package and right join function but not able to formulate the correct syntax.


I'm also trying to identify rows contain at least one shared value  in a dataset called 'Values":


>Values

A             B

1,2,5       3,8,7

2,4,6       7,6,3



Columns A & B in the first row do not share any value while in the 2nd row they have a single shared value which is 6.

The result I wish to see :-


A             B             shared values

1,2,5       3,8,7             N/A

2,4,6       7,6,3               6


I tried this syntax : SharedValues <- Values$A == Values$B but this returns logical results and what I wish to have

is a new data frame including the new vector "shared values" showing the information exactly as above.




Kind Regards






        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Joining tables with different order and matched values

Ulrik Stervbo-2
Hi Abo,

?merge

or the join functions from dplyr.

HTH
Ulrik

On Tue, 9 May 2017 at 06:44 abo dalash <[hidden email]> wrote:

> Hi All ..,
>
>
> I have 2 tables and I'm trying to have some information from the 1st table
> to appear in the second table with different order.
>
>
> For Example, let's say this is my 1st table :-
>
>
>
> Drug name           indications
>
>  Ibuprofen                Pain
>
>  Simvastatin            hyperlipidemia
>
> losartan                   hypertension
>
>
>
> my 2nd table is in different order for the 1st column :-
>
>
> Drug name       indications
>
>
> Simvastatin
>
> losartan
>
> Ibuprofen
>
> Metformin
>
>
> I wish to see the indication of each drug in my 2nd table subsisted from
> the information in my 1st table so the final table
>
> would be like this
>
>
> Drug name       indications
>
>
> Simvastatin     hyperlipidemia
>
> losartan           hypertension
>
> Ibuprofen       pain
>
> Metformin    N/A
>
>
> I have been trying to use Sqldf package and right join function but not
> able to formulate the correct syntax.
>
>
> I'm also trying to identify rows contain at least one shared value  in a
> dataset called 'Values":
>
>
> >Values
>
> A             B
>
> 1,2,5       3,8,7
>
> 2,4,6       7,6,3
>
>
>
> Columns A & B in the first row do not share any value while in the 2nd row
> they have a single shared value which is 6.
>
> The result I wish to see :-
>
>
> A             B             shared values
>
> 1,2,5       3,8,7             N/A
>
> 2,4,6       7,6,3               6
>
>
> I tried this syntax : SharedValues <- Values$A == Values$B but this
> returns logical results and what I wish to have
>
> is a new data frame including the new vector "shared values" showing the
> information exactly as above.
>
>
>
>
> Kind Regards
>
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Joining tables with different order and matched values

Ulrik Stervbo-2
Hi Abo,

Please keep the list in cc.

I think the function documentation is pretty straight forward - two
data.frames are required, and if you wish to keep elements that are not
present in both data.frames, you set the flag all = TRUE. You also have the
option to specify which columns to join by.

If you need more assistance with joining two data.frames, you should
provide a reproducible example, and if you have trouble with a function you
should provide an example of what you have tried so far.

Best wishes,
Ulrik



On Tue, 9 May 2017 at 10:00 abo dalash <[hidden email]> wrote:

> Could you please teach me about the correct formation of the syntax?. I
> have read the help page and other online resources about inner,left,&right
> join but wasn't able to formulate the correct syntax.
>
>
> Sent from my Samsung device
>
>
> -------- Original message --------
> From: Ulrik Stervbo <[hidden email]>
> Date: 09/05/2017 7:42 a.m. (GMT+00:00)
> To: abo dalash <[hidden email]>, "[hidden email]" <
> [hidden email]>
> Subject: Re: [R] Joining tables with different order and matched values
>
> Hi Abo,
>
> ?merge
>
> or the join functions from dplyr.
>
> HTH
> Ulrik
>
> On Tue, 9 May 2017 at 06:44 abo dalash <[hidden email]> wrote:
>
>> Hi All ..,
>>
>>
>> I have 2 tables and I'm trying to have some information from the 1st
>> table to appear in the second table with different order.
>>
>>
>> For Example, let's say this is my 1st table :-
>>
>>
>>
>> Drug name           indications
>>
>>  Ibuprofen                Pain
>>
>>  Simvastatin            hyperlipidemia
>>
>> losartan                   hypertension
>>
>>
>>
>> my 2nd table is in different order for the 1st column :-
>>
>>
>> Drug name       indications
>>
>>
>> Simvastatin
>>
>> losartan
>>
>> Ibuprofen
>>
>> Metformin
>>
>>
>> I wish to see the indication of each drug in my 2nd table subsisted from
>> the information in my 1st table so the final table
>>
>> would be like this
>>
>>
>> Drug name       indications
>>
>>
>> Simvastatin     hyperlipidemia
>>
>> losartan           hypertension
>>
>> Ibuprofen       pain
>>
>> Metformin    N/A
>>
>>
>> I have been trying to use Sqldf package and right join function but not
>> able to formulate the correct syntax.
>>
>>
>> I'm also trying to identify rows contain at least one shared value  in a
>> dataset called 'Values":
>>
>>
>> >Values
>>
>> A             B
>>
>> 1,2,5       3,8,7
>>
>> 2,4,6       7,6,3
>>
>>
>>
>> Columns A & B in the first row do not share any value while in the 2nd
>> row they have a single shared value which is 6.
>>
>> The result I wish to see :-
>>
>>
>> A             B             shared values
>>
>> 1,2,5       3,8,7             N/A
>>
>> 2,4,6       7,6,3               6
>>
>>
>> I tried this syntax : SharedValues <- Values$A == Values$B but this
>> returns logical results and what I wish to have
>>
>> is a new data frame including the new vector "shared values" showing the
>> information exactly as above.
>>
>>
>>
>>
>> Kind Regards
>>
>>
>>
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Joining tables with different order and matched values

abo dalash
I'm repeating my question and hope to find someone to help.


I have been trying for hours but without results, I have done previous suggestions but still struggling.


I believe that join functions in dplyr will do the work but I'm confusing with the correct syntax.


I have 2 tables and I'm trying to have some information from the 1st table to appear in the 2nd table.


let's say this is my 1st table :-


>df1
Drug name           indications

 Ibuprofen                Pain

 Simvastatin            hyperlipidemia

losartan                   hypertension



my 2nd table contains the same list of drugs under the first column BUT with different order :-

>df2
Drug name       indications


Simvastatin

losartan

Ibuprofen

Metformin

Simply, I want to produce a table like df1 but in the order of the 1st column of my df2.

This would be like this

>joined tables
Drug name       indications


Simvastatin     hyperlipidemia

losartan           hypertension

Ibuprofen       pain

Metformin    N/A


Please note that it is important to keep the order of drugs in df2 as it and to see the appropriate indication of each drug(which is withdrawn from df1) next to it under "indications" column.



________________________________
From: Ulrik Stervbo <[hidden email]>
Sent: 09 May 2017 06:31 PM
To: abo dalash
Subject: Re: [R] Joining tables with different order and matched values

Hi Abo,

Please keep the list in cc - 1) the comments are accessible to everyone, 2) there is a chance that someone else might reply.

If the merge does what you intend, but you are unhappy with the order, you can arrange the resulting data.frame:

df <- data.frame(x = c(5, 4,2,3,6, 1), y = letters[1:6])

df
df[order(df$x), ]

HTH
Ulrik



On Tue, 9 May 2017 at 16:17 abo dalash <[hidden email]<mailto:[hidden email]>> wrote:


I still cannot produce the table I wish. I tried the following with the same results.


A <-merge(dt1, dt2, by = "Drug name", all.x = TRUE)


A <-join_query(dt1, dt2, by = "Drug name")

This returns a table showing results with changing the order of drugs in the 2nd data frame. I want to see the results under
"indications" column without changing the order of drugs in my 2nd data frame. I have been trying for many hours, so please
help me to know what is the mistake I have done and what is the correct syntax.


Regards
________________________________
From: Ulrik Stervbo <[hidden email]<mailto:[hidden email]>>
Sent: 09 May 2017 12:22 PM
To: abo dalash; R-help

Subject: Re: [R] Joining tables with different order and matched values
Hi Abo,

Please keep the list in cc.

I think the function documentation is pretty straight forward - two data.frames are required, and if you wish to keep elements that are not present in both data.frames, you set the flag all = TRUE. You also have the option to specify which columns to join by.

If you need more assistance with joining two data.frames, you should provide a reproducible example, and if you have trouble with a function you should provide an example of what you have tried so far.

Best wishes,
Ulrik



On Tue, 9 May 2017 at 10:00 abo dalash <[hidden email]<mailto:[hidden email]>> wrote:
Could you please teach me about the correct formation of the syntax?. I hav
n but wasn't able to formulate the correct syntax.


Sent from my Samsung device


-------- Original message --------
From: Ulrik Stervbo <[hidden email]<mailto:[hidden email]>>
Date: 09/05/2017 7:42 a.m. (GMT+00:00)
To: abo dalash <[hidden email]<mailto:[hidden email]>>, "[hidden email]" <[hidden email]<mailto:[hidden email]>>
Subject: Re: [R] Joining tables with different order and matched values

Hi Abo,

?merge

or the join functions from dplyr.

HTH
Ulrik

On Tue, 9 May 2017 at 06:44 abo dalash <[hidden email]<mailto:[hidden email]>> wrote:
Hi All ..,


I have 2 tables and I'm trying to have some information from the 1st table to appear in the second table with different order.


For Example, let's say this is my 1st table :-



Drug name           indications

 Ibuprofen                Pain

 Simvastatin            hyperlipidemia

losartan                   hypertension



my 2nd table is in different order for the 1st column :-


Drug name       indications


Simvastatin

losartan

Ibuprofen

Metformin


I wish to see the indication of each drug in my 2nd table subsisted from the information in my 1st table so the final table

would be like this


Drug name       indications


Simvastatin     hyperlipidemia

losartan           hypertension

Ibuprofen       pain

Metformin    N/A


I have been trying to use Sqldf package and right join function but not able to formulate the correct syntax.


I'm also trying to identify rows contain at least one shared value  in a dataset called 'Values":


>Values

A             B

1,2,5       3,8,7

2,4,6       7,6,3



Columns A & B in the first row do not share any value while in the 2nd row they have a single shared value which is 6.

The result I wish to see :-


A             B             shared values

1,2,5       3,8,7             N/A

2,4,6       7,6,3               6


I tried this syntax : SharedValues <- Values$A == Values$B but this returns logical results and what I wish to have

is a new data frame including the new vector "shared values" showing the information exactly as above.




Kind Regards






        [[alternative HTML version deleted]]

______________________________________________
[hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Joining tables with different order and matched values

Boris Steipe
myDf1 <- data.frame(drugs = c("Ibuprofen", "Simvastatin", "Losartan"),
                    indications = c("pain", "hyperlipidemia", "hypertension"),
                    stringsAsFactors = FALSE)

myDf2 <- data.frame(drugs = c("Simvastatin", "Losartan", "Ibuprofen", "Metformin"),
                    stringsAsFactors = FALSE)

myDf3 <- merge(myDf2, myDf1, all = TRUE, sort = FALSE)


R > myDf3
        drugs    indications
1 Simvastatin hyperlipidemia
2    Losartan   hypertension
3   Ibuprofen           pain
4   Metformin           <NA>


R > str(myDf3)
'data.frame': 4 obs. of  2 variables:
 $ drugs      : chr  "Simvastatin" "Losartan" "Ibuprofen" "Metformin"
 $ indications: chr  "hyperlipidemia" "hypertension" "pain" NA



-----

Minimum working example!
Don't post in HTML!
... you should know better by now.



> On May 9, 2017, at 1:21 PM, abo dalash <[hidden email]> wrote:
>
> I'm repeating my question and hope to find someone to help.
>
>
> I have been trying for hours but without results, I have done previous suggestions but still struggling.
>
>
> I believe that join functions in dplyr will do the work but I'm confusing with the correct syntax.
>
>
> I have 2 tables and I'm trying to have some information from the 1st table to appear in the 2nd table.
>
>
> let's say this is my 1st table :-
>
>
>> df1
> Drug name           indications
>
> Ibuprofen                Pain
>
> Simvastatin            hyperlipidemia
>
> losartan                   hypertension
>
>
>
> my 2nd table contains the same list of drugs under the first column BUT with different order :-
>
>> df2
> Drug name       indications
>
>
> Simvastatin
>
> losartan
>
> Ibuprofen
>
> Metformin
>
> Simply, I want to produce a table like df1 but in the order of the 1st column of my df2.
>
> This would be like this
>
>> joined tables
> Drug name       indications
>
>
> Simvastatin     hyperlipidemia
>
> losartan           hypertension
>
> Ibuprofen       pain
>
> Metformin    N/A
>
>
> Please note that it is important to keep the order of drugs in df2 as it and to see the appropriate indication of each drug(which is withdrawn from df1) next to it under "indications" column.
>
>
>
> ________________________________
> From: Ulrik Stervbo <[hidden email]>
> Sent: 09 May 2017 06:31 PM
> To: abo dalash
> Subject: Re: [R] Joining tables with different order and matched values
>
> Hi Abo,
>
> Please keep the list in cc - 1) the comments are accessible to everyone, 2) there is a chance that someone else might reply.
>
> If the merge does what you intend, but you are unhappy with the order, you can arrange the resulting data.frame:
>
> df <- data.frame(x = c(5, 4,2,3,6, 1), y = letters[1:6])
>
> df
> df[order(df$x), ]
>
> HTH
> Ulrik
>
>
>
> On Tue, 9 May 2017 at 16:17 abo dalash <[hidden email]<mailto:[hidden email]>> wrote:
>
>
> I still cannot produce the table I wish. I tried the following with the same results.
>
>
> A <-merge(dt1, dt2, by = "Drug name", all.x = TRUE)
>
>
> A <-join_query(dt1, dt2, by = "Drug name")
>
> This returns a table showing results with changing the order of drugs in the 2nd data frame. I want to see the results under
> "indications" column without changing the order of drugs in my 2nd data frame. I have been trying for many hours, so please
> help me to know what is the mistake I have done and what is the correct syntax.
>
>
> Regards
> ________________________________
> From: Ulrik Stervbo <[hidden email]<mailto:[hidden email]>>
> Sent: 09 May 2017 12:22 PM
> To: abo dalash; R-help
>
> Subject: Re: [R] Joining tables with different order and matched values
> Hi Abo,
>
> Please keep the list in cc.
>
> I think the function documentation is pretty straight forward - two data.frames are required, and if you wish to keep elements that are not present in both data.frames, you set the flag all = TRUE. You also have the option to specify which columns to join by.
>
> If you need more assistance with joining two data.frames, you should provide a reproducible example, and if you have trouble with a function you should provide an example of what you have tried so far.
>
> Best wishes,
> Ulrik
>
>
>
> On Tue, 9 May 2017 at 10:00 abo dalash <[hidden email]<mailto:[hidden email]>> wrote:
> Could you please teach me about the correct formation of the syntax?. I hav
> n but wasn't able to formulate the correct syntax.
>
>
> Sent from my Samsung device
>
>
> -------- Original message --------
> From: Ulrik Stervbo <[hidden email]<mailto:[hidden email]>>
> Date: 09/05/2017 7:42 a.m. (GMT+00:00)
> To: abo dalash <[hidden email]<mailto:[hidden email]>>, "[hidden email]" <[hidden email]<mailto:[hidden email]>>
> Subject: Re: [R] Joining tables with different order and matched values
>
> Hi Abo,
>
> ?merge
>
> or the join functions from dplyr.
>
> HTH
> Ulrik
>
> On Tue, 9 May 2017 at 06:44 abo dalash <[hidden email]<mailto:[hidden email]>> wrote:
> Hi All ..,
>
>
> I have 2 tables and I'm trying to have some information from the 1st table to appear in the second table with different order.
>
>
> For Example, let's say this is my 1st table :-
>
>
>
> Drug name           indications
>
> Ibuprofen                Pain
>
> Simvastatin            hyperlipidemia
>
> losartan                   hypertension
>
>
>
> my 2nd table is in different order for the 1st column :-
>
>
> Drug name       indications
>
>
> Simvastatin
>
> losartan
>
> Ibuprofen
>
> Metformin
>
>
> I wish to see the indication of each drug in my 2nd table subsisted from the information in my 1st table so the final table
>
> would be like this
>
>
> Drug name       indications
>
>
> Simvastatin     hyperlipidemia
>
> losartan           hypertension
>
> Ibuprofen       pain
>
> Metformin    N/A
>
>
> I have been trying to use Sqldf package and right join function but not able to formulate the correct syntax.
>
>
> I'm also trying to identify rows contain at least one shared value  in a dataset called 'Values":
>
>
>> Values
>
> A             B
>
> 1,2,5       3,8,7
>
> 2,4,6       7,6,3
>
>
>
> Columns A & B in the first row do not share any value while in the 2nd row they have a single shared value which is 6.
>
> The result I wish to see :-
>
>
> A             B             shared values
>
> 1,2,5       3,8,7             N/A
>
> 2,4,6       7,6,3               6
>
>
> I tried this syntax : SharedValues <- Values$A == Values$B but this returns logical results and what I wish to have
>
> is a new data frame including the new vector "shared values" showing the information exactly as above.
>
>
>
>
> Kind Regards
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Joining tables with different order and matched values

jholtman
In reply to this post by abo dalash
Here is a solution to the "shared values" question

> library(stringr)
> input <- read.table(text = "A             B
+
+ 1,2,5       3,8,7
+
+ 2,4,6       7,6,3  ",
+     header = TRUE,
+     as.is = TRUE
+     )
>
> input$'shared values' <- apply(input, 1, function(x){
+     toString(intersect(str_extract_all(x[1], "[^,]")[[1]],
+               str_extract_all(x[2], "[^,]")[[1]]
+               ))
+ })
>
> input
      A     B shared values
1 1,2,5 3,8,7
2 2,4,6 7,6,3             6



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Mon, May 8, 2017 at 10:56 AM, abo dalash <[hidden email]> wrote:

> Hi All ..,
>
>
> I have 2 tables and I'm trying to have some information from the 1st table
> to appear in the second table with different order.
>
>
> For Example, let's say this is my 1st table :-
>
>
>
> Drug name           indications
>
>  Ibuprofen                Pain
>
>  Simvastatin            hyperlipidemia
>
> losartan                   hypertension
>
>
>
> my 2nd table is in different order for the 1st column :-
>
>
> Drug name       indications
>
>
> Simvastatin
>
> losartan
>
> Ibuprofen
>
> Metformin
>
>
> I wish to see the indication of each drug in my 2nd table subsisted from
> the information in my 1st table so the final table
>
> would be like this
>
>
> Drug name       indications
>
>
> Simvastatin     hyperlipidemia
>
> losartan           hypertension
>
> Ibuprofen       pain
>
> Metformin    N/A
>
>
> I have been trying to use Sqldf package and right join function but not
> able to formulate the correct syntax.
>
>
> I'm also trying to identify rows contain at least one shared value  in a
> dataset called 'Values":
>
>
> >Values
>
> A             B
>
> 1,2,5       3,8,7
>
> 2,4,6       7,6,3
>
>
>
> Columns A & B in the first row do not share any value while in the 2nd row
> they have a single shared value which is 6.
>
> The result I wish to see :-
>
>
> A             B             shared values
>
> 1,2,5       3,8,7             N/A
>
> 2,4,6       7,6,3               6
>
>
> I tried this syntax : SharedValues <- Values$A == Values$B but this
> returns logical results and what I wish to have
>
> is a new data frame including the new vector "shared values" showing the
> information exactly as above.
>
>
>
>
> Kind Regards
>
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.