Dataframe by Serial ID

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Dataframe by Serial ID

Thomas Subia-2
Colleagues,

I have two data frames which look like this.

Data frame 1

  Serial Pre.Hole Pre.flow   Pre.Date
1   3036        1     0.24 19-Nov-19
2   3036        2     0.212 19-Nov-19
3   3036        3     1.292 19-Nov-19
4   3036        4     0.262 19-Nov-19
5   3036        5     1.291 19-Nov-19
6   3036        6     0.26 19-Nov-19

Data frame 2

      Serial Post.Hole Post.flow Post.Date
62323  11024        44     -0.678 11-Dec-19
62324  11024        45     -0.659 11-Dec-19
62325  11024        46   -0.654 11-Dec-19
62326  11024        47     -0.699 11-Dec-19
62327  11024        48   -0.671 11-Dec-19
62328  11024        49     -0.687 11-Dec-19

What I want is to create a data frame whose serials numbers are common to data frames 1 and 2.
The resulting data frame 1st row would look like this.

Serial Post.Hole Post.flow Post.Date Pre.Hole Pre.flow Pre.Date

Any ideas on how to do this would be appreciated.

Thomas Subia
Statistician / Senior Quality Engineer

IMG Companies 
225 Mountain Vista Parkway
Livermore, CA 94551
T. (925) 273-1106
F. (925) 273-1111
E. [hidden email]


Precision Manufacturing for Emerging Technologies
imgprecision.com 

The contents of this message, together with any attachments, are intended only for the use of the individual or entity to which they are addressed and may contain information that is legally privileged, confidential and exempt from disclosure. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this message, or any attachment, is strictly prohibited. If you have received this message in error, please notify the original sender or IMG Companies, LLC at Tel: 925-273-1100 immediately by telephone or by return E-mail and delete this message, along with any attachments, from your computer. Thank you.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Dataframe by Serial ID

Jeff Newmiller
"merge" is generally the base R answer to this question, and there are equivalent functions in various contributed packages.

However, it is necessary to identify which columns in each table uniquely identify each row ("primary key"). If your Serial 3036 shows up 10 times in the first table and 10 times in the second table then you will end up with all combinations of those rows (100 rows) in a merge that uses only Serial to match rows. Is that what you want?


On January 8, 2020 6:52:18 AM PST, Thomas Subia <[hidden email]> wrote:

>Colleagues,
>
>I have two data frames which look like this.
>
>Data frame 1
>
>   Serial Pre.Hole Pre.flow   Pre.Date
>1   3036        1     0.24 19-Nov-19
>2   3036        2     0.212 19-Nov-19
>3   3036        3     1.292 19-Nov-19
>4   3036        4     0.262 19-Nov-19
>5   3036        5     1.291 19-Nov-19
>6   3036        6     0.26 19-Nov-19
>
>Data frame 2
>
>       Serial Post.Hole Post.flow Post.Date
>62323  11024        44     -0.678 11-Dec-19
>62324  11024        45     -0.659 11-Dec-19
>62325  11024        46   -0.654 11-Dec-19
>62326  11024        47     -0.699 11-Dec-19
>62327  11024        48   -0.671 11-Dec-19
>62328  11024        49     -0.687 11-Dec-19
>
>What I want is to create a data frame whose serials numbers are common
>to data frames 1 and 2.
>The resulting data frame 1st row would look like this.
>
>Serial Post.Hole Post.flow Post.Date Pre.Hole Pre.flow Pre.Date
>
>Any ideas on how to do this would be appreciated.
>
>Thomas Subia
>Statistician / Senior Quality Engineer
>
>IMG Companies 
>225 Mountain Vista Parkway
>Livermore, CA 94551
>T. (925) 273-1106
>F. (925) 273-1111
>E. [hidden email]
>
>
>Precision Manufacturing for Emerging Technologies
>imgprecision.com 
>
>The contents of this message, together with any attachments, are
>intended only for the use of the individual or entity to which they are
>addressed and may contain information that is legally privileged,
>confidential and exempt from disclosure. If you are not the intended
>recipient, you are hereby notified that any dissemination,
>distribution, or copying of this message, or any attachment, is
>strictly prohibited. If you have received this message in error, please
>notify the original sender or IMG Companies, LLC at Tel: 925-273-1100
>immediately by telephone or by return E-mail and delete this message,
>along with any attachments, from your computer. Thank you.
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Dataframe by Serial ID

David Winsemius
In reply to this post by Thomas Subia-2


> On Jan 8, 2020, at 6:52 AM, Thomas Subia <[hidden email]> wrote:
>
> Colleagues,
>
> I have two data frames which look like this.
>
> Data frame 1
>
>   Serial Pre.Hole Pre.flow   Pre.Date
> 1   3036        1     0.24 19-Nov-19
> 2   3036        2     0.212 19-Nov-19
> 3   3036        3     1.292 19-Nov-19
> 4   3036        4     0.262 19-Nov-19
> 5   3036        5     1.291 19-Nov-19
> 6   3036        6     0.26 19-Nov-19
>
> Data frame 2
>
>       Serial Post.Hole Post.flow Post.Date
> 62323  11024        44     -0.678 11-Dec-19
> 62324  11024        45     -0.659 11-Dec-19
> 62325  11024        46   -0.654 11-Dec-19
> 62326  11024        47     -0.699 11-Dec-19
> 62327  11024        48   -0.671 11-Dec-19
> 62328  11024        49     -0.687 11-Dec-19
>
> What I want is to create a data frame whose serials numbers are common to data frames 1 and 2.
> The resulting data frame 1st row would look like this.
>
> Serial Post.Hole Post.flow Post.Date Pre.Hole Pre.flow Pre.Date

Simply try:

new.df <- merge(df1,df2)

You should be aware that those columns with dates are actually factor or character values. R does have a Date class and you will likely want to look at using as.Date to make them more useful.


--
David.

>
> Any ideas on how to do this would be appreciated.
>
> Thomas Subia
> Statistician / Senior Quality Engineer
>
> IMG Companies
> 225 Mountain Vista Parkway
> Livermore, CA 94551
> T. (925) 273-1106
> F. (925) 273-1111
> E. [hidden email]
>
>
> Precision Manufacturing for Emerging Technologies
> imgprecision.com
>
> The contents of this message, together with any attachments, are intended only for the use of the individual or entity to which they are addressed and may contain information that is legally privileged, confidential and exempt from disclosure. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this message, or any attachment, is strictly prohibited. If you have received this message in error, please notify the original sender or IMG Companies, LLC at Tel: 925-273-1100 immediately by telephone or by return E-mail and delete this message, along with any attachments, from your computer. Thank you.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Dataframe by Serial ID

Eric Berger
In reply to this post by Jeff Newmiller
Hi Thomas,
Jeff is correct that this can be handled via merge, e.g.
df3 <- merge( df2, df1, by="Serial", all=FALSE )

This operation is called an "inner join", and you could use other tools,
such as the dplyr package to accomplish the same thing

df3 <- dplyr::inner_join( df2, df1, by="Serial" )

HTH,
Eric


On Wed, Jan 8, 2020 at 5:05 PM Jeff Newmiller <[hidden email]>
wrote:

> "merge" is generally the base R answer to this question, and there are
> equivalent functions in various contributed packages.
>
> However, it is necessary to identify which columns in each table uniquely
> identify each row ("primary key"). If your Serial 3036 shows up 10 times in
> the first table and 10 times in the second table then you will end up with
> all combinations of those rows (100 rows) in a merge that uses only Serial
> to match rows. Is that what you want?
>
>
> On January 8, 2020 6:52:18 AM PST, Thomas Subia <[hidden email]>
> wrote:
> >Colleagues,
> >
> >I have two data frames which look like this.
> >
> >Data frame 1
> >
> >       Serial  Pre.Hole        Pre.flow        Pre.Date
> >1      3036        1                   0.24            19-Nov-19
> >2      3036        2                   0.212           19-Nov-19
> >3      3036        3                   1.292           19-Nov-19
> >4      3036        4                   0.262           19-Nov-19
> >5      3036        5                   1.291           19-Nov-19
> >6      3036        6                   0.26            19-Nov-19
> >
> >Data frame 2
> >
> >       Serial   Post.Hole      Post.flow       Post.Date
> >62323  11024        44                 -0.678          11-Dec-19
> >62324  11024        45                 -0.659          11-Dec-19
> >62325  11024        46                  -0.654                 11-Dec-19
> >62326  11024        47                 -0.699          11-Dec-19
> >62327  11024        48                  -0.671                 11-Dec-19
> >62328  11024        49                 -0.687          11-Dec-19
> >
> >What I want is to create a data frame whose serials numbers are common
> >to data frames 1 and 2.
> >The resulting data frame 1st row would look like this.
> >
> >Serial         Post.Hole       Post.flow       Post.Date       Pre.Hole
>       Pre.flow        Pre.Date
> >
> >Any ideas on how to do this would be appreciated.
> >
> >Thomas Subia
> >Statistician / Senior Quality Engineer
> >
> >IMG Companies
> >225 Mountain Vista Parkway
> >Livermore, CA 94551
> >T. (925) 273-1106
> >F. (925) 273-1111
> >E. [hidden email]
> >
> >
> >Precision Manufacturing for Emerging Technologies
> >imgprecision.com
> >
> >The contents of this message, together with any attachments, are
> >intended only for the use of the individual or entity to which they are
> >addressed and may contain information that is legally privileged,
> >confidential and exempt from disclosure. If you are not the intended
> >recipient, you are hereby notified that any dissemination,
> >distribution, or copying of this message, or any attachment, is
> >strictly prohibited. If you have received this message in error, please
> >notify the original sender or IMG Companies, LLC at Tel: 925-273-1100
> >immediately by telephone or by return E-mail and delete this message,
> >along with any attachments, from your computer. Thank you.
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.