Filter according to the latest data

classic Classic list List threaded Threaded
5 messages Options
Mat
Reply | Threaded
Open this post in threaded view
|

Filter according to the latest data

Mat
Hello together,

i have a data.frame, like this one:
                 No.          Change           Date          
A              123           final                2013-01-15
B              123           error               2013-01-16
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

and now a want a new data.frame which includes only the newest entry for each number.
The solution look like this:
                 No.          Change           Date          
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

is there any way to filter my data.frame to the latest data, perhabs "max"?

Thanks.

Mat
Reply | Threaded
Open this post in threaded view
|

Re: Filter according to the latest data

nalluri pratap
library(sqldf)

k1<-data.frame(ID=LETTERS[1:4],
No=c(rep(123,3),111),
Change=c("final","error","bug fixed","final"),
Date=c("2013-01-15","2013-01-16","2013-01-17","2013-01-12"))
 
k1$Date=as.Date(as.character(k1$Date),tz=UK)
 
sqldf("select *
from k1
group by No
having max(Date)")


--- On Fri, 1/2/13, Mat <[hidden email]> wrote:


From: Mat <[hidden email]>
Subject: [R] Filter according to the latest data
To: [hidden email]
Date: Friday, 1 February, 2013, 1:34 PM


Hello together,

i have a data.frame, like this one:
                 No.          Change           Date         
A              123           final                2013-01-15
B              123           error               2013-01-16
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

and now a want a new data.frame which includes only the newest entry for
each number.
The solution look like this:
                 No.          Change           Date         
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

is there any way to filter my data.frame to the latest data, perhabs "max"?

Thanks.

Mat



--
View this message in context: http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Filter according to the latest data

jholtman
In reply to this post by Mat
try this:

> x <- read.table(text = "                 No.          Change           Date
+ A              123           final                2013-01-15
+ B              123           error               2013-01-16
+ C              123           'bug fixed'       2013-01-17
+ D              111           final                2013-01-12"
+     , header = TRUE
+     , as.is = TRUE
+     )
> do.call(rbind, lapply(split(x, x$No.), function(.sec){
+     .sec[which(.sec$Date == max(.sec$Date))[1L], ]
+ }))
    No.    Change       Date
111 111     final 2013-01-12
123 123 bug fixed 2013-01-17


On Fri, Feb 1, 2013 at 3:04 AM, Mat <[hidden email]> wrote:

> Hello together,
>
> i have a data.frame, like this one:
>                  No.          Change           Date
> A              123           final                2013-01-15
> B              123           error               2013-01-16
> C              123           bug fixed       2013-01-17
> D              111           final                2013-01-12
>
> and now a want a new data.frame which includes only the newest entry for
> each number.
> The solution look like this:
>                  No.          Change           Date
> C              123           bug fixed       2013-01-17
> D              111           final                2013-01-12
>
> is there any way to filter my data.frame to the latest data, perhabs "max"?
>
> Thanks.
>
> Mat
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Filter according to the latest data

arun kirshna
In reply to this post by Mat
Hi,
Perhaps, (#Untested)
do.call(rbind,lapply(split(dat1,dat1$No),function(x) tail(x,1)))

#or
library(plyr)
ddply(dat1,.(No), function(x) x[nrow(x),])

A.K.


----- Original Message -----
From: Mat <[hidden email]>
To: [hidden email]
Cc:
Sent: Friday, February 1, 2013 3:04 AM
Subject: [R] Filter according to the latest data

Hello together,

i have a data.frame, like this one:
                 No.          Change           Date         
A              123           final                2013-01-15
B              123           error               2013-01-16
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

and now a want a new data.frame which includes only the newest entry for
each number.
The solution look like this:
                 No.          Change           Date         
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

is there any way to filter my data.frame to the latest data, perhabs "max"?

Thanks.

Mat



--
View this message in context: http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Filter according to the latest data

Gabor Grothendieck
In reply to this post by nalluri pratap
On Fri, Feb 1, 2013 at 8:05 AM, nalluri pratap <[hidden email]> wrote:
> library(sqldf)
>

> sqldf("select *
> from k1
> group by No
> having max(Date)")
>

HAVING is only used to select groups and only works by chance in this
example but if the data were to change then it would likely not work.

Try this instead.  It makes use of an sqlite-specific feature that
guarantees that when MAX is used in a GROUP BY that the other columns
will be from the same row:

> sqldf("select ID, No, Change, max(Date) Date from k1 group by No")
  ID  No    Change       Date
1  D 111     final 2013-01-12
2  C 123 bug fixed 2013-01-17


--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.