sort

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

sort

Val-17
HI All,
I have a sample of data frame
DF1<-read.table(text="name ddate
  A  2019-10-28
  A  2018-01-25
  A  2020-01-12
  A  2017-10-20
  B  2020-11-20
  B  2019-10-20
  B  2017-05-20
  B  2020-01-20
  c  2009-10-01  ",header=TRUE)

1. I want sort by name and ddate on decreasing order and the output
should like as follow
   A  2020-01-12
   A  2019-01-12
   A  2018-01-25
   A  2017-10-20
   B  2020-11-21
  B  2020-11-01
  B  2019-10-20
  B  2017-05-20
  c  2009-10-01

2.  Take the top two rows by group( names) and the out put should like
   A  2020-01-12
   A  2019-01-12
   B  2020-11-21
   B  2020-11-01
    c  2009-10-01

3.  Within each group (name) get the date difference  between the
first and second rows dates. If a group has only one row then the
difference should be 0

The final out put is
Name diff
   A  365
    B  20
    C  0

Here is my attempt and have an issue at the sorting
DF1$DTime <- as.POSIXct(DF1$ddate , format = "%Y-%m-%d")
DF2 <- DF1[order(DF1$name, ((as.Date(DF1$DTime, decreasing = TRUE)))), ]

not working
Any help?

Thank you

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [External] sort

Richard M. Heiberger
## the data you gave us

DF1 <- read.table(text="name ddate
  A  2019-10-28
  A  2018-01-25
  A  2020-01-12
  A  2017-10-20
  B  2020-11-20
  B  2019-10-20
  B  2017-05-20
  B  2020-01-20
  c  2009-10-01  ",
  header=TRUE, colClasses=c("character", "POSIXct"))

DF1

D2 <- split(DF1, DF1$name)
D2

sapply(D2, function(x) {
  DD <- c(sort(x$ddate, decreasing=TRUE), min(x$ddate))
  DD[1]-DD[2]
})


## the data that your intended answer is based on

DF1 <- read.table(text="name ddate
  A  2019-01-12  ## intended value
  A  2018-01-25
  A  2020-01-12
  A  2017-10-20
  B  2020-11-20
  B  2019-10-20
  B  2017-05-20
  B  2020-01-20
  c  2009-10-01  ",
  header=TRUE, colClasses=c("character", "POSIXct"))

DF1

D2 <- split(DF1, DF1$name)
D2

sapply(D2, function(x) {
  DD <- c(sort(x$ddate, decreasing=TRUE), min(x$ddate))
  DD[1]-DD[2]
})

On Thu, May 14, 2020 at 11:00 PM Val <[hidden email]> wrote:

> HI All,
> I have a sample of data frame
> DF1<-read.table(text="name ddate
>   A  2019-10-28
>   A  2018-01-25
>   A  2020-01-12
>   A  2017-10-20
>   B  2020-11-20
>   B  2019-10-20
>   B  2017-05-20
>   B  2020-01-20
>   c  2009-10-01  ",header=TRUE)
>
> 1. I want sort by name and ddate on decreasing order and the output
> should like as follow
>    A  2020-01-12
>    A  2019-01-12
>    A  2018-01-25
>    A  2017-10-20
>    B  2020-11-21
>   B  2020-11-01
>   B  2019-10-20
>   B  2017-05-20
>   c  2009-10-01
>
> 2.  Take the top two rows by group( names) and the out put should like
>    A  2020-01-12
>    A  2019-01-12
>    B  2020-11-21
>    B  2020-11-01
>     c  2009-10-01
>
> 3.  Within each group (name) get the date difference  between the
> first and second rows dates. If a group has only one row then the
> difference should be 0
>
> The final out put is
> Name diff
>    A  365
>     B  20
>     C  0
>
> Here is my attempt and have an issue at the sorting
> DF1$DTime <- as.POSIXct(DF1$ddate , format = "%Y-%m-%d")
> DF2 <- DF1[order(DF1$name, ((as.Date(DF1$DTime, decreasing = TRUE)))), ]
>
> not working
> Any help?
>
> Thank you
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: sort

Jim Lemon-4
In reply to this post by Val-17
Hi Val,
Your problem is keeping the orders straight. You want dates decreasing
and names increasing:

DF1<-read.table(text="name ddate
  A  2019-10-28
  A  2018-01-25
  A  2020-01-12
  A  2017-10-20
  B  2020-11-20
  B  2019-10-20
  B  2017-05-20
  B  2020-01-20
  C  2009-10-01  ",header=TRUE)
DF1$Time<-as.POSIXct(DF1$ddate , format = "%Y-%m-%d")
# get dates in decreasing order
DF2<-DF1[order(DF1$Time,decreasing=TRUE),]
# create the final output with the names in increasing order
DF3<-data.frame(name=sort(unique(DF2$name)))
# create a function that returns the first diff of a vector
getdate1diff<-function(x)
 return(ifelse(length(x)>1,abs(diff(x))[1],0))
# apply the function by the unique names in DF2
DF3$date1diff<-by(DF2$Time,DF2$name,getdate1diff)
DF3

Jim

On Fri, May 15, 2020 at 1:00 PM Val <[hidden email]> wrote:

>
> HI All,
> I have a sample of data frame
> DF1<-read.table(text="name ddate
>   A  2019-10-28
>   A  2018-01-25
>   A  2020-01-12
>   A  2017-10-20
>   B  2020-11-20
>   B  2019-10-20
>   B  2017-05-20
>   B  2020-01-20
>   c  2009-10-01  ",header=TRUE)
>
> 1. I want sort by name and ddate on decreasing order and the output
> should like as follow
>    A  2020-01-12
>    A  2019-01-12
>    A  2018-01-25
>    A  2017-10-20
>    B  2020-11-21
>   B  2020-11-01
>   B  2019-10-20
>   B  2017-05-20
>   c  2009-10-01
>
> 2.  Take the top two rows by group( names) and the out put should like
>    A  2020-01-12
>    A  2019-01-12
>    B  2020-11-21
>    B  2020-11-01
>     c  2009-10-01
>
> 3.  Within each group (name) get the date difference  between the
> first and second rows dates. If a group has only one row then the
> difference should be 0
>
> The final out put is
> Name diff
>    A  365
>     B  20
>     C  0
>
> Here is my attempt and have an issue at the sorting
> DF1$DTime <- as.POSIXct(DF1$ddate , format = "%Y-%m-%d")
> DF2 <- DF1[order(DF1$name, ((as.Date(DF1$DTime, decreasing = TRUE)))), ]
>
> not working
> Any help?
>
> Thank you
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: sort

Rui Barradas
In reply to this post by Val-17
Hello,

Here is a dplyr solution. arrange() sorts by name and desc(ddate) and
top_n keeps the first 2 after grouping by ddate. Then it's a matter of
being careful with diff() in the summarise instruction.


library(dplyr)

DF1 %>%
   mutate(ddate = as.Date(ddate)) %>%
   arrange(name, desc(ddate)) %>%
   group_by(name) %>%
   top_n(2) %>%
   summarise(diff = ifelse(n() > 1, diff(rev(ddate)), 0))
#Selecting by ddate
## A tibble: 3 x 2
#  name   diff
#  <chr> <dbl>
#1 A        76
#2 B       305
#3 c         0


Hope this helps,

Rui Barradas

Às 03:58 de 15/05/20, Val escreveu:

> HI All,
> I have a sample of data frame
> DF1<-read.table(text="name ddate
>    A  2019-10-28
>    A  2018-01-25
>    A  2020-01-12
>    A  2017-10-20
>    B  2020-11-20
>    B  2019-10-20
>    B  2017-05-20
>    B  2020-01-20
>    c  2009-10-01  ",header=TRUE)
>
> 1. I want sort by name and ddate on decreasing order and the output
> should like as follow
>     A  2020-01-12
>     A  2019-01-12
>     A  2018-01-25
>     A  2017-10-20
>     B  2020-11-21
>    B  2020-11-01
>    B  2019-10-20
>    B  2017-05-20
>    c  2009-10-01
>
> 2.  Take the top two rows by group( names) and the out put should like
>     A  2020-01-12
>     A  2019-01-12
>     B  2020-11-21
>     B  2020-11-01
>      c  2009-10-01
>
> 3.  Within each group (name) get the date difference  between the
> first and second rows dates. If a group has only one row then the
> difference should be 0
>
> The final out put is
> Name diff
>     A  365
>      B  20
>      C  0
>
> Here is my attempt and have an issue at the sorting
> DF1$DTime <- as.POSIXct(DF1$ddate , format = "%Y-%m-%d")
> DF2 <- DF1[order(DF1$name, ((as.Date(DF1$DTime, decreasing = TRUE)))), ]
>
> not working
> Any help?
>
> Thank you
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.