DATA SUMMARIZING and REPORTING

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

DATA SUMMARIZING and REPORTING

abhinabaroy09
Hi R-helpers,

I have dataframe like

  ID_CASE         YEAR_MTH       ATT_1             A1              A2
A3  CB26A 201302 1 146 42 74  CB26A 201302 0 140 50 77  CB26A 201303 0 128
36 77  CB26A 201304 1 146 36 72  CB26A 201305 1 134 36 80  CB26A 201305 0
148 30 80  CB26A 201306 0 134 20 72  CB26A 201307 1 125 48 79  CB26A 201309
0 122 44 74  CB26A 201310 1 126 37 72  CB26A 201310 1 107 43 75
I want a final dataframe which will look like

  ID_CASE Period  No.ofChange      %Paid  CB26A 201302-2013042  0.414365
CB26A 201303-201305 2 0.445245  CB26A 201304-201306 1 0.444444  CB26A
201305-201307 2 0.460741  CB26A 201306-201308 1 0.461774  CB26A
201307-201309 1 0.451327  CB26A 201308-201310 1 0.461378
where,
Period = a time period of 3 months which is shifted by 1 month subsequently

No.ofChange = number of time ATT_1 has changed values in this period

%Paid = sum(A3)/(sum(A1)+sum(A2)) for this period
E.g. for Period=201302-201304,
%Paid = (74+77+77+72)/((146+140+128+146)+(42+50+36+36))

Period calculation should start from the first YEAR_MTH for the ID_CASE,
i.e., if for a ID_CASE first YEAR_MTH is 201301 or 201304 then the period
should be defined accordingly.

I have a dataframe with 400 unique ID_CASE, I need to do it for all ID_CASE.

How can I do it in R?

Regards,
Abhinaba

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: DATA SUMMARIZING and REPORTING

Bert Gunter
Is this homework? There is a no homework policy here.

And stop posting in HTML --- plain text only-- and learn to use ?dput
to post example data.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Wed, Jul 30, 2014 at 5:46 AM, Abhinaba Roy <[hidden email]> wrote:

> Hi R-helpers,
>
> I have dataframe like
>
>   ID_CASE         YEAR_MTH       ATT_1             A1              A2
> A3  CB26A 201302 1 146 42 74  CB26A 201302 0 140 50 77  CB26A 201303 0 128
> 36 77  CB26A 201304 1 146 36 72  CB26A 201305 1 134 36 80  CB26A 201305 0
> 148 30 80  CB26A 201306 0 134 20 72  CB26A 201307 1 125 48 79  CB26A 201309
> 0 122 44 74  CB26A 201310 1 126 37 72  CB26A 201310 1 107 43 75
> I want a final dataframe which will look like
>
>   ID_CASE Period  No.ofChange      %Paid  CB26A 201302-2013042  0.414365
> CB26A 201303-201305 2 0.445245  CB26A 201304-201306 1 0.444444  CB26A
> 201305-201307 2 0.460741  CB26A 201306-201308 1 0.461774  CB26A
> 201307-201309 1 0.451327  CB26A 201308-201310 1 0.461378
> where,
> Period = a time period of 3 months which is shifted by 1 month subsequently
>
> No.ofChange = number of time ATT_1 has changed values in this period
>
> %Paid = sum(A3)/(sum(A1)+sum(A2)) for this period
> E.g. for Period=201302-201304,
> %Paid = (74+77+77+72)/((146+140+128+146)+(42+50+36+36))
>
> Period calculation should start from the first YEAR_MTH for the ID_CASE,
> i.e., if for a ID_CASE first YEAR_MTH is 201301 or 201304 then the period
> should be defined accordingly.
>
> I have a dataframe with 400 unique ID_CASE, I need to do it for all ID_CASE.
>
> How can I do it in R?
>
> Regards,
> Abhinaba
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: DATA SUMMARIZING and REPORTING

PIKAL Petr
In reply to this post by abhinabaroy09
Hi

Maybe

?aggregate

Use dput for data presentation and no HTML as everything gets scrambled with it.

Regards
Petr

> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of Abhinaba Roy
> Sent: Wednesday, July 30, 2014 2:46 PM
> To: r-help
> Subject: [R] DATA SUMMARIZING and REPORTING
>
> Hi R-helpers,
>
> I have dataframe like
>
>   ID_CASE         YEAR_MTH       ATT_1             A1              A2
> A3  CB26A 201302 1 146 42 74  CB26A 201302 0 140 50 77  CB26A 201303 0
> 128
> 36 77  CB26A 201304 1 146 36 72  CB26A 201305 1 134 36 80  CB26A 201305
> 0
> 148 30 80  CB26A 201306 0 134 20 72  CB26A 201307 1 125 48 79  CB26A
> 201309 0 122 44 74  CB26A 201310 1 126 37 72  CB26A 201310 1 107 43 75
> I want a final dataframe which will look like
>
>   ID_CASE Period  No.ofChange      %Paid  CB26A 201302-2013042
> 0.414365
> CB26A 201303-201305 2 0.445245  CB26A 201304-201306 1 0.444444  CB26A
> 201305-201307 2 0.460741  CB26A 201306-201308 1 0.461774  CB26A
> 201307-201309 1 0.451327  CB26A 201308-201310 1 0.461378 where, Period
> = a time period of 3 months which is shifted by 1 month subsequently
>
> No.ofChange = number of time ATT_1 has changed values in this period
>
> %Paid = sum(A3)/(sum(A1)+sum(A2)) for this period E.g. for
> Period=201302-201304, %Paid =
> (74+77+77+72)/((146+140+128+146)+(42+50+36+36))
>
> Period calculation should start from the first YEAR_MTH for the
> ID_CASE, i.e., if for a ID_CASE first YEAR_MTH is 201301 or 201304 then
> the period should be defined accordingly.
>
> I have a dataframe with 400 unique ID_CASE, I need to do it for all
> ID_CASE.
>
> How can I do it in R?
>
> Regards,
> Abhinaba
>
>       [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

________________________________
Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system.
If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: DATA SUMMARIZING and REPORTING

abhinabaroy09
Hi,

> x <- read.csv("log1.csv",header=TRUE,as.is=TRUE)> dput(x=x)structure(list(ID_CASE = c("CB26A", "CB26A", "CB26A", "CB26A",
"CB26A", "CB26A", "CB26A", "CB26A", "CB26A", "CB26A", "CB26A"
), MTH_SUPPORT = c(201302L, 201302L, 201303L, 201304L, 201305L,
201305L, 201306L, 201307L, 201309L, 201310L, 201310L), ATT_1 = c(1L,
0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 1L), A1 = c(146L, 140L, 128L,
146L, 134L, 148L, 134L, 125L, 122L, 126L, 107L), A2 = c(42L,
50L, 36L, 36L, 36L, 30L, 20L, 48L, 44L, 37L, 43L), A3 = c(74L,
77L, 77L, 72L, 80L, 80L, 72L, 79L, 74L, 72L, 75L)), .Names = c("ID_CASE",
"MTH_SUPPORT", "ATT_1", "A1", "A2", "A3"), class = "data.frame",
row.names = c(NA,
-11L))


Looking forward to a solution to the problem.


Thank you


Regards



On Wed, Jul 30, 2014 at 7:31 PM, PIKAL Petr <[hidden email]> wrote:

> Hi
>
> Maybe
>
> ?aggregate
>
> Use dput for data presentation and no HTML as everything gets scrambled
> with it.
>
> Regards
> Petr
>
> > -----Original Message-----
> > From: [hidden email] [mailto:r-help-bounces@r-
> > project.org] On Behalf Of Abhinaba Roy
> > Sent: Wednesday, July 30, 2014 2:46 PM
> > To: r-help
> > Subject: [R] DATA SUMMARIZING and REPORTING
> >
> > Hi R-helpers,
> >
> > I have dataframe like
> >
> >   ID_CASE         YEAR_MTH       ATT_1             A1              A2
> > A3  CB26A 201302 1 146 42 74  CB26A 201302 0 140 50 77  CB26A 201303 0
> > 128
> > 36 77  CB26A 201304 1 146 36 72  CB26A 201305 1 134 36 80  CB26A 201305
> > 0
> > 148 30 80  CB26A 201306 0 134 20 72  CB26A 201307 1 125 48 79  CB26A
> > 201309 0 122 44 74  CB26A 201310 1 126 37 72  CB26A 201310 1 107 43 75
> > I want a final dataframe which will look like
> >
> >   ID_CASE Period  No.ofChange      %Paid  CB26A 201302-2013042
> > 0.414365
> > CB26A 201303-201305 2 0.445245  CB26A 201304-201306 1 0.444444  CB26A
> > 201305-201307 2 0.460741  CB26A 201306-201308 1 0.461774  CB26A
> > 201307-201309 1 0.451327  CB26A 201308-201310 1 0.461378 where, Period
> > = a time period of 3 months which is shifted by 1 month subsequently
> >
> > No.ofChange = number of time ATT_1 has changed values in this period
> >
> > %Paid = sum(A3)/(sum(A1)+sum(A2)) for this period E.g. for
> > Period=201302-201304, %Paid =
> > (74+77+77+72)/((146+140+128+146)+(42+50+36+36))
> >
> > Period calculation should start from the first YEAR_MTH for the
> > ID_CASE, i.e., if for a ID_CASE first YEAR_MTH is 201301 or 201304 then
> > the period should be defined accordingly.
> >
> > I have a dataframe with 400 unique ID_CASE, I need to do it for all
> > ID_CASE.
> >
> > How can I do it in R?
> >
> > Regards,
> > Abhinaba
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ________________________________
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie
> vymažte ze svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
> či zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout;
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
> výslovným dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn
> nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto
> emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich
> existence je adresátovi či osobě jím zastoupené známá.
>
> This e-mail and any documents attached to it may be confidential and are
> intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its
> sender. Delete the contents of this e-mail with all attachments and its
> copies from your system.
> If you are not the intended recipient of this e-mail, you are not
> authorized to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage
> caused by modifications of the e-mail or by delay with transfer of the
> email.
>
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a
> contract in any time, for any reason, and without stating any reasoning.
> - if the e-mail contains an offer, the recipient is entitled to
> immediately accept such offer; The sender of this e-mail (offer) excludes
> any acceptance of the offer on the part of the recipient containing any
> amendment or variation.
> - the sender insists on that the respective contract is concluded only
> upon an express mutual agreement on all its aspects.
> - the sender of this e-mail informs that he/she is not authorized to enter
> into any contracts on behalf of the company except for cases in which
> he/she is expressly authorized to do so in writing, and such authorization
> or power of attorney is submitted to the recipient or the person
> represented by the recipient, or the existence of such authorization is
> known to the recipient of the person represented by the recipient.
>
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: DATA SUMMARIZING and REPORTING

arun kirshna
In reply to this post by abhinabaroy09
For the example, you gave:

x ##dataset

indx <- t(sapply(min(x$MTH_SUPPORT):(max(x$MTH_SUPPORT) - 2), function(x) c(x, x +
    2)))

res <- do.call(rbind, apply(indx, 1, function(.indx) {
    x1 <- x[x$MTH_SUPPORT >= .indx[1] & x$MTH_SUPPORT <= .indx[2], ]
    Period <- paste(.indx[1], .indx[2], sep = "-")
    No.ofChange <- sum(x1$ATT_1[-1] != x1$ATT_1[-length(x1$ATT_1)])
    Paid = with(x1, sum(A3)/(sum(A1) + sum(A2)))
    data.frame(ID_CASE = x$ID_CASE[1L], Period, No.ofChange, Paid, stringsAsFactors = F)
}))


 res
  ID_CASE        Period No.ofChange      Paid
1   CB26A 201302-201304           2 0.4143646
2   CB26A 201303-201305           2 0.4452450
3   CB26A 201304-201306           1 0.4444444
4   CB26A 201305-201307           2 0.4607407
5   CB26A 201306-201308           1 0.4617737
6   CB26A 201307-201309           1 0.4513274
7   CB26A 201308-201310           1 0.4613779


With multiple ID_CASE, either split the dataset by ID_CASE or on the grouping functions before applying this.


A.K.




On Wednesday, July 30, 2014 8:48 AM, Abhinaba Roy <[hidden email]> wrote:
Hi R-helpers,

I have dataframe like

  ID_CASE         YEAR_MTH       ATT_1             A1              A2
A3  CB26A 201302 1 146 42 74  CB26A 201302 0 140 50 77  CB26A 201303 0 128
36 77  CB26A 201304 1 146 36 72  CB26A 201305 1 134 36 80  CB26A 201305 0
148 30 80  CB26A 201306 0 134 20 72  CB26A 201307 1 125 48 79  CB26A 201309
0 122 44 74  CB26A 201310 1 126 37 72  CB26A 201310 1 107 43 75
I want a final dataframe which will look like

  ID_CASE Period  No.ofChange      %Paid  CB26A 201302-2013042  0.414365
CB26A 201303-201305 2 0.445245  CB26A 201304-201306 1 0.444444  CB26A
201305-201307 2 0.460741  CB26A 201306-201308 1 0.461774  CB26A
201307-201309 1 0.451327  CB26A 201308-201310 1 0.461378
where,
Period = a time period of 3 months which is shifted by 1 month subsequently

No.ofChange = number of time ATT_1 has changed values in this period

%Paid = sum(A3)/(sum(A1)+sum(A2)) for this period
E.g. for Period=201302-201304,
%Paid = (74+77+77+72)/((146+140+128+146)+(42+50+36+36))

Period calculation should start from the first YEAR_MTH for the ID_CASE,
i.e., if for a ID_CASE first YEAR_MTH is 201301 or 201304 then the period
should be defined accordingly.

I have a dataframe with 400 unique ID_CASE, I need to do it for all ID_CASE.

How can I do it in R?

Regards,
Abhinaba

    [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: DATA SUMMARIZING and REPORTING

arun kirshna


With >1 ID_CASE, you may try:

xN <- x
xN$ID_CASE <- "CB27A" #creating another ID_CASE, other data same
x <- rbind(x, xN)
res1 <- do.call(rbind, lapply(split(x, x$ID_CASE), function(.x) {
    indx <- with(.x, t(sapply(min(MTH_SUPPORT):(max(MTH_SUPPORT) - 2), function(y) c(y,
        y + 2))))
    do.call(rbind, apply(indx, 1, function(.indx) {
        x1 <- .x[with(.x, MTH_SUPPORT >= .indx[1] & MTH_SUPPORT <= .indx[2]), ]
        Period <- paste(.indx[1], .indx[2], sep = "-")
        x2 <- within(x1, {
            Paid <- sum(A3)/(sum(A1) + sum(A2))
            No.ofChange <- sum(ATT_1[-1] != ATT_1[-length(ATT_1)])
        })
        data.frame(ID_CASE = .x$ID_CASE[1L], Period, No.ofChange = x2$No.ofChange[1L],
            Paid = x2$Paid[1L], stringsAsFactors = F)
    }))
}))

row.names(res1) <- 1:nrow(res1)
> res1
   ID_CASE        Period No.ofChange      Paid
1    CB26A 201302-201304           2 0.4143646
2    CB26A 201303-201305           2 0.4452450
3    CB26A 201304-201306           1 0.4444444
4    CB26A 201305-201307           2 0.4607407
5    CB26A 201306-201308           1 0.4617737
6    CB26A 201307-201309           1 0.4513274
7    CB26A 201308-201310           1 0.4613779
8    CB27A 201302-201304           2 0.4143646
9    CB27A 201303-201305           2 0.4452450
10   CB27A 201304-201306           1 0.4444444
11   CB27A 201305-201307           2 0.4607407
12   CB27A 201306-201308           1 0.4617737
13   CB27A 201307-201309           1 0.4513274
14   CB27A 201308-201310           1 0.4613779
A.K.




On Thursday, July 31, 2014 12:34 AM, arun <[hidden email]> wrote:
For the example, you gave:

x ##dataset

indx <- t(sapply(min(x$MTH_SUPPORT):(max(x$MTH_SUPPORT) - 2), function(x) c(x, x +
    2)))

res <- do.call(rbind, apply(indx, 1, function(.indx) {
    x1 <- x[x$MTH_SUPPORT >= .indx[1] & x$MTH_SUPPORT <= .indx[2], ]
    Period <- paste(.indx[1], .indx[2], sep = "-")
    No.ofChange <- sum(x1$ATT_1[-1] != x1$ATT_1[-length(x1$ATT_1)])
    Paid = with(x1, sum(A3)/(sum(A1) + sum(A2)))
    data.frame(ID_CASE = x$ID_CASE[1L], Period, No.ofChange, Paid, stringsAsFactors = F)
}))


 res
  ID_CASE        Period No.ofChange      Paid
1   CB26A 201302-201304           2 0.4143646
2   CB26A 201303-201305           2 0.4452450
3   CB26A 201304-201306           1 0.4444444
4   CB26A 201305-201307           2 0.4607407
5   CB26A 201306-201308           1 0.4617737
6   CB26A 201307-201309           1 0.4513274
7   CB26A 201308-201310           1 0.4613779


With multiple ID_CASE, either split the dataset by ID_CASE or on the grouping functions before applying this.


A.K.







On Wednesday, July 30, 2014 8:48 AM, Abhinaba Roy <[hidden email]> wrote:
Hi R-helpers,

I have dataframe like

  ID_CASE         YEAR_MTH       ATT_1             A1              A2
A3  CB26A 201302 1 146 42 74  CB26A 201302 0 140 50 77  CB26A 201303 0 128
36 77  CB26A 201304 1 146 36 72  CB26A 201305 1 134 36 80  CB26A 201305 0
148 30 80  CB26A 201306 0 134 20 72  CB26A 201307 1 125 48 79  CB26A 201309
0 122 44 74  CB26A 201310 1 126 37 72  CB26A 201310 1 107 43 75
I want a final dataframe which will look like

  ID_CASE Period  No.ofChange      %Paid  CB26A 201302-2013042  0.414365
CB26A 201303-201305 2 0.445245  CB26A 201304-201306 1 0.444444  CB26A
201305-201307 2 0.460741  CB26A 201306-201308 1 0.461774  CB26A
201307-201309 1 0.451327  CB26A 201308-201310 1 0.461378
where,
Period = a time period of 3 months which is shifted by 1 month subsequently

No.ofChange = number of time ATT_1 has changed values in this period

%Paid = sum(A3)/(sum(A1)+sum(A2)) for this period
E.g. for Period=201302-201304,
%Paid = (74+77+77+72)/((146+140+128+146)+(42+50+36+36))

Period calculation should start from the first YEAR_MTH for the ID_CASE,
i.e., if for a ID_CASE first YEAR_MTH is 201301 or 201304 then the period
should be defined accordingly.

I have a dataframe with 400 unique ID_CASE, I need to do it for all ID_CASE.

How can I do it in R?

Regards,
Abhinaba

    [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.