Computing growth rate

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Computing growth rate

Brijesh Mishra
Hi,

I am trying to calculate growth rate (say, sales, though it is to be
computed for many variables) in a panel data set. Problem is that I
have missing data for many firms for many years. To put it simply, I
have created this short dataframe (original df id much bigger)

df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))

# this gives me
co_code1 fyear1 sales1
1      1100   1990   1000
2      1100   1991   1100
3      1100   1992   1200
4      1100   1993   1300
5      1100   1994   1400
6      1100   1995   1500
7      1100   1996   1600
8      1200   1990   1000
9      1200   1991   1100
10     1200   1992   1200
11     1200   1993   1300
12     1200   1994   1400
13     1200   1995   1500
14     1200   1996   1600
15     1300   1990   1000
16     1300   1991   1100
17     1300   1992   1200
18     1300   1993   1300
19     1300   1994   1400
20     1300   1995   1500
21     1300   1996   1600

# I am now removing a couple of rows
df1<-df1[-c(5, 8), ]
# the result is
   co_code1 fyear1 sales1
1      1100   1990   1000
2      1100   1991   1100
3      1100   1992   1200
4      1100   1993   1300
6      1100   1995   1500
7      1100   1996   1600
9      1200   1991   1100
10     1200   1992   1200
11     1200   1993   1300
12     1200   1994   1400
13     1200   1995   1500
14     1200   1996   1600
15     1300   1990   1000
16     1300   1991   1100
17     1300   1992   1200
18     1300   1993   1300
19     1300   1994   1400
20     1300   1995   1500
21     1300   1996   1600
# so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
removed. If I try,
d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)

# this apparently gives wrong results for the year 1995 (as shown
below) as growth rates are computed considering yearly increment.

   co_code1 fyear1 sales1    growth
1      1100   1990   1000        NA
2      1100   1991   1100 10.000000
3      1100   1992   1200  9.090909
4      1100   1993   1300  8.333333
5      1100   1995   1500 15.384615
6      1100   1996   1600  6.666667
7      1200   1991   1100        NA
8      1200   1992   1200  9.090909
9      1200   1993   1300  8.333333
10     1200   1994   1400  7.692308
11     1200   1995   1500  7.142857
12     1200   1996   1600  6.666667
13     1300   1990   1000        NA
14     1300   1991   1100 10.000000
15     1300   1992   1200  9.090909
16     1300   1993   1300  8.333333
17     1300   1994   1400  7.692308
18     1300   1995   1500  7.142857
19     1300   1996   1600  6.666667
# I thought of using the formula only when the increment of fyear1 is
only 1 while in a co_code1, by using this formula

d<-ddply(df1,
         "co_code1",
         transform,
         if(diff(fyear1)==1){
           growth=(exp(diff(log(df1$sales1)))-1)*100
         } else{
           growth=NA
         })

But, this doesn't work. I am getting the following error.

In if (diff(fyear1) == 1) { :
  the condition has length > 1 and only the first element will be used
(repeated a few times).

# I have searched for a solution, but somehow couldn't get one. Hope
that some kind soul will guide me here.

Regards,

Brijesh K Mishra
Indian Institute of Management, Indore
India

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

Rui Barradas
Hello,

That is a very common mistake. if() accepts only one TRUE/FALSE, for a
vectorized version you need ?ifelse. Something like the following
(untested).

growth <- ifelse(diff(fyear1)==1, (exp(diff(log(df1$sales1)))-1)*100, NA)

Hope this helps,

Rui Barradas

Em 15-12-2016 03:40, Brijesh Mishra escreveu:

> Hi,
>
> I am trying to calculate growth rate (say, sales, though it is to be
> computed for many variables) in a panel data set. Problem is that I
> have missing data for many firms for many years. To put it simply, I
> have created this short dataframe (original df id much bigger)
>
> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>
> # this gives me
> co_code1 fyear1 sales1
> 1      1100   1990   1000
> 2      1100   1991   1100
> 3      1100   1992   1200
> 4      1100   1993   1300
> 5      1100   1994   1400
> 6      1100   1995   1500
> 7      1100   1996   1600
> 8      1200   1990   1000
> 9      1200   1991   1100
> 10     1200   1992   1200
> 11     1200   1993   1300
> 12     1200   1994   1400
> 13     1200   1995   1500
> 14     1200   1996   1600
> 15     1300   1990   1000
> 16     1300   1991   1100
> 17     1300   1992   1200
> 18     1300   1993   1300
> 19     1300   1994   1400
> 20     1300   1995   1500
> 21     1300   1996   1600
>
> # I am now removing a couple of rows
> df1<-df1[-c(5, 8), ]
> # the result is
>     co_code1 fyear1 sales1
> 1      1100   1990   1000
> 2      1100   1991   1100
> 3      1100   1992   1200
> 4      1100   1993   1300
> 6      1100   1995   1500
> 7      1100   1996   1600
> 9      1200   1991   1100
> 10     1200   1992   1200
> 11     1200   1993   1300
> 12     1200   1994   1400
> 13     1200   1995   1500
> 14     1200   1996   1600
> 15     1300   1990   1000
> 16     1300   1991   1100
> 17     1300   1992   1200
> 18     1300   1993   1300
> 19     1300   1994   1400
> 20     1300   1995   1500
> 21     1300   1996   1600
> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
> removed. If I try,
> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>
> # this apparently gives wrong results for the year 1995 (as shown
> below) as growth rates are computed considering yearly increment.
>
>     co_code1 fyear1 sales1    growth
> 1      1100   1990   1000        NA
> 2      1100   1991   1100 10.000000
> 3      1100   1992   1200  9.090909
> 4      1100   1993   1300  8.333333
> 5      1100   1995   1500 15.384615
> 6      1100   1996   1600  6.666667
> 7      1200   1991   1100        NA
> 8      1200   1992   1200  9.090909
> 9      1200   1993   1300  8.333333
> 10     1200   1994   1400  7.692308
> 11     1200   1995   1500  7.142857
> 12     1200   1996   1600  6.666667
> 13     1300   1990   1000        NA
> 14     1300   1991   1100 10.000000
> 15     1300   1992   1200  9.090909
> 16     1300   1993   1300  8.333333
> 17     1300   1994   1400  7.692308
> 18     1300   1995   1500  7.142857
> 19     1300   1996   1600  6.666667
> # I thought of using the formula only when the increment of fyear1 is
> only 1 while in a co_code1, by using this formula
>
> d<-ddply(df1,
>           "co_code1",
>           transform,
>           if(diff(fyear1)==1){
>             growth=(exp(diff(log(df1$sales1)))-1)*100
>           } else{
>             growth=NA
>           })
>
> But, this doesn't work. I am getting the following error.
>
> In if (diff(fyear1) == 1) { :
>    the condition has length > 1 and only the first element will be used
> (repeated a few times).
>
> # I have searched for a solution, but somehow couldn't get one. Hope
> that some kind soul will guide me here.
>
> Regards,
>
> Brijesh K Mishra
> Indian Institute of Management, Indore
> India
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

Berend Hasselman
In reply to this post by Brijesh Mishra

> On 15 Dec 2016, at 04:40, Brijesh Mishra <[hidden email]> wrote:
>
> Hi,
>
> I am trying to calculate growth rate (say, sales, though it is to be
> computed for many variables) in a panel data set. Problem is that I
> have missing data for many firms for many years. To put it simply, I
> have created this short dataframe (original df id much bigger)
>
> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>
> # this gives me
> co_code1 fyear1 sales1
> 1      1100   1990   1000
> 2      1100   1991   1100
> 3      1100   1992   1200
> 4      1100   1993   1300
> 5      1100   1994   1400
> 6      1100   1995   1500
> 7      1100   1996   1600
> 8      1200   1990   1000
> 9      1200   1991   1100
> 10     1200   1992   1200
> 11     1200   1993   1300
> 12     1200   1994   1400
> 13     1200   1995   1500
> 14     1200   1996   1600
> 15     1300   1990   1000
> 16     1300   1991   1100
> 17     1300   1992   1200
> 18     1300   1993   1300
> 19     1300   1994   1400
> 20     1300   1995   1500
> 21     1300   1996   1600
>
> # I am now removing a couple of rows
> df1<-df1[-c(5, 8), ]
> # the result is
>   co_code1 fyear1 sales1
> 1      1100   1990   1000
> 2      1100   1991   1100
> 3      1100   1992   1200
> 4      1100   1993   1300
> 6      1100   1995   1500
> 7      1100   1996   1600
> 9      1200   1991   1100
> 10     1200   1992   1200
> 11     1200   1993   1300
> 12     1200   1994   1400
> 13     1200   1995   1500
> 14     1200   1996   1600
> 15     1300   1990   1000
> 16     1300   1991   1100
> 17     1300   1992   1200
> 18     1300   1993   1300
> 19     1300   1994   1400
> 20     1300   1995   1500
> 21     1300   1996   1600
> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
> removed. If I try,
> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>
> # this apparently gives wrong results for the year 1995 (as shown
> below) as growth rates are computed considering yearly increment.
>
>   co_code1 fyear1 sales1    growth
> 1      1100   1990   1000        NA
> 2      1100   1991   1100 10.000000
> 3      1100   1992   1200  9.090909
> 4      1100   1993   1300  8.333333
> 5      1100   1995   1500 15.384615
> 6      1100   1996   1600  6.666667
> 7      1200   1991   1100        NA
> 8      1200   1992   1200  9.090909
> 9      1200   1993   1300  8.333333
> 10     1200   1994   1400  7.692308
> 11     1200   1995   1500  7.142857
> 12     1200   1996   1600  6.666667
> 13     1300   1990   1000        NA
> 14     1300   1991   1100 10.000000
> 15     1300   1992   1200  9.090909
> 16     1300   1993   1300  8.333333
> 17     1300   1994   1400  7.692308
> 18     1300   1995   1500  7.142857
> 19     1300   1996   1600  6.666667
> # I thought of using the formula only when the increment of fyear1 is
> only 1 while in a co_code1, by using this formula
>
> d<-ddply(df1,
>         "co_code1",
>         transform,
>         if(diff(fyear1)==1){
>           growth=(exp(diff(log(df1$sales1)))-1)*100
>         } else{
>           growth=NA
>         })
>
> But, this doesn't work. I am getting the following error.
>
> In if (diff(fyear1) == 1) { :
>  the condition has length > 1 and only the first element will be used
> (repeated a few times).
>
> # I have searched for a solution, but somehow couldn't get one. Hope
> that some kind soul will guide me here.
>

In your case use ifelse() as explained by Rui.
But it can be done more easily since the fyear1 and co_code1 are synchronized.
Add a new column to df1 like this

df1$growth <- c(NA,
         ifelse(diff(df1$fyear1)==1,
                    (exp(diff(log(df1$sales1)))-1)*100,
                    NA
                    )
        )

and display df1. From your request I cannot determine if this is what you want.

regards,

Berend Hasselman

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

Brijesh Mishra
In reply to this post by Rui Barradas
Dear Mr. Barradas,

Thanks a lot for pointing that. I tried that in a few steps-
1. when I evaluated

d<-ddply(df1,"co_code1",transform, growth <- ifelse(diff(fyear1)==1,
(exp(diff(log(df1$sales1)))-1)*100, NA))

I got the following, i.e., I was not getting the growth column automatically.

co_code1 fyear1 sales1
1      1100   1990   1000
2      1100   1991   1100
3      1100   1992   1200
4      1100   1993   1300
5      1100   1995   1500
6      1100   1996   1600
7      1200   1991   1100
8      1200   1992   1200
9      1200   1993   1300
10     1200   1994   1400
11     1200   1995   1500
12     1200   1996   1600
13     1300   1990   1000
14     1300   1992   1200
15     1300   1993   1300
16     1300   1994   1400
17     1300   1995   1500
18     1300   1996   1600

2. When, just for the heck of it, the assign mark (<-) was changed to
'=' as done previously,

d<-ddply(df1,"co_code1",transform, growth = ifelse(diff(fyear1)==1,
(exp(diff(log(df1$sales1)))-1)*100, NA))

It was no longer evaluated-error was

"Error in data.frame(list(co_code1 = c(1100, 1100, 1100, 1100, 1100, 1100 :
  arguments imply differing number of rows: 6, 5"

3. The following gives the desired result

df1$growth<-c(NA, ifelse(diff(df1$fyear1)==1,
(exp(diff(log(df1$sales1)))-1)*100, NA))

But now I am no longer restricting each iteranation to
'co_code1'-hypothetically if one co_code1 is followed by another with
incremental 'fyear1' difference as 1, growth will be evaluated.

Is there a better and more elegant way of doing it?

Thanks and regards,

Brijesh

On Thu, Dec 15, 2016 at 5:02 PM, Rui Barradas <[hidden email]> wrote:

> Hello,
>
> That is a very common mistake. if() accepts only one TRUE/FALSE, for a
> vectorized version you need ?ifelse. Something like the following
> (untested).
>
> growth <- ifelse(diff(fyear1)==1, (exp(diff(log(df1$sales1)))-1)*100, NA)
>
> Hope this helps,
>
> Rui Barradas
>
>
> Em 15-12-2016 03:40, Brijesh Mishra escreveu:
>>
>> Hi,
>>
>> I am trying to calculate growth rate (say, sales, though it is to be
>> computed for many variables) in a panel data set. Problem is that I
>> have missing data for many firms for many years. To put it simply, I
>> have created this short dataframe (original df id much bigger)
>>
>> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>>
>> # this gives me
>> co_code1 fyear1 sales1
>> 1      1100   1990   1000
>> 2      1100   1991   1100
>> 3      1100   1992   1200
>> 4      1100   1993   1300
>> 5      1100   1994   1400
>> 6      1100   1995   1500
>> 7      1100   1996   1600
>> 8      1200   1990   1000
>> 9      1200   1991   1100
>> 10     1200   1992   1200
>> 11     1200   1993   1300
>> 12     1200   1994   1400
>> 13     1200   1995   1500
>> 14     1200   1996   1600
>> 15     1300   1990   1000
>> 16     1300   1991   1100
>> 17     1300   1992   1200
>> 18     1300   1993   1300
>> 19     1300   1994   1400
>> 20     1300   1995   1500
>> 21     1300   1996   1600
>>
>> # I am now removing a couple of rows
>> df1<-df1[-c(5, 8), ]
>> # the result is
>>     co_code1 fyear1 sales1
>> 1      1100   1990   1000
>> 2      1100   1991   1100
>> 3      1100   1992   1200
>> 4      1100   1993   1300
>> 6      1100   1995   1500
>> 7      1100   1996   1600
>> 9      1200   1991   1100
>> 10     1200   1992   1200
>> 11     1200   1993   1300
>> 12     1200   1994   1400
>> 13     1200   1995   1500
>> 14     1200   1996   1600
>> 15     1300   1990   1000
>> 16     1300   1991   1100
>> 17     1300   1992   1200
>> 18     1300   1993   1300
>> 19     1300   1994   1400
>> 20     1300   1995   1500
>> 21     1300   1996   1600
>> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>> removed. If I try,
>> d<-ddply(df1,"co_code1",transform,
>> growth=c(NA,exp(diff(log(sales1)))-1)*100)
>>
>> # this apparently gives wrong results for the year 1995 (as shown
>> below) as growth rates are computed considering yearly increment.
>>
>>     co_code1 fyear1 sales1    growth
>> 1      1100   1990   1000        NA
>> 2      1100   1991   1100 10.000000
>> 3      1100   1992   1200  9.090909
>> 4      1100   1993   1300  8.333333
>> 5      1100   1995   1500 15.384615
>> 6      1100   1996   1600  6.666667
>> 7      1200   1991   1100        NA
>> 8      1200   1992   1200  9.090909
>> 9      1200   1993   1300  8.333333
>> 10     1200   1994   1400  7.692308
>> 11     1200   1995   1500  7.142857
>> 12     1200   1996   1600  6.666667
>> 13     1300   1990   1000        NA
>> 14     1300   1991   1100 10.000000
>> 15     1300   1992   1200  9.090909
>> 16     1300   1993   1300  8.333333
>> 17     1300   1994   1400  7.692308
>> 18     1300   1995   1500  7.142857
>> 19     1300   1996   1600  6.666667
>> # I thought of using the formula only when the increment of fyear1 is
>> only 1 while in a co_code1, by using this formula
>>
>> d<-ddply(df1,
>>           "co_code1",
>>           transform,
>>           if(diff(fyear1)==1){
>>             growth=(exp(diff(log(df1$sales1)))-1)*100
>>           } else{
>>             growth=NA
>>           })
>>
>> But, this doesn't work. I am getting the following error.
>>
>> In if (diff(fyear1) == 1) { :
>>    the condition has length > 1 and only the first element will be used
>> (repeated a few times).
>>
>> # I have searched for a solution, but somehow couldn't get one. Hope
>> that some kind soul will guide me here.
>>
>> Regards,
>>
>> Brijesh K Mishra
>> Indian Institute of Management, Indore
>> India
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

Brijesh Mishra
In reply to this post by Berend Hasselman
Dear Mr Hasselman,

I missed you mail, while I was typing my own mail as a reply to Mr.
Barradas suggestion. In fact, I implemented your suggestion even
before reading it. But, I have a concern that I have noted (though its
only hypothetical- such a scenario is very unlikely to occur). Is
there a way to restrict such calculations co_code1 wise?

Many thanks,

Brijesh

On Thu, Dec 15, 2016 at 5:48 PM, Berend Hasselman <[hidden email]> wrote:

>
>> On 15 Dec 2016, at 04:40, Brijesh Mishra <[hidden email]> wrote:
>>
>> Hi,
>>
>> I am trying to calculate growth rate (say, sales, though it is to be
>> computed for many variables) in a panel data set. Problem is that I
>> have missing data for many firms for many years. To put it simply, I
>> have created this short dataframe (original df id much bigger)
>>
>> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>>
>> # this gives me
>> co_code1 fyear1 sales1
>> 1      1100   1990   1000
>> 2      1100   1991   1100
>> 3      1100   1992   1200
>> 4      1100   1993   1300
>> 5      1100   1994   1400
>> 6      1100   1995   1500
>> 7      1100   1996   1600
>> 8      1200   1990   1000
>> 9      1200   1991   1100
>> 10     1200   1992   1200
>> 11     1200   1993   1300
>> 12     1200   1994   1400
>> 13     1200   1995   1500
>> 14     1200   1996   1600
>> 15     1300   1990   1000
>> 16     1300   1991   1100
>> 17     1300   1992   1200
>> 18     1300   1993   1300
>> 19     1300   1994   1400
>> 20     1300   1995   1500
>> 21     1300   1996   1600
>>
>> # I am now removing a couple of rows
>> df1<-df1[-c(5, 8), ]
>> # the result is
>>   co_code1 fyear1 sales1
>> 1      1100   1990   1000
>> 2      1100   1991   1100
>> 3      1100   1992   1200
>> 4      1100   1993   1300
>> 6      1100   1995   1500
>> 7      1100   1996   1600
>> 9      1200   1991   1100
>> 10     1200   1992   1200
>> 11     1200   1993   1300
>> 12     1200   1994   1400
>> 13     1200   1995   1500
>> 14     1200   1996   1600
>> 15     1300   1990   1000
>> 16     1300   1991   1100
>> 17     1300   1992   1200
>> 18     1300   1993   1300
>> 19     1300   1994   1400
>> 20     1300   1995   1500
>> 21     1300   1996   1600
>> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>> removed. If I try,
>> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>>
>> # this apparently gives wrong results for the year 1995 (as shown
>> below) as growth rates are computed considering yearly increment.
>>
>>   co_code1 fyear1 sales1    growth
>> 1      1100   1990   1000        NA
>> 2      1100   1991   1100 10.000000
>> 3      1100   1992   1200  9.090909
>> 4      1100   1993   1300  8.333333
>> 5      1100   1995   1500 15.384615
>> 6      1100   1996   1600  6.666667
>> 7      1200   1991   1100        NA
>> 8      1200   1992   1200  9.090909
>> 9      1200   1993   1300  8.333333
>> 10     1200   1994   1400  7.692308
>> 11     1200   1995   1500  7.142857
>> 12     1200   1996   1600  6.666667
>> 13     1300   1990   1000        NA
>> 14     1300   1991   1100 10.000000
>> 15     1300   1992   1200  9.090909
>> 16     1300   1993   1300  8.333333
>> 17     1300   1994   1400  7.692308
>> 18     1300   1995   1500  7.142857
>> 19     1300   1996   1600  6.666667
>> # I thought of using the formula only when the increment of fyear1 is
>> only 1 while in a co_code1, by using this formula
>>
>> d<-ddply(df1,
>>         "co_code1",
>>         transform,
>>         if(diff(fyear1)==1){
>>           growth=(exp(diff(log(df1$sales1)))-1)*100
>>         } else{
>>           growth=NA
>>         })
>>
>> But, this doesn't work. I am getting the following error.
>>
>> In if (diff(fyear1) == 1) { :
>>  the condition has length > 1 and only the first element will be used
>> (repeated a few times).
>>
>> # I have searched for a solution, but somehow couldn't get one. Hope
>> that some kind soul will guide me here.
>>
>
> In your case use ifelse() as explained by Rui.
> But it can be done more easily since the fyear1 and co_code1 are synchronized.
> Add a new column to df1 like this
>
> df1$growth <- c(NA,
>          ifelse(diff(df1$fyear1)==1,
>                     (exp(diff(log(df1$sales1)))-1)*100,
>                     NA
>                     )
>         )
>
> and display df1. From your request I cannot determine if this is what you want.
>
> regards,
>
> Berend Hasselman
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

Brijesh Mishra
This was ensured while using ddply()...

On Thu, Dec 15, 2016 at 6:04 PM, Brijesh Mishra
<[hidden email]> wrote:

> Dear Mr Hasselman,
>
> I missed you mail, while I was typing my own mail as a reply to Mr.
> Barradas suggestion. In fact, I implemented your suggestion even
> before reading it. But, I have a concern that I have noted (though its
> only hypothetical- such a scenario is very unlikely to occur). Is
> there a way to restrict such calculations co_code1 wise?
>
> Many thanks,
>
> Brijesh
>
> On Thu, Dec 15, 2016 at 5:48 PM, Berend Hasselman <[hidden email]> wrote:
>>
>>> On 15 Dec 2016, at 04:40, Brijesh Mishra <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I am trying to calculate growth rate (say, sales, though it is to be
>>> computed for many variables) in a panel data set. Problem is that I
>>> have missing data for many firms for many years. To put it simply, I
>>> have created this short dataframe (original df id much bigger)
>>>
>>> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>>> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>>>
>>> # this gives me
>>> co_code1 fyear1 sales1
>>> 1      1100   1990   1000
>>> 2      1100   1991   1100
>>> 3      1100   1992   1200
>>> 4      1100   1993   1300
>>> 5      1100   1994   1400
>>> 6      1100   1995   1500
>>> 7      1100   1996   1600
>>> 8      1200   1990   1000
>>> 9      1200   1991   1100
>>> 10     1200   1992   1200
>>> 11     1200   1993   1300
>>> 12     1200   1994   1400
>>> 13     1200   1995   1500
>>> 14     1200   1996   1600
>>> 15     1300   1990   1000
>>> 16     1300   1991   1100
>>> 17     1300   1992   1200
>>> 18     1300   1993   1300
>>> 19     1300   1994   1400
>>> 20     1300   1995   1500
>>> 21     1300   1996   1600
>>>
>>> # I am now removing a couple of rows
>>> df1<-df1[-c(5, 8), ]
>>> # the result is
>>>   co_code1 fyear1 sales1
>>> 1      1100   1990   1000
>>> 2      1100   1991   1100
>>> 3      1100   1992   1200
>>> 4      1100   1993   1300
>>> 6      1100   1995   1500
>>> 7      1100   1996   1600
>>> 9      1200   1991   1100
>>> 10     1200   1992   1200
>>> 11     1200   1993   1300
>>> 12     1200   1994   1400
>>> 13     1200   1995   1500
>>> 14     1200   1996   1600
>>> 15     1300   1990   1000
>>> 16     1300   1991   1100
>>> 17     1300   1992   1200
>>> 18     1300   1993   1300
>>> 19     1300   1994   1400
>>> 20     1300   1995   1500
>>> 21     1300   1996   1600
>>> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>>> removed. If I try,
>>> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>>>
>>> # this apparently gives wrong results for the year 1995 (as shown
>>> below) as growth rates are computed considering yearly increment.
>>>
>>>   co_code1 fyear1 sales1    growth
>>> 1      1100   1990   1000        NA
>>> 2      1100   1991   1100 10.000000
>>> 3      1100   1992   1200  9.090909
>>> 4      1100   1993   1300  8.333333
>>> 5      1100   1995   1500 15.384615
>>> 6      1100   1996   1600  6.666667
>>> 7      1200   1991   1100        NA
>>> 8      1200   1992   1200  9.090909
>>> 9      1200   1993   1300  8.333333
>>> 10     1200   1994   1400  7.692308
>>> 11     1200   1995   1500  7.142857
>>> 12     1200   1996   1600  6.666667
>>> 13     1300   1990   1000        NA
>>> 14     1300   1991   1100 10.000000
>>> 15     1300   1992   1200  9.090909
>>> 16     1300   1993   1300  8.333333
>>> 17     1300   1994   1400  7.692308
>>> 18     1300   1995   1500  7.142857
>>> 19     1300   1996   1600  6.666667
>>> # I thought of using the formula only when the increment of fyear1 is
>>> only 1 while in a co_code1, by using this formula
>>>
>>> d<-ddply(df1,
>>>         "co_code1",
>>>         transform,
>>>         if(diff(fyear1)==1){
>>>           growth=(exp(diff(log(df1$sales1)))-1)*100
>>>         } else{
>>>           growth=NA
>>>         })
>>>
>>> But, this doesn't work. I am getting the following error.
>>>
>>> In if (diff(fyear1) == 1) { :
>>>  the condition has length > 1 and only the first element will be used
>>> (repeated a few times).
>>>
>>> # I have searched for a solution, but somehow couldn't get one. Hope
>>> that some kind soul will guide me here.
>>>
>>
>> In your case use ifelse() as explained by Rui.
>> But it can be done more easily since the fyear1 and co_code1 are synchronized.
>> Add a new column to df1 like this
>>
>> df1$growth <- c(NA,
>>          ifelse(diff(df1$fyear1)==1,
>>                     (exp(diff(log(df1$sales1)))-1)*100,
>>                     NA
>>                     )
>>         )
>>
>> and display df1. From your request I cannot determine if this is what you want.
>>
>> regards,
>>
>> Berend Hasselman
>>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

PIKAL Petr
In reply to this post by Berend Hasselman
Hi

Maybe you does not need if or ifelse but just divide by years difference.

d2<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1))/diff(fyear1))- 1)*100)

Cheers
Petr

> -----Original Message-----
> From: R-help [mailto:[hidden email]] On Behalf Of Berend
> Hasselman
> Sent: Thursday, December 15, 2016 1:18 PM
> To: Brijesh Mishra <[hidden email]>
> Cc: r-help mailing list <[hidden email]>
> Subject: Re: [R] Computing growth rate
>
>
> > On 15 Dec 2016, at 04:40, Brijesh Mishra <[hidden email]>
> wrote:
> >
> > Hi,
> >
> > I am trying to calculate growth rate (say, sales, though it is to be
> > computed for many variables) in a panel data set. Problem is that I
> > have missing data for many firms for many years. To put it simply, I
> > have created this short dataframe (original df id much bigger)
> >
> > df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
> > fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
> >
> > # this gives me
> > co_code1 fyear1 sales1
> > 1      1100   1990   1000
> > 2      1100   1991   1100
> > 3      1100   1992   1200
> > 4      1100   1993   1300
> > 5      1100   1994   1400
> > 6      1100   1995   1500
> > 7      1100   1996   1600
> > 8      1200   1990   1000
> > 9      1200   1991   1100
> > 10     1200   1992   1200
> > 11     1200   1993   1300
> > 12     1200   1994   1400
> > 13     1200   1995   1500
> > 14     1200   1996   1600
> > 15     1300   1990   1000
> > 16     1300   1991   1100
> > 17     1300   1992   1200
> > 18     1300   1993   1300
> > 19     1300   1994   1400
> > 20     1300   1995   1500
> > 21     1300   1996   1600
> >
> > # I am now removing a couple of rows
> > df1<-df1[-c(5, 8), ]
> > # the result is
> >   co_code1 fyear1 sales1
> > 1      1100   1990   1000
> > 2      1100   1991   1100
> > 3      1100   1992   1200
> > 4      1100   1993   1300
> > 6      1100   1995   1500
> > 7      1100   1996   1600
> > 9      1200   1991   1100
> > 10     1200   1992   1200
> > 11     1200   1993   1300
> > 12     1200   1994   1400
> > 13     1200   1995   1500
> > 14     1200   1996   1600
> > 15     1300   1990   1000
> > 16     1300   1991   1100
> > 17     1300   1992   1200
> > 18     1300   1993   1300
> > 19     1300   1994   1400
> > 20     1300   1995   1500
> > 21     1300   1996   1600
> > # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
> > removed. If I try,
> > d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-
> 1)*100)
> >
> > # this apparently gives wrong results for the year 1995 (as shown
> > below) as growth rates are computed considering yearly increment.
> >
> >   co_code1 fyear1 sales1    growth
> > 1      1100   1990   1000        NA
> > 2      1100   1991   1100 10.000000
> > 3      1100   1992   1200  9.090909
> > 4      1100   1993   1300  8.333333
> > 5      1100   1995   1500 15.384615
> > 6      1100   1996   1600  6.666667
> > 7      1200   1991   1100        NA
> > 8      1200   1992   1200  9.090909
> > 9      1200   1993   1300  8.333333
> > 10     1200   1994   1400  7.692308
> > 11     1200   1995   1500  7.142857
> > 12     1200   1996   1600  6.666667
> > 13     1300   1990   1000        NA
> > 14     1300   1991   1100 10.000000
> > 15     1300   1992   1200  9.090909
> > 16     1300   1993   1300  8.333333
> > 17     1300   1994   1400  7.692308
> > 18     1300   1995   1500  7.142857
> > 19     1300   1996   1600  6.666667
> > # I thought of using the formula only when the increment of fyear1 is
> > only 1 while in a co_code1, by using this formula
> >
> > d<-ddply(df1,
> >         "co_code1",
> >         transform,
> >         if(diff(fyear1)==1){
> >           growth=(exp(diff(log(df1$sales1)))-1)*100
> >         } else{
> >           growth=NA
> >         })
> >
> > But, this doesn't work. I am getting the following error.
> >
> > In if (diff(fyear1) == 1) { :
> >  the condition has length > 1 and only the first element will be used
> > (repeated a few times).
> >
> > # I have searched for a solution, but somehow couldn't get one. Hope
> > that some kind soul will guide me here.
> >
>
> In your case use ifelse() as explained by Rui.
> But it can be done more easily since the fyear1 and co_code1 are
> synchronized.
> Add a new column to df1 like this
>
> df1$growth <- c(NA,
>          ifelse(diff(df1$fyear1)==1,
>                     (exp(diff(log(df1$sales1)))-1)*100,
>                     NA
>                     )
>         )
>
> and display df1. From your request I cannot determine if this is what you
> want.
>
> regards,
>
> Berend Hasselman
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

________________________________
Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system.
If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

Berend Hasselman
In reply to this post by Brijesh Mishra

> On 15 Dec 2016, at 13:34, Brijesh Mishra <[hidden email]> wrote:
>
> Dear Mr Hasselman,
>
> I missed you mail, while I was typing my own mail as a reply to Mr.
> Barradas suggestion. In fact, I implemented your suggestion even
> before reading it. But, I have a concern that I have noted (though its
> only hypothetical- such a scenario is very unlikely to occur). Is
> there a way to restrict such calculations co_code1 wise?

Like this?

df2 <- ddply(df1,"co_code1", transform,
    growth=c(NA, ifelse(diff(fyear1)==1, (exp(diff(log(sales1)))-1)*100,NA))
    )


But do also look at Petr Pikal's solution. Which of the two solutions you prefer depends on what you want in your special case.

Berend
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

dkStevens
In reply to this post by Berend Hasselman
Berend - Unless you need the change in sales year by year, you might
consider looking at each company's sales over the years and use
regression or other type of trend analysis to get an overall trend...
Or, if not, simply divide diff(sales) by diff(fyear1) for each company
so at least you get the average over the missing years.

David


On 12/15/2016 7:18 AM, Berend Hasselman wrote:

>> On 15 Dec 2016, at 04:40, Brijesh Mishra <[hidden email]> wrote:
>>
>> Hi,
>>
>> I am trying to calculate growth rate (say, sales, though it is to be
>> computed for many variables) in a panel data set. Problem is that I
>> have missing data for many firms for many years. To put it simply, I
>> have created this short dataframe (original df id much bigger)
>>
>> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>>
>> # this gives me
>> co_code1 fyear1 sales1
>> 1      1100   1990   1000
>> 2      1100   1991   1100
>> 3      1100   1992   1200
>> 4      1100   1993   1300
>> 5      1100   1994   1400
>> 6      1100   1995   1500
>> 7      1100   1996   1600
>> 8      1200   1990   1000
>> 9      1200   1991   1100
>> 10     1200   1992   1200
>> 11     1200   1993   1300
>> 12     1200   1994   1400
>> 13     1200   1995   1500
>> 14     1200   1996   1600
>> 15     1300   1990   1000
>> 16     1300   1991   1100
>> 17     1300   1992   1200
>> 18     1300   1993   1300
>> 19     1300   1994   1400
>> 20     1300   1995   1500
>> 21     1300   1996   1600
>>
>> # I am now removing a couple of rows
>> df1<-df1[-c(5, 8), ]
>> # the result is
>>    co_code1 fyear1 sales1
>> 1      1100   1990   1000
>> 2      1100   1991   1100
>> 3      1100   1992   1200
>> 4      1100   1993   1300
>> 6      1100   1995   1500
>> 7      1100   1996   1600
>> 9      1200   1991   1100
>> 10     1200   1992   1200
>> 11     1200   1993   1300
>> 12     1200   1994   1400
>> 13     1200   1995   1500
>> 14     1200   1996   1600
>> 15     1300   1990   1000
>> 16     1300   1991   1100
>> 17     1300   1992   1200
>> 18     1300   1993   1300
>> 19     1300   1994   1400
>> 20     1300   1995   1500
>> 21     1300   1996   1600
>> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>> removed. If I try,
>> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>>
>> # this apparently gives wrong results for the year 1995 (as shown
>> below) as growth rates are computed considering yearly increment.
>>
>>    co_code1 fyear1 sales1    growth
>> 1      1100   1990   1000        NA
>> 2      1100   1991   1100 10.000000
>> 3      1100   1992   1200  9.090909
>> 4      1100   1993   1300  8.333333
>> 5      1100   1995   1500 15.384615
>> 6      1100   1996   1600  6.666667
>> 7      1200   1991   1100        NA
>> 8      1200   1992   1200  9.090909
>> 9      1200   1993   1300  8.333333
>> 10     1200   1994   1400  7.692308
>> 11     1200   1995   1500  7.142857
>> 12     1200   1996   1600  6.666667
>> 13     1300   1990   1000        NA
>> 14     1300   1991   1100 10.000000
>> 15     1300   1992   1200  9.090909
>> 16     1300   1993   1300  8.333333
>> 17     1300   1994   1400  7.692308
>> 18     1300   1995   1500  7.142857
>> 19     1300   1996   1600  6.666667
>> # I thought of using the formula only when the increment of fyear1 is
>> only 1 while in a co_code1, by using this formula
>>
>> d<-ddply(df1,
>>          "co_code1",
>>          transform,
>>          if(diff(fyear1)==1){
>>            growth=(exp(diff(log(df1$sales1)))-1)*100
>>          } else{
>>            growth=NA
>>          })
>>
>> But, this doesn't work. I am getting the following error.
>>
>> In if (diff(fyear1) == 1) { :
>>   the condition has length > 1 and only the first element will be used
>> (repeated a few times).
>>
>> # I have searched for a solution, but somehow couldn't get one. Hope
>> that some kind soul will guide me here.
>>
> In your case use ifelse() as explained by Rui.
> But it can be done more easily since the fyear1 and co_code1 are synchronized.
> Add a new column to df1 like this
>
> df1$growth <- c(NA,
>           ifelse(diff(df1$fyear1)==1,
>                      (exp(diff(log(df1$sales1)))-1)*100,
>                      NA
>                      )
>          )
>
> and display df1. From your request I cannot determine if this is what you want.
>
> regards,
>
> Berend Hasselman
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
David K Stevens, P.E., Ph.D.
Professor and Head, Environmental Engineering
Civil and Environmental Engineering
Utah Water Research Laboratory
8200 Old Main Hill
Logan, UT  84322-8200
435 797 3229 - voice
435 797 1363 - fax
[hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

Brijesh Mishra
In reply to this post by PIKAL Petr
Wow, Mr Petr. The placing of diff(fyear1) was very clever indeed. Just
to understand the steps intended by you-

exp(diff(log(sales1))/diff(fyear1))- 1)
= exp(((log(sales1(t)/sales1(t-1)))/(fyear1(t)-fyear(t-1)))-1)
= exp(log(sales(t)/sales(t-1))^(1/(delta(fyear1))))-1
= ((sales(t)/(sales(t-1)))^(1/(delta(fyear1)))-1

This gives the CAGR, which saves some precious data-points (in my
dataset, it may prove a big boon). I spent a significant amount of
time today to figure out something like this, which you did so easily.

Many Thanks,

Brijesh

On Thu, Dec 15, 2016 at 7:21 PM, PIKAL Petr <[hidden email]> wrote:

> Hi
>
> Maybe you does not need if or ifelse but just divide by years difference.
>
> d2<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1))/diff(fyear1))- 1)*100)
>
> Cheers
> Petr
>
>> -----Original Message-----
>> From: R-help [mailto:[hidden email]] On Behalf Of Berend
>> Hasselman
>> Sent: Thursday, December 15, 2016 1:18 PM
>> To: Brijesh Mishra <[hidden email]>
>> Cc: r-help mailing list <[hidden email]>
>> Subject: Re: [R] Computing growth rate
>>
>>
>> > On 15 Dec 2016, at 04:40, Brijesh Mishra <[hidden email]>
>> wrote:
>> >
>> > Hi,
>> >
>> > I am trying to calculate growth rate (say, sales, though it is to be
>> > computed for many variables) in a panel data set. Problem is that I
>> > have missing data for many firms for many years. To put it simply, I
>> > have created this short dataframe (original df id much bigger)
>> >
>> > df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>> > fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>> >
>> > # this gives me
>> > co_code1 fyear1 sales1
>> > 1      1100   1990   1000
>> > 2      1100   1991   1100
>> > 3      1100   1992   1200
>> > 4      1100   1993   1300
>> > 5      1100   1994   1400
>> > 6      1100   1995   1500
>> > 7      1100   1996   1600
>> > 8      1200   1990   1000
>> > 9      1200   1991   1100
>> > 10     1200   1992   1200
>> > 11     1200   1993   1300
>> > 12     1200   1994   1400
>> > 13     1200   1995   1500
>> > 14     1200   1996   1600
>> > 15     1300   1990   1000
>> > 16     1300   1991   1100
>> > 17     1300   1992   1200
>> > 18     1300   1993   1300
>> > 19     1300   1994   1400
>> > 20     1300   1995   1500
>> > 21     1300   1996   1600
>> >
>> > # I am now removing a couple of rows
>> > df1<-df1[-c(5, 8), ]
>> > # the result is
>> >   co_code1 fyear1 sales1
>> > 1      1100   1990   1000
>> > 2      1100   1991   1100
>> > 3      1100   1992   1200
>> > 4      1100   1993   1300
>> > 6      1100   1995   1500
>> > 7      1100   1996   1600
>> > 9      1200   1991   1100
>> > 10     1200   1992   1200
>> > 11     1200   1993   1300
>> > 12     1200   1994   1400
>> > 13     1200   1995   1500
>> > 14     1200   1996   1600
>> > 15     1300   1990   1000
>> > 16     1300   1991   1100
>> > 17     1300   1992   1200
>> > 18     1300   1993   1300
>> > 19     1300   1994   1400
>> > 20     1300   1995   1500
>> > 21     1300   1996   1600
>> > # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>> > removed. If I try,
>> > d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-
>> 1)*100)
>> >
>> > # this apparently gives wrong results for the year 1995 (as shown
>> > below) as growth rates are computed considering yearly increment.
>> >
>> >   co_code1 fyear1 sales1    growth
>> > 1      1100   1990   1000        NA
>> > 2      1100   1991   1100 10.000000
>> > 3      1100   1992   1200  9.090909
>> > 4      1100   1993   1300  8.333333
>> > 5      1100   1995   1500 15.384615
>> > 6      1100   1996   1600  6.666667
>> > 7      1200   1991   1100        NA
>> > 8      1200   1992   1200  9.090909
>> > 9      1200   1993   1300  8.333333
>> > 10     1200   1994   1400  7.692308
>> > 11     1200   1995   1500  7.142857
>> > 12     1200   1996   1600  6.666667
>> > 13     1300   1990   1000        NA
>> > 14     1300   1991   1100 10.000000
>> > 15     1300   1992   1200  9.090909
>> > 16     1300   1993   1300  8.333333
>> > 17     1300   1994   1400  7.692308
>> > 18     1300   1995   1500  7.142857
>> > 19     1300   1996   1600  6.666667
>> > # I thought of using the formula only when the increment of fyear1 is
>> > only 1 while in a co_code1, by using this formula
>> >
>> > d<-ddply(df1,
>> >         "co_code1",
>> >         transform,
>> >         if(diff(fyear1)==1){
>> >           growth=(exp(diff(log(df1$sales1)))-1)*100
>> >         } else{
>> >           growth=NA
>> >         })
>> >
>> > But, this doesn't work. I am getting the following error.
>> >
>> > In if (diff(fyear1) == 1) { :
>> >  the condition has length > 1 and only the first element will be used
>> > (repeated a few times).
>> >
>> > # I have searched for a solution, but somehow couldn't get one. Hope
>> > that some kind soul will guide me here.
>> >
>>
>> In your case use ifelse() as explained by Rui.
>> But it can be done more easily since the fyear1 and co_code1 are
>> synchronized.
>> Add a new column to df1 like this
>>
>> df1$growth <- c(NA,
>>          ifelse(diff(df1$fyear1)==1,
>>                     (exp(diff(log(df1$sales1)))-1)*100,
>>                     NA
>>                     )
>>         )
>>
>> and display df1. From your request I cannot determine if this is what you
>> want.
>>
>> regards,
>>
>> Berend Hasselman
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ________________________________
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá.
>
> This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system.
> If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email.
>
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning.
> - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation.
> - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects.
> - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Computing growth rate

Brijesh Mishra
In reply to this post by Berend Hasselman
Yes, Mr Hasselman. This works like charm now. I also realise where I
was making an error. Now I have two very good options to choose from.
Spoilt for choices...

Many Many Thanks,

Brijesh

On Thu, Dec 15, 2016 at 7:53 PM, Berend Hasselman <[hidden email]> wrote:

>
>> On 15 Dec 2016, at 13:34, Brijesh Mishra <[hidden email]> wrote:
>>
>> Dear Mr Hasselman,
>>
>> I missed you mail, while I was typing my own mail as a reply to Mr.
>> Barradas suggestion. In fact, I implemented your suggestion even
>> before reading it. But, I have a concern that I have noted (though its
>> only hypothetical- such a scenario is very unlikely to occur). Is
>> there a way to restrict such calculations co_code1 wise?
>
> Like this?
>
> df2 <- ddply(df1,"co_code1", transform,
>     growth=c(NA, ifelse(diff(fyear1)==1, (exp(diff(log(sales1)))-1)*100,NA))
>     )
>
>
> But do also look at Petr Pikal's solution. Which of the two solutions you prefer depends on what you want in your special case.
>
> Berend

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.