Capturing positive and negative changes using R

classic Classic list List threaded Threaded
7 messages Options
F86
Reply | Threaded
Open this post in threaded view
|

Capturing positive and negative changes using R

F86
Dear R-users,

I have a country-year data for 180 countries from 1970 to 2010. I’m interested in capturing positive and negative changes in some of the variables. Some of these variables are continuous (0,25, 0,33, 1, 1,5 etc) others are ordered (0,1, 2).

To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))    

My data looks something like this (please see below).

There’re some problems with this code:  (1) I can’t capture the smaller changes, say from 0,25 to 0,33 ( I get weird numbers). I would love to get the exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2) It can’t make difference between countries. That is, it takes the difference between countries while it should only do this for each country ( for ex: when the US ends in 2011, and Canada starts, it counts this a difference but it shouldn’t, see below). (3) NAs, missing values, is neither a positive or negative change, although it does think that what comes after the NA is a difference.

 So, I wonder if anyone here can help me to adjust this code. I appreciate all comments.
 

Year
Country
X1
X2
1990
United States
0
0,22
1991
United States
0
0,22
1992
United States
0
0,22
1993
United States
0
0,22
1994
United States
0
0,22
1995
United States
0
0,22
1996
United States
0
0,22
1997
United States
0
0,5
1998
United States
0
0,5
1999
United States
0
0,5
2000
United States
0
0,5
2001
United States
0
0,5
2002
United States
2
NA
2003
United States
2
0,5
2004
United States
2
1
2005
United States
1
1
2006
United States
1
1
2007
United States
1
1
2008
United States
1
1
2009
United States
1
1
2010
United States
1
0,5
2011
United States
0
0,5
1990
Canada
1
1,5
1991
Canada
1
1,5
1992
Canada
1
NA
1993
Canada
1
1,5
1994
Canada
1
1,5
1995
Canada
1
1,5
1996
Canada
1
1,5
1997
Canada
1
1,5
1998
Canada
1
2
1999
Canada
2
2
2000
Canada
2
2
2001
Canada
2
2
2002
Canada
2
2
2003
Canada
1
2
2004
Canada
2
0,5
2005
Canada
1
0,5
2006
Canada
0
0,5
2007
Canada
1
0,5
2008
Canada
0
0,5
2009
Canada
1
0,5
2010
Canada
1
0,5
2011
Canada
0
1
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Capturing positive and negative changes using R

Rui Barradas
Hello,

Please don't post in HTML, the data is unreadable.

Two ideas:
1) Why c(FALSE, diff()) if diff returns numeric values? Use c(0, diff)
instead, and you won't need the coercion to numeric with the plus sign.
2) There are many ways to group by a variable, in this case 'Country'.
See in base R

?aggregate
?tapply
?ave
?by

And there's also contributed packages, such as dplyr and data.table.


If you can repost your data, setting your e-mail client to plain text,
we will be able to say more.

Hope this helps,

Rui Barradas

Às 20:33 de 20/07/19, Faradj Koliev escreveu:

> Dear R-users,
>
> I have a country-year data for 180 countries from 1970 to 2010. I’m interested in capturing positive and negative changes in some of the variables. Some of these variables are continuous (0,25, 0,33, 1, 1,5 etc) others are ordered (0,1, 2).
>
> To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))
>
> My data looks something like this (please see below).
>
> There’re some problems with this code:  (1) I can’t capture the smaller changes, say from 0,25 to 0,33 ( I get weird numbers). I would love to get the exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2) It can’t make difference between countries. That is, it takes the difference between countries while it should only do this for each country ( for ex: when the US ends in 2011, and Canada starts, it counts this a difference but it shouldn’t, see below). (3) NAs, missing values, is neither a positive or negative change, although it does think that what comes after the NA is a difference.
>
>   So, I wonder if anyone here can help me to adjust this code. I appreciate all comments.
>  
>
> Year
> Country
> X1
> X2
> 1990
> United States
> 0
> 0,22
> 1991
> United States
> 0
> 0,22
> 1992
> United States
> 0
> 0,22
> 1993
> United States
> 0
> 0,22
> 1994
> United States
> 0
> 0,22
> 1995
> United States
> 0
> 0,22
> 1996
> United States
> 0
> 0,22
> 1997
> United States
> 0
> 0,5
> 1998
> United States
> 0
> 0,5
> 1999
> United States
> 0
> 0,5
> 2000
> United States
> 0
> 0,5
> 2001
> United States
> 0
> 0,5
> 2002
> United States
> 2
> NA
> 2003
> United States
> 2
> 0,5
> 2004
> United States
> 2
> 1
> 2005
> United States
> 1
> 1
> 2006
> United States
> 1
> 1
> 2007
> United States
> 1
> 1
> 2008
> United States
> 1
> 1
> 2009
> United States
> 1
> 1
> 2010
> United States
> 1
> 0,5
> 2011
> United States
> 0
> 0,5
> 1990
> Canada
> 1
> 1,5
> 1991
> Canada
> 1
> 1,5
> 1992
> Canada
> 1
> NA
> 1993
> Canada
> 1
> 1,5
> 1994
> Canada
> 1
> 1,5
> 1995
> Canada
> 1
> 1,5
> 1996
> Canada
> 1
> 1,5
> 1997
> Canada
> 1
> 1,5
> 1998
> Canada
> 1
> 2
> 1999
> Canada
> 2
> 2
> 2000
> Canada
> 2
> 2
> 2001
> Canada
> 2
> 2
> 2002
> Canada
> 2
> 2
> 2003
> Canada
> 1
> 2
> 2004
> Canada
> 2
> 0,5
> 2005
> Canada
> 1
> 0,5
> 2006
> Canada
> 0
> 0,5
> 2007
> Canada
> 1
> 0,5
> 2008
> Canada
> 0
> 0,5
> 2009
> Canada
> 1
> 0,5
> 2010
> Canada
> 1
> 0,5
> 2011
> Canada
> 0
> 1
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Capturing positive and negative changes using R

Jim Lemon-4
In reply to this post by F86
Hi Faradj,
Rui's advice is correct, here's a way to do it. Note that I have
replaced the comma decimal points with full stops for my convenience:

fkdf<-read.csv(text="Year,Country,X1,X2
1990,United States,0,0.22
1991,United States,0,0.22
1992,United States,0,0.22
1993,United States,0,0.22
1994,United States,0,0.22
1995,United States,0,0.22
1996,United States,0,0.22
1997,United States,0,0.5
1998,United States,0,0.5
1999,United States,0,0.5
2000,United States,0,0.5
2001,United States,0,0.5
2002,United States,2,NA
2003,United States,2,0.5
2004,United States,2,1
2005,United States,1,1
2006,United States,1,1
2007,United States,1,1
2008,United States,1,1
2009,United States,1,1
2010,United States,1,0.5
2011,United States,0,0.5
1990,Canada,1,1.5
1991,Canada,1,1.5
1992,Canada,1,NA
1993,Canada,1,1.5
1994,Canada,1,1.5
1995,Canada,1,1.5
1996,Canada,1,1.5
1997,Canada,1,1.5
1998,Canada,1,2
1999,Canada,2,2
2000,Canada,2,2
2001,Canada,2,2
2002,Canada,2,2
2003,Canada,1,2
2004,Canada,2,0.5
2005,Canada,1,0.5
2006,Canada,0,0.5
2007,Canada,1,0.5
2008,Canada,0,0.5
2009,Canada,1,0.5
2010,Canada,1,0.5
2011,Canada,0,1",
header=TRUE,stringsAsFactors=FALSE)
diffX1<-aggregate(fkdf$X1,by=list(fkdf[,2]),FUN=diff)
diffX2<-aggregate(fkdf$X2,by=list(fkdf[,2]),FUN=diff)
diffX1<-data.frame(diffX1$Group.1,diffX1$x)
diffyears<-unique(fkdf$Year)[-1]
names(diffX1)<-c("Country",diffyears)
diffX2<-data.frame(diffX2$Group.1,diffX2$x)
names(diffX2)<-c("Country",diffyears)

Jim

On Sun, Jul 21, 2019 at 5:34 AM Faradj Koliev <[hidden email]> wrote:

>
> Dear R-users,
>
> I have a country-year data for 180 countries from 1970 to 2010. I’m interested in capturing positive and negative changes in some of the variables. Some of these variables are continuous (0,25, 0,33, 1, 1,5 etc) others are ordered (0,1, 2).
>
> To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))
>
> My data looks something like this (please see below).
>
> There’re some problems with this code:  (1) I can’t capture the smaller changes, say from 0,25 to 0,33 ( I get weird numbers). I would love to get the exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2) It can’t make difference between countries. That is, it takes the difference between countries while it should only do this for each country ( for ex: when the US ends in 2011, and Canada starts, it counts this a difference but it shouldn’t, see below). (3) NAs, missing values, is neither a positive or negative change, although it does think that what comes after the NA is a difference.
>
>  So, I wonder if anyone here can help me to adjust this code. I appreciate all comments.
>
>
> Year
> Country
> X1
> X2
> 1990
> United States
> 0
> 0,22
> 1991
> United States
> 0
> 0,22
> 1992
> United States
> 0
> 0,22
> 1993
> United States
> 0
> 0,22
> 1994
> United States
> 0
> 0,22
> 1995
> United States
> 0
> 0,22
> 1996
> United States
> 0
> 0,22
> 1997
> United States
> 0
> 0,5
> 1998
> United States
> 0
> 0,5
> 1999
> United States
> 0
> 0,5
> 2000
> United States
> 0
> 0,5
> 2001
> United States
> 0
> 0,5
> 2002
> United States
> 2
> NA
> 2003
> United States
> 2
> 0,5
> 2004
> United States
> 2
> 1
> 2005
> United States
> 1
> 1
> 2006
> United States
> 1
> 1
> 2007
> United States
> 1
> 1
> 2008
> United States
> 1
> 1
> 2009
> United States
> 1
> 1
> 2010
> United States
> 1
> 0,5
> 2011
> United States
> 0
> 0,5
> 1990
> Canada
> 1
> 1,5
> 1991
> Canada
> 1
> 1,5
> 1992
> Canada
> 1
> NA
> 1993
> Canada
> 1
> 1,5
> 1994
> Canada
> 1
> 1,5
> 1995
> Canada
> 1
> 1,5
> 1996
> Canada
> 1
> 1,5
> 1997
> Canada
> 1
> 1,5
> 1998
> Canada
> 1
> 2
> 1999
> Canada
> 2
> 2
> 2000
> Canada
> 2
> 2
> 2001
> Canada
> 2
> 2
> 2002
> Canada
> 2
> 2
> 2003
> Canada
> 1
> 2
> 2004
> Canada
> 2
> 0,5
> 2005
> Canada
> 1
> 0,5
> 2006
> Canada
> 0
> 0,5
> 2007
> Canada
> 1
> 0,5
> 2008
> Canada
> 0
> 0,5
> 2009
> Canada
> 1
> 0,5
> 2010
> Canada
> 1
> 0,5
> 2011
> Canada
> 0
> 1
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Capturing positive and negative changes using R

Jeff Newmiller
It is possible that part of the original problem was that Fardadj was expecting R to recognise the comma as the decimal and he read in that column as a factor without realizing it. Factors are discrete, not continuous.

He should use the str() function to identify the column types in his data frame.

On July 20, 2019 6:17:19 PM CDT, Jim Lemon <[hidden email]> wrote:

>Hi Faradj,
>Rui's advice is correct, here's a way to do it. Note that I have
>replaced the comma decimal points with full stops for my convenience:
>
>fkdf<-read.csv(text="Year,Country,X1,X2
>1990,United States,0,0.22
>1991,United States,0,0.22
>1992,United States,0,0.22
>1993,United States,0,0.22
>1994,United States,0,0.22
>1995,United States,0,0.22
>1996,United States,0,0.22
>1997,United States,0,0.5
>1998,United States,0,0.5
>1999,United States,0,0.5
>2000,United States,0,0.5
>2001,United States,0,0.5
>2002,United States,2,NA
>2003,United States,2,0.5
>2004,United States,2,1
>2005,United States,1,1
>2006,United States,1,1
>2007,United States,1,1
>2008,United States,1,1
>2009,United States,1,1
>2010,United States,1,0.5
>2011,United States,0,0.5
>1990,Canada,1,1.5
>1991,Canada,1,1.5
>1992,Canada,1,NA
>1993,Canada,1,1.5
>1994,Canada,1,1.5
>1995,Canada,1,1.5
>1996,Canada,1,1.5
>1997,Canada,1,1.5
>1998,Canada,1,2
>1999,Canada,2,2
>2000,Canada,2,2
>2001,Canada,2,2
>2002,Canada,2,2
>2003,Canada,1,2
>2004,Canada,2,0.5
>2005,Canada,1,0.5
>2006,Canada,0,0.5
>2007,Canada,1,0.5
>2008,Canada,0,0.5
>2009,Canada,1,0.5
>2010,Canada,1,0.5
>2011,Canada,0,1",
>header=TRUE,stringsAsFactors=FALSE)
>diffX1<-aggregate(fkdf$X1,by=list(fkdf[,2]),FUN=diff)
>diffX2<-aggregate(fkdf$X2,by=list(fkdf[,2]),FUN=diff)
>diffX1<-data.frame(diffX1$Group.1,diffX1$x)
>diffyears<-unique(fkdf$Year)[-1]
>names(diffX1)<-c("Country",diffyears)
>diffX2<-data.frame(diffX2$Group.1,diffX2$x)
>names(diffX2)<-c("Country",diffyears)
>
>Jim
>
>On Sun, Jul 21, 2019 at 5:34 AM Faradj Koliev <[hidden email]>
>wrote:
>>
>> Dear R-users,
>>
>> I have a country-year data for 180 countries from 1970 to 2010. I’m
>interested in capturing positive and negative changes in some of the
>variables. Some of these variables are continuous (0,25, 0,33, 1, 1,5
>etc) others are ordered (0,1, 2).
>>
>> To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))
>>
>> My data looks something like this (please see below).
>>
>> There’re some problems with this code:  (1) I can’t capture the
>smaller changes, say from 0,25 to 0,33 ( I get weird numbers). I would
>love to get the exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2)
>It can’t make difference between countries. That is, it takes the
>difference between countries while it should only do this for each
>country ( for ex: when the US ends in 2011, and Canada starts, it
>counts this a difference but it shouldn’t, see below). (3) NAs, missing
>values, is neither a positive or negative change, although it does
>think that what comes after the NA is a difference.
>>
>>  So, I wonder if anyone here can help me to adjust this code. I
>appreciate all comments.
>>
>>
>> Year
>> Country
>> X1
>> X2
>> 1990
>> United States
>> 0
>> 0,22
>> 1991
>> United States
>> 0
>> 0,22
>> 1992
>> United States
>> 0
>> 0,22
>> 1993
>> United States
>> 0
>> 0,22
>> 1994
>> United States
>> 0
>> 0,22
>> 1995
>> United States
>> 0
>> 0,22
>> 1996
>> United States
>> 0
>> 0,22
>> 1997
>> United States
>> 0
>> 0,5
>> 1998
>> United States
>> 0
>> 0,5
>> 1999
>> United States
>> 0
>> 0,5
>> 2000
>> United States
>> 0
>> 0,5
>> 2001
>> United States
>> 0
>> 0,5
>> 2002
>> United States
>> 2
>> NA
>> 2003
>> United States
>> 2
>> 0,5
>> 2004
>> United States
>> 2
>> 1
>> 2005
>> United States
>> 1
>> 1
>> 2006
>> United States
>> 1
>> 1
>> 2007
>> United States
>> 1
>> 1
>> 2008
>> United States
>> 1
>> 1
>> 2009
>> United States
>> 1
>> 1
>> 2010
>> United States
>> 1
>> 0,5
>> 2011
>> United States
>> 0
>> 0,5
>> 1990
>> Canada
>> 1
>> 1,5
>> 1991
>> Canada
>> 1
>> 1,5
>> 1992
>> Canada
>> 1
>> NA
>> 1993
>> Canada
>> 1
>> 1,5
>> 1994
>> Canada
>> 1
>> 1,5
>> 1995
>> Canada
>> 1
>> 1,5
>> 1996
>> Canada
>> 1
>> 1,5
>> 1997
>> Canada
>> 1
>> 1,5
>> 1998
>> Canada
>> 1
>> 2
>> 1999
>> Canada
>> 2
>> 2
>> 2000
>> Canada
>> 2
>> 2
>> 2001
>> Canada
>> 2
>> 2
>> 2002
>> Canada
>> 2
>> 2
>> 2003
>> Canada
>> 1
>> 2
>> 2004
>> Canada
>> 2
>> 0,5
>> 2005
>> Canada
>> 1
>> 0,5
>> 2006
>> Canada
>> 0
>> 0,5
>> 2007
>> Canada
>> 1
>> 0,5
>> 2008
>> Canada
>> 0
>> 0,5
>> 2009
>> Canada
>> 1
>> 0,5
>> 2010
>> Canada
>> 1
>> 0,5
>> 2011
>> Canada
>> 0
>> 1
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Capturing positive and negative changes using R

Richard O'Keefe-2
If "Fardadj was expecting R to recognise the comma as the decimal"
then it might be worth mentioning the 'dec = "."' argument of
read.table and its friends.


On Sun, 21 Jul 2019 at 12:48, Jeff Newmiller <[hidden email]>
wrote:

> It is possible that part of the original problem was that Fardadj was
> expecting R to recognise the comma as the decimal and he read in that
> column as a factor without realizing it. Factors are discrete, not
> continuous.
>
> He should use the str() function to identify the column types in his data
> frame.
>
> On July 20, 2019 6:17:19 PM CDT, Jim Lemon <[hidden email]> wrote:
> >Hi Faradj,
> >Rui's advice is correct, here's a way to do it. Note that I have
> >replaced the comma decimal points with full stops for my convenience:
> >
> >fkdf<-read.csv(text="Year,Country,X1,X2
> >1990,United States,0,0.22
> >1991,United States,0,0.22
> >1992,United States,0,0.22
> >1993,United States,0,0.22
> >1994,United States,0,0.22
> >1995,United States,0,0.22
> >1996,United States,0,0.22
> >1997,United States,0,0.5
> >1998,United States,0,0.5
> >1999,United States,0,0.5
> >2000,United States,0,0.5
> >2001,United States,0,0.5
> >2002,United States,2,NA
> >2003,United States,2,0.5
> >2004,United States,2,1
> >2005,United States,1,1
> >2006,United States,1,1
> >2007,United States,1,1
> >2008,United States,1,1
> >2009,United States,1,1
> >2010,United States,1,0.5
> >2011,United States,0,0.5
> >1990,Canada,1,1.5
> >1991,Canada,1,1.5
> >1992,Canada,1,NA
> >1993,Canada,1,1.5
> >1994,Canada,1,1.5
> >1995,Canada,1,1.5
> >1996,Canada,1,1.5
> >1997,Canada,1,1.5
> >1998,Canada,1,2
> >1999,Canada,2,2
> >2000,Canada,2,2
> >2001,Canada,2,2
> >2002,Canada,2,2
> >2003,Canada,1,2
> >2004,Canada,2,0.5
> >2005,Canada,1,0.5
> >2006,Canada,0,0.5
> >2007,Canada,1,0.5
> >2008,Canada,0,0.5
> >2009,Canada,1,0.5
> >2010,Canada,1,0.5
> >2011,Canada,0,1",
> >header=TRUE,stringsAsFactors=FALSE)
> >diffX1<-aggregate(fkdf$X1,by=list(fkdf[,2]),FUN=diff)
> >diffX2<-aggregate(fkdf$X2,by=list(fkdf[,2]),FUN=diff)
> >diffX1<-data.frame(diffX1$Group.1,diffX1$x)
> >diffyears<-unique(fkdf$Year)[-1]
> >names(diffX1)<-c("Country",diffyears)
> >diffX2<-data.frame(diffX2$Group.1,diffX2$x)
> >names(diffX2)<-c("Country",diffyears)
> >
> >Jim
> >
> >On Sun, Jul 21, 2019 at 5:34 AM Faradj Koliev <[hidden email]>
> >wrote:
> >>
> >> Dear R-users,
> >>
> >> I have a country-year data for 180 countries from 1970 to 2010. I’m
> >interested in capturing positive and negative changes in some of the
> >variables. Some of these variables are continuous (0,25, 0,33, 1, 1,5
> >etc) others are ordered (0,1, 2).
> >>
> >> To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))
> >>
> >> My data looks something like this (please see below).
> >>
> >> There’re some problems with this code:  (1) I can’t capture the
> >smaller changes, say from 0,25 to 0,33 ( I get weird numbers). I would
> >love to get the exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2)
> >It can’t make difference between countries. That is, it takes the
> >difference between countries while it should only do this for each
> >country ( for ex: when the US ends in 2011, and Canada starts, it
> >counts this a difference but it shouldn’t, see below). (3) NAs, missing
> >values, is neither a positive or negative change, although it does
> >think that what comes after the NA is a difference.
> >>
> >>  So, I wonder if anyone here can help me to adjust this code. I
> >appreciate all comments.
> >>
> >>
> >> Year
> >> Country
> >> X1
> >> X2
> >> 1990
> >> United States
> >> 0
> >> 0,22
> >> 1991
> >> United States
> >> 0
> >> 0,22
> >> 1992
> >> United States
> >> 0
> >> 0,22
> >> 1993
> >> United States
> >> 0
> >> 0,22
> >> 1994
> >> United States
> >> 0
> >> 0,22
> >> 1995
> >> United States
> >> 0
> >> 0,22
> >> 1996
> >> United States
> >> 0
> >> 0,22
> >> 1997
> >> United States
> >> 0
> >> 0,5
> >> 1998
> >> United States
> >> 0
> >> 0,5
> >> 1999
> >> United States
> >> 0
> >> 0,5
> >> 2000
> >> United States
> >> 0
> >> 0,5
> >> 2001
> >> United States
> >> 0
> >> 0,5
> >> 2002
> >> United States
> >> 2
> >> NA
> >> 2003
> >> United States
> >> 2
> >> 0,5
> >> 2004
> >> United States
> >> 2
> >> 1
> >> 2005
> >> United States
> >> 1
> >> 1
> >> 2006
> >> United States
> >> 1
> >> 1
> >> 2007
> >> United States
> >> 1
> >> 1
> >> 2008
> >> United States
> >> 1
> >> 1
> >> 2009
> >> United States
> >> 1
> >> 1
> >> 2010
> >> United States
> >> 1
> >> 0,5
> >> 2011
> >> United States
> >> 0
> >> 0,5
> >> 1990
> >> Canada
> >> 1
> >> 1,5
> >> 1991
> >> Canada
> >> 1
> >> 1,5
> >> 1992
> >> Canada
> >> 1
> >> NA
> >> 1993
> >> Canada
> >> 1
> >> 1,5
> >> 1994
> >> Canada
> >> 1
> >> 1,5
> >> 1995
> >> Canada
> >> 1
> >> 1,5
> >> 1996
> >> Canada
> >> 1
> >> 1,5
> >> 1997
> >> Canada
> >> 1
> >> 1,5
> >> 1998
> >> Canada
> >> 1
> >> 2
> >> 1999
> >> Canada
> >> 2
> >> 2
> >> 2000
> >> Canada
> >> 2
> >> 2
> >> 2001
> >> Canada
> >> 2
> >> 2
> >> 2002
> >> Canada
> >> 2
> >> 2
> >> 2003
> >> Canada
> >> 1
> >> 2
> >> 2004
> >> Canada
> >> 2
> >> 0,5
> >> 2005
> >> Canada
> >> 1
> >> 0,5
> >> 2006
> >> Canada
> >> 0
> >> 0,5
> >> 2007
> >> Canada
> >> 1
> >> 0,5
> >> 2008
> >> Canada
> >> 0
> >> 0,5
> >> 2009
> >> Canada
> >> 1
> >> 0,5
> >> 2010
> >> Canada
> >> 1
> >> 0,5
> >> 2011
> >> Canada
> >> 0
> >> 1
> >>         [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Capturing positive and negative changes using R

Daniel Nordlund-3
In reply to this post by Jim Lemon-4
Here is one more option using the ave() function. Using Jim's data and
naming convention

fkdf$X1_change <- ave(fkdf[,'X1'], fkdf$Country, FUN=function(x)
c(0,diff(x)))
fkdf$X2_change <- ave(fkdf[,'X2'], fkdf$Country, FUN=function(x)
c(0,diff(x)))

hope this is helpful,

Dan

--
Daniel Nordlund
Port Townsend, WA  USA


On 7/20/2019 4:17 PM, Jim Lemon wrote:

> Hi Faradj,
> Rui's advice is correct, here's a way to do it. Note that I have
> replaced the comma decimal points with full stops for my convenience:
>
> fkdf<-read.csv(text="Year,Country,X1,X2
> 1990,United States,0,0.22
> 1991,United States,0,0.22
> 1992,United States,0,0.22
> 1993,United States,0,0.22
> 1994,United States,0,0.22
> 1995,United States,0,0.22
> 1996,United States,0,0.22
> 1997,United States,0,0.5
> 1998,United States,0,0.5
> 1999,United States,0,0.5
> 2000,United States,0,0.5
> 2001,United States,0,0.5
> 2002,United States,2,NA
> 2003,United States,2,0.5
> 2004,United States,2,1
> 2005,United States,1,1
> 2006,United States,1,1
> 2007,United States,1,1
> 2008,United States,1,1
> 2009,United States,1,1
> 2010,United States,1,0.5
> 2011,United States,0,0.5
> 1990,Canada,1,1.5
> 1991,Canada,1,1.5
> 1992,Canada,1,NA
> 1993,Canada,1,1.5
> 1994,Canada,1,1.5
> 1995,Canada,1,1.5
> 1996,Canada,1,1.5
> 1997,Canada,1,1.5
> 1998,Canada,1,2
> 1999,Canada,2,2
> 2000,Canada,2,2
> 2001,Canada,2,2
> 2002,Canada,2,2
> 2003,Canada,1,2
> 2004,Canada,2,0.5
> 2005,Canada,1,0.5
> 2006,Canada,0,0.5
> 2007,Canada,1,0.5
> 2008,Canada,0,0.5
> 2009,Canada,1,0.5
> 2010,Canada,1,0.5
> 2011,Canada,0,1",
> header=TRUE,stringsAsFactors=FALSE)
> diffX1<-aggregate(fkdf$X1,by=list(fkdf[,2]),FUN=diff)
> diffX2<-aggregate(fkdf$X2,by=list(fkdf[,2]),FUN=diff)
> diffX1<-data.frame(diffX1$Group.1,diffX1$x)
> diffyears<-unique(fkdf$Year)[-1]
> names(diffX1)<-c("Country",diffyears)
> diffX2<-data.frame(diffX2$Group.1,diffX2$x)
> names(diffX2)<-c("Country",diffyears)
>
> Jim
>
> On Sun, Jul 21, 2019 at 5:34 AM Faradj Koliev <[hidden email]> wrote:
>> Dear R-users,
>>
>> I have a country-year data for 180 countries from 1970 to 2010. I’m interested in capturing positive and negative changes in some of the variables. Some of these variables are continuous (0,25, 0,33, 1, 1,5 etc) others are ordered (0,1, 2).
>>
>> To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))
>>
>> My data looks something like this (please see below).
>>
>> There’re some problems with this code:  (1) I can’t capture the smaller changes, say from 0,25 to 0,33 ( I get weird numbers). I would love to get the exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2) It can’t make difference between countries. That is, it takes the difference between countries while it should only do this for each country ( for ex: when the US ends in 2011, and Canada starts, it counts this a difference but it shouldn’t, see below). (3) NAs, missing values, is neither a positive or negative change, although it does think that what comes after the NA is a difference.
>>
>>   So, I wonder if anyone here can help me to adjust this code. I appreciate all comments.
>>
>>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
F86
Reply | Threaded
Open this post in threaded view
|

Re: Capturing positive and negative changes using R

F86
In reply to this post by Richard O'Keefe-2
Thank you much, this was very helpful. Jim’s and Daniel’s code was spot on.

Indeed, as Jeff Newmiller pointed out,  one of the problems was that I assumed that R could recognise decimals - I did as Richard O’Keefe suggested and the problem was gone.

Thanks again!

Faradj

> On 21 Jul 2019, at 03:36, Richard O'Keefe <[hidden email]> wrote:
>
> If "Fardadj was expecting R to recognise the comma as the decimal"
> then it might be worth mentioning the 'dec = "."' argument of
> read.table and its friends.
>
>
> On Sun, 21 Jul 2019 at 12:48, Jeff Newmiller <[hidden email] <mailto:[hidden email]>> wrote:
> It is possible that part of the original problem was that Fardadj was expecting R to recognise the comma as the decimal and he read in that column as a factor without realizing it. Factors are discrete, not continuous.
>
> He should use the str() function to identify the column types in his data frame.
>
> On July 20, 2019 6:17:19 PM CDT, Jim Lemon <[hidden email] <mailto:[hidden email]>> wrote:
> >Hi Faradj,
> >Rui's advice is correct, here's a way to do it. Note that I have
> >replaced the comma decimal points with full stops for my convenience:
> >
> >fkdf<-read.csv(text="Year,Country,X1,X2
> >1990,United States,0,0.22
> >1991,United States,0,0.22
> >1992,United States,0,0.22
> >1993,United States,0,0.22
> >1994,United States,0,0.22
> >1995,United States,0,0.22
> >1996,United States,0,0.22
> >1997,United States,0,0.5
> >1998,United States,0,0.5
> >1999,United States,0,0.5
> >2000,United States,0,0.5
> >2001,United States,0,0.5
> >2002,United States,2,NA
> >2003,United States,2,0.5
> >2004,United States,2,1
> >2005,United States,1,1
> >2006,United States,1,1
> >2007,United States,1,1
> >2008,United States,1,1
> >2009,United States,1,1
> >2010,United States,1,0.5
> >2011,United States,0,0.5
> >1990,Canada,1,1.5
> >1991,Canada,1,1.5
> >1992,Canada,1,NA
> >1993,Canada,1,1.5
> >1994,Canada,1,1.5
> >1995,Canada,1,1.5
> >1996,Canada,1,1.5
> >1997,Canada,1,1.5
> >1998,Canada,1,2
> >1999,Canada,2,2
> >2000,Canada,2,2
> >2001,Canada,2,2
> >2002,Canada,2,2
> >2003,Canada,1,2
> >2004,Canada,2,0.5
> >2005,Canada,1,0.5
> >2006,Canada,0,0.5
> >2007,Canada,1,0.5
> >2008,Canada,0,0.5
> >2009,Canada,1,0.5
> >2010,Canada,1,0.5
> >2011,Canada,0,1",
> >header=TRUE,stringsAsFactors=FALSE)
> >diffX1<-aggregate(fkdf$X1,by=list(fkdf[,2]),FUN=diff)
> >diffX2<-aggregate(fkdf$X2,by=list(fkdf[,2]),FUN=diff)
> >diffX1<-data.frame(diffX1$Group.1,diffX1$x)
> >diffyears<-unique(fkdf$Year)[-1]
> >names(diffX1)<-c("Country",diffyears)
> >diffX2<-data.frame(diffX2$Group.1,diffX2$x)
> >names(diffX2)<-c("Country",diffyears)
> >
> >Jim
> >
> >On Sun, Jul 21, 2019 at 5:34 AM Faradj Koliev <[hidden email] <mailto:[hidden email]>>
> >wrote:
> >>
> >> Dear R-users,
> >>
> >> I have a country-year data for 180 countries from 1970 to 2010. I’m
> >interested in capturing positive and negative changes in some of the
> >variables. Some of these variables are continuous (0,25, 0,33, 1, 1,5
> >etc) others are ordered (0,1, 2).
> >>
> >> To do this, I use this code data$X1_change<- +c(FALSE,diff(data$X1))
> >>
> >> My data looks something like this (please see below).
> >>
> >> There’re some problems with this code:  (1) I can’t capture the
> >smaller changes, say from 0,25 to 0,33 ( I get weird numbers). I would
> >love to get the exact difference ( for ex: +1, -0,22, +4, -2 etc).  (2)
> >It can’t make difference between countries. That is, it takes the
> >difference between countries while it should only do this for each
> >country ( for ex: when the US ends in 2011, and Canada starts, it
> >counts this a difference but it shouldn’t, see below). (3) NAs, missing
> >values, is neither a positive or negative change, although it does
> >think that what comes after the NA is a difference.
> >>
> >>  So, I wonder if anyone here can help me to adjust this code. I
> >appreciate all comments.
> >>
> >>
> >> Year
> >> Country
> >> X1
> >> X2
> >> 1990
> >> United States
> >> 0
> >> 0,22
> >> 1991
> >> United States
> >> 0
> >> 0,22
> >> 1992
> >> United States
> >> 0
> >> 0,22
> >> 1993
> >> United States
> >> 0
> >> 0,22
> >> 1994
> >> United States
> >> 0
> >> 0,22
> >> 1995
> >> United States
> >> 0
> >> 0,22
> >> 1996
> >> United States
> >> 0
> >> 0,22
> >> 1997
> >> United States
> >> 0
> >> 0,5
> >> 1998
> >> United States
> >> 0
> >> 0,5
> >> 1999
> >> United States
> >> 0
> >> 0,5
> >> 2000
> >> United States
> >> 0
> >> 0,5
> >> 2001
> >> United States
> >> 0
> >> 0,5
> >> 2002
> >> United States
> >> 2
> >> NA
> >> 2003
> >> United States
> >> 2
> >> 0,5
> >> 2004
> >> United States
> >> 2
> >> 1
> >> 2005
> >> United States
> >> 1
> >> 1
> >> 2006
> >> United States
> >> 1
> >> 1
> >> 2007
> >> United States
> >> 1
> >> 1
> >> 2008
> >> United States
> >> 1
> >> 1
> >> 2009
> >> United States
> >> 1
> >> 1
> >> 2010
> >> United States
> >> 1
> >> 0,5
> >> 2011
> >> United States
> >> 0
> >> 0,5
> >> 1990
> >> Canada
> >> 1
> >> 1,5
> >> 1991
> >> Canada
> >> 1
> >> 1,5
> >> 1992
> >> Canada
> >> 1
> >> NA
> >> 1993
> >> Canada
> >> 1
> >> 1,5
> >> 1994
> >> Canada
> >> 1
> >> 1,5
> >> 1995
> >> Canada
> >> 1
> >> 1,5
> >> 1996
> >> Canada
> >> 1
> >> 1,5
> >> 1997
> >> Canada
> >> 1
> >> 1,5
> >> 1998
> >> Canada
> >> 1
> >> 2
> >> 1999
> >> Canada
> >> 2
> >> 2
> >> 2000
> >> Canada
> >> 2
> >> 2
> >> 2001
> >> Canada
> >> 2
> >> 2
> >> 2002
> >> Canada
> >> 2
> >> 2
> >> 2003
> >> Canada
> >> 1
> >> 2
> >> 2004
> >> Canada
> >> 2
> >> 0,5
> >> 2005
> >> Canada
> >> 1
> >> 0,5
> >> 2006
> >> Canada
> >> 0
> >> 0,5
> >> 2007
> >> Canada
> >> 1
> >> 0,5
> >> 2008
> >> Canada
> >> 0
> >> 0,5
> >> 2009
> >> Canada
> >> 1
> >> 0,5
> >> 2010
> >> Canada
> >> 1
> >> 0,5
> >> 2011
> >> Canada
> >> 0
> >> 1
> >>         [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> [hidden email] <mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> >> PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >______________________________________________
> >[hidden email] <mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> [hidden email] <mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.