Case statement in sqldf

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Case statement in sqldf

Mangalani Peter Makananisa
Hi all,

I am trying to create a new  variable called Fiscal Year (FY) using case expression in sqldf  and I am getting a null FY , see the code below .

Please advise me as to how I can do this mutation.

  library(zoo)
  library(lubridate)
  library(stringr)
  library(RH2)
  library(sqldf)

cr$ReportDate = as.Date(cr$ReportDate, format ='%Y-%m-%d')

> cr2 =  sqldf(" select ReportDate
+                      ,  case
+                        when ReportDate between  '2012-04-01'  and  '2013-03-31'
+                        then '2012_13'
+                        when  ReportDate between '2013-04-01'  and  '2014-03-31'
+                        then '2013_14'
+                        when  ReportDate between  '2014-04-01'  and  '2015-03-31'
+                        then'201415'
+                        when ReportDate between '2015-04-01'  and  '2016-03-31'
+                        then '2015_16'
+                        when ReportDate between '2016-04-01'  and  '2017-03-31'
+                        then '2016_17'
+                        when ReportDate between '2017-04-01'  and  '2018-03-3'
+                        then '2017_18' else null
+                        end as FY
+               from cr
+              where  ReportDate  >=  '2012-04-01'
+              ")

Thanking you in advance

Kind regards,

Mangalani Peter Makananisa (0005786)
South African Revenue Service (SARS)
Specialist: Statistical Support
TCEI_OR (Head Office)
Tell: +272 422 7357, Cell: +2782 456 4669

Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Pages/Email-disclaimer.aspx

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Case statement in sqldf

David Winsemius

> On Sep 11, 2017, at 1:05 AM, Mangalani Peter Makananisa <[hidden email]> wrote:
>
> Hi all,
>
> I am trying to create a new  variable called Fiscal Year (FY) using case expression in sqldf  and I am getting a null FY , see the code below .
>
> Please advise me as to how I can do this mutation.
>
>  library(zoo)
>  library(lubridate)
>  library(stringr)
>  library(RH2)
>  library(sqldf)
>
> cr$ReportDate = as.Date(cr$ReportDate, format ='%Y-%m-%d')
>
>> cr2 =  sqldf(" select ReportDate
> +                      ,  case
> +                        when ReportDate between  '2012-04-01'  and  '2013-03-31'
> +                        then '2012_13'
> +                        when  ReportDate between '2013-04-01'  and  '2014-03-31'
> +                        then '2013_14'
> +                        when  ReportDate between  '2014-04-01'  and  '2015-03-31'
> +                        then'201415'
> +                        when ReportDate between '2015-04-01'  and  '2016-03-31'
> +                        then '2015_16'
> +                        when ReportDate between '2016-04-01'  and  '2017-03-31'
> +                        then '2016_17'
> +                        when ReportDate between '2017-04-01'  and  '2018-03-3'
> +                        then '2017_18' else null
> +                        end as FY
> +               from cr
> +              where  ReportDate  >=  '2012-04-01'
> +              ")

There was no cr object in any of the package I loaded although `lubridate` and `stringr` appear unnecessary and were omitted. I get no error with your code using this test object:

 cr <- data.frame(ReportDate = seq(as.Date("1970-01-01"), as.Date("2020-01-01"), by="1 year" ))

> cr2 =  sqldf(" select ReportDate
+                      ,  case
+                        when ReportDate between  '2012-04-01'  and  '2013-03-31'
+                        then '2012_13'
+                        when  ReportDate between '2013-04-01'  and  '2014-03-31'
+                        then '2013_14'
+                        when  ReportDate between  '2014-04-01'  and  '2015-03-31'
+                        then'201415'
+                        when ReportDate between '2015-04-01'  and  '2016-03-31'
+                        then '2015_16'
+                        when ReportDate between '2016-04-01'  and  '2017-03-31'
+                        then '2016_17'
+                        when ReportDate between '2017-04-01'  and  '2018-03-3'
+                        then '2017_18' else null
+                        end as FY
+               from cr
+              where  ReportDate  >=  '2012-04-01'
+              ")
>
> str(cr2)
'data.frame': 8 obs. of  2 variables:
 $ ReportDate: Date, format: "2013-01-01" "2014-01-01" "2015-01-01" ...
 $ FY        : chr  "2012_13" "2013_14" "201415" "2015_16" ...

>
>
> Thanking you in advance
>
> Kind regards,
>
> Mangalani Peter Makananisa (0005786)
> South African Revenue Service (SARS)
> Specialist: Statistical Support
> TCEI_OR (Head Office)
> Tell: +272 422 7357, Cell: +2782 456 4669
>
> Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Pages/Email-disclaimer.aspx
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Case statement in sqldf

Gabor Grothendieck
In reply to this post by Mangalani Peter Makananisa
2018-03-3 in your code should be 2018-03-31.

The line
    then'201415'
needs to be fixed.

When posting please provide minimal self-contained examples. There was
no input provided and library statements not relevant to the posted
code were included.

Fixing the invalid date and bad line, getting rid of those library
statements that are unnecessary and providing some test input, it
works for me for the input shown.

(Note that it would NOT work if we omitted library(RH2) since the
default sqlite back end does not have date types and does not know
that an R date -- which is sent to sqlite as the number of days since
1970-01-01 -- corresponds to a particular character string; however,
the H2 database does have date types.  See FAQ #4 on the sqldf github
home page for more info.
    https://github.com/ggrothendieck/sqldf
)

This works:

library(sqldf)
library(RH2)

cr <- data.frame(ReportDate = as.Date("2017-09-11")) # input

cr2 =  sqldf(" select ReportDate
                     ,  case
                       when ReportDate between  '2012-04-01'  and  '2013-03-31'
                       then '2012_13'
                       when  ReportDate between '2013-04-01'  and  '2014-03-31'
                       then '2013_14'
                       when  ReportDate between  '2014-04-01'  and  '2015-03-31'
                       then '2014_15'
                       when ReportDate between '2015-04-01'  and  '2016-03-31'
                       then '2015_16'
                       when ReportDate between '2016-04-01'  and  '2017-03-31'
                       then '2016_17'
                       when ReportDate between '2017-04-01'  and  '2018-03-31'
                      then '2017_18' else null
             end as FY
             from cr
             where  ReportDate  >=  '2012-04-01'
             ")

giving:

  > cr2
    ReportDate      FY
  1 2017-09-11 2017_18

Note that using as.yearqtr from zoo this alternative could be used:

library(zoo)
cr <- data.frame(ReportDate = as.Date("2017-09-11")) # input

fy <- as.integer(as.yearqtr(cr$ReportDate) + 3/4)
transform(cr, FY = paste0(fy-1, "_", fy %% 100))

giving:

  ReportDate      FY
1 2017-09-11 2017_18


On Mon, Sep 11, 2017 at 4:05 AM, Mangalani Peter Makananisa
<[hidden email]> wrote:

> Hi all,
>
>
>
> I am trying to create a new  variable called Fiscal Year (FY) using case
> expression in sqldf  and I am getting a null FY , see the code below .
>
>
>> +                        then '2017_18' else null>> South African Revenue Service (SARS)>> Specialist: Statistical Support>> TCEI_OR (Head Office)>> Tell: +272 422 7357, Cell: +2782 456 4669>> http://www.sars.gov.za/Pages/Email-disclaimer.aspxemail: ggrothendieck at gmail.with
> Please advise me as to how I can do this mutation.
>
>
>
>   library(zoo)
>
>   library(lubridate)
>
>   library(stringr)
>
>   library(RH2)
>
>   library(sqldf)
>
>
>
> cr$ReportDate = as.Date(cr$ReportDate, format ='%Y-%m-%d')
>
>
>
>> cr2 =  sqldf(" select ReportDate
>
> +                      ,  case
>
> +                        when ReportDate between  '2012-04-01'  and
> '2013-03-31'
>
> +                        then '2012_13'
>
> +                        when  ReportDate between '2013-04-01'  and
> '2014-03-31'
>
> +                        then '2013_14'
>
> +                        when  ReportDate between  '2014-04-01'  and
> '2015-03-31'
>
> +                        then'201415'
>
> +                        when ReportDate between '2015-04-01'  and
> '2016-03-31'
>
> +                        then '2015_16'
>
> +                        when ReportDate between '2016-04-01'  and
> '2017-03-31'
>
> +                        then '2016_17'
>
> +                        when ReportDate between '2017-04-01'  and
> '2018-03-3'
>


> +                        end as FY
>
> +               from cr
>
> +              where  ReportDate  >=  '2012-04-01'
>
> +              ")
>
>
>
> Thanking you in advance
>
>
>
> Kind regards,
>
>
>
> Mangalani Peter Makananisa (0005786)
>








>
>
>
>
> Disclaimer
>
> Please Note: This email and its contents are subject to our email legal
> notice which can be viewed at




--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Case statement in sqldf

Mangalani Peter Makananisa
In reply to this post by David Winsemius
Thanks D,

I will work on the solution you gave and give feedback.

-----Original Message-----
From: David Winsemius [mailto:[hidden email]]
Sent: 11 September 2017 05:19 PM
To: Mangalani Peter Makananisa
Cc: [hidden email]
Subject: Re: [R] Case statement in sqldf


> On Sep 11, 2017, at 1:05 AM, Mangalani Peter Makananisa <[hidden email]> wrote:
>
> Hi all,
>
> I am trying to create a new  variable called Fiscal Year (FY) using case expression in sqldf  and I am getting a null FY , see the code below .
>
> Please advise me as to how I can do this mutation.
>
>  library(zoo)
>  library(lubridate)
>  library(stringr)
>  library(RH2)
>  library(sqldf)
>
> cr$ReportDate = as.Date(cr$ReportDate, format ='%Y-%m-%d')
>
>> cr2 =  sqldf(" select ReportDate
> +                      ,  case
> +                        when ReportDate between  '2012-04-01'  and  '2013-03-31'
> +                        then '2012_13'
> +                        when  ReportDate between '2013-04-01'  and  '2014-03-31'
> +                        then '2013_14'
> +                        when  ReportDate between  '2014-04-01'  and  '2015-03-31'
> +                        then'201415'
> +                        when ReportDate between '2015-04-01'  and  '2016-03-31'
> +                        then '2015_16'
> +                        when ReportDate between '2016-04-01'  and  '2017-03-31'
> +                        then '2016_17'
> +                        when ReportDate between '2017-04-01'  and  '2018-03-3'
> +                        then '2017_18' else null
> +                        end as FY
> +               from cr
> +              where  ReportDate  >=  '2012-04-01'
> +              ")

There was no cr object in any of the package I loaded although `lubridate` and `stringr` appear unnecessary and were omitted. I get no error with your code using this test object:

 cr <- data.frame(ReportDate = seq(as.Date("1970-01-01"), as.Date("2020-01-01"), by="1 year" ))

> cr2 =  sqldf(" select ReportDate
+                      ,  case
+                        when ReportDate between  '2012-04-01'  and  '2013-03-31'
+                        then '2012_13'
+                        when  ReportDate between '2013-04-01'  and  '2014-03-31'
+                        then '2013_14'
+                        when  ReportDate between  '2014-04-01'  and  '2015-03-31'
+                        then'201415'
+                        when ReportDate between '2015-04-01'  and  '2016-03-31'
+                        then '2015_16'
+                        when ReportDate between '2016-04-01'  and  '2017-03-31'
+                        then '2016_17'
+                        when ReportDate between '2017-04-01'  and  '2018-03-3'
+                        then '2017_18' else null
+                        end as FY
+               from cr
+              where  ReportDate  >=  '2012-04-01'
+              ")
>
> str(cr2)
'data.frame': 8 obs. of  2 variables:
 $ ReportDate: Date, format: "2013-01-01" "2014-01-01" "2015-01-01" ...
 $ FY        : chr  "2012_13" "2013_14" "201415" "2015_16" ...

>
>
> Thanking you in advance
>
> Kind regards,
>
> Mangalani Peter Makananisa (0005786)
> South African Revenue Service (SARS)
> Specialist: Statistical Support
> TCEI_OR (Head Office)
> Tell: +272 422 7357, Cell: +2782 456 4669
>
> Please Note: This email and its contents are subject to our email
> legal notice which can be viewed at
> http://www.sars.gov.za/Pages/Email-disclaimer.aspx
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law

Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Pages/Email-disclaimer.aspx

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Case statement in sqldf

Mangalani Peter Makananisa
In reply to this post by Gabor Grothendieck
Thank you very much,

I will work on it

-----Original Message-----
From: Gabor Grothendieck [mailto:[hidden email]]
Sent: 11 September 2017 06:50 PM
To: Mangalani Peter Makananisa
Cc: [hidden email]
Subject: Re: Case statement in sqldf

2018-03-3 in your code should be 2018-03-31.

The line
    then'201415'
needs to be fixed.

When posting please provide minimal self-contained examples. There was no input provided and library statements not relevant to the posted code were included.

Fixing the invalid date and bad line, getting rid of those library statements that are unnecessary and providing some test input, it works for me for the input shown.

(Note that it would NOT work if we omitted library(RH2) since the default sqlite back end does not have date types and does not know that an R date -- which is sent to sqlite as the number of days since
1970-01-01 -- corresponds to a particular character string; however, the H2 database does have date types.  See FAQ #4 on the sqldf github home page for more info.
    https://github.com/ggrothendieck/sqldf
)

This works:

library(sqldf)
library(RH2)

cr <- data.frame(ReportDate = as.Date("2017-09-11")) # input

cr2 =  sqldf(" select ReportDate
                     ,  case
                       when ReportDate between  '2012-04-01'  and  '2013-03-31'
                       then '2012_13'
                       when  ReportDate between '2013-04-01'  and  '2014-03-31'
                       then '2013_14'
                       when  ReportDate between  '2014-04-01'  and  '2015-03-31'
                       then '2014_15'
                       when ReportDate between '2015-04-01'  and  '2016-03-31'
                       then '2015_16'
                       when ReportDate between '2016-04-01'  and  '2017-03-31'
                       then '2016_17'
                       when ReportDate between '2017-04-01'  and  '2018-03-31'
                      then '2017_18' else null
             end as FY
             from cr
             where  ReportDate  >=  '2012-04-01'
             ")

giving:

  > cr2
    ReportDate      FY
  1 2017-09-11 2017_18

Note that using as.yearqtr from zoo this alternative could be used:

library(zoo)
cr <- data.frame(ReportDate = as.Date("2017-09-11")) # input

fy <- as.integer(as.yearqtr(cr$ReportDate) + 3/4) transform(cr, FY = paste0(fy-1, "_", fy %% 100))

giving:

  ReportDate      FY
1 2017-09-11 2017_18


On Mon, Sep 11, 2017 at 4:05 AM, Mangalani Peter Makananisa <[hidden email]> wrote:

> Hi all,
>
>
>
> I am trying to create a new  variable called Fiscal Year (FY) using
> case expression in sqldf  and I am getting a null FY , see the code below .
>
>
>> +                        then '2017_18' else null>> South African
>> + Revenue Service (SARS)>> Specialist: Statistical Support>> TCEI_OR
>> + (Head Office)>> Tell: +272 422 7357, Cell: +2782 456 4669>>
>> + http://www.sars.gov.za/Pages/Email-disclaimer.aspxemail: 
>> + ggrothendieck at gmail.with
> Please advise me as to how I can do this mutation.
>
>
>
>   library(zoo)
>
>   library(lubridate)
>
>   library(stringr)
>
>   library(RH2)
>
>   library(sqldf)
>
>
>
> cr$ReportDate = as.Date(cr$ReportDate, format ='%Y-%m-%d')
>
>
>
>> cr2 =  sqldf(" select ReportDate
>
> +                      ,  case
>
> +                        when ReportDate between  '2012-04-01'  and
> '2013-03-31'
>
> +                        then '2012_13'
>
> +                        when  ReportDate between '2013-04-01'  and
> '2014-03-31'
>
> +                        then '2013_14'
>
> +                        when  ReportDate between  '2014-04-01'  and
> '2015-03-31'
>
> +                        then'201415'
>
> +                        when ReportDate between '2015-04-01'  and
> '2016-03-31'
>
> +                        then '2015_16'
>
> +                        when ReportDate between '2016-04-01'  and
> '2017-03-31'
>
> +                        then '2016_17'
>
> +                        when ReportDate between '2017-04-01'  and
> '2018-03-3'
>


> +                        end as FY
>
> +               from cr
>
> +              where  ReportDate  >=  '2012-04-01'
>
> +              ")
>
>
>
> Thanking you in advance
>
>
>
> Kind regards,
>
>
>
> Mangalani Peter Makananisa (0005786)
>








>
>
>
>
> Disclaimer
>
> Please Note: This email and its contents are subject to our email
> legal notice which can be viewed at




--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP

Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Pages/Email-disclaimer.aspx

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.