Lahman Baseball Data Using R DBI Package

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Lahman Baseball Data Using R DBI Package

Philip
I’m trying to pull data from one table (batting) in the Lahman Baseball database.  Notice X2B for doubles and X3B for triples – fourth and fifth from the right.  

The dbGetQuery function runs fine when I leave there two out but I get error messages (in red) when I include 2B/3B or X2B/X3B.

Can anyone give me some direction?

Thanks,
Philip Heinrich
***************************************************************************************************************************************************
tail(dbReadTable(Lahman,"batting"))

ID                       playerID      yearID    stint teamID team_ID lgID   G        G_batting   AB     R     H       X2B    X3B   HR   RBI   SB
107414 107414 yastrmi01      2019       1       SFN       2920      NL     107        NA          371   64  101      22       3     21     55    2
107416 107416 yelicch01      2019        1       MIL       2911      NL     130        NA          489 100  161      29       3     44    97   30
107419 107419 youngal01     2019       1       ARI        2896      NL       17        NA            25     1      1        0       0       0      0     0
107420 107420 zagunma01   2019        1      CHN       2901      NL      30        NA             36     2      9        3       0      0       5     0
107422 107422 zavalse01      2019        1      CHA       2900      AL        5        NA             12     1      1        0       0      0       0     0
107427 107427 zimmery01     2019        1      WAS      2925      NL      52        NA           171   20    44        9      0      6      27     0
107428 107428 zobribe01       2019        1      CHN      2901      NL      47        NA           150   24   39         5      0      1      17     0
107429 107429 zuninmi01       2019        1      TBA       2922      AL      90        NA           266    30  44       10      1      9      32     0


Hack11Batting <- dbGetQuery(Lahman,"SELECT playerID,yearID,AB,R,H,2B,3B,HR,
                            RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM batting
                            WHERE yearID = 2018 AND AB >99")
Error: unrecognized token: "2B"

 Hack11Batting <- dbGetQuery(Lahman,"SELECT playerID,yearID,AB,R,H,X2B,X3B,HR,
                            RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM batting
                            WHERE yearID = 2018 AND AB >99")
Error: no such column: X2B

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Lahman Baseball Data Using R DBI Package

Bill Dunlap-2
Have you tried putting double quotes around 2B and 3B:  "...2B, 3B, ..." ->
"...\"2B\",\"3B\",..."?

-Bill

On Fri, Oct 2, 2020 at 3:49 PM Philip <[hidden email]> wrote:

> I’m trying to pull data from one table (batting) in the Lahman Baseball
> database.  Notice X2B for doubles and X3B for triples – fourth and fifth
> from the right.
>
> The dbGetQuery function runs fine when I leave there two out but I get
> error messages (in red) when I include 2B/3B or X2B/X3B.
>
> Can anyone give me some direction?
>
> Thanks,
> Philip Heinrich
>
> ***************************************************************************************************************************************************
> tail(dbReadTable(Lahman,"batting"))
>
> ID                       playerID      yearID    stint teamID team_ID
> lgID   G        G_batting   AB     R     H       X2B    X3B   HR   RBI   SB
> 107414 107414 yastrmi01      2019       1       SFN       2920      NL
>  107        NA          371   64  101      22       3     21     55    2
> 107416 107416 yelicch01      2019        1       MIL       2911      NL
>  130        NA          489 100  161      29       3     44    97   30
> 107419 107419 youngal01     2019       1       ARI        2896      NL
>    17        NA            25     1      1        0       0       0      0
>    0
> 107420 107420 zagunma01   2019        1      CHN       2901      NL
> 30        NA             36     2      9        3       0      0       5
>  0
> 107422 107422 zavalse01      2019        1      CHA       2900      AL
>     5        NA             12     1      1        0       0      0
>  0     0
> 107427 107427 zimmery01     2019        1      WAS      2925      NL
> 52        NA           171   20    44        9      0      6      27     0
> 107428 107428 zobribe01       2019        1      CHN      2901      NL
>   47        NA           150   24   39         5      0      1      17     0
> 107429 107429 zuninmi01       2019        1      TBA       2922      AL
>   90        NA           266    30  44       10      1      9      32     0
>
>
> Hack11Batting <- dbGetQuery(Lahman,"SELECT playerID,yearID,AB,R,H,2B,3B,HR,
>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM batting
>                             WHERE yearID = 2018 AND AB >99")
> Error: unrecognized token: "2B"
>
>  Hack11Batting <- dbGetQuery(Lahman,"SELECT
> playerID,yearID,AB,R,H,X2B,X3B,HR,
>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM batting
>                             WHERE yearID = 2018 AND AB >99")
> Error: no such column: X2B
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Lahman Baseball Data Using R DBI Package

Bill Dunlap-2
The double quotes are required by SQL if a name is not of the form
letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL
keyword like 'where' or 'select'.  If you are doing this from a function,
you may as well quote all the names.

-Bill

On Fri, Oct 2, 2020 at 6:18 PM Philip <[hidden email]> wrote:

> The \”2B\” worked.  Have no idea why.  Can you point me somewhere that can
> explain this to me.
>
> Thanks,
> Philip
>
> *From:* Bill Dunlap
> *Sent:* Friday, October 2, 2020 3:54 PM
> *To:* Philip
> *Cc:* r-help
> *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package
>
> Have you tried putting double quotes around 2B and 3B:  "...2B, 3B, ..."
> -> "...\"2B\",\"3B\",..."?
>
> -Bill
>
> On Fri, Oct 2, 2020 at 3:49 PM Philip <[hidden email]> wrote:
>
>> I’m trying to pull data from one table (batting) in the Lahman Baseball
>> database.  Notice X2B for doubles and X3B for triples – fourth and fifth
>> from the right.
>>
>> The dbGetQuery function runs fine when I leave there two out but I get
>> error messages (in red) when I include 2B/3B or X2B/X3B.
>>
>> Can anyone give me some direction?
>>
>> Thanks,
>> Philip Heinrich
>>
>> ***************************************************************************************************************************************************
>> tail(dbReadTable(Lahman,"batting"))
>>
>> ID                       playerID      yearID    stint teamID team_ID
>> lgID   G        G_batting   AB     R     H       X2B    X3B   HR   RBI   SB
>> 107414 107414 yastrmi01      2019       1       SFN       2920
>> NL     107        NA          371   64  101      22       3     21
>> 55    2
>> 107416 107416 yelicch01      2019        1       MIL       2911
>> NL     130        NA          489 100  161      29       3     44    97   30
>> 107419 107419 youngal01     2019       1       ARI        2896
>> NL       17        NA            25     1      1        0       0
>> 0      0     0
>> 107420 107420 zagunma01   2019        1      CHN       2901      NL
>> 30        NA             36     2      9        3       0      0
>> 5     0
>> 107422 107422 zavalse01      2019        1      CHA       2900
>> AL        5        NA             12     1      1        0       0
>> 0       0     0
>> 107427 107427 zimmery01     2019        1      WAS      2925      NL
>> 52        NA           171   20    44        9      0      6      27     0
>> 107428 107428 zobribe01       2019        1      CHN      2901
>> NL      47        NA           150   24   39         5      0      1
>> 17     0
>> 107429 107429 zuninmi01       2019        1      TBA       2922
>> AL      90        NA           266    30  44       10      1      9
>> 32     0
>>
>>
>> Hack11Batting <- dbGetQuery(Lahman,"SELECT
>> playerID,yearID,AB,R,H,2B,3B,HR,
>>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
>> batting
>>                             WHERE yearID = 2018 AND AB >99")
>> Error: unrecognized token: "2B"
>>
>> Hack11Batting <- dbGetQuery(Lahman,"SELECT
>> playerID,yearID,AB,R,H,X2B,X3B,HR,
>>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
>> batting
>>                             WHERE yearID = 2018 AND AB >99")
>> Error: no such column: X2B
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Lahman Baseball Data Using R DBI Package

R help mailing list-2
Hi Philip,

You've probably realized by now that R doesn't like column names that
start with a number. If you try to access an R-dataframe column named
2B or 3B with the familiar "$" notation, you'll get an error:

> library(DBI)
> library(RSQLite)
> con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite")
> Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600 ORDER BY AB DESC")
> Hack12Batting$AB
 [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602
> Hack12Batting$3B
Error: unexpected numeric constant in "Hack12Batting$3"

How to handle? You can rename columns on-the-fly by piping. See
reference [1] and use either library(magrittr) or library(dplyr) or a
combination thereof:

library(magrittr)
dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.)))

#OR one of the following:

library(dplyr)
dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`)

library(dplyr)
dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.)))

library(dplyr)
dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.)))

Best, Bill.

W. Michels, Ph.D.

[1] https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames










On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap <[hidden email]> wrote:

>
> The double quotes are required by SQL if a name is not of the form
> letter-followed-by-any-number-of-letters-or-numbers or if the name is a SQL
> keyword like 'where' or 'select'.  If you are doing this from a function,
> you may as well quote all the names.
>
> -Bill
>
> On Fri, Oct 2, 2020 at 6:18 PM Philip <[hidden email]> wrote:
>
> > The \”2B\” worked.  Have no idea why.  Can you point me somewhere that can
> > explain this to me.
> >
> > Thanks,
> > Philip
> >
> > *From:* Bill Dunlap
> > *Sent:* Friday, October 2, 2020 3:54 PM
> > *To:* Philip
> > *Cc:* r-help
> > *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package
> >
> > Have you tried putting double quotes around 2B and 3B:  "...2B, 3B, ..."
> > -> "...\"2B\",\"3B\",..."?
> >
> > -Bill
> >
> > On Fri, Oct 2, 2020 at 3:49 PM Philip <[hidden email]> wrote:
> >
> >> I’m trying to pull data from one table (batting) in the Lahman Baseball
> >> database.  Notice X2B for doubles and X3B for triples – fourth and fifth
> >> from the right.
> >>
> >> The dbGetQuery function runs fine when I leave there two out but I get
> >> error messages (in red) when I include 2B/3B or X2B/X3B.
> >>
> >> Can anyone give me some direction?
> >>
> >> Thanks,
> >> Philip Heinrich
> >>
> >> ***************************************************************************************************************************************************
> >> tail(dbReadTable(Lahman,"batting"))
> >>
> >> ID                       playerID      yearID    stint teamID team_ID
> >> lgID   G        G_batting   AB     R     H       X2B    X3B   HR   RBI   SB
> >> 107414 107414 yastrmi01      2019       1       SFN       2920
> >> NL     107        NA          371   64  101      22       3     21
> >> 55    2
> >> 107416 107416 yelicch01      2019        1       MIL       2911
> >> NL     130        NA          489 100  161      29       3     44    97   30
> >> 107419 107419 youngal01     2019       1       ARI        2896
> >> NL       17        NA            25     1      1        0       0
> >> 0      0     0
> >> 107420 107420 zagunma01   2019        1      CHN       2901      NL
> >> 30        NA             36     2      9        3       0      0
> >> 5     0
> >> 107422 107422 zavalse01      2019        1      CHA       2900
> >> AL        5        NA             12     1      1        0       0
> >> 0       0     0
> >> 107427 107427 zimmery01     2019        1      WAS      2925      NL
> >> 52        NA           171   20    44        9      0      6      27     0
> >> 107428 107428 zobribe01       2019        1      CHN      2901
> >> NL      47        NA           150   24   39         5      0      1
> >> 17     0
> >> 107429 107429 zuninmi01       2019        1      TBA       2922
> >> AL      90        NA           266    30  44       10      1      9
> >> 32     0
> >>
> >>
> >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> >> playerID,yearID,AB,R,H,2B,3B,HR,
> >>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> >> batting
> >>                             WHERE yearID = 2018 AND AB >99")
> >> Error: unrecognized token: "2B"
> >>
> >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> >> playerID,yearID,AB,R,H,X2B,X3B,HR,
> >>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> >> batting
> >>                             WHERE yearID = 2018 AND AB >99")
> >> Error: no such column: X2B
> >>
> >>         [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Lahman Baseball Data Using R DBI Package

Bill Dunlap-2
This is really a feature of SQL, not R.  SQL requires that you double quote
column names that start with numbers, include spaces, etc., or that are SQL
key words.  E.g.,

> d <- data.frame(Order=c("sit","stay","heel"),
Where=c("here","there","there"), From=c("me","me","you"))
> sqldf::sqldf("select Order,Where,From from d WHERE From=\"me\"")
Error: near "Order": syntax error
> sqldf::sqldf("select \"Order\",\"Where\",\"From\" from d Where
\"From\"=\"me\"")
  Order Where From
1   sit  here   me
2  stay there   me

You may as well double quote all column names in SQL queries.

-Bill

On Wed, Oct 7, 2020 at 9:57 PM William Michels <[hidden email]>
wrote:

> Hi Philip,
>
> You've probably realized by now that R doesn't like column names that
> start with a number. If you try to access an R-dataframe column named
> 2B or 3B with the familiar "$" notation, you'll get an error:
>
> > library(DBI)
> > library(RSQLite)
> > con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite")
> > Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID =
> 2018 AND AB >600 ORDER BY AB DESC")
> > Hack12Batting$AB
>  [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602
> > Hack12Batting$3B
> Error: unexpected numeric constant in "Hack12Batting$3"
>
> How to handle? You can rename columns on-the-fly by piping. See
> reference [1] and use either library(magrittr) or library(dplyr) or a
> combination thereof:
>
> library(magrittr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.)))
>
> #OR one of the following:
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`)
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.)))
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.)))
>
> Best, Bill.
>
> W. Michels, Ph.D.
>
> [1]
> https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames
>
>
>
>
>
>
>
>
>
>
> On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap <[hidden email]>
> wrote:
> >
> > The double quotes are required by SQL if a name is not of the form
> > letter-followed-by-any-number-of-letters-or-numbers or if the name is a
> SQL
> > keyword like 'where' or 'select'.  If you are doing this from a function,
> > you may as well quote all the names.
> >
> > -Bill
> >
> > On Fri, Oct 2, 2020 at 6:18 PM Philip <[hidden email]> wrote:
> >
> > > The \”2B\” worked.  Have no idea why.  Can you point me somewhere that
> can
> > > explain this to me.
> > >
> > > Thanks,
> > > Philip
> > >
> > > *From:* Bill Dunlap
> > > *Sent:* Friday, October 2, 2020 3:54 PM
> > > *To:* Philip
> > > *Cc:* r-help
> > > *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package
> > >
> > > Have you tried putting double quotes around 2B and 3B:  "...2B, 3B,
> ..."
> > > -> "...\"2B\",\"3B\",..."?
> > >
> > > -Bill
> > >
> > > On Fri, Oct 2, 2020 at 3:49 PM Philip <[hidden email]> wrote:
> > >
> > >> I’m trying to pull data from one table (batting) in the Lahman
> Baseball
> > >> database.  Notice X2B for doubles and X3B for triples – fourth and
> fifth
> > >> from the right.
> > >>
> > >> The dbGetQuery function runs fine when I leave there two out but I get
> > >> error messages (in red) when I include 2B/3B or X2B/X3B.
> > >>
> > >> Can anyone give me some direction?
> > >>
> > >> Thanks,
> > >> Philip Heinrich
> > >>
> > >>
> ***************************************************************************************************************************************************
> > >> tail(dbReadTable(Lahman,"batting"))
> > >>
> > >> ID                       playerID      yearID    stint teamID team_ID
> > >> lgID   G        G_batting   AB     R     H       X2B    X3B   HR
>  RBI   SB
> > >> 107414 107414 yastrmi01      2019       1       SFN       2920
> > >> NL     107        NA          371   64  101      22       3     21
> > >> 55    2
> > >> 107416 107416 yelicch01      2019        1       MIL       2911
> > >> NL     130        NA          489 100  161      29       3     44
> 97   30
> > >> 107419 107419 youngal01     2019       1       ARI        2896
> > >> NL       17        NA            25     1      1        0       0
> > >> 0      0     0
> > >> 107420 107420 zagunma01   2019        1      CHN       2901      NL
> > >> 30        NA             36     2      9        3       0      0
> > >> 5     0
> > >> 107422 107422 zavalse01      2019        1      CHA       2900
> > >> AL        5        NA             12     1      1        0       0
> > >> 0       0     0
> > >> 107427 107427 zimmery01     2019        1      WAS      2925      NL
> > >> 52        NA           171   20    44        9      0      6      27
>    0
> > >> 107428 107428 zobribe01       2019        1      CHN      2901
> > >> NL      47        NA           150   24   39         5      0      1
> > >> 17     0
> > >> 107429 107429 zuninmi01       2019        1      TBA       2922
> > >> AL      90        NA           266    30  44       10      1      9
> > >> 32     0
> > >>
> > >>
> > >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> > >> playerID,yearID,AB,R,H,2B,3B,HR,
> > >>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> > >> batting
> > >>                             WHERE yearID = 2018 AND AB >99")
> > >> Error: unrecognized token: "2B"
> > >>
> > >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> > >> playerID,yearID,AB,R,H,X2B,X3B,HR,
> > >>                             RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> > >> batting
> > >>                             WHERE yearID = 2018 AND AB >99")
> > >> Error: no such column: X2B
> > >>
> > >>         [[alternative HTML version deleted]]
> > >>
> > >> ______________________________________________
> > >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Lahman Baseball Data Using R DBI Package

Rui Barradas
Hello,


Às 17:26 de 08/10/20, Bill Dunlap escreveu:
> This is really a feature of SQL, not R.  SQL requires that you double quote
> column names that start with numbers, include spaces, etc., or that are SQL
> key words.  


Right, but there's no need to escape the double quotes, just put the SQL
statement between single quotes and it becomes more readable.

sqldf::sqldf('select "Order","Where","From" from d WHERE "From"="me"')


Hope this helps,

Rui Barradas

E.g.,

>
>> d <- data.frame(Order=c("sit","stay","heel"),
> Where=c("here","there","there"), From=c("me","me","you"))
>> sqldf::sqldf("select Order,Where,From from d WHERE From=\"me\"")
> Error: near "Order": syntax error
>> sqldf::sqldf("select \"Order\",\"Where\",\"From\" from d Where
> \"From\"=\"me\"")
>    Order Where From
> 1   sit  here   me
> 2  stay there   me
>
> You may as well double quote all column names in SQL queries.
>
> -Bill
>
> On Wed, Oct 7, 2020 at 9:57 PM William Michels <[hidden email]>
> wrote:
>
>> Hi Philip,
>>
>> You've probably realized by now that R doesn't like column names that
>> start with a number. If you try to access an R-dataframe column named
>> 2B or 3B with the familiar "$" notation, you'll get an error:
>>
>>> library(DBI)
>>> library(RSQLite)
>>> con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite")
>>> Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID =
>> 2018 AND AB >600 ORDER BY AB DESC")
>>> Hack12Batting$AB
>>   [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602
>>> Hack12Batting$3B
>> Error: unexpected numeric constant in "Hack12Batting$3"
>>
>> How to handle? You can rename columns on-the-fly by piping. See
>> reference [1] and use either library(magrittr) or library(dplyr) or a
>> combination thereof:
>>
>> library(magrittr)
>> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
>> ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.)))
>>
>> #OR one of the following:
>>
>> library(dplyr)
>> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
>> ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`)
>>
>> library(dplyr)
>> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
>> ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.)))
>>
>> library(dplyr)
>> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
>> ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.)))
>>
>> Best, Bill.
>>
>> W. Michels, Ph.D.
>>
>> [1]
>> https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap <[hidden email]>
>> wrote:
>>>
>>> The double quotes are required by SQL if a name is not of the form
>>> letter-followed-by-any-number-of-letters-or-numbers or if the name is a
>> SQL
>>> keyword like 'where' or 'select'.  If you are doing this from a function,
>>> you may as well quote all the names.
>>>
>>> -Bill
>>>
>>> On Fri, Oct 2, 2020 at 6:18 PM Philip <[hidden email]> wrote:
>>>
>>>> The \”2B\” worked.  Have no idea why.  Can you point me somewhere that
>> can
>>>> explain this to me.
>>>>
>>>> Thanks,
>>>> Philip
>>>>
>>>> *From:* Bill Dunlap
>>>> *Sent:* Friday, October 2, 2020 3:54 PM
>>>> *To:* Philip
>>>> *Cc:* r-help
>>>> *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package
>>>>
>>>> Have you tried putting double quotes around 2B and 3B:  "...2B, 3B,
>> ..."
>>>> -> "...\"2B\",\"3B\",..."?
>>>>
>>>> -Bill
>>>>
>>>> On Fri, Oct 2, 2020 at 3:49 PM Philip <[hidden email]> wrote:
>>>>
>>>>> I’m trying to pull data from one table (batting) in the Lahman
>> Baseball
>>>>> database.  Notice X2B for doubles and X3B for triples – fourth and
>> fifth
>>>>> from the right.
>>>>>
>>>>> The dbGetQuery function runs fine when I leave there two out but I get
>>>>> error messages (in red) when I include 2B/3B or X2B/X3B.
>>>>>
>>>>> Can anyone give me some direction?
>>>>>
>>>>> Thanks,
>>>>> Philip Heinrich
>>>>>
>>>>>
>> ***************************************************************************************************************************************************
>>>>> tail(dbReadTable(Lahman,"batting"))
>>>>>
>>>>> ID                       playerID      yearID    stint teamID team_ID
>>>>> lgID   G        G_batting   AB     R     H       X2B    X3B   HR
>>   RBI   SB
>>>>> 107414 107414 yastrmi01      2019       1       SFN       2920
>>>>> NL     107        NA          371   64  101      22       3     21
>>>>> 55    2
>>>>> 107416 107416 yelicch01      2019        1       MIL       2911
>>>>> NL     130        NA          489 100  161      29       3     44
>> 97   30
>>>>> 107419 107419 youngal01     2019       1       ARI        2896
>>>>> NL       17        NA            25     1      1        0       0
>>>>> 0      0     0
>>>>> 107420 107420 zagunma01   2019        1      CHN       2901      NL
>>>>> 30        NA             36     2      9        3       0      0
>>>>> 5     0
>>>>> 107422 107422 zavalse01      2019        1      CHA       2900
>>>>> AL        5        NA             12     1      1        0       0
>>>>> 0       0     0
>>>>> 107427 107427 zimmery01     2019        1      WAS      2925      NL
>>>>> 52        NA           171   20    44        9      0      6      27
>>     0
>>>>> 107428 107428 zobribe01       2019        1      CHN      2901
>>>>> NL      47        NA           150   24   39         5      0      1
>>>>> 17     0
>>>>> 107429 107429 zuninmi01       2019        1      TBA       2922
>>>>> AL      90        NA           266    30  44       10      1      9
>>>>> 32     0
>>>>>
>>>>>
>>>>> Hack11Batting <- dbGetQuery(Lahman,"SELECT
>>>>> playerID,yearID,AB,R,H,2B,3B,HR,
>>>>>                              RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
>>>>> batting
>>>>>                              WHERE yearID = 2018 AND AB >99")
>>>>> Error: unrecognized token: "2B"
>>>>>
>>>>> Hack11Batting <- dbGetQuery(Lahman,"SELECT
>>>>> playerID,yearID,AB,R,H,X2B,X3B,HR,
>>>>>                              RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
>>>>> batting
>>>>>                              WHERE yearID = 2018 AND AB >99")
>>>>> Error: no such column: X2B
>>>>>
>>>>>          [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.