Creating a simple function

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Creating a simple function

Zachary Lim
Hi,

I'm trying to create a simple function that takes a dataframe as its only argument. I've been using gmodels::CrossTable, but it requires a lot of arguments, e.g.:

#this runs fine
CrossTable(data$col1, data$col2, prop.chisq = FALSE, prop.c = FALSE, prop.t = FALSE, format = "SPSS")

Moreover, I wanted to make it compatible with piping, so I decided to create the following function:

ctab <- function(data) {
  CrossTable(data[,1], data[,2], prop.chisq = FALSE, prop.c = FALSE, prop.t = FALSE, format = "SPSS")
}

When I try to use this function, however, I get the following error:

#this results in 'Error: Must use a vector in `[`, not an object of class matrix.'
data %>% select(col1, col2) %>% ctab()

I tried searching online but couldn't find much about that error (except for in specific and unrelated cases). Moreover, when I created a very simple dataset, it turns out there's no problem:

#this runs fine
data.frame(C1 = c('x','y','x','y'), C2 = c('a','a','b','b')) %>% ctab()


Is this a problem with my function or the data? If it's the data, why does directly calling CrossTable work?

Thanks!

Best,
Zach

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating a simple function

Duncan Murdoch-2
On 20/09/2019 11:30 a.m., Zachary Lim wrote:

> Hi,
>
> I'm trying to create a simple function that takes a dataframe as its only argument. I've been using gmodels::CrossTable, but it requires a lot of arguments, e.g.:
>
> #this runs fine
> CrossTable(data$col1, data$col2, prop.chisq = FALSE, prop.c = FALSE, prop.t = FALSE, format = "SPSS")
>
> Moreover, I wanted to make it compatible with piping, so I decided to create the following function:
>
> ctab <- function(data) {
>    CrossTable(data[,1], data[,2], prop.chisq = FALSE, prop.c = FALSE, prop.t = FALSE, format = "SPSS")
> }
>
> When I try to use this function, however, I get the following error:
>
> #this results in 'Error: Must use a vector in `[`, not an object of class matrix.'
> data %>% select(col1, col2) %>% ctab()
>
> I tried searching online but couldn't find much about that error (except for in specific and unrelated cases). Moreover, when I created a very simple dataset, it turns out there's no problem:
>
> #this runs fine
> data.frame(C1 = c('x','y','x','y'), C2 = c('a','a','b','b')) %>% ctab()
>
>
> Is this a problem with my function or the data? If it's the data, why does directly calling CrossTable work?

Presumably  data %>% select(col1, col2)  isn't giving you a dataframe.
However, you haven't given us a reproducible example, so I can't tell
you what it's doing.  But that's where you should look.

Duncan Murdoch

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating a simple function

Rui Barradas
In reply to this post by Zachary Lim
Hello,

Something like this?


ctab <- function(data) {
   gmodels::CrossTable(as.matrix(data), prop.chisq = FALSE, prop.c =
FALSE, prop.t = FALSE, format = "SPSS")
}

mtcars %>% select(cyl, gear) %>% ctab()


Hope this helps,

Rui Barradas

Às 16:30 de 20/09/19, Zachary Lim escreveu:

> Hi,
>
> I'm trying to create a simple function that takes a dataframe as its only argument. I've been using gmodels::CrossTable, but it requires a lot of arguments, e.g.:
>
> #this runs fine
> CrossTable(data$col1, data$col2, prop.chisq = FALSE, prop.c = FALSE, prop.t = FALSE, format = "SPSS")
>
> Moreover, I wanted to make it compatible with piping, so I decided to create the following function:
>
> ctab <- function(data) {
>    CrossTable(data[,1], data[,2], prop.chisq = FALSE, prop.c = FALSE, prop.t = FALSE, format = "SPSS")
> }
>
> When I try to use this function, however, I get the following error:
>
> #this results in 'Error: Must use a vector in `[`, not an object of class matrix.'
> data %>% select(col1, col2) %>% ctab()
>
> I tried searching online but couldn't find much about that error (except for in specific and unrelated cases). Moreover, when I created a very simple dataset, it turns out there's no problem:
>
> #this runs fine
> data.frame(C1 = c('x','y','x','y'), C2 = c('a','a','b','b')) %>% ctab()
>
>
> Is this a problem with my function or the data? If it's the data, why does directly calling CrossTable work?
>
> Thanks!
>
> Best,
> Zach
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating a simple function

Jeff Newmiller
In reply to this post by Duncan Murdoch-2
The dplyr::select function returns a special variety of data.frame called a tibble. The tibble has certain features designed to make it behave consistently when indexing is used. Specifically, the `[` operator always returns a tibble regardless of how many columns are indicated by the column index. This is unlike the conventional data frame which returns a vector when exactly one column is indicated by the column index, or a data.frame if more than one is indicated.

A syntax that consistently yields a column vector with both tibbles and data.frames is

dta[[ 1 ]]

so

ctab <- function(data) {
   CrossTable(data[[1]], data[[2]], prop.chisq = FALSE, prop.c = FALSE,
prop.t = FALSE, format = "SPSS")
}

should work.

On September 20, 2019 10:59:46 AM PDT, Duncan Murdoch <[hidden email]> wrote:

>On 20/09/2019 11:30 a.m., Zachary Lim wrote:
>> Hi,
>>
>> I'm trying to create a simple function that takes a dataframe as its
>only argument. I've been using gmodels::CrossTable, but it requires a
>lot of arguments, e.g.:
>>
>> #this runs fine
>> CrossTable(data$col1, data$col2, prop.chisq = FALSE, prop.c = FALSE,
>prop.t = FALSE, format = "SPSS")
>>
>> Moreover, I wanted to make it compatible with piping, so I decided to
>create the following function:
>>
>> ctab <- function(data) {
>>    CrossTable(data[,1], data[,2], prop.chisq = FALSE, prop.c = FALSE,
>prop.t = FALSE, format = "SPSS")
>> }
>>
>> When I try to use this function, however, I get the following error:
>>
>> #this results in 'Error: Must use a vector in `[`, not an object of
>class matrix.'
>> data %>% select(col1, col2) %>% ctab()
>>
>> I tried searching online but couldn't find much about that error
>(except for in specific and unrelated cases). Moreover, when I created
>a very simple dataset, it turns out there's no problem:
>>
>> #this runs fine
>> data.frame(C1 = c('x','y','x','y'), C2 = c('a','a','b','b')) %>%
>ctab()
>>
>>
>> Is this a problem with my function or the data? If it's the data, why
>does directly calling CrossTable work?
>
>Presumably  data %>% select(col1, col2)  isn't giving you a dataframe.
>However, you haven't given us a reproducible example, so I can't tell
>you what it's doing.  But that's where you should look.
>
>Duncan Murdoch
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating a simple function

Duncan Murdoch-2
On 21/09/2019 7:38 a.m., Jeff Newmiller wrote:
> The dplyr::select function returns a special variety of data.frame called a tibble.

I don't think that's always true.  The docs say it returns "An object of
the same class as .data.", and that's what I'm seeing:

 > str(data.frame(a=c(1,1,2,2), b=1:4) %>% subset(a == 1))
'data.frame': 2 obs. of  2 variables:
  $ a: num  1 1
  $ b: int  1 2

But I believe there are other dplyr functions that take dataframes as
input and return tibbles, I just don't know which ones.

Duncan Murdoch

The tibble has certain features designed to make it behave consistently
when indexing is used. Specifically, the `[` operator always returns a
tibble regardless of how many columns are indicated by the column index.
This is unlike the conventional data frame which returns a vector when
exactly one column is indicated by the column index, or a data.frame if
more than one is indicated.

>
> A syntax that consistently yields a column vector with both tibbles and data.frames is
>
> dta[[ 1 ]]
>
> so
>
> ctab <- function(data) {
>     CrossTable(data[[1]], data[[2]], prop.chisq = FALSE, prop.c = FALSE,
> prop.t = FALSE, format = "SPSS")
> }
>
> should work.
>
> On September 20, 2019 10:59:46 AM PDT, Duncan Murdoch <[hidden email]> wrote:
>> On 20/09/2019 11:30 a.m., Zachary Lim wrote:
>>> Hi,
>>>
>>> I'm trying to create a simple function that takes a dataframe as its
>> only argument. I've been using gmodels::CrossTable, but it requires a
>> lot of arguments, e.g.:
>>>
>>> #this runs fine
>>> CrossTable(data$col1, data$col2, prop.chisq = FALSE, prop.c = FALSE,
>> prop.t = FALSE, format = "SPSS")
>>>
>>> Moreover, I wanted to make it compatible with piping, so I decided to
>> create the following function:
>>>
>>> ctab <- function(data) {
>>>     CrossTable(data[,1], data[,2], prop.chisq = FALSE, prop.c = FALSE,
>> prop.t = FALSE, format = "SPSS")
>>> }
>>>
>>> When I try to use this function, however, I get the following error:
>>>
>>> #this results in 'Error: Must use a vector in `[`, not an object of
>> class matrix.'
>>> data %>% select(col1, col2) %>% ctab()
>>>
>>> I tried searching online but couldn't find much about that error
>> (except for in specific and unrelated cases). Moreover, when I created
>> a very simple dataset, it turns out there's no problem:
>>>
>>> #this runs fine
>>> data.frame(C1 = c('x','y','x','y'), C2 = c('a','a','b','b')) %>%
>> ctab()
>>>
>>>
>>> Is this a problem with my function or the data? If it's the data, why
>> does directly calling CrossTable work?
>>
>> Presumably  data %>% select(col1, col2)  isn't giving you a dataframe.
>> However, you haven't given us a reproducible example, so I can't tell
>> you what it's doing.  But that's where you should look.
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating a simple function

Jeff Newmiller
Your use of subset instead of select does not help, but a corrected example does indeed confirm your point.

library(dplyr)

str(data.frame(a=c(1,1,2,2), b=1:4) %>% select(b,a))
## 'data.frame': 4 obs. of 2 variables:
## $ b: int 1 2 3 4
## $ a: num 1 1 2 2

However the `[` issue is still worth addressing. If that does not fix the problem then a dput(head(troublesomedata)) from Zachary will be needed to figure out what actually is going on.

On September 21, 2019 5:22:07 AM PDT, Duncan Murdoch <[hidden email]> wrote:

>On 21/09/2019 7:38 a.m., Jeff Newmiller wrote:
>> The dplyr::select function returns a special variety of data.frame
>called a tibble.
>
>I don't think that's always true.  The docs say it returns "An object
>of
>the same class as .data.", and that's what I'm seeing:
>
> > str(data.frame(a=c(1,1,2,2), b=1:4) %>% subset(a == 1))
>'data.frame': 2 obs. of  2 variables:
>  $ a: num  1 1
>  $ b: int  1 2
>
>But I believe there are other dplyr functions that take dataframes as
>input and return tibbles, I just don't know which ones.
>
>Duncan Murdoch
>
>The tibble has certain features designed to make it behave consistently
>
>when indexing is used. Specifically, the `[` operator always returns a
>tibble regardless of how many columns are indicated by the column
>index.
>This is unlike the conventional data frame which returns a vector when
>exactly one column is indicated by the column index, or a data.frame if
>
>more than one is indicated.
>>
>> A syntax that consistently yields a column vector with both tibbles
>and data.frames is
>>
>> dta[[ 1 ]]
>>
>> so
>>
>> ctab <- function(data) {
>>     CrossTable(data[[1]], data[[2]], prop.chisq = FALSE, prop.c =
>FALSE,
>> prop.t = FALSE, format = "SPSS")
>> }
>>
>> should work.
>>
>> On September 20, 2019 10:59:46 AM PDT, Duncan Murdoch
><[hidden email]> wrote:
>>> On 20/09/2019 11:30 a.m., Zachary Lim wrote:
>>>> Hi,
>>>>
>>>> I'm trying to create a simple function that takes a dataframe as
>its
>>> only argument. I've been using gmodels::CrossTable, but it requires
>a
>>> lot of arguments, e.g.:
>>>>
>>>> #this runs fine
>>>> CrossTable(data$col1, data$col2, prop.chisq = FALSE, prop.c =
>FALSE,
>>> prop.t = FALSE, format = "SPSS")
>>>>
>>>> Moreover, I wanted to make it compatible with piping, so I decided
>to
>>> create the following function:
>>>>
>>>> ctab <- function(data) {
>>>>     CrossTable(data[,1], data[,2], prop.chisq = FALSE, prop.c =
>FALSE,
>>> prop.t = FALSE, format = "SPSS")
>>>> }
>>>>
>>>> When I try to use this function, however, I get the following
>error:
>>>>
>>>> #this results in 'Error: Must use a vector in `[`, not an object of
>>> class matrix.'
>>>> data %>% select(col1, col2) %>% ctab()
>>>>
>>>> I tried searching online but couldn't find much about that error
>>> (except for in specific and unrelated cases). Moreover, when I
>created
>>> a very simple dataset, it turns out there's no problem:
>>>>
>>>> #this runs fine
>>>> data.frame(C1 = c('x','y','x','y'), C2 = c('a','a','b','b')) %>%
>>> ctab()
>>>>
>>>>
>>>> Is this a problem with my function or the data? If it's the data,
>why
>>> does directly calling CrossTable work?
>>>
>>> Presumably  data %>% select(col1, col2)  isn't giving you a
>dataframe.
>>> However, you haven't given us a reproducible example, so I can't
>tell
>>> you what it's doing.  But that's where you should look.
>>>
>>> Duncan Murdoch
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating a simple function

Duncan Murdoch-2
On 21/09/2019 9:05 a.m., Jeff Newmiller wrote:
> Your use of subset instead of select does not help,

Whoops, sorry.  Thanks for doing the real check.

Duncan

but a corrected example does indeed confirm your point.

>
> library(dplyr)
>
> str(data.frame(a=c(1,1,2,2), b=1:4) %>% select(b,a))
> ## 'data.frame': 4 obs. of 2 variables:
> ## $ b: int 1 2 3 4
> ## $ a: num 1 1 2 2
>
> However the `[` issue is still worth addressing. If that does not fix the problem then a dput(head(troublesomedata)) from Zachary will be needed to figure out what actually is going on.
>
> On September 21, 2019 5:22:07 AM PDT, Duncan Murdoch <[hidden email]> wrote:
>> On 21/09/2019 7:38 a.m., Jeff Newmiller wrote:
>>> The dplyr::select function returns a special variety of data.frame
>> called a tibble.
>>
>> I don't think that's always true.  The docs say it returns "An object
>> of
>> the same class as .data.", and that's what I'm seeing:
>>
>>> str(data.frame(a=c(1,1,2,2), b=1:4) %>% subset(a == 1))
>> 'data.frame': 2 obs. of  2 variables:
>>   $ a: num  1 1
>>   $ b: int  1 2
>>
>> But I believe there are other dplyr functions that take dataframes as
>> input and return tibbles, I just don't know which ones.
>>
>> Duncan Murdoch
>>
>> The tibble has certain features designed to make it behave consistently
>>
>> when indexing is used. Specifically, the `[` operator always returns a
>> tibble regardless of how many columns are indicated by the column
>> index.
>> This is unlike the conventional data frame which returns a vector when
>> exactly one column is indicated by the column index, or a data.frame if
>>
>> more than one is indicated.
>>>
>>> A syntax that consistently yields a column vector with both tibbles
>> and data.frames is
>>>
>>> dta[[ 1 ]]
>>>
>>> so
>>>
>>> ctab <- function(data) {
>>>      CrossTable(data[[1]], data[[2]], prop.chisq = FALSE, prop.c =
>> FALSE,
>>> prop.t = FALSE, format = "SPSS")
>>> }
>>>
>>> should work.
>>>
>>> On September 20, 2019 10:59:46 AM PDT, Duncan Murdoch
>> <[hidden email]> wrote:
>>>> On 20/09/2019 11:30 a.m., Zachary Lim wrote:
>>>>> Hi,
>>>>>
>>>>> I'm trying to create a simple function that takes a dataframe as
>> its
>>>> only argument. I've been using gmodels::CrossTable, but it requires
>> a
>>>> lot of arguments, e.g.:
>>>>>
>>>>> #this runs fine
>>>>> CrossTable(data$col1, data$col2, prop.chisq = FALSE, prop.c =
>> FALSE,
>>>> prop.t = FALSE, format = "SPSS")
>>>>>
>>>>> Moreover, I wanted to make it compatible with piping, so I decided
>> to
>>>> create the following function:
>>>>>
>>>>> ctab <- function(data) {
>>>>>      CrossTable(data[,1], data[,2], prop.chisq = FALSE, prop.c =
>> FALSE,
>>>> prop.t = FALSE, format = "SPSS")
>>>>> }
>>>>>
>>>>> When I try to use this function, however, I get the following
>> error:
>>>>>
>>>>> #this results in 'Error: Must use a vector in `[`, not an object of
>>>> class matrix.'
>>>>> data %>% select(col1, col2) %>% ctab()
>>>>>
>>>>> I tried searching online but couldn't find much about that error
>>>> (except for in specific and unrelated cases). Moreover, when I
>> created
>>>> a very simple dataset, it turns out there's no problem:
>>>>>
>>>>> #this runs fine
>>>>> data.frame(C1 = c('x','y','x','y'), C2 = c('a','a','b','b')) %>%
>>>> ctab()
>>>>>
>>>>>
>>>>> Is this a problem with my function or the data? If it's the data,
>> why
>>>> does directly calling CrossTable work?
>>>>
>>>> Presumably  data %>% select(col1, col2)  isn't giving you a
>> dataframe.
>>>> However, you haven't given us a reproducible example, so I can't
>> tell
>>>> you what it's doing.  But that's where you should look.
>>>>
>>>> Duncan Murdoch
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.