R equivalent for SQL query

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

R equivalent for SQL query

Steven Raemaekers
Hi,

I have a query which I would like to translate into R, but I do not know how to do it in an easy way.
Assume a data frame has columns A, B and C:

A B C
1 1 3
1 1 4
1 1 5
1 2 6
1 2 7
1 3 8

The query is as follows:

select A, B, count(*)
from data.frame
group by A, B
order by count(*) desc

How do I translate this into R statements in such way that the result is a data frame structured as follows:

A B count(*)
1 1 3
1 2 2
1 3 1

Thanks,

Steven
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R equivalent for SQL query

Steven Raemaekers
I'm sorry, but I think that's just plain rude. This is exactly the reason why people do not like to ask for help on mailing lists and internet forums.

The help file is full of valuable information, if you know exactly where to look. For someone who doesn't know this, it is rather more complicated.
There are several options in the help file of table, and I do not know exactly how to apply this function or which options to choose.

So, again, can anyone help me to translate this query to R? Thanks!

Regards,

Steven

On 3 apr. 2012, at 21:07, Bert Gunter wrote:

> If reading the help file for the function that does what I think you
> want is not helpful enough, you'll have to seek elsewhere. I expect
> some minimal learning effort.
>
> (?table means type this at the prompt in the GUI).
>
> -- Bert
>
> On Tue, Apr 3, 2012 at 12:04 PM, Steven Raemaekers <[hidden email]> wrote:
>> Thanks. Excuse me for saying this, but that is not very helpful. Can you provide **one**, since that would be enough…:P
>>
>> On 3 apr. 2012, at 21:01, Bert Gunter wrote:
>>
>>> ?table
>>>
>>> There are **many** other ways.
>>>
>>> -- Bert
>>>
>>> On Tue, Apr 3, 2012 at 11:26 AM, Steven Raemaekers <[hidden email]> wrote:
>>>> Hi,
>>>>
>>>> I have a query which I would like to translate into R, but I do not know how to do it in an easy way.
>>>> Assume a data frame has columns A, B and C:
>>>>
>>>> A       B       C
>>>> 1       1       3
>>>> 1       1       4
>>>> 1       1       5
>>>> 1       2       6
>>>> 1       2       7
>>>> 1       3       8
>>>>
>>>> The query is as follows:
>>>>
>>>> select A, B, count(*)
>>>> from data.frame
>>>> group by A, B
>>>> order by count(*) desc
>>>>
>>>> How do I translate this into R statements in such way that the result is a data frame structured as follows:
>>>>
>>>> A       B       count(*)
>>>> 1       1       3
>>>> 1       2       2
>>>> 1       3       1
>>>>
>>>> Thanks,
>>>>
>>>> Steven
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>> --
>>>
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>>
>>> Internal Contact Info:
>>> Phone: 467-7374
>>> Website:
>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>
>>
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R equivalent for SQL query

andrija djurovic
In reply to this post by Steven Raemaekers
Hi,

here are some solutions:

DF <- read.table(textConnection("
A       B       C
1       1       3
1       1       4
1       1       5
1       2       6
1       2       7
1       3       8 "), header=TRUE)

#using sqldf package
library(sqldf)
sqldf("select A, B, count(*)
      from DF
      group by A, B
      order by count(*) desc")

#using function table
as.data.frame(table(DF$A, DF$B))

As you can see, you can use sqldf package for performing sql queries
on R data frames.

Andrija



On Tue, Apr 3, 2012 at 8:26 PM, Steven Raemaekers <[hidden email]> wrote:

> Hi,
>
> I have a query which I would like to translate into R, but I do not know how to do it in an easy way.
> Assume a data frame has columns A, B and C:
>
> A       B       C
> 1       1       3
> 1       1       4
> 1       1       5
> 1       2       6
> 1       2       7
> 1       3       8
>
> The query is as follows:
>
> select A, B, count(*)
> from data.frame
> group by A, B
> order by count(*) desc
>
> How do I translate this into R statements in such way that the result is a data frame structured as follows:
>
> A       B       count(*)
> 1       1       3
> 1       2       2
> 1       3       1
>
> Thanks,
>
> Steven
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R equivalent for SQL query

jholtman
Here is a solution using the data.table package:

> x <- read.table(text = "A       B       C
+ 1       1       3
+ 1       1       4
+ 1       1       5
+ 1       2       6
+ 1       2       7
+ 1       3       8", header = TRUE)
> require(data.table)
> x <- data.table(x)  # convert to a data.table
> # query
> result <- x[
+           , list(count = length(C))
+           , by = list(A,B)
+           ]
> # unsorted result
> result
     A B count
[1,] 1 1     3
[2,] 1 2     2
[3,] 1 3     1
> # sorted result
> result[order(result$count, decreasing = TRUE), ]
     A B count
[1,] 1 1     3
[2,] 1 2     2
[3,] 1 3     1
>
>


On Tue, Apr 3, 2012 at 3:50 PM, andrija djurovic <[hidden email]> wrote:

> Hi,
>
> here are some solutions:
>
> DF <- read.table(textConnection("
> A       B       C
> 1       1       3
> 1       1       4
> 1       1       5
> 1       2       6
> 1       2       7
> 1       3       8 "), header=TRUE)
>
> #using sqldf package
> library(sqldf)
> sqldf("select A, B, count(*)
>      from DF
>      group by A, B
>      order by count(*) desc")
>
> #using function table
> as.data.frame(table(DF$A, DF$B))
>
> As you can see, you can use sqldf package for performing sql queries
> on R data frames.
>
> Andrija
>
>
>
> On Tue, Apr 3, 2012 at 8:26 PM, Steven Raemaekers <[hidden email]> wrote:
>> Hi,
>>
>> I have a query which I would like to translate into R, but I do not know how to do it in an easy way.
>> Assume a data frame has columns A, B and C:
>>
>> A       B       C
>> 1       1       3
>> 1       1       4
>> 1       1       5
>> 1       2       6
>> 1       2       7
>> 1       3       8
>>
>> The query is as follows:
>>
>> select A, B, count(*)
>> from data.frame
>> group by A, B
>> order by count(*) desc
>>
>> How do I translate this into R statements in such way that the result is a data frame structured as follows:
>>
>> A       B       count(*)
>> 1       1       3
>> 1       2       2
>> 1       3       1
>>
>> Thanks,
>>
>> Steven
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R equivalent for SQL query

atyre2
In reply to this post by Steven Raemaekers
As you can see there are lots of ways to get this done! But in vanilla R:
test = data.frame(A=rep(1,6),B=c(1,1,1,2,2,3),C=1:6)
with(test,aggregate(C,list(A,B),length))
does what you've asked, I think.
On Tue, Apr 3, 2012 at 1:26 PM, Steven Raemaekers <[hidden email]>wrote:

> Hi,
>
> I have a query which I would like to translate into R, but I do not know
> how to do it in an easy way.
> Assume a data frame has columns A, B and C:
>
> A       B       C
> 1       1       3
> 1       1       4
> 1       1       5
> 1       2       6
> 1       2       7
> 1       3       8
>
> The query is as follows:
>
> select A, B, count(*)
> from data.frame
> group by A, B
> order by count(*) desc
>
> How do I translate this into R statements in such way that the result is a
> data frame structured as follows:
>
> A       B       count(*)
> 1       1       3
> 1       2       2
> 1       3       1
>
> Thanks,
>
> Steven
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Drew Tyre

School of Natural Resources
University of Nebraska-Lincoln
416 Hardin Hall, East Campus
3310 Holdrege Street
Lincoln, NE 68583-0974

phone: +1 402 472 4054
fax: +1 402 472 2946
email: [hidden email]
http://snr.unl.edu/tyre
http://aminpractice.blogspot.com
http://www.flickr.com/photos/atiretoo

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R equivalent for SQL query

David Winsemius
In reply to this post by Steven Raemaekers

On Apr 3, 2012, at 3:26 PM, Steven Raemaekers wrote:

> I'm sorry, but I think that's just plain rude. This is exactly the  
> reason why people do not like to ask for help on mailing lists and  
> internet forums.

The usual expectation on this list is for a questioner to demonstrate  
what was tried after advice is offered.  To this point you have been  
given two earlier answers and do not seem to acknowledge those  
efforts. Admittedly both of those solutions did require that you  
install and load a non-base package to get the resutls you wanted.  
However,  did you show any further effort.

>
> The help file is full of valuable information, if you know exactly  
> where to look. For someone who doesn't know this, it is rather more  
> complicated.

I would have expected the 'table' function to have been demonstrated  
in "Introduction to R".

> There are several options in the help file of table, and I do not  
> know exactly how to apply this function or which options to choose.
>

 > dput(dat)
structure(list(A = c(1L, 1L, 1L, 1L, 1L, 1L), B = c(1L, 1L, 1L,
2L, 2L, 3L), C = 3:8), .Names = c("A", "B", "C"), class =  
"data.frame", row.names = c(NA,
-6L))


 > table(A=dat$A, B=dat$B)
    B
A   1 2 3
   1 3 2 1

Now I happen to know (after reading the help pages and following links  
and running examples) that 'as.data.frame' is a useful way of coercing  
a contingency table to a dataframe, and that function _is_ mentioned  
in the Value section on the ?table help page ....  which is actually  
the function called when the following code is executed:

You are expected to read
as.data.frame(table(A=dat$A, B=dat$B))
   A B Freq
1 1 1    3
2 1 2    2
3 1 3    1

?ftable is also a useful function (and it was listed on ?table)

--
David.

> So, again, can anyone help me to translate this query to R? Thanks!
>
> Regards,
>
> Steven
>
> On 3 apr. 2012, at 21:07, Bert Gunter wrote:
>
>> If reading the help file for the function that does what I think you
>> want is not helpful enough, you'll have to seek elsewhere. I expect
>> some minimal learning effort.
>>
>> (?table means type this at the prompt in the GUI).
>>
>> -- Bert
>>
>> On Tue, Apr 3, 2012 at 12:04 PM, Steven Raemaekers <[hidden email]
>> > wrote:
>>> Thanks. Excuse me for saying this, but that is not very helpful.  
>>> Can you provide **one**, since that would be enough…:P
>>>
>>> On 3 apr. 2012, at 21:01, Bert Gunter wrote:
>>>
>>>> ?table
>>>>
>>>> There are **many** other ways.
>>>>
>>>> -- Bert
>>>>
>>>> On Tue, Apr 3, 2012 at 11:26 AM, Steven Raemaekers <[hidden email]
>>>> > wrote:
>>>>> Hi,
>>>>>
>>>>> I have a query which I would like to translate into R, but I do  
>>>>> not know how to do it in an easy way.
>>>>> Assume a data frame has columns A, B and C:
>>>>>
>>>>> A       B       C
>>>>> 1       1       3
>>>>> 1       1       4
>>>>> 1       1       5
>>>>> 1       2       6
>>>>> 1       2       7
>>>>> 1       3       8
>>>>>
>>>>> The query is as follows:
>>>>>
>>>>> select A, B, count(*)
>>>>> from data.frame
>>>>> group by A, B
>>>>> order by count(*) desc
>>>>>
>>>>> How do I translate this into R statements in such way that the  
>>>>> result is a data frame structured as follows:
>>>>>
>>>>> A       B       count(*)
>>>>> 1       1       3
>>>>> 1       2       2
>>>>> 1       3       1
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Steven
>>>>> _____________

>

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: R equivalent for SQL query

Henrique Dallazuanna
In reply to this post by Steven Raemaekers
Here is another way:

aggregate(.~ A + B, FUN = length, x)


On Tue, Apr 3, 2012 at 3:26 PM, Steven Raemaekers <[hidden email]>wrote:

> Hi,
>
> I have a query which I would like to translate into R, but I do not know
> how to do it in an easy way.
> Assume a data frame has columns A, B and C:
>
> A       B       C
> 1       1       3
> 1       1       4
> 1       1       5
> 1       2       6
> 1       2       7
> 1       3       8
>
> The query is as follows:
>
> select A, B, count(*)
> from data.frame
> group by A, B
> order by count(*) desc
>
> How do I translate this into R statements in such way that the result is a
> data frame structured as follows:
>
> A       B       count(*)
> 1       1       3
> 1       2       2
> 1       3       1
>
> Thanks,
>
> Steven
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.