selecting certain rows from data frame

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

selecting certain rows from data frame

Hrithik R
Hi,
if I have a dataframe such that

ID Time  Earn
1        1        10
1        2        50
1        3        68
2        1        40
2        2        78
2        4       88
3        1        50
3        2        60
3        3        98
4        1        33
4        2        48
4        4       58
.....
....
.....

Now if I have to select the all the rows from the data frame which does not
include rows with certain IDs, say for example (prime) ID == 2 & 3, how do I do
it


Thanks

Rith


     
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

steven mosher
Hi,
Next time give folks code to produce a toy sample of your problem

 DF <-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
  DF
   ID       Data Stuff
1   1  2.0628225     1
2   1  0.6599165     2
3   1  0.5672595     3
4   2 -0.5308823     4
5   2 -0.5358471     5
6   2 -0.1414992     6
7   3 -0.1679643     7
8   3  0.9220922     8
9   3  0.8863018     9
10  4 -0.7255916    10
11  4 -1.2446753    11
12  4  0.8165567    12
13  5  0.0925008    13
14  5 -0.8534803    14
15  5 -0.6535016    15

# now I want to select rows where ID = 2 or 5
# Assign DF2 to those elements of DF where the ID variable=2 or 5

 DF2 <- DF[which(DF$ID==2 | DF$ID==5), ]
 DF2
   ID       Data Stuff
4   2 -0.5308823     4
5   2 -0.5358471     5
6   2 -0.1414992     6
13  5  0.0925008    13
14  5 -0.8534803    14
15  5 -0.6535016    15

On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R <[hidden email]> wrote:

> Hi,
> if I have a dataframe such that
>
> ID Time  Earn
> 1        1        10
> 1        2        50
> 1        3        68
> 2        1        40
> 2        2        78
> 2        4       88
> 3        1        50
> 3        2        60
> 3        3        98
> 4        1        33
> 4        2        48
> 4        4       58
> .....
> ....
> .....
>
> Now if I have to select the all the rows from the data frame which does not
> include rows with certain IDs, say for example (prime) ID == 2 & 3, how do
> I do
> it
>
>
> Thanks
>
> Rith
>
>
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

Peter Ehlers
On 2010-12-14 23:57, steven mosher wrote:

> Hi,
> Next time give folks code to produce a toy sample of your problem
>
>   DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
>    DF
>     ID       Data Stuff
> 1   1  2.0628225     1
> 2   1  0.6599165     2
> 3   1  0.5672595     3
> 4   2 -0.5308823     4
> 5   2 -0.5358471     5
> 6   2 -0.1414992     6
> 7   3 -0.1679643     7
> 8   3  0.9220922     8
> 9   3  0.8863018     9
> 10  4 -0.7255916    10
> 11  4 -1.2446753    11
> 12  4  0.8165567    12
> 13  5  0.0925008    13
> 14  5 -0.8534803    14
> 15  5 -0.6535016    15
>
> # now I want to select rows where ID = 2 or 5
> # Assign DF2 to those elements of DF where the ID variable=2 or 5
>
>   DF2<- DF[which(DF$ID==2 | DF$ID==5), ]

Or use subset():

  DF2 <- subset(DF, ID %in% c(2,5))

Peter Ehlers

>   DF2
>     ID       Data Stuff
> 4   2 -0.5308823     4
> 5   2 -0.5358471     5
> 6   2 -0.1414992     6
> 13  5  0.0925008    13
> 14  5 -0.8534803    14
> 15  5 -0.6535016    15
>
> On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R<[hidden email]>  wrote:
>
>> Hi,
>> if I have a dataframe such that
>>
>> ID Time  Earn
>> 1        1        10
>> 1        2        50
>> 1        3        68
>> 2        1        40
>> 2        2        78
>> 2        4       88
>> 3        1        50
>> 3        2        60
>> 3        3        98
>> 4        1        33
>> 4        2        48
>> 4        4       58
>> .....
>> ....
>> .....
>>
>> Now if I have to select the all the rows from the data frame which does not
>> include rows with certain IDs, say for example (prime) ID == 2&  3, how do
>> I do
>> it
>>
>>
>> Thanks
>>
>> Rith
>>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

Ivan Calandra
In reply to this post by steven mosher
Hi,

Just to note that which() is unnecessary here:
DF2 <- DF[DF$ID==2 | DF$ID==5, ]

Ivan

Le 12/15/2010 08:57, steven mosher a écrit :

> Hi,
> Next time give folks code to produce a toy sample of your problem
>
>   DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
>    DF
>     ID       Data Stuff
> 1   1  2.0628225     1
> 2   1  0.6599165     2
> 3   1  0.5672595     3
> 4   2 -0.5308823     4
> 5   2 -0.5358471     5
> 6   2 -0.1414992     6
> 7   3 -0.1679643     7
> 8   3  0.9220922     8
> 9   3  0.8863018     9
> 10  4 -0.7255916    10
> 11  4 -1.2446753    11
> 12  4  0.8165567    12
> 13  5  0.0925008    13
> 14  5 -0.8534803    14
> 15  5 -0.6535016    15
>
> # now I want to select rows where ID = 2 or 5
> # Assign DF2 to those elements of DF where the ID variable=2 or 5
>
>   DF2<- DF[which(DF$ID==2 | DF$ID==5), ]
>   DF2
>     ID       Data Stuff
> 4   2 -0.5308823     4
> 5   2 -0.5358471     5
> 6   2 -0.1414992     6
> 13  5  0.0925008    13
> 14  5 -0.8534803    14
> 15  5 -0.6535016    15
>
> On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R<[hidden email]>  wrote:
>
>> Hi,
>> if I have a dataframe such that
>>
>> ID Time  Earn
>> 1        1        10
>> 1        2        50
>> 1        3        68
>> 2        1        40
>> 2        2        78
>> 2        4       88
>> 3        1        50
>> 3        2        60
>> 3        3        98
>> 4        1        33
>> 4        2        48
>> 4        4       58
>> .....
>> ....
>> .....
>>
>> Now if I have to select the all the rows from the data frame which does not
>> include rows with certain IDs, say for example (prime) ID == 2&  3, how do
>> I do
>> it
>>
>>
>> Thanks
>>
>> Rith
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
[hidden email]

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

David Winsemius

On Dec 15, 2010, at 4:18 AM, Ivan Calandra wrote:

> Hi,
>
> Just to note that which() is unnecessary here:
> DF2 <- DF[DF$ID==2 | DF$ID==5, ]

And to further note that it is only unnecessary of you have no NA's in  
that ID column.

 > DF[4,1] <- NA
 > DF[8,1] <- NA
 > DF2 <- DF[DF$ID==2 | DF$ID==5, ]

(These NA rows would not appear if which() were used.)

--
David.

>
> Ivan
>
> Le 12/15/2010 08:57, steven mosher a écrit :
>> Hi,
>> Next time give folks code to produce a toy sample of your problem
>>
>>  DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
>>   DF
>>    ID       Data Stuff
>> 1   1  2.0628225     1
>> 2   1  0.6599165     2
>> 3   1  0.5672595     3
>> 4   2 -0.5308823     4
>> 5   2 -0.5358471     5
>> 6   2 -0.1414992     6
>> 7   3 -0.1679643     7
>> 8   3  0.9220922     8
>> 9   3  0.8863018     9
>> 10  4 -0.7255916    10
>> 11  4 -1.2446753    11
>> 12  4  0.8165567    12
>> 13  5  0.0925008    13
>> 14  5 -0.8534803    14
>> 15  5 -0.6535016    15
>>
>> # now I want to select rows where ID = 2 or 5
>> # Assign DF2 to those elements of DF where the ID variable=2 or 5
>>
>>  DF2<- DF[which(DF$ID==2 | DF$ID==5), ]
>>  DF2
>>    ID       Data Stuff
>> 4   2 -0.5308823     4
>> 5   2 -0.5358471     5
>> 6   2 -0.1414992     6
>> 13  5  0.0925008    13
>> 14  5 -0.8534803    14
>> 15  5 -0.6535016    15
>>
>> On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R<[hidden email]>  wrote:
>>
>>> Hi,
>>> if I have a dataframe such that
>>>
>>> ID Time  Earn
>>> 1        1        10
>>> 1        2        50
>>> 1        3        68
>>> 2        1        40
>>> 2        2        78
>>> 2        4       88
>>> 3        1        50
>>> 3        2        60
>>> 3        3        98
>>> 4        1        33
>>> 4        2        48
>>> 4        4       58
>>> .....
>>> ....
>>> .....
>>>
>>> Now if I have to select the all the rows from the data frame which  
>>> does not
>>> include rows with certain IDs, say for example (prime) ID == 2&  
>>> 3, how do
>>> I do
>>> it
>>>
>>>
>>> Thanks
>>>
>>> Rith
>>>
>>>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Ivan CALANDRA
> PhD Student
> University of Hamburg
> Biozentrum Grindel und Zoologisches Museum
> Abt. Säugetiere
> Martin-Luther-King-Platz 3
> D-20146 Hamburg, GERMANY
> +49(0)40 42838 6231
> [hidden email]
>
> **********
> http://www.for771.uni-bonn.de
> http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

Hrithik R
In reply to this post by Peter Ehlers
Hi Steven and Peter,
I apologise for not providing the code for the sample

I now realise what I need may be a bit tricky...
my dataframe has hundreds of IDs in which case Steven's solution will not be
optimum

Peter's solution seems best, but how do I reverse this and use it to select the
dataframe rows which do not contain particular IDs say for example IDs 2 and 5
in this case.

Thanks again for your time
Rith


________________________________
From: Peter Ehlers <[hidden email]>
To: steven mosher <[hidden email]>

org>
Sent: Wed, December 15, 2010 3:26:14 AM
Subject: Re: [R] selecting certain rows from data frame

On 2010-12-14 23:57, steven mosher wrote:

> Hi,
> Next time give folks code to produce a toy sample of your problem
>
>  DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
>    DF
>    ID      Data Stuff
> 1  1  2.0628225    1
> 2  1  0.6599165    2
> 3  1  0.5672595    3
> 4  2 -0.5308823    4
> 5  2 -0.5358471    5
> 6  2 -0.1414992    6
> 7  3 -0.1679643    7
> 8  3  0.9220922    8
> 9  3  0.8863018    9
> 10  4 -0.7255916    10
> 11  4 -1.2446753    11
> 12  4  0.8165567    12
> 13  5  0.0925008    13
> 14  5 -0.8534803    14
> 15  5 -0.6535016    15
>
> # now I want to select rows where ID = 2 or 5
> # Assign DF2 to those elements of DF where the ID variable=2 or 5
>
>  DF2<- DF[which(DF$ID==2 | DF$ID==5), ]
Or use subset():

  DF2 <- subset(DF, ID %in% c(2,5))

Peter Ehlers

>  DF2
>    ID      Data Stuff
> 4  2 -0.5308823    4
> 5  2 -0.5358471    5
> 6  2 -0.1414992    6
> 13  5  0.0925008    13
> 14  5 -0.8534803    14
> 15  5 -0.6535016    15
>

>
>> Hi,
>> if I have a dataframe such that
>>
>> ID Time  Earn
>> 1        1        10
>> 1        2        50
>> 1        3        68
>> 2        1        40
>> 2        2        78
>> 2        4      88
>> 3        1        50
>> 3        2        60
>> 3        3        98
>> 4        1        33
>> 4        2        48
>> 4        4      58
>> .....
>> ....
>> .....
>>
>> Now if I have to select the all the rows from the data frame which does not
>> include rows with certain IDs, say for example (prime) ID == 2&  3, how do
>> I do
>> it
>>
>>
>> Thanks
>>
>> Rith
>>


     
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

Hrithik R
In reply to this post by Peter Ehlers
Hi Steven and Peter,
I apologise for not providing the code for the sample

I now realise what I need may be a bit tricky...
my dataframe has hundreds of IDs in which case Steven's solution will not be
optimum

Peter's solution seems best, but how do I reverse this and use it to select the
dataframe rows which do not contain particular IDs say for example IDs 2 and 5
in this case.

Thanks again for your time
Rith




________________________________
From: Peter Ehlers <[hidden email]>
To: steven mosher <[hidden email]>

org>
Sent: Wed, December 15, 2010 3:26:14 AM
Subject: Re: [R] selecting certain rows from data frame

On 2010-12-14 23:57, steven mosher wrote:

> Hi,
> Next time give folks code to produce a toy sample of your problem
>
>  DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
>    DF
>    ID      Data Stuff
> 1  1  2.0628225    1
> 2  1  0.6599165    2
> 3  1  0.5672595    3
> 4  2 -0.5308823    4
> 5  2 -0.5358471    5
> 6  2 -0.1414992    6
> 7  3 -0.1679643    7
> 8  3  0.9220922    8
> 9  3  0.8863018    9
> 10  4 -0.7255916    10
> 11  4 -1.2446753    11
> 12  4  0.8165567    12
> 13  5  0.0925008    13
> 14  5 -0.8534803    14
> 15  5 -0.6535016    15
>
> # now I want to select rows where ID = 2 or 5
> # Assign DF2 to those elements of DF where the ID variable=2 or 5
>
>  DF2<- DF[which(DF$ID==2 | DF$ID==5), ]
Or use subset():

  DF2 <- subset(DF, ID %in% c(2,5))

Peter Ehlers

>  DF2
>    ID      Data Stuff
> 4  2 -0.5308823    4
> 5  2 -0.5358471    5
> 6  2 -0.1414992    6
> 13  5  0.0925008    13
> 14  5 -0.8534803    14
> 15  5 -0.6535016    15
>

>
>> Hi,
>> if I have a dataframe such that
>>
>> ID Time  Earn
>> 1        1        10
>> 1        2        50
>> 1        3        68
>> 2        1        40
>> 2        2        78
>> 2        4      88
>> 3        1        50
>> 3        2        60
>> 3        3        98
>> 4        1        33
>> 4        2        48
>> 4        4      58
>> .....
>> ....
>> .....
>>
>> Now if I have to select the all the rows from the data frame which does not
>> include rows with certain IDs, say for example (prime) ID == 2&  3, how do
>> I do
>> it
>>
>>
>> Thanks
>>
>> Rith
>>


     
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

Peter Ehlers
On 2010-12-15 09:44, Hrithik R wrote:
> Hi Steven and Peter,
> I apologise for not providing the code for the sample
> I now realise what I need may be a bit tricky...
> my dataframe has hundreds of IDs in which case Steven's solution will
> not be optimum
> Peter's solution seems best, but how do I reverse this and use it to
> select the dataframe rows which */_do not_/* contain particular IDs say
> for example IDs 2 and 5 in this case.

That's easy; use the 'NOT' operator ('!' in R):

  DF2 <- subset(DF, !(ID %in% c(2,5)))

Peter Ehlers

> Thanks again for your time
> Rith
>
> ------------------------------------------------------------------------
> *From:* Peter Ehlers <[hidden email]>
> *To:* steven mosher <[hidden email]>
> *Cc:* Hrithik R <[hidden email]>; "[hidden email]"
> <[hidden email]>
> *Sent:* Wed, December 15, 2010 3:26:14 AM
> *Subject:* Re: [R] selecting certain rows from data frame
>
> On 2010-12-14 23:57, steven mosher wrote:
>  > Hi,
>  > Next time give folks code to produce a toy sample of your problem
>  >
>  > DF<-data.frame(ID=rep(1:5,each=3),Data=rnorm(15),Stuff=seq(1:15))
>  > DF
>  > ID Data Stuff
>  > 1 1 2.0628225 1
>  > 2 1 0.6599165 2
>  > 3 1 0.5672595 3
>  > 4 2 -0.5308823 4
>  > 5 2 -0.5358471 5
>  > 6 2 -0.1414992 6
>  > 7 3 -0.1679643 7
>  > 8 3 0.9220922 8
>  > 9 3 0.8863018 9
>  > 10 4 -0.7255916 10
>  > 11 4 -1.2446753 11
>  > 12 4 0.8165567 12
>  > 13 5 0.0925008 13
>  > 14 5 -0.8534803 14
>  > 15 5 -0.6535016 15
>  >
>  > # now I want to select rows where ID = 2 or 5
>  > # Assign DF2 to those elements of DF where the ID variable=2 or 5
>  >
>  > DF2<- DF[which(DF$ID==2 | DF$ID==5), ]
>
> Or use subset():
>
> DF2 <- subset(DF, ID %in% c(2,5))
>
> Peter Ehlers
>
>  > DF2
>  > ID Data Stuff
>  > 4 2 -0.5308823 4
>  > 5 2 -0.5358471 5
>  > 6 2 -0.1414992 6
>  > 13 5 0.0925008 13
>  > 14 5 -0.8534803 14
>  > 15 5 -0.6535016 15
>  >
>  > On Tue, Dec 14, 2010 at 10:10 PM, Hrithik R<[hidden email]
> <mailto:[hidden email]>> wrote:
>  >
>  >> Hi,
>  >> if I have a dataframe such that
>  >>
>  >> ID Time Earn
>  >> 1 1 10
>  >> 1 2 50
>  >> 1 3 68
>  >> 2 1 40
>  >> 2 2 78
>  >> 2 4 88
>  >> 3 1 50
>  >> 3 2 60
>  >> 3 3 98
>  >> 4 1 33
>  >> 4 2 48
>  >> 4 4 58
>  >> .....
>  >> ....
>  >> .....
>  >>
>  >> Now if I have to select the all the rows from the data frame which
> does not
>  >> include rows with certain IDs, say for example (prime) ID == 2& 3,
> how do
>  >> I do
>  >> it
>  >>
>  >>
>  >> Thanks
>  >>
>  >> Rith
>  >>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

marymurphy
This post has NOT been accepted by the mailing list yet.
I have a problem similar to his - but on a larger scale.

I have a big dataset of prescription use. I have about 5 million rows of data per month. Each row is one prescription. The person ids are held in a column, therefore could have 5 million rows of data for, example, 1 million people. Other columns have data on date obtained, drug type, sex, age, etc etc.

i want to generate demograhics for each UNIQUE id. I do this by making a subset

justdemos<-aggregate(age~id*sex*insurancetype, data=unique(df[c("age", "id", "sex", "insurancetype")]),FUN=mean

this works fine for his purpose. But now i want to find mean prescription use for each unique id. Each row has data on the date the drug went out and what drug it was. So I want to find the number of dates per unique id.

Here is an example of data;

drugs<-c("drug x", "drug y", "drug z")
DF<-data.frame(ID=c(1,2,3,1,1,1,2,5,3,4,4,3),drugs,month=seq(1:3))
DF

   ID  drugs month
1   1 drug x     1
2   2 drug y     2
3   3 drug z     3
4   1 drug x     1
5   1 drug y     2
6   1 drug z     3
7   2 drug x     1
8   5 drug y     2
9   3 drug z     3
10  4 drug x     1
11  4 drug y     2
12  3 drug z     3

does anyone know a good way of isolating unique IDs (there will be up to a million of them) to find out their mean use of RXs per month?

Thank you,
SJ

Reply | Threaded
Open this post in threaded view
|

Re: selecting certain rows from data frame

arun kirshna
In reply to this post by Hrithik R


Hi,
You can use ?split()
 lst1<-split(DF,DF$ID)
lst1[1:2]
#$`1`
#  ID  drugs month
#1  1 drug x     1
#4  1 drug x     1
#5  1 drug y     2
#6  1 drug z     3
#
#$`2`
 # ID  drugs month
#2  2 drug y     2
#7  2 drug x     1

mean(sapply(lst1,nrow))
#[1] 2.4
#or
library(plyr)
 mean(ddply(DF,.(ID),nrow)[,2])
#[1] 2.4
#or
mean(with(DF,tapply(ID,ID,FUN=length)))
#[1] 2.4
A.K.




________________________________
From: Sarah Jo Sinnott <[hidden email]>
To: arun <[hidden email]>
Sent: Friday, May 3, 2013 4:35 PM
Subject: Re: selecting certain rows from data frame



Yes - but if I can count the number of rows for each ID, this equates to number of drugs per each ID. So that way I can get a mean #rows(drugs).

e.g.,

ID 1 = 4 rows (approx=4drugs)
ID2= 2 rows
ID 3 = 3 rows
ID 4 = 2 rows
ID 5 = 1 row

12 rows/5people = 2.4rows/person

that is 2.4 drugs per person.

Do you think it is possible to isolate the number of rows per unique ID? It would be great if you could! I'v etried reorganising my data into wide format - but it doesn't work very well, so I'm left with his option really!

Thank you for you help thus far

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.