Finding unique terms

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Finding unique terms

roslinazairimah zakaria
Dear r-users,

I have this data:

structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
    COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
    4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
    "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
"factor"),
    PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
    82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
    100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
    41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
    X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
"COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
"data.frame", row.names = c(NA,
-11L))

I want to combine the same Student ID and add up all the values for PO1M,
PO1T,...,PO2T obtained by the same ID.

How do I do that?
Thank you for any help given.

--
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: [hidden email] <[hidden email]>;
[hidden email] <[hidden email]>*
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

Dénes Tóth-2


On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote:

> Dear r-users,
>
> I have this data:
>
> structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
>      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
> "factor"),
>      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> "data.frame", row.names = c(NA,
> -11L))
>
> I want to combine the same Student ID and add up all the values for PO1M,
> PO1T,...,PO2T obtained by the same ID.

dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
     COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
     4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
     "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
"factor"),
     PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
     82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
     100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
     41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
     X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
     NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
"COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
"data.frame", row.names = c(NA,
-11L))

# I assume you would like to add up the values with na.rm = TRUE
meanFn <- function(x) mean(x, na.rm = TRUE)

# see ?aggregate
aggregate(dat[, c("PO1M", "PO1T", "PO2M")],
           by = dat["STUDENT_ID"],
           FUN = meanFn)

# if you have largish or large data
library(data.table)
dat2 <- as.data.table(dat)
dat2[, lapply(.SD, meanFn),
      by = STUDENT_ID,
      .SDcols = c("PO1M", "PO1T", "PO2M")]


Regards,
Denes


>
> How do I do that?
> Thank you for any help given.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

roslinazairimah zakaria
Hi Denes,

It works perfectly as I want!

Thanks a lot.

On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth <[hidden email]> wrote:

>
>
> On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote:
> > Dear r-users,
> >
> > I have this data:
> >
> > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
> >      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
> >      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
> >      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
> > "factor"),
> >      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
> >      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
> >      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
> >      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
> >      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
> >      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
> > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> > "data.frame", row.names = c(NA,
> > -11L))
> >
> > I want to combine the same Student ID and add up all the values for PO1M,
> > PO1T,...,PO2T obtained by the same ID.
>
> dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
>      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
> "factor"),
>      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> "data.frame", row.names = c(NA,
> -11L))
>
> # I assume you would like to add up the values with na.rm = TRUE
> meanFn <- function(x) mean(x, na.rm = TRUE)
>
> # see ?aggregate
> aggregate(dat[, c("PO1M", "PO1T", "PO2M")],
>            by = dat["STUDENT_ID"],
>            FUN = meanFn)
>
> # if you have largish or large data
> library(data.table)
> dat2 <- as.data.table(dat)
> dat2[, lapply(.SD, meanFn),
>       by = STUDENT_ID,
>       .SDcols = c("PO1M", "PO1T", "PO2M")]
>
>
> Regards,
> Denes
>
>
> >
> > How do I do that?
> > Thank you for any help given.
> >
>


--
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: [hidden email] <[hidden email]>;
[hidden email] <[hidden email]>*
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

Jeff Newmiller
You said "add up"... so you did not mean to say that? Denes computed the mean...

On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria <[hidden email]> wrote:

>Hi Denes,
>
>It works perfectly as I want!
>
>Thanks a lot.
>
>On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth <[hidden email]>
>wrote:
>
>>
>>
>> On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote:
>> > Dear r-users,
>> >
>> > I have this data:
>> >
>> > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
>> > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class =
>"factor"),
>> >      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>> >      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>> >      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class
>=
>> > "factor"),
>> >      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>> >      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>> >      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>> >      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>> >      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>> >      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =
>c("STUDENT_ID",
>> > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
>> > "data.frame", row.names = c(NA,
>> > -11L))
>> >
>> > I want to combine the same Student ID and add up all the values for
>PO1M,
>> > PO1T,...,PO2T obtained by the same ID.
>>
>> dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L,
>1L,
>> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class =
>"factor"),
>>      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>>      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>>      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
>> "factor"),
>>      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>>      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>>      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>>      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>>      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>>      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =
>c("STUDENT_ID",
>> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
>> "data.frame", row.names = c(NA,
>> -11L))
>>
>> # I assume you would like to add up the values with na.rm = TRUE
>> meanFn <- function(x) mean(x, na.rm = TRUE)
>>
>> # see ?aggregate
>> aggregate(dat[, c("PO1M", "PO1T", "PO2M")],
>>            by = dat["STUDENT_ID"],
>>            FUN = meanFn)
>>
>> # if you have largish or large data
>> library(data.table)
>> dat2 <- as.data.table(dat)
>> dat2[, lapply(.SD, meanFn),
>>       by = STUDENT_ID,
>>       .SDcols = c("PO1M", "PO1T", "PO2M")]
>>
>>
>> Regards,
>> Denes
>>
>>
>> >
>> > How do I do that?
>> > Thank you for any help given.
>> >
>>

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

Jim Lemon-4
Yes, I thought that as well and had worked out this but didn't send it:

add_Pscores<-function(x) {
return(sum(unlist(x),na.rm=TRUE))
}
by(rzdf[,c("PO1M", "PO1T", "PO2M", "PO2T")],rzdf$STUDENT_ID,FUN=add_Pscores)
rzdf$STUDENT_ID: AA15285
[1] 724.8
------------------------------------------------------------
rzdf$STUDENT_ID: AA15286
[1] 661.45

Jim
On Fri, Oct 12, 2018 at 1:37 PM Jeff Newmiller <[hidden email]> wrote:
>
> You said "add up"... so you did not mean to say that? Denes computed the mean...
>
> On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria <[hidden email]> wrote:
> >Hi Denes,
> >
> >It works perfectly as I want!

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

Dénes Tóth-2
In reply to this post by Jeff Newmiller


On 10/12/2018 04:36 AM, Jeff Newmiller wrote:
> You said "add up"... so you did not mean to say that? Denes computed the mean...

Nice catch, Jeff. Of course I wanted to use 'sum' instead of 'mean'.


>
> On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria <[hidden email]> wrote:
>> Hi Denes,
>>
>> It works perfectly as I want!
>>
>> Thanks a lot.
>>
>> On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth <[hidden email]>
>> wrote:
>>
>>>
>>>
>>> On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote:
>>>> Dear r-users,
>>>>
>>>> I have this data:
>>>>
>>>> structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
>>>> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class =
>> "factor"),
>>>>       COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>>>>       4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>>>>       "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class
>> =
>>>> "factor"),
>>>>       PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>>>>       82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>>>>       100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>>>>       41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>>>>       X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>>>>       NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =
>> c("STUDENT_ID",
>>>> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
>>>> "data.frame", row.names = c(NA,
>>>> -11L))
>>>>
>>>> I want to combine the same Student ID and add up all the values for
>> PO1M,
>>>> PO1T,...,PO2T obtained by the same ID.
>>>
>>> dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L,
>> 1L,
>>> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class =
>> "factor"),
>>>       COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>>>       4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>>>       "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
>>> "factor"),
>>>       PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>>>       82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>>>       100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>>>       41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>>>       X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>>>       NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =
>> c("STUDENT_ID",
>>> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
>>> "data.frame", row.names = c(NA,
>>> -11L))
>>>
>>> # I assume you would like to add up the values with na.rm = TRUE
>>> meanFn <- function(x) mean(x, na.rm = TRUE)
>>>
>>> # see ?aggregate
>>> aggregate(dat[, c("PO1M", "PO1T", "PO2M")],
>>>             by = dat["STUDENT_ID"],
>>>             FUN = meanFn)
>>>
>>> # if you have largish or large data
>>> library(data.table)
>>> dat2 <- as.data.table(dat)
>>> dat2[, lapply(.SD, meanFn),
>>>        by = STUDENT_ID,
>>>        .SDcols = c("PO1M", "PO1T", "PO2M")]
>>>
>>>
>>> Regards,
>>> Denes
>>>
>>>
>>>>
>>>> How do I do that?
>>>> Thank you for any help given.
>>>>
>>>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

Dénes Tóth-2


On 10/12/2018 08:58 AM, Dénes Tóth wrote:
>
>
> On 10/12/2018 04:36 AM, Jeff Newmiller wrote:
>> You said "add up"... so you did not mean to say that? Denes computed
>> the mean...
>
> Nice catch, Jeff. Of course I wanted to use 'sum' instead of 'mean'.

Oh, and one more note: If you have NAs in your columns, 'sum' is rarely
the aggregate statistic that you are after. Probably this is why my
subconscious statistician suggested 'mean'.

>
>
>>
>> On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria
>> <[hidden email]> wrote:
>>> Hi Denes,
>>>
>>> It works perfectly as I want!
>>>
>>> Thanks a lot.
>>>
>>> On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth <[hidden email]>
>>> wrote:
>>>
>>>>
>>>>
>>>> On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote:
>>>>> Dear r-users,
>>>>>
>>>>> I have this data:
>>>>>
>>>>> structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
>>>>> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class =
>>> "factor"),
>>>>>       COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>>>>>       4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>>>>>       "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class
>>> =
>>>>> "factor"),
>>>>>       PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>>>>>       82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>>>>>       100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>>>>>       41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>>>>>       X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>>>>>       NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =
>>> c("STUDENT_ID",
>>>>> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
>>>>> "data.frame", row.names = c(NA,
>>>>> -11L))
>>>>>
>>>>> I want to combine the same Student ID and add up all the values for
>>> PO1M,
>>>>> PO1T,...,PO2T obtained by the same ID.
>>>>
>>>> dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L,
>>> 1L,
>>>> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class =
>>> "factor"),
>>>>       COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>>>>       4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>>>>       "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
>>>> "factor"),
>>>>       PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>>>>       82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>>>>       100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>>>>       41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>>>>       X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>>>>       NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =
>>> c("STUDENT_ID",
>>>> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
>>>> "data.frame", row.names = c(NA,
>>>> -11L))
>>>>
>>>> # I assume you would like to add up the values with na.rm = TRUE
>>>> meanFn <- function(x) mean(x, na.rm = TRUE)
>>>>
>>>> # see ?aggregate
>>>> aggregate(dat[, c("PO1M", "PO1T", "PO2M")],
>>>>             by = dat["STUDENT_ID"],
>>>>             FUN = meanFn)
>>>>
>>>> # if you have largish or large data
>>>> library(data.table)
>>>> dat2 <- as.data.table(dat)
>>>> dat2[, lapply(.SD, meanFn),
>>>>        by = STUDENT_ID,
>>>>        .SDcols = c("PO1M", "PO1T", "PO2M")]
>>>>
>>>>
>>>> Regards,
>>>> Denes
>>>>
>>>>
>>>>>
>>>>> How do I do that?
>>>>> Thank you for any help given.
>>>>>
>>>>
>>

--
Dr. Tóth Dénes ügyvezető
Kogentum Kft.
Tel.: 06-30-2583723
Web: www.kogentum.hu

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

Robert Baer
In reply to this post by roslinazairimah zakaria

> Dear r-users,
>
> I have this data:
>
> structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
>      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
> "factor"),
>      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> "data.frame", row.names = c(NA,
> -11L))
>
> I want to combine the same Student ID and add up all the values for PO1M,
> PO1T,...,PO2T obtained by the same ID.
>
> How do I do that?
> Thank you for any help given

# load data

# Enter dataframe by hand
dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
     COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
     4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
     "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
"factor"),
     PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
     82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
     100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
     41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
     X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
     NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
"COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
"data.frame", row.names = c(NA,
-11L))

# Create sums by student ID

library(dplyr)
dat %>%
   group_by(STUDENT_ID) %>%
   summarize(sum.PO1M = sum(PO1M, na.rm = TRUE),
             sum.PO1T = sum(PO1M, na.rm = TRUE),
             sum.PO2M = sum(PO1M, na.rm = TRUE),
             sum.PO2T = sum(PO1M, na.rm = TRUE))

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

Robert Baer
In reply to this post by roslinazairimah zakaria


On 10/11/2018 5:12 PM, roslinazairimah zakaria wrote:

> Dear r-users,
>
> I have this data:
>
> structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
>      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
>      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
>      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
> "factor"),
>      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
>      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
>      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
>      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
>      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
>      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> "data.frame", row.names = c(NA,
> -11L))
>
> I want to combine the same Student ID and add up all the values for PO1M,
> PO1T,...,PO2T obtained by the same ID.
>
> How do I do that?
> Thank you for any help given.
>
oops!  Forgot to clean up after my cut and paste. Solution with dplyr
looks like this:
# Create sums by student ID
library(dplyr)
dat %>%
   group_by(STUDENT_ID) %>%
   summarize(sum.PO1M = sum(PO1M, na.rm = TRUE),
             sum.PO1T = sum(PO1T, na.rm = TRUE),
             sum.PO2M = sum(PO2M, na.rm = TRUE),
             sum.PO2T = sum(PO2T, na.rm = TRUE))

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

Bert Gunter-2
Here is a base R solution:
"dat" is the data frame as in Robert's solution.

> aggregate(dat[,3:6], by= dat[1], FUN = sum, na.rm = TRUE)
  STUDENT_ID   PO1M PO1T PO2M PO2T
1    AA15285 287.80  350   37   50
2    AA15286 240.45  330   41   50

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 15, 2018 at 6:42 PM Robert Baer <[hidden email]> wrote:

>
>
> On 10/11/2018 5:12 PM, roslinazairimah zakaria wrote:
> > Dear r-users,
> >
> > I have this data:
> >
> > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"),
> >      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
> >      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
> >      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
> > "factor"),
> >      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
> >      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
> >      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
> >      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
> >      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
> >      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID",
> > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> > "data.frame", row.names = c(NA,
> > -11L))
> >
> > I want to combine the same Student ID and add up all the values for PO1M,
> > PO1T,...,PO2T obtained by the same ID.
> >
> > How do I do that?
> > Thank you for any help given.
> >
> oops!  Forgot to clean up after my cut and paste. Solution with dplyr
> looks like this:
> # Create sums by student ID
> library(dplyr)
> dat %>%
>    group_by(STUDENT_ID) %>%
>    summarize(sum.PO1M = sum(PO1M, na.rm = TRUE),
>              sum.PO1T = sum(PO1T, na.rm = TRUE),
>              sum.PO2M = sum(PO2M, na.rm = TRUE),
>              sum.PO2T = sum(PO2T, na.rm = TRUE))
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Finding unique terms

roslinazairimah zakaria
In reply to this post by Jeff Newmiller
Yes you are right, I want the sum. I wll change the formula accordingly.

On Fri, Oct 12, 2018 at 10:36 AM Jeff Newmiller <[hidden email]>
wrote:

> You said "add up"... so you did not mean to say that? Denes computed the
> mean...
>
> On October 11, 2018 3:56:23 PM PDT, roslinazairimah zakaria <
> [hidden email]> wrote:
> >Hi Denes,
> >
> >It works perfectly as I want!
> >
> >Thanks a lot.
> >
> >On Fri, Oct 12, 2018 at 6:29 AM Dénes Tóth <[hidden email]>
> >wrote:
> >
> >>
> >>
> >> On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote:
> >> > Dear r-users,
> >> >
> >> > I have this data:
> >> >
> >> > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> >> > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class =
> >"factor"),
> >> >      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
> >> >      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
> >> >      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class
> >=
> >> > "factor"),
> >> >      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
> >> >      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
> >> >      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
> >> >      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
> >> >      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
> >> >      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =
> >c("STUDENT_ID",
> >> > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> >> > "data.frame", row.names = c(NA,
> >> > -11L))
> >> >
> >> > I want to combine the same Student ID and add up all the values for
> >PO1M,
> >> > PO1T,...,PO2T obtained by the same ID.
> >>
> >> dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L,
> >1L,
> >> 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class =
> >"factor"),
> >>      COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L,
> >>      4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113",
> >>      "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class =
> >> "factor"),
> >>      PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65,
> >>      82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60,
> >>      100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA,
> >>      41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50),
> >>      X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA,
> >>      NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =
> >c("STUDENT_ID",
> >> "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class =
> >> "data.frame", row.names = c(NA,
> >> -11L))
> >>
> >> # I assume you would like to add up the values with na.rm = TRUE
> >> meanFn <- function(x) mean(x, na.rm = TRUE)
> >>
> >> # see ?aggregate
> >> aggregate(dat[, c("PO1M", "PO1T", "PO2M")],
> >>            by = dat["STUDENT_ID"],
> >>            FUN = meanFn)
> >>
> >> # if you have largish or large data
> >> library(data.table)
> >> dat2 <- as.data.table(dat)
> >> dat2[, lapply(.SD, meanFn),
> >>       by = STUDENT_ID,
> >>       .SDcols = c("PO1M", "PO1T", "PO2M")]
> >>
> >>
> >> Regards,
> >> Denes
> >>
> >>
> >> >
> >> > How do I do that?
> >> > Thank you for any help given.
> >> >
> >>
>
> --
> Sent from my phone. Please excuse my brevity.
>


--
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: [hidden email] <[hidden email]>;
[hidden email] <[hidden email]>*
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.