Hi All I have a dataset of 200 columns and 1000 rows , there are 3 repeated values under each column (7,8,10). I wanted to calculate the frequency of each value under each column and then apply the function maf () given that the frequency of each value is known. I can do the analysis step by step like this :- > Values A B C ... 200 1 7 10 7 2 7 8 7 3 10 8 7 4 8 7 10 . . . 1000 For column A : I calculate the frequency for the 3 values as follows : count7 <- length(which(Values$A == 7)) count8 <- length(which(Values$A == 8)) count10 <- length(which(Values$A == 10)) count7 = 2, count8 = 1 , count10= 1. Then, I create a vector and type the frequencies manually : Freq<- c( count7=2 ,count8= 1,count10=1) Then I apply the function maf () :- maf(Freq) This gives me the result I need for column A , could you please help me to perform the analysis for all of the 200 columns at once ? Regards Allahisone [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
This is not a good way to do things! R has many powerful built in functions
to do this sort of thing for you. Searching -- e.g. at rseek.org or even a plain old google search -- can help you find them. Also, it looks like you need to go through a tutorial or two to learn more about R's basic functionality. In this case, something like (no reproducible example given, so can't confirm): apply(Values, 2, function(x)maf(tabulate(x))) should be close to what you want . Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1 <[hidden email]> wrote: > > Hi All > > > I have a dataset of 200 columns and 1000 rows , there are 3 repeated > values under each column (7,8,10). I wanted to calculate the frequency of > each value under each column and then apply the function maf () given that > the frequency of each value is known. I can do the analysis step by step > like this :- > > > > Values > > > A B C ... 200 > > 1 7 10 7 > > 2 7 8 7 > > 3 10 8 7 > > 4 8 7 10 > > . > > . > > . > > 1000 > > > For column A : I calculate the frequency for the 3 values as follows : > > count7 <- length(which(Values$A == 7)) > > count8 <- length(which(Values$A == 8)) > > count10 <- length(which(Values$A == 10)) > > > count7 = 2, count8 = 1 , count10= 1. > > > Then, I create a vector and type the frequencies manually : > > > Freq<- c( count7=2 ,count8= 1,count10=1) > > > Then I apply the function maf () :- > > maf(Freq) > > > This gives me the result I need for column A , could you please help me > > to perform the analysis for all of the 200 columns at once ? > > > Regards > > Allahisone > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Always reply to the list. I am not a free, private consultant!
"For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3" Huh?? > x <- sample(1:3,10,TRUE) > x [1] 1 3 1 1 1 3 2 3 2 1 > tabulate(x) [1] 5 2 3 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1 <[hidden email]> wrote: > Thank you so much for your replay > > > Actually, I tried apply() function but struggled with the part of writing > the appropriate function inside it which calculate the frequency of the 3 > values. Tabulate () function is a good start but the problem is that this > calculates the frequency of two values only per column which means that > when I apply maf () function , maf value will be calculated using the > frequency of these 2 values only without considering the frequency of the > 3rd value. For example, if I have the values : 1 , 2 , 3 in each column, > applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I > need a way to calculate the frequencies of all of the 3 values so the > calculation of maf will be correct as it will consider all the 3 > frequencies but not only 2 . > > > Regards > > Allahisone > ------------------------------ > *From:* Bert Gunter <[hidden email]> > *Sent:* 09 November 2017 20:56:39 > *To:* Allaisone 1 > *Cc:* [hidden email] > *Subject:* Re: [R] Calculating frequencies of multiple values in 200 > colomns > > This is not a good way to do things! R has many powerful built in > functions to do this sort of thing for you. Searching -- e.g. at > rseek.org or even a plain old google search -- can help you find them. > Also, it looks like you need to go through a tutorial or two to learn more > about R's basic functionality. > > In this case, something like (no reproducible example given, so can't > confirm): > > apply(Values, 2, function(x)maf(tabulate(x))) > > should be close to what you want . > > > Cheers, > Bert > > > > > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1 <[hidden email]> > wrote: > >> >> Hi All >> >> >> I have a dataset of 200 columns and 1000 rows , there are 3 repeated >> values under each column (7,8,10). I wanted to calculate the frequency of >> each value under each column and then apply the function maf () given that >> the frequency of each value is known. I can do the analysis step by step >> like this :- >> >> >> > Values >> >> >> A B C ... 200 >> >> 1 7 10 7 >> >> 2 7 8 7 >> >> 3 10 8 7 >> >> 4 8 7 10 >> >> . >> >> . >> >> . >> >> 1000 >> >> >> For column A : I calculate the frequency for the 3 values as follows : >> >> count7 <- length(which(Values$A == 7)) >> >> count8 <- length(which(Values$A == 8)) >> >> count10 <- length(which(Values$A == 10)) >> >> >> count7 = 2, count8 = 1 , count10= 1. >> >> >> Then, I create a vector and type the frequencies manually : >> >> >> Freq<- c( count7=2 ,count8= 1,count10=1) >> >> >> Then I apply the function maf () :- >> >> maf(Freq) >> >> >> This gives me the result I need for column A , could you please help me >> >> to perform the analysis for all of the 200 columns at once ? >> >> >> Regards >> >> Allahisone >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Thank you for your effort Bert.., I knew what is the problem now, the values (1,2,3) were only an example. The values I have are 0 , 1, 2 . Tabulate () function seem to ignore calculating the frequency of 0 values and this is my exact problem as the frequency of 0 values should also be calculated for the maf to be calculated correctly. ________________________________ From: Bert Gunter <[hidden email]> Sent: 09 November 2017 23:51:35 To: Allaisone 1; R-help Subject: Re: [R] Calculating frequencies of multiple values in 200 colomns [[elided Hotmail spam]] "For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3" Huh?? > x <- sample(1:3,10,TRUE) > x [1] 1 3 1 1 1 3 2 3 2 1 > tabulate(x) [1] 5 2 3 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1 <[hidden email]<mailto:[hidden email]>> wrote: Thank you so much for your replay Actually, I tried apply() function but struggled with the part of writing the appropriate function inside it which calculate the frequency of the 3 values. Tabulate () function is a good start but the problem is that this calculates the frequency of two values only per column which means that when I apply maf () function , maf value will be calculated using the frequency of these 2 values only without considering the frequency of the 3rd value. For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I need a way to calculate the frequencies of all of the 3 values so the calculation of maf will be correct as it will consider all the 3 frequencies but not only 2 . Regards Allahisone ________________________________ From: Bert Gunter <[hidden email]<mailto:[hidden email]>> Sent: 09 November 2017 20:56:39 To: Allaisone 1 Cc: [hidden email] Subject: Re: [R] Calculating frequencies of multiple values in 200 colomns This is not a good way to do things! R has many powerful built in functions to do this sort of thing for you. Searching -- e.g. at rseek.org<http://rseek.org> or even a plain old google search -- can help you find them. Also, it looks like you need to go through a tutorial or two to learn more about R's basic functionality. In this case, something like (no reproducible example given, so can't confirm): apply(Values, 2, function(x)maf(tabulate(x))) should be close to what you want . Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1 <[hidden email]<mailto:[hidden email]>> wrote: Hi All I have a dataset of 200 columns and 1000 rows , there are 3 repeated values under each column (7,8,10). I wanted to calculate the frequency of each value under each column and then apply the function maf () given that the frequency of each value is known. I can do the analysis step by step like this :- > Values A B C ... 200 1 7 10 7 2 7 8 7 3 10 8 7 4 8 7 10 . . . For column A : I calculate the frequency for the 3 values as follows : count7 <- length(which(Values$A == 7)) count8 <- length(which(Values$A == 8)) count10 <- length(which(Values$A == 10)) count7 = 2, count8 = 1 , count10= 1. Then, I create a vector and type the frequencies manually : Freq<- c( count7=2 ,count8= 1,count10=1) Then I apply the function maf () :- maf(Freq) This gives me the result I need for column A , could you please help me to perform the analysis for all of the 200 columns at once ? Regards Allahisone [[alternative HTML version deleted]] ______________________________________________ [hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|> x <- sample(0:2, 10, replace = TRUE)
|> x [1] 1 0 2 1 0 2 2 0 2 1 |> tabulate(x) [1] 3 4 |> table(x) x 0 1 2 3 3 4 B. > On Nov 10, 2017, at 4:32 AM, Allaisone 1 <[hidden email]> wrote: > > > > Thank you for your effort Bert.., > > > I knew what is the problem now, the values (1,2,3) were only an example. The values I have are 0 , 1, 2 . Tabulate () function seem to ignore calculating the frequency of 0 values and this is my exact problem as the frequency of 0 values should also be calculated for the maf to be calculated correctly. > > ________________________________ > From: Bert Gunter <[hidden email]> > Sent: 09 November 2017 23:51:35 > To: Allaisone 1; R-help > Subject: Re: [R] Calculating frequencies of multiple values in 200 colomns > > [[elided Hotmail spam]] > > "For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3" > > Huh?? > >> x <- sample(1:3,10,TRUE) >> x > [1] 1 3 1 1 1 3 2 3 2 1 >> tabulate(x) > [1] 5 2 3 > > Cheers, > Bert > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1 <[hidden email]<mailto:[hidden email]>> wrote: > > Thank you so much for your replay > > > Actually, I tried apply() function but struggled with the part of writing the appropriate function inside it which calculate the frequency of the 3 values. Tabulate () function is a good start but the problem is that this calculates the frequency of two values only per column which means that when I apply maf () function , maf value will be calculated using the frequency of these 2 values only without considering the frequency of the 3rd value. For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I need a way to calculate the frequencies of all of the 3 values so the calculation of maf will be correct as it will consider all the 3 frequencies but not only 2 . > > > Regards > > Allahisone > > ________________________________ > From: Bert Gunter <[hidden email]<mailto:[hidden email]>> > Sent: 09 November 2017 20:56:39 > To: Allaisone 1 > Cc: [hidden email] > Subject: Re: [R] Calculating frequencies of multiple values in 200 colomns > > This is not a good way to do things! R has many powerful built in functions to do this sort of thing for you. Searching -- e.g. at rseek.org<http://rseek.org> or even a plain old google search -- can help you find them. Also, it looks like you need to go through a tutorial or two to learn more about R's basic functionality. > > In this case, something like (no reproducible example given, so can't confirm): > > apply(Values, 2, function(x)maf(tabulate(x))) > > should be close to what you want . > > > Cheers, > Bert > > > > > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1 <[hidden email]<mailto:[hidden email]>> wrote: > > Hi All > > > I have a dataset of 200 columns and 1000 rows , there are 3 repeated values under each column (7,8,10). I wanted to calculate the frequency of each value under each column and then apply the function maf () given that the frequency of each value is known. I can do the analysis step by step like this :- > > >> Values > > > A B C ... 200 > > 1 7 10 7 > > 2 7 8 7 > > 3 10 8 7 > > 4 8 7 10 > > . > > . > > . > > > > > For column A : I calculate the frequency for the 3 values as follows : > > count7 <- length(which(Values$A == 7)) > > count8 <- length(which(Values$A == 8)) > > count10 <- length(which(Values$A == 10)) > > > count7 = 2, count8 = 1 , count10= 1. > > > Then, I create a vector and type the frequencies manually : > > > Freq<- c( count7=2 ,count8= 1,count10=1) > > > Then I apply the function maf () :- > > maf(Freq) > > > This gives me the result I need for column A , could you please help me > > to perform the analysis for all of the 200 columns at once ? > > > Regards > > Allahisone > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Hi,
To clarify the default behavior that Boris is referencing below, note the definition of the 'bin' argument to the tabulate() function: bin: a numeric vector ***(of positive integers)***, or a factor. Long vectors are supported. I added the asterisks for emphasis. This is also noted in the examples used for the function in ?tabulate at the bottom of the help page. The second argument, 'nbins', which defaults to max(1, bin, na.rm = TRUE), also affects the output: > tabulate(c(2, 3, 5)) [1] 0 1 1 0 1 In this case, with each element in the returned vector indicating how many 1's, 2's, 3's, 4's and 5's are present in the source vector. Compare that to: > tabulate(c(2, 3, 5), nbins = 3) [1] 0 1 1 In the above example, 5 is ignored. Note also that tabulate(), unlike table(), does not return a named vector, just the frequencies. While tabulate() is used within the table() function, reviewing the code for the latter reveals how the default behavior of tabulate() is modified and preceded/wrapped in other code for use there. Regards, Marc Schwartz > On Nov 10, 2017, at 8:43 AM, Boris Steipe <[hidden email]> wrote: > > |> x <- sample(0:2, 10, replace = TRUE) > |> x > [1] 1 0 2 1 0 2 2 0 2 1 > |> tabulate(x) > [1] 3 4 > |> table(x) > x > 0 1 2 > 3 3 4 > > > > B. > > > >> On Nov 10, 2017, at 4:32 AM, Allaisone 1 <[hidden email]> wrote: >> >> >> >> Thank you for your effort Bert.., >> >> >> I knew what is the problem now, the values (1,2,3) were only an example. The values I have are 0 , 1, 2 . Tabulate () function seem to ignore calculating the frequency of 0 values and this is my exact problem as the frequency of 0 values should also be calculated for the maf to be calculated correctly. >> >> ________________________________ >> From: Bert Gunter <[hidden email]> >> Sent: 09 November 2017 23:51:35 >> To: Allaisone 1; R-help >> Subject: Re: [R] Calculating frequencies of multiple values in 200 colomns >> >> [[elided Hotmail spam]] >> >> "For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3" >> >> Huh?? >> >>> x <- sample(1:3,10,TRUE) >>> x >> [1] 1 3 1 1 1 3 2 3 2 1 >>> tabulate(x) >> [1] 5 2 3 >> >> Cheers, >> Bert >> >> >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1 <[hidden email]<mailto:[hidden email]>> wrote: >> >> Thank you so much for your replay >> >> >> Actually, I tried apply() function but struggled with the part of writing the appropriate function inside it which calculate the frequency of the 3 values. Tabulate () function is a good start but the problem is that this calculates the frequency of two values only per column which means that when I apply maf () function , maf value will be calculated using the frequency of these 2 values only without considering the frequency of the 3rd value. For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I need a way to calculate the frequencies of all of the 3 values so the calculation of maf will be correct as it will consider all the 3 frequencies but not only 2 . >> >> >> Regards >> >> Allahisone >> >> ________________________________ >> From: Bert Gunter <[hidden email]<mailto:[hidden email]>> >> Sent: 09 November 2017 20:56:39 >> To: Allaisone 1 >> Cc: [hidden email] >> Subject: Re: [R] Calculating frequencies of multiple values in 200 colomns >> >> This is not a good way to do things! R has many powerful built in functions to do this sort of thing for you. Searching -- e.g. at rseek.org<http://rseek.org> or even a plain old google search -- can help you find them. Also, it looks like you need to go through a tutorial or two to learn more about R's basic functionality. >> >> In this case, something like (no reproducible example given, so can't confirm): >> >> apply(Values, 2, function(x)maf(tabulate(x))) >> >> should be close to what you want . >> >> >> Cheers, >> Bert >> >> >> >> >> >> >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1 <[hidden email]<mailto:[hidden email]>> wrote: >> >> Hi All >> >> >> I have a dataset of 200 columns and 1000 rows , there are 3 repeated values under each column (7,8,10). I wanted to calculate the frequency of each value under each column and then apply the function maf () given that the frequency of each value is known. I can do the analysis step by step like this :- >> >> >>> Values >> >> >> A B C ... 200 >> >> 1 7 10 7 >> >> 2 7 8 7 >> >> 3 10 8 7 >> >> 4 8 7 10 >> >> . >> >> . >> >> . >> >> >> >> >> For column A : I calculate the frequency for the 3 values as follows : >> >> count7 <- length(which(Values$A == 7)) >> >> count8 <- length(which(Values$A == 8)) >> >> count10 <- length(which(Values$A == 10)) >> >> >> count7 = 2, count8 = 1 , count10= 1. >> >> >> Then, I create a vector and type the frequencies manually : >> >> >> Freq<- c( count7=2 ,count8= 1,count10=1) >> >> >> Then I apply the function maf () :- >> >> maf(Freq) >> >> >> This gives me the result I need for column A , could you please help me >> >> to perform the analysis for all of the 200 columns at once ? >> >> >> Regards >> >> Allahisone ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
How about this workaround - add 1 to the vector
x <- c(1,0,2,1,0,2,2,0,2,1) tabulate(x) # [1] 3 4 tabulate(x+1) #[1] 3 3 4 On Fri, Nov 10, 2017 at 4:34 PM, Marc Schwartz <[hidden email]> wrote: > Hi, > > To clarify the default behavior that Boris is referencing below, note the > definition of the 'bin' argument to the tabulate() function: > > bin: a numeric vector ***(of positive integers)***, or a factor. Long > vectors are supported. > > I added the asterisks for emphasis. > > This is also noted in the examples used for the function in ?tabulate at > the bottom of the help page. > > The second argument, 'nbins', which defaults to max(1, bin, na.rm = TRUE), > also affects the output: > > > tabulate(c(2, 3, 5)) > [1] 0 1 1 0 1 > > In this case, with each element in the returned vector indicating how many > 1's, 2's, 3's, 4's and 5's are present in the source vector. > > Compare that to: > > > tabulate(c(2, 3, 5), nbins = 3) > [1] 0 1 1 > > In the above example, 5 is ignored. > > Note also that tabulate(), unlike table(), does not return a named vector, > just the frequencies. > > While tabulate() is used within the table() function, reviewing the code > for the latter reveals how the default behavior of tabulate() is modified > and preceded/wrapped in other code for use there. > > Regards, > > Marc Schwartz > > > > On Nov 10, 2017, at 8:43 AM, Boris Steipe <[hidden email]> > wrote: > > > > |> x <- sample(0:2, 10, replace = TRUE) > > |> x > > [1] 1 0 2 1 0 2 2 0 2 1 > > |> tabulate(x) > > [1] 3 4 > > |> table(x) > > x > > 0 1 2 > > 3 3 4 > > > > > > > > B. > > > > > > > >> On Nov 10, 2017, at 4:32 AM, Allaisone 1 <[hidden email]> > wrote: > >> > >> > >> > >> Thank you for your effort Bert.., > >> > >> > >> I knew what is the problem now, the values (1,2,3) were only an > example. The values I have are 0 , 1, 2 . Tabulate () function seem to > ignore calculating the frequency of 0 values and this is my exact problem > as the frequency of 0 values should also be calculated for the maf to be > calculated correctly. > >> > >> ________________________________ > >> From: Bert Gunter <[hidden email]> > >> Sent: 09 November 2017 23:51:35 > >> To: Allaisone 1; R-help > >> Subject: Re: [R] Calculating frequencies of multiple values in 200 > colomns > >> > >> [[elided Hotmail spam]] > >> > >> "For example, if I have the values : 1 , 2 , 3 in each column, applying > Tabulate () would calculate the frequency of 1 and 2 without 3" > >> > >> Huh?? > >> > >>> x <- sample(1:3,10,TRUE) > >>> x > >> [1] 1 3 1 1 1 3 2 3 2 1 > >>> tabulate(x) > >> [1] 5 2 3 > >> > >> Cheers, > >> Bert > >> > >> > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1 <[hidden email]< > mailto:[hidden email]>> wrote: > >> > >> Thank you so much for your replay > >> > >> > >> Actually, I tried apply() function but struggled with the part of > writing the appropriate function inside it which calculate the frequency of > the 3 values. Tabulate () function is a good start but the problem is that > this calculates the frequency of two values only per column which means > that when I apply maf () function , maf value will be calculated using the > frequency of these 2 values only without considering the frequency of the > 3rd value. For example, if I have the values : 1 , 2 , 3 in each column, > applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I > need a way to calculate the frequencies of all of the 3 values so the > calculation of maf will be correct as it will consider all the 3 > frequencies but not only 2 . > >> > >> > >> Regards > >> > >> Allahisone > >> > >> ________________________________ > >> From: Bert Gunter <[hidden email]<mailto:[hidden email] > >> > >> Sent: 09 November 2017 20:56:39 > >> To: Allaisone 1 > >> Cc: [hidden email] > >> Subject: Re: [R] Calculating frequencies of multiple values in 200 > colomns > >> > >> This is not a good way to do things! R has many powerful built in > functions to do this sort of thing for you. Searching -- e.g. at > rseek.org<http://rseek.org> or even a plain old google search -- can help > you find them. Also, it looks like you need to go through a tutorial or two > to learn more about R's basic functionality. > >> > >> In this case, something like (no reproducible example given, so can't > confirm): > >> > >> apply(Values, 2, function(x)maf(tabulate(x))) > >> > >> should be close to what you want . > >> > >> > >> Cheers, > >> Bert > >> > >> > >> > >> > >> > >> > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1 <[hidden email]< > mailto:[hidden email]>> wrote: > >> > >> Hi All > >> > >> > >> I have a dataset of 200 columns and 1000 rows , there are 3 repeated > values under each column (7,8,10). I wanted to calculate the frequency of > each value under each column and then apply the function maf () given that > the frequency of each value is known. I can do the analysis step by step > like this :- > >> > >> > >>> Values > >> > >> > >> A B C ... 200 > >> > >> 1 7 10 7 > >> > >> 2 7 8 7 > >> > >> 3 10 8 7 > >> > >> 4 8 7 10 > >> > >> . > >> > >> . > >> > >> . > >> > >> > >> > >> > >> For column A : I calculate the frequency for the 3 values as follows : > >> > >> count7 <- length(which(Values$A == 7)) > >> > >> count8 <- length(which(Values$A == 8)) > >> > >> count10 <- length(which(Values$A == 10)) > >> > >> > >> count7 = 2, count8 = 1 , count10= 1. > >> > >> > >> Then, I create a vector and type the frequencies manually : > >> > >> > >> Freq<- c( count7=2 ,count8= 1,count10=1) > >> > >> > >> Then I apply the function maf () :- > >> > >> maf(Freq) > >> > >> > >> This gives me the result I need for column A , could you please help me > >> > >> to perform the analysis for all of the 200 columns at once ? > >> > >> > >> Regards > >> > >> Allahisone > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Allaisone 1
Use table(factor(x, levels=your3values))
Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Nov 10, 2017 at 1:32 AM, Allaisone 1 <[hidden email]> wrote: > > > Thank you for your effort Bert.., > > > I knew what is the problem now, the values (1,2,3) were only an example. > The values I have are 0 , 1, 2 . Tabulate () function seem to ignore > calculating the frequency of 0 values and this is my exact problem as the > frequency of 0 values should also be calculated for the maf to be > calculated correctly. > > ________________________________ > From: Bert Gunter <[hidden email]> > Sent: 09 November 2017 23:51:35 > To: Allaisone 1; R-help > Subject: Re: [R] Calculating frequencies of multiple values in 200 colomns > > [[elided Hotmail spam]] > > "For example, if I have the values : 1 , 2 , 3 in each column, applying > Tabulate () would calculate the frequency of 1 and 2 without 3" > > Huh?? > > > x <- sample(1:3,10,TRUE) > > x > [1] 1 3 1 1 1 3 2 3 2 1 > > tabulate(x) > [1] 5 2 3 > > Cheers, > Bert > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Thu, Nov 9, 2017 at 3:44 PM, Allaisone 1 <[hidden email]< > mailto:[hidden email]>> wrote: > > Thank you so much for your replay > > > Actually, I tried apply() function but struggled with the part of writing > the appropriate function inside it which calculate the frequency of the 3 > values. Tabulate () function is a good start but the problem is that this > calculates the frequency of two values only per column which means that > when I apply maf () function , maf value will be calculated using the > frequency of these 2 values only without considering the frequency of the > 3rd value. For example, if I have the values : 1 , 2 , 3 in each column, > applying Tabulate () would calculate the frequency of 1 and 2 without 3 . I > need a way to calculate the frequencies of all of the 3 values so the > calculation of maf will be correct as it will consider all the 3 > frequencies but not only 2 . > > > Regards > > Allahisone > > ________________________________ > From: Bert Gunter <[hidden email]<mailto:[hidden email]>> > Sent: 09 November 2017 20:56:39 > To: Allaisone 1 > Cc: [hidden email] > Subject: Re: [R] Calculating frequencies of multiple values in 200 colomns > > This is not a good way to do things! R has many powerful built in > functions to do this sort of thing for you. Searching -- e.g. at > rseek.org<http://rseek.org> or even a plain old google search -- can help > you find them. Also, it looks like you need to go through a tutorial or two > to learn more about R's basic functionality. > > In this case, something like (no reproducible example given, so can't > confirm): > > apply(Values, 2, function(x)maf(tabulate(x))) > > should be close to what you want . > > > Cheers, > Bert > > > > > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Thu, Nov 9, 2017 at 11:44 AM, Allaisone 1 <[hidden email]< > mailto:[hidden email]>> wrote: > > Hi All > > > I have a dataset of 200 columns and 1000 rows , there are 3 repeated > values under each column (7,8,10). I wanted to calculate the frequency of > each value under each column and then apply the function maf () given that > the frequency of each value is known. I can do the analysis step by step > like this :- > > > > Values > > > A B C ... 200 > > 1 7 10 7 > > 2 7 8 7 > > 3 10 8 7 > > 4 8 7 10 > > . > > . > > . > > > > > For column A : I calculate the frequency for the 3 values as follows : > > count7 <- length(which(Values$A == 7)) > > count8 <- length(which(Values$A == 8)) > > count10 <- length(which(Values$A == 10)) > > > count7 = 2, count8 = 1 , count10= 1. > > > Then, I create a vector and type the frequencies manually : > > > Freq<- c( count7=2 ,count8= 1,count10=1) > > > Then I apply the function maf () :- > > maf(Freq) > > > This gives me the result I need for column A , could you please help me > > to perform the analysis for all of the 200 columns at once ? > > > Regards > > Allahisone > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email]<mailto:[hidden email]> mailing list -- To > UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |