# small sample techniques Classic List Threaded 15 messages Open this post in threaded view
|

## small sample techniques

 If my sample size is small is there a particular switch option that I need to use with t.test so that it calculates the t ratio correctly? Here is a dummy example? á =0.05 Mean pain reduction for A =27; B =31 and SD are SDA=9 SDB=12 drgA.p<-rnorm(5,27,9); drgB.p<-rnorm(5,31,12) t.test(drgA.p,drgB.p) # what do I need to give as additional parameter here?   I can do it manually but was looking for a switch option that I need to specify for  t.test.   Thanks ../Murli           [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 Hi Nair, If the two populations are normal the t-test gives you the exact result for whatever the sample size is (the sample size will affect the number of degrees of freedom). When the populations are not normal and the sample size is large it is still OK to use t-test (because of the Central Limit Theorem) but this is not necessarily true for the small sample size. You could use simulation to find the relevant probabilities. --- "Nair, Murlidharan T" <[hidden email]> wrote: > If my sample size is small is there a particular > switch option that I need to use with t.test so that > it calculates the t ratio correctly? > > Here is a dummy example? > > á =0.05 > > Mean pain reduction for A =27; B =31 and SD are > SDA=9 SDB=12 > > drgA.p<-rnorm(5,27,9); > > drgB.p<-rnorm(5,31,12) > > t.test(drgA.p,drgB.p) # what do I need to give as > additional parameter here? > >   > > I can do it manually but was looking for a switch > option that I need to specify for  t.test. > >   > > Thanks ../Murli > >   > > > [[alternative HTML version deleted]] > > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, > reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 Indeed, I understand what you say. The df of freedom for the dummy example is n1+n2-2 = 8. But when I run the t.test I get it as 5.08, am I missing something? -----Original Message----- From: Moshe Olshansky [mailto:[hidden email]] Sent: Tuesday, August 07, 2007 9:05 PM To: Nair, Murlidharan T; [hidden email] Subject: Re: [R] small sample techniques Hi Nair, If the two populations are normal the t-test gives you the exact result for whatever the sample size is (the sample size will affect the number of degrees of freedom). When the populations are not normal and the sample size is large it is still OK to use t-test (because of the Central Limit Theorem) but this is not necessarily true for the small sample size. You could use simulation to find the relevant probabilities. --- "Nair, Murlidharan T" <[hidden email]> wrote: > If my sample size is small is there a particular > switch option that I need to use with t.test so that > it calculates the t ratio correctly? > > Here is a dummy example? > > á =0.05 > > Mean pain reduction for A =27; B =31 and SD are > SDA=9 SDB=12 > > drgA.p<-rnorm(5,27,9); > > drgB.p<-rnorm(5,31,12) > > t.test(drgA.p,drgB.p) # what do I need to give as > additional parameter here? > >   > > I can do it manually but was looking for a switch > option that I need to specify for  t.test. > >   > > Thanks ../Murli > >   > > > [[alternative HTML version deleted]] > > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, > reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 On Wed, 8 Aug 2007, Nair, Murlidharan T wrote: > Indeed, I understand what you say. The df of freedom for the dummy example is n1+n2-2 = 8. But when I run the t.test I get it as 5.08, am I missing something? > Yes. You are probably looking for the version of the t.test that assumes equal variances (the original one), so you need var.equal=TRUE.       -thomas > -----Original Message----- > From: Moshe Olshansky [mailto:[hidden email]] > Sent: Tuesday, August 07, 2007 9:05 PM > To: Nair, Murlidharan T; [hidden email] > Subject: Re: [R] small sample techniques > > Hi Nair, > > If the two populations are normal the t-test gives you > the exact result for whatever the sample size is (the > sample size will affect the number of degrees of > freedom). > When the populations are not normal and the sample > size is large it is still OK to use t-test (because of > the Central Limit Theorem) but this is not necessarily > true for the small sample size. > You could use simulation to find the relevant > probabilities. > > --- "Nair, Murlidharan T" <[hidden email]> wrote: > >> If my sample size is small is there a particular >> switch option that I need to use with t.test so that >> it calculates the t ratio correctly? >> >> Here is a dummy example? >> >> á =0.05 >> >> Mean pain reduction for A =27; B =31 and SD are >> SDA=9 SDB=12 >> >> drgA.p<-rnorm(5,27,9); >> >> drgB.p<-rnorm(5,31,12) >> >> t.test(drgA.p,drgB.p) # what do I need to give as >> additional parameter here? >> >> >> >> I can do it manually but was looking for a switch >> option that I need to specify for  t.test. >> >> >> >> Thanks ../Murli >> >> >> >> >> [[alternative HTML version deleted]] >> >>> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, >> reproducible code. >> > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > Thomas Lumley Assoc. Professor, Biostatistics [hidden email] University of Washington, Seattle ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 In reply to this post by Moshe Olshansky-2 About using t tests and confidence intervals for "large" samples - "large" may need to be very large. The old pre-computer-age rule of n >= 30 is inadequate. For example, for an exponential distribution, the actual size of a nominal 2.5% one-sided t-test is not accurate to within 10% (i.e. between 2.25% & 2.75%) until n is around 5000. The error (actual - nominal size) decreases very slowly, at the rate 1/sqrt(n). In practice, real distributions may be even more skewed than the exponential distribution, even though they appear less skewed, if they have long tails.  In this case the sample size would need to be even larger for t procedures to be reasonably accurate. An alternative is to use bootstrapping.  Bootstrap procedures that decrease at the rate 1/n include bootstrap t, BCa, and bootstrap tilting. Moshe Olshansky <[hidden email]> wrote: >If the two populations are normal the t-test gives you >the exact result for whatever the sample size is (the >sample size will affect the number of degrees of >freedom). >When the populations are not normal and the sample >size is large it is still OK to use t-test (because of >the Central Limit Theorem) but this is not necessarily >true for the small sample size. >You could use simulation to find the relevant >probabilities. >... ======================================================== | Tim Hesterberg       Senior Research Scientist       | | [hidden email]  Insightful Corp.                | | (206)802-2319        1700 Westlake Ave. N, Suite 500 | | (206)283-8691 (fax)  Seattle, WA 98109-3044, U.S.A.  | |                      www.insightful.com/Hesterberg   | ======================================================== Short course - Bootstrap Methods and Permutation Tests                 Oct 10-11 San Francisco, 3-4 Oct UK. http://www.insightful.com/services/training.asp______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 In reply to this post by Nair, Murlidharan T As Thomas Lumley noted, there exist several versions of t-test. If you use t1 <- t.test(x,y) then no assumption is made of x and y having equal variance and of the two sample sizes being equal and then an approximate t-test is used with an approximate number of degrees of freedom (and this is what you got). If you use t2 <- t.test(x,y,var.equal=TRUE) then equal variance is assumed and you get 8 degrees of freedom. If you use t3 <- t.test(x,y,paired=TRUE) then equal sample sizes are assumed and the number of degrees of freedom is 4 (5-1). --- "Nair, Murlidharan T" <[hidden email]> wrote: > Indeed, I understand what you say. The df of freedom > for the dummy example is n1+n2-2 = 8. But when I run > the t.test I get it as 5.08, am I missing something? > > > -----Original Message----- > From: Moshe Olshansky [mailto:[hidden email]] > > Sent: Tuesday, August 07, 2007 9:05 PM > To: Nair, Murlidharan T; [hidden email] > Subject: Re: [R] small sample techniques > > Hi Nair, > > If the two populations are normal the t-test gives > you > the exact result for whatever the sample size is > (the > sample size will affect the number of degrees of > freedom). > When the populations are not normal and the sample > size is large it is still OK to use t-test (because > of > the Central Limit Theorem) but this is not > necessarily > true for the small sample size. > You could use simulation to find the relevant > probabilities. > > --- "Nair, Murlidharan T" <[hidden email]> wrote: > > > If my sample size is small is there a particular > > switch option that I need to use with t.test so > that > > it calculates the t ratio correctly? > > > > Here is a dummy example? > > > > á =0.05 > > > > Mean pain reduction for A =27; B =31 and SD are > > SDA=9 SDB=12 > > > > drgA.p<-rnorm(5,27,9); > > > > drgB.p<-rnorm(5,31,12) > > > > t.test(drgA.p,drgB.p) # what do I need to give as > > additional parameter here? > > > >   > > > > I can do it manually but was looking for a switch > > option that I need to specify for  t.test. > > > >   > > > > Thanks ../Murli > > > >   > > > > > > [[alternative HTML version deleted]] > > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, > > reproducible code. > > > > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 On 9/08/2007, at 2:57 PM, Moshe Olshansky wrote: > As Thomas Lumley noted, there exist several versions > of t-test.         > If you use t3 <- t.test(x,y,paired=TRUE) then equal > sample sizes are assumed and the number of degrees of > freedom is 4 (5-1).         This is seriously misleading.  The assumption is not that the sample   sizes         are equal, but rather that there is ***just one sample***, namely   the sample of differences.         More explicitly the assumptions are that                 x_i - y_i         are i.i.d. Gaussian with mean mu and variance sigma^2.         One is trying to conduct inference about mu, of course.         It should also be noted that it is a crucial assumption for the   ``non-paired''         t-test that the two samples be ***independent*** of each other, as   well as         being Gaussian.         None of this is however germane to Nair's original question; it is   clear         that he is interested in a two-independent-sample t-test.                                 cheers,                                         Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confidenti...{{dropped}} ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 Well, this an explanation of what is done in the paired t-test (and why the number of df is as it is). I was too lazy to write all this. It is nice that some list members are less lazy! --- Rolf Turner <[hidden email]> wrote: > > On 9/08/2007, at 2:57 PM, Moshe Olshansky wrote: > > > As Thomas Lumley noted, there exist several > versions > > of t-test. > > > > > If you use t3 <- t.test(x,y,paired=TRUE) then > equal > > sample sizes are assumed and the number of degrees > of > > freedom is 4 (5-1). > > This is seriously misleading.  The assumption is > not that the sample   > sizes > are equal, but rather that there is ***just one > sample***, namely   > the sample of differences. > > More explicitly the assumptions are that > > x_i - y_i > > are i.i.d. Gaussian with mean mu and variance > sigma^2. > > One is trying to conduct inference about mu, of > course. > > It should also be noted that it is a crucial > assumption for the   > ``non-paired'' > t-test that the two samples be ***independent*** of > each other, as   > well as > being Gaussian. > > None of this is however germane to Nair's original > question; it is   > clear > that he is interested in a two-independent-sample > t-test. > > cheers, > > Rolf Turner > > ###################################################################### > Attention: > This e-mail message is privileged and confidential. > If you are not the > intended recipient please delete the message and > notify the sender. > Any views or opinions presented are solely those of > the author. > > This e-mail has been scanned and cleared by > MailMarshal > www.marshalsoftware.com > ###################################################################### > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 Thanks, that discussion was helpful. Well, I have another question I am comparing two proportions for its deviation from the hypothesized difference of zero. My manually calculated z ratio is 1.94. But, when I calculate it using prop.test, it uses Pearson's chi-squared test and the X-squared value that it gives it 0.74. Is there a function in R where I can calculate the z ratio? Which is    ('p1-'p2)-(p1-p2)  Z= ----------------              S                 ('p1-'p2) Where S is the standard error estimate of the difference between two independent proportions Dummy example This is how I use it prop.test(c(30,23),c(300,300)) Cheers../Murli -----Original Message----- From: Moshe Olshansky [mailto:[hidden email]] Sent: Thursday, August 09, 2007 12:01 AM To: Rolf Turner; [hidden email] Cc: Nair, Murlidharan T; Moshe Olshansky Subject: Re: [R] small sample techniques Well, this an explanation of what is done in the paired t-test (and why the number of df is as it is). I was too lazy to write all this. It is nice that some list members are less lazy! --- Rolf Turner <[hidden email]> wrote: > > On 9/08/2007, at 2:57 PM, Moshe Olshansky wrote: > > > As Thomas Lumley noted, there exist several > versions > > of t-test. > > > > > If you use t3 <- t.test(x,y,paired=TRUE) then > equal > > sample sizes are assumed and the number of degrees > of > > freedom is 4 (5-1). > > This is seriously misleading.  The assumption is > not that the sample   > sizes > are equal, but rather that there is ***just one > sample***, namely   > the sample of differences. > > More explicitly the assumptions are that > > x_i - y_i > > are i.i.d. Gaussian with mean mu and variance > sigma^2. > > One is trying to conduct inference about mu, of > course. > > It should also be noted that it is a crucial > assumption for the   > ``non-paired'' > t-test that the two samples be ***independent*** of > each other, as   > well as > being Gaussian. > > None of this is however germane to Nair's original > question; it is   > clear > that he is interested in a two-independent-sample > t-test. > > cheers, > > Rolf Turner > > ###################################################################### > Attention: > This e-mail message is privileged and confidential. > If you are not the > intended recipient please delete the message and > notify the sender. > Any views or opinions presented are solely those of > the author. > > This e-mail has been scanned and cleared by > MailMarshal > www.marshalsoftware.com > ###################################################################### > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 > -----Original Message----- > From: [hidden email] > [mailto:[hidden email]] On Behalf Of Nair, > Murlidharan T > Sent: Thursday, August 09, 2007 9:19 AM > To: Moshe Olshansky; Rolf Turner; [hidden email] > Subject: Re: [R] small sample techniques > > Thanks, that discussion was helpful. Well, I have another question > I am comparing two proportions for its deviation from the hypothesized > difference of zero. My manually calculated z ratio is 1.94. > But, when I calculate it using prop.test, it uses Pearson's > chi-squared > test and the X-squared value that it gives it 0.74. Is there > a function > in R where I can calculate the z ratio? Which is > > >    ('p1-'p2)-(p1-p2) >  Z= ---------------- >     S > ('p1-'p2) > > Where S is the standard error estimate of the difference between two > independent proportions > > Dummy example > This is how I use it > prop.test(c(30,23),c(300,300)) > > > Cheers../Murli > > Murli, I think you need to recheck you computations.  You can run a t-test on your data in a variety of ways.  Here is one: > x<-c(rep(1,30),rep(0,270)) > y<-c(rep(1,23),rep(0,277)) > t.test(x,y)         Welch Two Sample t-test data:  x and y t = 1.0062, df = 589.583, p-value = 0.3147 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:  -0.02221086  0.06887752 sample estimates:  mean of x  mean of y 0.10000000 0.07666667 Hope this is helpful, Dan Daniel J. Nordlund Research and Data Analysis Washington State Department of Social and Health Services Olympia, WA  98504-5204 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 n=300 30% taking A relief from pain 23% taking B relief from pain Question; If there is no difference are we likely to get a 7% difference? Hypothesis H0: p1-p2=0 H1: p1-p2!=0 (not equal to) 1>Weighed average of two sample proportion     300(0.30)+300(0.23)     ------------------- = 0.265       300+300 2>Std Error estimate of the difference between two independent proportions       sqrt((0.265 *0.735)*((1/300)+(1/300))) = 0.03603 3>Evaluation of the difference between sample proportion as a deviation from the hypothesized difference of zero          ((0.30-0.23)-(0))/0.03603 = 1.94 z did not approach 1.96 hence H0 is not rejected. This is what I was trying to do using prop.test. prop.test(c(30,23),c(300,300)) What function should I use? -----Original Message----- From: [hidden email] on behalf of Nordlund, Dan (DSHS/RDA) Sent: Thu 8/9/2007 1:26 PM To: [hidden email] Subject: Re: [R] small sample techniques   > -----Original Message----- > From: [hidden email] > [mailto:[hidden email]] On Behalf Of Nair, > Murlidharan T > Sent: Thursday, August 09, 2007 9:19 AM > To: Moshe Olshansky; Rolf Turner; [hidden email] > Subject: Re: [R] small sample techniques > > Thanks, that discussion was helpful. Well, I have another question > I am comparing two proportions for its deviation from the hypothesized > difference of zero. My manually calculated z ratio is 1.94. > But, when I calculate it using prop.test, it uses Pearson's > chi-squared > test and the X-squared value that it gives it 0.74. Is there > a function > in R where I can calculate the z ratio? Which is > > >    ('p1-'p2)-(p1-p2) >  Z= ---------------- >     S > ('p1-'p2) > > Where S is the standard error estimate of the difference between two > independent proportions > > Dummy example > This is how I use it > prop.test(c(30,23),c(300,300)) > > > Cheers../Murli > > Murli, I think you need to recheck you computations.  You can run a t-test on your data in a variety of ways.  Here is one: > x<-c(rep(1,30),rep(0,270)) > y<-c(rep(1,23),rep(0,277)) > t.test(x,y)         Welch Two Sample t-test data:  x and y t = 1.0062, df = 589.583, p-value = 0.3147 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval:  -0.02221086  0.06887752 sample estimates:  mean of x  mean of y 0.10000000 0.07666667 Hope this is helpful, Dan Daniel J. Nordlund Research and Data Analysis Washington State Department of Social and Health Services Olympia, WA  98504-5204 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: small sample techniques

 > -----Original Message----- > From: [hidden email] > [mailto:[hidden email]] On Behalf Of Nair, > Murlidharan T > Sent: Thursday, August 09, 2007 12:02 PM > To: Nordlund, Dan (DSHS/RDA); [hidden email] > Subject: Re: [R] small sample techniques > > n=300 > 30% taking A relief from pain > 23% taking B relief from pain > Question; If there is no difference are we likely to get a 7% > difference? > > Hypothesis > H0: p1-p2=0 > H1: p1-p2!=0 (not equal to) > > 1>Weighed average of two sample proportion >     300(0.30)+300(0.23) >     ------------------- = 0.265 >       300+300 > 2>Std Error estimate of the difference between two > independent proportions >       sqrt((0.265 *0.735)*((1/300)+(1/300))) = 0.03603 > > 3>Evaluation of the difference between sample proportion as a > deviation from the hypothesized difference of zero >          ((0.30-0.23)-(0))/0.03603 = 1.94 > > > z did not approach 1.96 hence H0 is not rejected. > > This is what I was trying to do using prop.test. > > prop.test(c(30,23),c(300,300)) > > What function should I use? > > The proportion test above indicates that p1=0.1 and p2=0.07666667.  But in your t-test you specify p1=0.3 and p2=0.23.  Which is correct?  If p1=0.3 and p2=0.23, then use prop.test(c(.30*300,.23*300),c(300,300)) Hope this is helpful, Dan Daniel J. Nordlund Research and Data Analysis Washington State Department of Social and Health Services Olympia, WA  98504-5204 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|