Quantcast

Paired t-tests

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Paired t-tests

R Help
Hello List,

I'm trying to do a paired t-test, and I'm wondering if it's consistent
with equations.  I have a dataset that has a response and two
treatments (here's an example):

   ID trt order          resp
17  1   0     1  0.0037513592
18  2   0     1  0.0118723051
19  4   0     1  0.0002610251
20  5   0     1 -0.0077951450
21  6   0     1  0.0022339952
22  7   0     2  0.0235195453

The subjects were randomized and assigned to receive either the
treatment or the placebo first, then the other.  I know I'll
eventually have to move on to a GLM or something that incorporates the
order, but for now I wanted to start with a simple t.test.  My problem
is that, if I get the responses into two vectors x and y (sorted by
ID) and do a t.test, and then compare that to a formula t.test, they
aren't the same.

> t.test(x,y,paired=TRUE)

        Paired t-test

data:  x and y
t = -0.3492, df = 15, p-value = 0.7318
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.010446921  0.007505966
sample estimates:
mean of the differences
           -0.001470477

> t.test(resp~trt,data=dat1[[3]],paired=TRUE)

        Paired t-test

data:  resp by trt
t = -0.3182, df = 15, p-value = 0.7547
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.007096678  0.005253173
sample estimates:
mean of the differences
          -0.0009217521

What I'm assuming is that the equation isn't retaining the inherent
order of the dataset, so the pairing isn't matching up (even though
the dataset is ordered by ID).  Is there a way to make the t.test
retain the correct ordering?

Thanks,
Sam

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Paired t-tests

Marc Schwartz-3
On Aug 15, 2010, at 9:05 AM, R Help wrote:

> Hello List,
>
> I'm trying to do a paired t-test, and I'm wondering if it's consistent
> with equations.  I have a dataset that has a response and two
> treatments (here's an example):
>
>   ID trt order          resp
> 17  1   0     1  0.0037513592
> 18  2   0     1  0.0118723051
> 19  4   0     1  0.0002610251
> 20  5   0     1 -0.0077951450
> 21  6   0     1  0.0022339952
> 22  7   0     2  0.0235195453
>
> The subjects were randomized and assigned to receive either the
> treatment or the placebo first, then the other.  I know I'll
> eventually have to move on to a GLM or something that incorporates the
> order, but for now I wanted to start with a simple t.test.  My problem
> is that, if I get the responses into two vectors x and y (sorted by
> ID) and do a t.test, and then compare that to a formula t.test, they
> aren't the same.
>
>> t.test(x,y,paired=TRUE)
>
> Paired t-test
>
> data:  x and y
> t = -0.3492, df = 15, p-value = 0.7318
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.010446921  0.007505966
> sample estimates:
> mean of the differences
>           -0.001470477
>
>> t.test(resp~trt,data=dat1[[3]],paired=TRUE)
>
> Paired t-test
>
> data:  resp by trt
> t = -0.3182, df = 15, p-value = 0.7547
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -0.007096678  0.005253173
> sample estimates:
> mean of the differences
>          -0.0009217521
>
> What I'm assuming is that the equation isn't retaining the inherent
> order of the dataset, so the pairing isn't matching up (even though
> the dataset is ordered by ID).  Is there a way to make the t.test
> retain the correct ordering?
>
> Thanks,
> Sam


See this thread from just 2 days ago:

  https://stat.ethz.ch/pipermail/r-help/2010-August/249068.html

perhaps focusing on Thomas' reply, which is the next post in the thread.

Bottom line, don't use the formula method for a paired t test.

HTH,

Marc Schwartz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Paired t-tests

Peter Dalgaard-2
Marc Schwartz wrote:

> On Aug 15, 2010, at 9:05 AM, R Help wrote:
>
>> Hello List,
>>
>> I'm trying to do a paired t-test, and I'm wondering if it's consistent
>> with equations.  I have a dataset that has a response and two
>> treatments (here's an example):
>>
>>   ID trt order          resp
>> 17  1   0     1  0.0037513592
>> 18  2   0     1  0.0118723051
>> 19  4   0     1  0.0002610251
>> 20  5   0     1 -0.0077951450
>> 21  6   0     1  0.0022339952
>> 22  7   0     2  0.0235195453
>>
>> The subjects were randomized and assigned to receive either the
>> treatment or the placebo first, then the other.  I know I'll
>> eventually have to move on to a GLM or something that incorporates the
>> order, but for now I wanted to start with a simple t.test.  My problem
>> is that, if I get the responses into two vectors x and y (sorted by
>> ID) and do a t.test, and then compare that to a formula t.test, they
>> aren't the same.
>>
>>> t.test(x,y,paired=TRUE)
>> Paired t-test
>>
>> data:  x and y
>> t = -0.3492, df = 15, p-value = 0.7318
>> alternative hypothesis: true difference in means is not equal to 0
>> 95 percent confidence interval:
>> -0.010446921  0.007505966
>> sample estimates:
>> mean of the differences
>>           -0.001470477
>>
>>> t.test(resp~trt,data=dat1[[3]],paired=TRUE)
>> Paired t-test
>>
>> data:  resp by trt
>> t = -0.3182, df = 15, p-value = 0.7547
>> alternative hypothesis: true difference in means is not equal to 0
>> 95 percent confidence interval:
>> -0.007096678  0.005253173
>> sample estimates:
>> mean of the differences
>>          -0.0009217521
>>
>> What I'm assuming is that the equation isn't retaining the inherent
>> order of the dataset, so the pairing isn't matching up (even though
>> the dataset is ordered by ID).  Is there a way to make the t.test
>> retain the correct ordering?
>>
>> Thanks,
>> Sam
>
>
> See this thread from just 2 days ago:
>
>   https://stat.ethz.ch/pipermail/r-help/2010-August/249068.html
>
> perhaps focusing on Thomas' reply, which is the next post in the thread.
>
> Bottom line, don't use the formula method for a paired t test.

Yes. I'm not sure the same problem is afoot here, though. In particular,
I'm puzzled by the fact that there are 15DF in both cases, but different
average difference. This kind of suggests to me that maybe the x and y
are not computed correctly. (If only the ordering was scrambled, the
average difference should be the same, but the variance typically
inflated.)

--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Paired t-tests

David Winsemius

On Aug 15, 2010, at 3:31 PM, Peter Dalgaard wrote:

> Marc Schwartz wrote:
>> On Aug 15, 2010, at 9:05 AM, R Help wrote:
>>
>>> Hello List,
>>>
>>> I'm trying to do a paired t-test, and I'm wondering if it's  
>>> consistent
>>> with equations.  I have a dataset that has a response and two
>>> treatments (here's an example):
>>>
>>>  ID trt order          resp
>>> 17  1   0     1  0.0037513592
>>> 18  2   0     1  0.0118723051
>>> 19  4   0     1  0.0002610251
>>> 20  5   0     1 -0.0077951450
>>> 21  6   0     1  0.0022339952
>>> 22  7   0     2  0.0235195453
>>>
>>> The subjects were randomized and assigned to receive either the
>>> treatment or the placebo first, then the other.  I know I'll
>>> eventually have to move on to a GLM or something that incorporates  
>>> the
>>> order, but for now I wanted to start with a simple t.test.  My  
>>> problem
>>> is that, if I get the responses into two vectors x and y (sorted by
>>> ID) and do a t.test, and then compare that to a formula t.test, they
>>> aren't the same.
>>>
>>>> t.test(x,y,paired=TRUE)
>>> Paired t-test
>>>
>>> data:  x and y
>>> t = -0.3492, df = 15, p-value = 0.7318
>>> alternative hypothesis: true difference in means is not equal to 0
>>> 95 percent confidence interval:
>>> -0.010446921  0.007505966
>>> sample estimates:
>>> mean of the differences
>>>          -0.001470477
>>>
>>>> t.test(resp~trt,data=dat1[[3]],paired=TRUE)

Since neither resp or trt would be in dat1[[3]] wouldn't the fact that  
no error was reported imply that either dat1 had been attached (and we  
were not informed of hthat prior attach()-ment or that resp and trt  
are also object names besides being column names inside dat1?


>>> Paired t-test
>>>
>>> data:  resp by trt
>>> t = -0.3182, df = 15, p-value = 0.7547
>>> alternative hypothesis: true difference in means is not equal to 0
>>> 95 percent confidence interval:
>>> -0.007096678  0.005253173
>>> sample estimates:
>>> mean of the differences
>>>         -0.0009217521
>>>
>>> What I'm assuming is that the equation isn't retaining the inherent
>>> order of the dataset, so the pairing isn't matching up (even though
>>> the dataset is ordered by ID).  Is there a way to make the t.test
>>> retain the correct ordering?
>>>
>>> Thanks,
>>> Sam
>>
>>
>> See this thread from just 2 days ago:
>>
>>  https://stat.ethz.ch/pipermail/r-help/2010-August/249068.html
>>
>> perhaps focusing on Thomas' reply, which is the next post in the  
>> thread.
>>
>> Bottom line, don't use the formula method for a paired t test.
>
> Yes. I'm not sure the same problem is afoot here, though. In  
> particular,
> I'm puzzled by the fact that there are 15DF in both cases, but  
> different
> average difference. This kind of suggests to me that maybe the x and y
> are not computed correctly. (If only the ordering was scrambled, the
> average difference should be the same, but the variance typically
> inflated.)
>
> --
> Peter Dalgaard
--

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Paired t-tests

Marc Schwartz-3
On Aug 15, 2010, at 2:48 PM, David Winsemius wrote:

>
> On Aug 15, 2010, at 3:31 PM, Peter Dalgaard wrote:
>
>> Marc Schwartz wrote:
>>> On Aug 15, 2010, at 9:05 AM, R Help wrote:
>>>
>>>> Hello List,
>>>>
>>>> I'm trying to do a paired t-test, and I'm wondering if it's consistent
>>>> with equations.  I have a dataset that has a response and two
>>>> treatments (here's an example):
>>>>
>>>> ID trt order          resp
>>>> 17  1   0     1  0.0037513592
>>>> 18  2   0     1  0.0118723051
>>>> 19  4   0     1  0.0002610251
>>>> 20  5   0     1 -0.0077951450
>>>> 21  6   0     1  0.0022339952
>>>> 22  7   0     2  0.0235195453
>>>>
>>>> The subjects were randomized and assigned to receive either the
>>>> treatment or the placebo first, then the other.  I know I'll
>>>> eventually have to move on to a GLM or something that incorporates the
>>>> order, but for now I wanted to start with a simple t.test.  My problem
>>>> is that, if I get the responses into two vectors x and y (sorted by
>>>> ID) and do a t.test, and then compare that to a formula t.test, they
>>>> aren't the same.
>>>>
>>>>> t.test(x,y,paired=TRUE)
>>>> Paired t-test
>>>>
>>>> data:  x and y
>>>> t = -0.3492, df = 15, p-value = 0.7318
>>>> alternative hypothesis: true difference in means is not equal to 0
>>>> 95 percent confidence interval:
>>>> -0.010446921  0.007505966
>>>> sample estimates:
>>>> mean of the differences
>>>>         -0.001470477
>>>>
>>>>> t.test(resp~trt,data=dat1[[3]],paired=TRUE)
>
> Since neither resp or trt would be in dat1[[3]] wouldn't the fact that no error was reported imply that either dat1 had been attached (and we were not informed of hthat prior attach()-ment or that resp and trt are also object names besides being column names inside dat1?
>
>
>>>> Paired t-test
>>>>
>>>> data:  resp by trt
>>>> t = -0.3182, df = 15, p-value = 0.7547
>>>> alternative hypothesis: true difference in means is not equal to 0
>>>> 95 percent confidence interval:
>>>> -0.007096678  0.005253173
>>>> sample estimates:
>>>> mean of the differences
>>>>        -0.0009217521
>>>>
>>>> What I'm assuming is that the equation isn't retaining the inherent
>>>> order of the dataset, so the pairing isn't matching up (even though
>>>> the dataset is ordered by ID).  Is there a way to make the t.test
>>>> retain the correct ordering?
>>>>
>>>> Thanks,
>>>> Sam
>>>
>>>
>>> See this thread from just 2 days ago:
>>>
>>> https://stat.ethz.ch/pipermail/r-help/2010-August/249068.html
>>>
>>> perhaps focusing on Thomas' reply, which is the next post in the thread.
>>>
>>> Bottom line, don't use the formula method for a paired t test.
>>
>> Yes. I'm not sure the same problem is afoot here, though. In particular,
>> I'm puzzled by the fact that there are 15DF in both cases, but different
>> average difference. This kind of suggests to me that maybe the x and y
>> are not computed correctly. (If only the ordering was scrambled, the
>> average difference should be the same, but the variance typically
>> inflated.)
>>


I suspect that David is correct here. Good catch.


set.seed(1)
x <- rnorm(16, 1, 1)
y <- rnorm(16, 1.5, 1)

grp <- rep(c("A", "B"), each = 16)


> t.test(x, y, paired = TRUE)

        Paired t-test

data:  x and y
t = -1.595, df = 15, p-value = 0.1316
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -1.2841549  0.1848776
sample estimates:
mean of the differences
             -0.5496387


> t.test(x-y)

        One Sample t-test

data:  x - y
t = -1.595, df = 15, p-value = 0.1316
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 -1.2841549  0.1848776
sample estimates:
 mean of x
-0.5496387



> t.test(c(x, y) ~ grp, paired = TRUE)

        Paired t-test

data:  c(x, y) by grp
t = -1.595, df = 15, p-value = 0.1316
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -1.2841549  0.1848776
sample estimates:
mean of the differences
             -0.5496387



# Scramble the pairings, as Peter notes

set.seed(2)

> t.test(c(sample(x), y) ~ grp, paired = TRUE)

        Paired t-test

data:  c(sample(x), y) by grp
t = -1.8166, df = 15, p-value = 0.0893
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -1.19453037  0.09525302
sample estimates:
mean of the differences
             -0.5496387



The prior thread behavior was due to the handling of missing data compromising the pairings.

So to the OP, check your working environment and your invocation of the formula method 'data' argument. However, avoid using the formula method for paired t tests.

Regards,

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...