Bug in t.test?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug in t.test?

Johannes W. Dietrich
Hello all,

due to unexplained differences between statistical results from
collaborators and our lab that arose in the same large proteomics
dataset we reevaluated the t.test() function. Here, we found a weird
behaviour that is also reproducible in the following small test
dataset:

Suppose, we have two vectors with numbers and some missing values
that refer to the same individuals and that should therefore be
evaluated with a paired t-test:

>  testdata.A <- c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
>  testdata.B <- c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);

Then

>  print(t.test(testdata.A, testdata.B, paired=TRUE,
>alternative="two.sided", na.action="na.pass"))

and

>  print(t.test(testdata.A, testdata.B, paired=TRUE,
>alternative="two.sided", na.action="na.exclude"))

deliver the same p value (0.1162, identical to Excel's result).

However, after combining the two vectors with

>  testdata <- c(testdata.A, testdata.B);

and defining a criterion vector with

>  criterion <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);

(that is the type of data layout we have in our proteomics project)
we get a different p-value (0.01453) with

>  print(t.test(testdata ~ criterion, paired=TRUE,
>alternative="two.sided", na.action="na.exclude")) .

The statement

>  print(t.test(testdata ~ criterion, paired=TRUE,
>alternative="two.sided", na.action="na.pass"))

however, delivers a p-value of 0.1162 again.

With

>  print(t.test(testdata[criterion==0], testdata[criterion==1],
>paired=TRUE, alternative="two.sided", na.action="na.exclude"))

that imitates the first form, we get again a p value of 0.1162.

What is the reason for the different p values? Should not all calls
to t.test that exlude missing values be equivalent and therefore
deliver the same results?

Excel, StatView and KaleidaGraph all display a p-value of 0.1162.

J. W. D.
--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- Dr. Johannes W. Dietrich, M.D.
-- Laboratory XU44, Endocrine Research
-- Medical Hospital I, Bergmannsheil University Hospitals
-- Ruhr University of Bochum
-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
-- eMail: "[hidden email]"
-- WWW: http://medical-cybernetics.de
-- WWW: http://www.bergmannsheil.de
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug in t.test?

Thomas Lumley


Thanks for the clear example. However, if there is a bug it is only that t.test.formula doesn't throw an error when given the paired=TRUE option.

The documentation says "The formula interface is only applicable for the 2-sample tests.",  but there probably should be an explicit check -- I didn't see that in the documentation until I realized that t.test.formula couldn't work for paired tests because you don't tell it which observations are paired.

    -thomas


On Fri, 13 Aug 2010, Johannes W. Dietrich wrote:

> Hello all,
>
> due to unexplained differences between statistical results from collaborators
> and our lab that arose in the same large proteomics dataset we reevaluated
> the t.test() function. Here, we found a weird behaviour that is also
> reproducible in the following small test dataset:
>
> Suppose, we have two vectors with numbers and some missing values that refer
> to the same individuals and that should therefore be evaluated with a paired
> t-test:
>
>>  testdata.A <- c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
>>  testdata.B <- c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);
>
> Then
>
>>  print(t.test(testdata.A, testdata.B, paired=TRUE, alternative="two.sided",
>> na.action="na.pass"))
>
> and
>
>>  print(t.test(testdata.A, testdata.B, paired=TRUE, alternative="two.sided",
>> na.action="na.exclude"))
>
> deliver the same p value (0.1162, identical to Excel's result).
>
> However, after combining the two vectors with
>
>>  testdata <- c(testdata.A, testdata.B);
>
> and defining a criterion vector with
>
>>  criterion <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);
>
> (that is the type of data layout we have in our proteomics project) we get a
> different p-value (0.01453) with
>
>>  print(t.test(testdata ~ criterion, paired=TRUE, alternative="two.sided",
>> na.action="na.exclude")) .
>
> The statement
>
>>  print(t.test(testdata ~ criterion, paired=TRUE, alternative="two.sided",
>> na.action="na.pass"))
>
> however, delivers a p-value of 0.1162 again.
>
> With
>
>>  print(t.test(testdata[criterion==0], testdata[criterion==1], paired=TRUE,
>> alternative="two.sided", na.action="na.exclude"))
>
> that imitates the first form, we get again a p value of 0.1162.
>
> What is the reason for the different p values? Should not all calls to t.test
> that exlude missing values be equivalent and therefore deliver the same
> results?
>
> Excel, StatView and KaleidaGraph all display a p-value of 0.1162.
>
> J. W. D.
> --
> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
> -- Dr. Johannes W. Dietrich, M.D.
> -- Laboratory XU44, Endocrine Research
> -- Medical Hospital I, Bergmannsheil University Hospitals
> -- Ruhr University of Bochum
> -- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
> -- Phone: +49:234:302-6400, Fax: +49:234:302-6403
> -- eMail: "[hidden email]"
> -- WWW: http://medical-cybernetics.de
> -- WWW: http://www.bergmannsheil.de
> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug in t.test?

Johannes W. Dietrich
Thank you for the fast reply! Although I have read the help page for
t.test over and over again I have obviously overlooked the relevant
sentence. The "workaround" that I have planned seems to be the
correct use.

Thanks again,

J. W. D.

At 15:31 Uhr -0700 13.08.2010, Thomas Lumley wrote:

>Thanks for the clear example. However, if there is a bug it is only
>that t.test.formula doesn't throw an error when given the
>paired=TRUE option.
>
>The documentation says "The formula interface is only applicable for
>the 2-sample tests.",  but there probably should be an explicit
>check -- I didn't see that in the documentation until I realized
>that t.test.formula couldn't work for paired tests because you don't
>tell it which observations are paired.
>
>    -thomas
>
>
>On Fri, 13 Aug 2010, Johannes W. Dietrich wrote:
>
>>Hello all,
>>
>>due to unexplained differences between statistical results from
>>collaborators and our lab that arose in the same large proteomics
>>dataset we reevaluated the t.test() function. Here, we found a
>>weird behaviour that is also reproducible in the following small
>>test dataset:
>>
>>Suppose, we have two vectors with numbers and some missing values
>>that refer to the same individuals and that should therefore be
>>evaluated with a paired t-test:
>>
>>>  testdata.A <- c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
>>>  testdata.B <- c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);
>>
>>Then
>>
>>>  print(t.test(testdata.A, testdata.B, paired=TRUE,
>>>alternative="two.sided", na.action="na.pass"))
>>
>>and
>>
>>>  print(t.test(testdata.A, testdata.B, paired=TRUE,
>>>alternative="two.sided", na.action="na.exclude"))
>>
>>deliver the same p value (0.1162, identical to Excel's result).
>>
>>However, after combining the two vectors with
>>
>>>  testdata <- c(testdata.A, testdata.B);
>>
>>and defining a criterion vector with
>>
>>>  criterion <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);
>>
>>(that is the type of data layout we have in our proteomics project)
>>we get a different p-value (0.01453) with
>>
>>>  print(t.test(testdata ~ criterion, paired=TRUE,
>>>alternative="two.sided", na.action="na.exclude")) .
>>
>>The statement
>>
>>>  print(t.test(testdata ~ criterion, paired=TRUE,
>>>alternative="two.sided", na.action="na.pass"))
>>
>>however, delivers a p-value of 0.1162 again.
>>
>>With
>>
>>>  print(t.test(testdata[criterion==0], testdata[criterion==1],
>>>paired=TRUE, alternative="two.sided", na.action="na.exclude"))
>>
>>that imitates the first form, we get again a p value of 0.1162.
>>
>>What is the reason for the different p values? Should not all calls
>>to t.test that exlude missing values be equivalent and therefore
>>deliver the same results?
>>
>>Excel, StatView and KaleidaGraph all display a p-value of 0.1162.
>>
>>J. W. D.
>>--
>>-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
>>-- Dr. Johannes W. Dietrich, M.D.
>>-- Laboratory XU44, Endocrine Research
>>-- Medical Hospital I, Bergmannsheil University Hospitals
>>-- Ruhr University of Bochum
>>-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
>>-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
>>-- eMail: "[hidden email]"
>>-- WWW: http://medical-cybernetics.de
>>-- WWW: http://www.bergmannsheil.de
>>-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
>>
>>______________________________________________
>>[hidden email] mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>>
>
>Thomas Lumley
>Professor of Biostatistics
>University of Washington, Seattle

--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- Dr. Johannes W. Dietrich, M.D.
-- Laboratory XU44, Endocrine Research
-- Medical Hospital I, Bergmannsheil University Hospitals
-- Ruhr University of Bochum
-- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
-- Phone: +49:234:302-6400, Fax: +49:234:302-6403
-- eMail: "[hidden email]"
-- WWW: http://medical-cybernetics.de
-- WWW: http://www.bergmannsheil.de
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug in t.test?

Ted.Harding-2
In reply to this post by Thomas Lumley
Hi Thomas,
I'm not too sure about your interpretation. Consider:

  set.seed(54321)
  X <- rnorm(10) ; Y <- rnorm(10)
  XY <- c(X,Y); group<-c(rep(0,10),rep(1,10))

  t.test(X,Y,paired=TRUE)
  #         Paired t-test
  # data:  X and Y
  # t = -1.5265, df = 9, p-value = 0.1612
  # 95 percent confidence interval:
  #  -2.1521562  0.4178885
  # sample estimates: mean of the differences
  #                  -0.8671339

  t.test(XY~group,paired=TRUE)
  #         Paired t-test
  # data:  XY by group
  # t = -1.5265, df = 9, p-value = 0.1612
  # 95 percent confidence interval:
  #  -2.1521562  0.4178885
  # sample estimates: mean of the differences
  #                  -0.8671339

  t.test(X,Y,paired=FALSE)
  #         Welch Two Sample t-test
  # data:  X and Y
  # t = -1.4865, df = 17.956, p-value = 0.1545
  # 95 percent confidence interval:
  #  -2.0928836  0.3586159
  # sample estimates: mean of x  mean of y
  #                  -0.2646042  0.6025297

  t.test(XY~group,paired=FALSE)
  #         Welch Two Sample t-test
  # data:  XY by group
  # t = -1.4865, df = 17.956, p-value = 0.1545
  # 95 percent confidence interval:
  #  -2.0928836  0.3586159
  # sample estimates: mean in group 0 mean in group 1
  #                  -0.2646042       0.6025297

So in each case t.test(XY~group,...) has done the same as
the corresponding t.test(X,Y,...); thus it would not seem
that "The formula interface is only applicable for the
2-sample tests" excludes specifying the two samples by
a formula with a 2-level factor; and it would seem that
the pairing has been correctly carried out (i.e. in the
formula case t.test assumes that it is by sequential
position within the respective group, without having to
be told as you suggest). In other words, it acts as though
giving a formula with a 2-group factor on the right, and
equal numbers in the two groups, is equivalent to giving
the two samples separately.

Johannes' original query was about differences when there
are NAs, corresponding to different settings of "na.action".
It is perhaps possible that 'na.action="na.pass"' and
'na.action="na.exclude"' result in different pairings in the
case "paired=TRUE". However, it seems to me that the differences
he observed are, shall we say, obscure!

Ted.

On 13-Aug-10 22:31:45, Thomas Lumley wrote:

> Thanks for the clear example. However, if there is a bug it is
> only that t.test.formula doesn't throw an error when given the
> paired=TRUE option.
>
> The documentation says "The formula interface is only applicable
> for the 2-sample tests.",  but there probably should be an explicit
> check -- I didn't see that in the documentation until I realized
> that t.test.formula couldn't work for paired tests because you don't
> tell it which observations are paired.
>     -thomas
>
> On Fri, 13 Aug 2010, Johannes W. Dietrich wrote:
>> Hello all,
>>
>> due to unexplained differences between statistical results from
>> collaborators
>> and our lab that arose in the same large proteomics dataset we
>> reevaluated
>> the t.test() function. Here, we found a weird behaviour that is also
>> reproducible in the following small test dataset:
>>
>> Suppose, we have two vectors with numbers and some missing values that
>> refer
>> to the same individuals and that should therefore be evaluated with a
>> paired
>> t-test:
>>
>>>  testdata.A <- c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
>>>  testdata.B <- c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);
>>
>> Then
>>
>>>  print(t.test(testdata.A, testdata.B, paired=TRUE,
>>>  alternative="two.sided",
>>> na.action="na.pass"))
>>
>> and
>>
>>>  print(t.test(testdata.A, testdata.B, paired=TRUE,
>>>  alternative="two.sided",
>>> na.action="na.exclude"))
>>
>> deliver the same p value (0.1162, identical to Excel's result).
>>
>> However, after combining the two vectors with
>>
>>>  testdata <- c(testdata.A, testdata.B);
>>
>> and defining a criterion vector with
>>
>>>  criterion <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);
>>
>> (that is the type of data layout we have in our proteomics project) we
>> get a
>> different p-value (0.01453) with
>>
>>>  print(t.test(testdata ~ criterion, paired=TRUE,
>>>  alternative="two.sided",
>>> na.action="na.exclude")) .
>>
>> The statement
>>
>>>  print(t.test(testdata ~ criterion, paired=TRUE,
>>>  alternative="two.sided",
>>> na.action="na.pass"))
>>
>> however, delivers a p-value of 0.1162 again.
>>
>> With
>>
>>>  print(t.test(testdata[criterion==0], testdata[criterion==1],
>>>  paired=TRUE,
>>> alternative="two.sided", na.action="na.exclude"))
>>
>> that imitates the first form, we get again a p value of 0.1162.
>>
>> What is the reason for the different p values? Should not all calls to
>> t.test
>> that exlude missing values be equivalent and therefore deliver the
>> same
>> results?
>>
>> Excel, StatView and KaleidaGraph all display a p-value of 0.1162.
>>
>> J. W. D.
>> --
>> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
>> -- --
>> -- Dr. Johannes W. Dietrich, M.D.
>> -- Laboratory XU44, Endocrine Research
>> -- Medical Hospital I, Bergmannsheil University Hospitals
>> -- Ruhr University of Bochum
>> -- Buerkle-de-la-Camp-Platz 1, D-44789 Bochum, NRW, Germany
>> -- Phone: +49:234:302-6400, Fax: +49:234:302-6403
>> -- eMail: "[hidden email]"
>> -- WWW: http://medical-cybernetics.de
>> -- WWW: http://www.bergmannsheil.de
>> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
>> -- --
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Thomas Lumley
> Professor of Biostatistics
> University of Washington, Seattle
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <[hidden email]>
Fax-to-email: +44 (0)870 094 0861
Date: 14-Aug-10                                       Time: 00:49:57
------------------------------ XFMail ------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug in t.test?

Peter Dalgaard-2
(Ted Harding) wrote:

> Johannes' original query was about differences when there
> are NAs, corresponding to different settings of "na.action".
> It is perhaps possible that 'na.action="na.pass"' and
> 'na.action="na.exclude"' result in different pairings in the
> case "paired=TRUE". However, it seems to me that the differences
> he observed are, shall we say, obscure!

Did you miss the point that the incorrect behaviour with na.exclude is
the default one? (Actually, na.omit is, but the effect is the same).

Arguably, t.test could just hardcode na.action=na.pass and begone with it.

--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug in t.test?

Thomas Lumley
In reply to this post by Ted.Harding-2
On Sat, 14 Aug 2010, [hidden email] wrote:

> Hi Thomas,
> I'm not too sure about your interpretation. Consider:

It seems hard to interpret "The formula interface is only applicable for the 2-sample tests." any other way

>
> Johannes' original query was about differences when there
> are NAs, corresponding to different settings of "na.action".
> It is perhaps possible that 'na.action="na.pass"' and
> 'na.action="na.exclude"' result in different pairings in the
> case "paired=TRUE". However, it seems to me that the differences
> he observed are, shall we say, obscure!

No, they are perfectly straightforward.  Johannes's data had two missing values, one in each group, but not in the same pair.

With na.omit or na.exclude, model.frame() removes the NAs. If there are the same number of NAs in each group, this leaves the same number of observations in each group. t.test.formula() splits these according to the group variable and passes them to t.test.default. Because of the (invalid) paired=TRUE argument, t.test.default assumes these are nine pairs and gets bogus answers.

On the other hand with na.pass, model.frame() does not remove NAs. t.test.formula() passes two sets of ten observations (including missing observations) to t.test.default().  Because of the paired=TRUE argument, t.test.default() assumes these are ten pairs, which happens to be true in this case, and after deleting the two pairs with missing observations it gives the right answer.

Regardless of the details, however, t.test.formula() can't reliably work with paired=TRUE because the user interface provides no way to specify which observations are paired. It would be possible (though bad idea in my opinion) to specify that paired=TRUE is allowed and that the pairing is done in the order the observations appear in the data. The minimal change would be to stop doing missing-value removal in t.test.formula, although that would be undesirable if a user wanted to supply some sort of na.impute() option.

I would strongly prefer having an explicit indication of pairing, eg paired=variable.name, or even better, paired=~variable.name. Relying on data frame ordering seems a really bad idea.

    -thomas

Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug in t.test?

Ted.Harding-2
On 14-Aug-10 16:07:14, Thomas Lumley wrote:

> On Sat, 14 Aug 2010, [hidden email] wrote:
>> Hi Thomas,
>> I'm not too sure about your interpretation. Consider:
>
> It seems hard to interpret "The formula interface is only applicable
> for the 2-sample tests." any other way
>
>>
>> Johannes' original query was about differences when there
>> are NAs, corresponding to different settings of "na.action".
>> It is perhaps possible that 'na.action="na.pass"' and
>> 'na.action="na.exclude"' result in different pairings in the
>> case "paired=TRUE". However, it seems to me that the differences
>> he observed are, shall we say, obscure!
>
> No, they are perfectly straightforward.  Johannes's data had two
> missing values, one in each group, but not in the same pair.
>
> With na.omit or na.exclude, model.frame() removes the NAs. If there
> are the same number of NAs in each group, this leaves the same number
> of observations in each group. t.test.formula() splits these according
> to the group variable and passes them to t.test.default. Because of
> the (invalid) paired=TRUE argument, t.test.default assumes these are
> nine pairs and gets bogus answers.
>
> On the other hand with na.pass, model.frame() does not remove NAs.
> t.test.formula() passes two sets of ten observations (including missing
> observations) to t.test.default().  Because of the paired=TRUE
> argument, t.test.default() assumes these are ten pairs, which happens
> to be true in this case, and after deleting the two pairs with missing
> observations it gives the right answer.
>
> Regardless of the details, however, t.test.formula() can't reliably
> work with paired=TRUE because the user interface provides no way to
> specify which observations are paired. It would be possible (though
> bad idea in my opinion) to specify that paired=TRUE is allowed and
> that the pairing is done in the order the observations appear in the
> data. The minimal change would be to stop doing missing-value removal
> in t.test.formula, although that would be undesirable if a user wanted
> to supply some sort of na.impute() option.
>
> I would strongly prefer having an explicit indication of pairing,
> eg paired=variable.name, or even better, paired=~variable.name.
> Relying on data frame ordering seems a really bad idea.
>
>     -thomas

Thanks, Thomas, for elucidating the mechanisms of what I had suspected.
Following the earlier correspondence, I did a little experimentation
which helps to visualise what goes on. I had been composing a draft
email about it (see below) when yours arrived. It ends with a view
on how things might be arranged to work more transparently.
Ted.

==============================================================
I previously wrote (in connection with how the parings take place):
> Johannes' original query was about differences when there are NAs,
> corresponding to different settings of "na.action". It is perhaps
> possible that 'na.action="na.pass"' and 'na.action="na.exclude"'
> result in different pairings in the case "paired=TRUE".

I have now tested the latter using Johannes Dietrich's example data.
Results below.

# Johannes' data:
testdata.A  <- c(1.15, -0.2, NA,  1  , -2, -0.5, 0.1, 1.2, -1.4, 0.01)
testdata.B  <- c(1.2 ,  1.1, 3 , -0.1,  3,  1.1, 0  , 1.3,  4  , NA  )

# Pairwise complete data:
testdata.A2 <- c(1.15, -0.2,      1  , -2, -0.5, 0.1, 1.2, -1.4)
testdata.B2 <- c(1.2 ,  1.1,     -0.1,  3,  1.1, 0  , 1.3,  4  )

# Groupwise complete data:
testdata.A3 <- c(1.15, -0.2,      1  , -2, -0.5, 0.1, 1.2, -1.4, 0.01)
testdata.B3 <- c(1.2 ,  1.1, 3 , -0.1,  3,  1.1, 0  , 1.3,  4        )

## Johannes:
t.test(testdata.A, testdata.B, paired=TRUE,      
       alternative="two.sided", na.action="na.exclude")
# Paired t-test # data:  testdata.A and testdata.B
# t = -1.7921, df = 7, p-value = 0.1162
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval: #  -3.5517017  0.4892017
# sample estimates: mean of the differences # -1.53125

## Johannes:
testdata  <- c(testdata.A, testdata.B);
criterion <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);

## Johannes:
# Formula-based, "na.pass":
print(t.test(testdata ~ criterion, paired=TRUE,
      alternative="two.sided", na.action="na.pass"))
# Paired t-test # data:  testdata by criterion
# t = -1.7921, df = 7, p-value = 0.1162
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval: #  -3.5517017  0.4892017
# sample estimates: mean of the differences # -1.53125

## Pairwise complete:
t.test(testdata.A2,testdata.B2,paired=TRUE)
# Paired t-test # data:  testdata.A2 and testdata.B2
# t = -1.7921, df = 7, p-value = 0.1162
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval: #  -3.5517017  0.4892017
# sample estimates: mean of the differences # -1.53125

## Johannes:
# Formula-based, "na.exclude":
t.test(testdata ~ criterion, paired=TRUE,
       alternative="two.sided", na.action="na.exclude")
# Paired t-test # data:  testdata by criterion
# t = -3.1063, df = 8, p-value = 0.01453
# lternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval: #  -2.9504111 -0.4362555
# sample estimates: mean of the differences # -1.693333

## Groupwise complete:
t.test(testdata.A3,testdata.B3,paired=TRUE)
# Paired t-test # data:  testdata.A3 and testdata.B3
# t = -3.1063, df = 8, p-value = 0.01453
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval: #  -2.9504111 -0.4362555
# sample estimates: mean of the differences # -1.693333

So, with "paired=TRUE":

Formula-based with "na.pass" is the same as using pairwise complete
data and then doing a paired t-test

Formula-based with "na.exclude" is the same as using groupwise
complete data and then doing a paired t-test.

Otherwise put, in the formula-based specification it seems that:

With "na.pass" the data are given to t.test with NAs present
and t.test, picking up the paramater "paired=TRUE", then does
a pairwise strikeout of incomplete cases and does a paired
t.test on the remainder (still paired as in the original data)

With "na.exclude" the data are first filtered of NAs *after the
formula has been read* (so that the corresponding elements of
'criterion' are struck out too), and the resulting data (with
no NAs) are then submitted to the main routine of t.test, [i.e.
as you point out t.test.default] which does a paired t.test on
the now complete data. But the pairing is no longer as in the
original data.

Thus these two approaches lead (as I had surmised) to different
pairings of the data -- compare the pairings resulting from
testdata.A2 and testdata.B2 with the pairings resulting from
testdata.A3 and testdata.B3 as shown at the begining of my code
above.

So one could argue that neither outcome is incorrect -- they
correspond to different approaches to dealing with missing data.
However, that being said, it is difficult to imagine a situation
in which one would want data, originally paired, to be stripped
group-wise of NAs and then submitted to a paired t-test with
different pairings!

Possibly the t.test() function should first look for the 'paired'
parameter before looking at 'na.action'. Then, if paired=TRUE,
it could do pairwise deletion of incomplete cases regardless
of the value of 'na.action' and do a paired t-test on the result,
while if paired=FALSE it could delete NAs in the outcome and
(if given the data as a formula) delete corresponding elements
in the formula's righthand side (the 2-level group indicator),
and, either way, then do an unpaired 2-sample t-test on the
result (where pairing is irrelevant anyway with "paired=FALSE").
==============================================================


--------------------------------------------------------------------
E-Mail: (Ted Harding) <[hidden email]>
Fax-to-email: +44 (0)870 094 0861
Date: 14-Aug-10                                       Time: 18:55:23
------------------------------ XFMail ------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.