Silent failure with NA results in fligner.test()

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Silent failure with NA results in fligner.test()

karoliskoncevicius
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Silent failure with NA results in fligner.test()

Kurt Hornik-5
>>>>> Karolis K writes:

Any preferences?

Best
-k

> Hello,
> In certain cases fligner.test() returns NaN statistic and NA p-value.
> The issue happens when, after centering with the median, all absolute values become constant, which ten leads to identical ranks.

> Below are a few examples:

> # 2 groups, 2 values each
> # issue is caused by residual values after centering (-0.5, 0.5, -0.5, 0.5)
> # then, after taking the absolute value, all the ranks become identical.
>> fligner.test(c(2,3,4,5), gl(2,2))

>         Fligner-Killeen test of homogeneity of variances

> data:  c(2, 3, 4, 5) and gl(2, 2)
> Fligner-Killeen:med chi-squared = NaN, df = 1, p-value = NA


> # similar situation with more observations and 3 groups
>> fligner.test(c(2,3,2,3,4,4,5,5,8,9,9,8), gl(3,4))

>         Fligner-Killeen test of homogeneity of variances

> data:  c(2, 3, 2, 3, 4, 4, 5, 5, 8, 9, 9, 8) and gl(3, 4)
> Fligner-Killeen:med chi-squared = NaN, df = 2, p-value = NA


> Two simple patches are proposed below. One returns an error, and another returns a p-value of 1.
> Not sure which one is more appropriate, so submitting both.

> Warm regards,
> Karolis Koncevičius

> ---

> Index: fligner.test.R
> ===================================================================
> --- fligner.test.R (revision 79650)
> +++ fligner.test.R (working copy)
> @@ -59,8 +59,13 @@
>          stop("data are essentially constant")
 

>      a <- qnorm((1 + rank(abs(x)) / (n + 1)) / 2)
> -    STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))
> -    STATISTIC <- (STATISTIC - n * mean(a)^2) / var(a)
> +    if (var(a) > 0) {
> +        STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))
> +        STATISTIC <- (STATISTIC - n * mean(a)^2) / var(a)
> +    }
> +    else {
> +        STATISTIC <- 0
> +    }
>      PARAMETER <- k - 1
>      PVAL <- pchisq(STATISTIC, PARAMETER, lower.tail = FALSE)
>      names(STATISTIC) <- "Fligner-Killeen:med chi-squared”

> ---

> Index: fligner.test.R
> ===================================================================
> --- fligner.test.R (revision 79650)
> +++ fligner.test.R (working copy)
> @@ -57,6 +57,8 @@
>      x <- x - tapply(x,g,median)[g]
>      if (all(x == 0))
>          stop("data are essentially constant")
> +    if (var(abs(x)) == 0)
> +        stop("absolute residuals from the median are essentially constant")
 
>      a <- qnorm((1 + rank(abs(x)) / (n + 1)) / 2)
>      STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))

> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Silent failure with NA results in fligner.test()

karoliskoncevicius
In reply to this post by karoliskoncevicius
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Silent failure with NA results in fligner.test()

Martin Maechler
In reply to this post by Kurt Hornik-5
Not sure....
If all of the variances are zero,  they are homogenous in that sense,
and I would give a  p-value of 1  ..
if only *some* of the variances are zero... it's less easy.

I still would try to *not* give an error in such cases  and even
prefer  NA  statistic and p-value..  because yes, these are "not
available" for such data.
But it is not strictly an error to try such a test on data of the
correct format...   Consequently, personally I would even try to not
give the current error ... but rather return NA values here:
>  if (all(x == 0))
>          stop("data are essentially constant")

On Mon, Dec 21, 2020 at 12:22 PM Kurt Hornik <[hidden email]> wrote:

>
> >>>>> Karolis K writes:
>
> Any preferences?
>
> Best
> -k
>
> > Hello,
> > In certain cases fligner.test() returns NaN statistic and NA p-value.
> > The issue happens when, after centering with the median, all absolute values become constant, which ten leads to identical ranks.
>
> > Below are a few examples:
>
> > # 2 groups, 2 values each
> > # issue is caused by residual values after centering (-0.5, 0.5, -0.5, 0.5)
> > # then, after taking the absolute value, all the ranks become identical.
> >> fligner.test(c(2,3,4,5), gl(2,2))
>
> >         Fligner-Killeen test of homogeneity of variances
>
> > data:  c(2, 3, 4, 5) and gl(2, 2)
> > Fligner-Killeen:med chi-squared = NaN, df = 1, p-value = NA
>
>
> > # similar situation with more observations and 3 groups
> >> fligner.test(c(2,3,2,3,4,4,5,5,8,9,9,8), gl(3,4))
>
> >         Fligner-Killeen test of homogeneity of variances
>
> > data:  c(2, 3, 2, 3, 4, 4, 5, 5, 8, 9, 9, 8) and gl(3, 4)
> > Fligner-Killeen:med chi-squared = NaN, df = 2, p-value = NA
>
>
> > Two simple patches are proposed below. One returns an error, and another returns a p-value of 1.
> > Not sure which one is more appropriate, so submitting both.
>
> > Warm regards,
> > Karolis Koncevičius
>
> > ---
>
> > Index: fligner.test.R
> > ===================================================================
> > --- fligner.test.R    (revision 79650)
> > +++ fligner.test.R    (working copy)
> > @@ -59,8 +59,13 @@
> >          stop("data are essentially constant")
>
> >      a <- qnorm((1 + rank(abs(x)) / (n + 1)) / 2)
> > -    STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))
> > -    STATISTIC <- (STATISTIC - n * mean(a)^2) / var(a)
> > +    if (var(a) > 0) {
> > +        STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))
> > +        STATISTIC <- (STATISTIC - n * mean(a)^2) / var(a)
> > +    }
> > +    else {
> > +        STATISTIC <- 0
> > +    }
> >      PARAMETER <- k - 1
> >      PVAL <- pchisq(STATISTIC, PARAMETER, lower.tail = FALSE)
> >      names(STATISTIC) <- "Fligner-Killeen:med chi-squared”
>
> > ---
>
> > Index: fligner.test.R
> > ===================================================================
> > --- fligner.test.R    (revision 79650)
> > +++ fligner.test.R    (working copy)
> > @@ -57,6 +57,8 @@
> >      x <- x - tapply(x,g,median)[g]
> >      if (all(x == 0))
> >          stop("data are essentially constant")
> > +    if (var(abs(x)) == 0)
> > +        stop("absolute residuals from the median are essentially constant")
>
> >      a <- qnorm((1 + rank(abs(x)) / (n + 1)) / 2)
> >      STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))
>
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


--
Martin <[hidden email]>   http://stat.ethz.ch/~maechler
Seminar für Statistik, ETH Zürich     HG G 16       Rämistrasse 101
CH-8092 Zurich, SWITZERLAND           ☎ +41 44 632 3408        <><

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel