[R] ifelse on data frames

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[R] ifelse on data frames

Murray Jorgensen
[Using R 2.2.0 on Windows XP; OK, OK, I will update soon!]

I have noticed some undesirable behaviour when applying
ifelse to a data frame. Here is my code:

A <- scan()
 1.000000 0.000000 0.000000  0 0.00000
 0.027702 0.972045 0.000253  0 0.00000

A <- matrix(A,nrow=2,ncol=5,byrow=T)
A == 0
ifelse(A==0,0,-A*log(A))

A <- as.data.frame(A)
ifelse(A==0,0,-A*log(A))

and this is the output:

> A <- scan()
1:  1.000000 0.000000 0.000000  0 0.00000
6:  0.027702 0.972045 0.000253  0 0.00000
11:
Read 10 items
> A <- matrix(A,nrow=2,ncol=5,byrow=T)
> A == 0
      [,1]  [,2]  [,3] [,4] [,5]
[1,] FALSE  TRUE  TRUE TRUE TRUE
[2,] FALSE FALSE FALSE TRUE TRUE
> ifelse(A==0,0,-A*log(A))
           [,1]       [,2]        [,3] [,4] [,5]
[1,] 0.00000000 0.00000000 0.000000000    0    0
[2,] 0.09934632 0.02756057 0.002095377    0    0
>
> A <- as.data.frame(A)
> ifelse(A==0,0,-A*log(A))
[[1]]
[1] 0.00000000 0.09934632

[[2]]
[1]        NaN 0.02756057

[[3]]
[1] 0

[[4]]
[1] NaN NaN

[[5]]
[1] 0

[[6]]
[1] 0.00000000 0.09934632

[[7]]
[1] 0

[[8]]
[1] 0

[[9]]
[1] 0

[[10]]
[1] 0

>

Is this a bug or a feature? Can the behaviour be explained?

Regards,  Murray Jorgensen
--
Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: [hidden email]                                Fax 7 838 4155
Phone  +64 7 838 4773 wk    Home +64 7 825 0441    Mobile 021 1395 862

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] ifelse on data frames

talepanda
It can be explained.

> class(A)
[1] "data.frame"
> length(A)
[1] 5
> class(A==0)
[1] "matrix"
> length(A==0)
[1] 10
> class(-A*log(A))
[1] "data.frame"
> length(-A*log(A))
[1] 5

as you can see, the result of A==0 is matrix with length=10, while the
result of -A*log(A) is still data.frame with length=5.

then, when calling ifelse( [length=10], 0, [length=5] ), internally,
the NO(3rd) argument was repeated by rep(-A*log(A),length.out=10) (try
this).
the result is "list" with length=10 and each element has 2 sub-elements.

So, the return value of A[(A==0)==FALSE] has 2 sub-elements as you get.

I think what confusing you is the behavior of A==0.

However, when using 'ifelse', I think you should use matrix as the
arguments because data.frame is not consistent with the purpose of
'ifelse'.

On 1/5/07, [hidden email] <[hidden email]> wrote:

> [Using R 2.2.0 on Windows XP; OK, OK, I will update soon!]
>
> I have noticed some undesirable behaviour when applying
> ifelse to a data frame. Here is my code:
>
> A <- scan()
>  1.000000 0.000000 0.000000  0 0.00000
>  0.027702 0.972045 0.000253  0 0.00000
>
> A <- matrix(A,nrow=2,ncol=5,byrow=T)
> A == 0
> ifelse(A==0,0,-A*log(A))
>
> A <- as.data.frame(A)
> ifelse(A==0,0,-A*log(A))
>
> and this is the output:
>
> > A <- scan()
> 1:  1.000000 0.000000 0.000000  0 0.00000
> 6:  0.027702 0.972045 0.000253  0 0.00000
> 11:
> Read 10 items
> > A <- matrix(A,nrow=2,ncol=5,byrow=T)
> > A == 0
>       [,1]  [,2]  [,3] [,4] [,5]
> [1,] FALSE  TRUE  TRUE TRUE TRUE
> [2,] FALSE FALSE FALSE TRUE TRUE
> > ifelse(A==0,0,-A*log(A))
>            [,1]       [,2]        [,3] [,4] [,5]
> [1,] 0.00000000 0.00000000 0.000000000    0    0
> [2,] 0.09934632 0.02756057 0.002095377    0    0
> >
> > A <- as.data.frame(A)
> > ifelse(A==0,0,-A*log(A))
> [[1]]
> [1] 0.00000000 0.09934632
>
> [[2]]
> [1]        NaN 0.02756057
>
> [[3]]
> [1] 0
>
> [[4]]
> [1] NaN NaN
>
> [[5]]
> [1] 0
>
> [[6]]
> [1] 0.00000000 0.09934632
>
> [[7]]
> [1] 0
>
> [[8]]
> [1] 0
>
> [[9]]
> [1] 0
>
> [[10]]
> [1] 0
>
> >
>
> Is this a bug or a feature? Can the behaviour be explained?
>
> Regards,  Murray Jorgensen
> --
> Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
> Department of Statistics, University of Waikato, Hamilton, New Zealand
> Email: [hidden email]                                Fax 7 838 4155
> Phone  +64 7 838 4773 wk    Home +64 7 825 0441    Mobile 021 1395 862
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] ifelse on data frames

PIKAL Petr
Hi

you could use also another approach in case of data frames

A <- as.data.frame(A)
A0 <- -A*log(A)
A0[is.na(A0)] <- 0

which changes NaN's to zeroes

HTH
Petr


On 5 Jan 2007 at 16:38, talepanda wrote:

Date sent:       Fri, 5 Jan 2007 16:38:05 +0900
From:           talepanda <[hidden email]>
To:             [hidden email]
Copies to:       [hidden email]
Subject:         Re: [R] ifelse on data frames

> It can be explained.
>
> > class(A)
> [1] "data.frame"
> > length(A)
> [1] 5
> > class(A==0)
> [1] "matrix"
> > length(A==0)
> [1] 10
> > class(-A*log(A))
> [1] "data.frame"
> > length(-A*log(A))
> [1] 5
>
> as you can see, the result of A==0 is matrix with length=10, while the
> result of -A*log(A) is still data.frame with length=5.
>
> then, when calling ifelse( [length=10], 0, [length=5] ), internally,
> the NO(3rd) argument was repeated by rep(-A*log(A),length.out=10) (try
> this). the result is "list" with length=10 and each element has 2
> sub-elements.
>
> So, the return value of A[(A==0)==FALSE] has 2 sub-elements as you
> get.
>
> I think what confusing you is the behavior of A==0.
>
> However, when using 'ifelse', I think you should use matrix as the
> arguments because data.frame is not consistent with the purpose of
> 'ifelse'.
>
> On 1/5/07, [hidden email] <[hidden email]> wrote: >
> [Using R 2.2.0 on Windows XP; OK, OK, I will update soon!] > > I have
> noticed some undesirable behaviour when applying > ifelse to a data
> frame. Here is my code: > > A <- scan() >  1.000000 0.000000 0.000000
> 0 0.00000 >  0.027702 0.972045 0.000253  0 0.00000 > > A <-
> matrix(A,nrow=2,ncol=5,byrow=T) > A == 0 > ifelse(A==0,0,-A*log(A)) >
> > A <- as.data.frame(A) > ifelse(A==0,0,-A*log(A)) > > and this is the
> output: > > > A <- scan() > 1:  1.000000 0.000000 0.000000  0 0.00000
> > 6:  0.027702 0.972045 0.000253  0 0.00000 > 11: > Read 10 items > >
> A <- matrix(A,nrow=2,ncol=5,byrow=T) > > A == 0 >       [,1]  [,2]
> [,3] [,4] [,5] > [1,] FALSE  TRUE  TRUE TRUE TRUE > [2,] FALSE FALSE
> FALSE TRUE TRUE > > ifelse(A==0,0,-A*log(A)) >            [,1]      
> [,2]        [,3] [,4] [,5] > [1,] 0.00000000 0.00000000 0.000000000  
> 0    0 > [2,] 0.09934632 0.02756057 0.002095377    0    0 > > > > A <-
> as.data.frame(A) > > ifelse(A==0,0,-A*log(A)) > [[1]] > [1] 0.00000000
> 0.09934632 > > [[2]] > [1]        NaN 0.02756057 > > [[3]] > [1] 0 > >
> [[4]] > [1] NaN NaN > > [[5]] > [1] 0 > > [[6]] > [1] 0.00000000
> 0.09934632 > > [[7]] > [1] 0 > > [[8]] > [1] 0 > > [[9]] > [1] 0 > >
> [[10]] > [1] 0 > > > > > Is this a bug or a feature? Can the behaviour
> be explained? > > Regards,  Murray Jorgensen > -- > Dr Murray
> Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html >
> Department of Statistics, University of Waikato, Hamilton, New Zealand
> > Email: [hidden email]                                Fax 7 838
> 4155 > Phone  +64 7 838 4773 wk    Home +64 7 825 0441    Mobile 021
> 1395 862 > > ______________________________________________ >
> [hidden email] mailing list >
> https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the
> posting guide http://www.R-project.org/posting-guide.html > and
> provide commented, minimal, self-contained, reproducible code. >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

Petr Pikal
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] ifelse on data frames

Bugzilla from auxsvr@yahoo.com
In reply to this post by Murray Jorgensen
On Friday 05 January 2007 12:34, Petr Pikal wrote:
> Hi
>
> you could use also another approach in case of data
frames
>
> A <- as.data.frame(A)
> A0 <- -A*log(A)
> A0[is.na(A0)] <- 0
I think you meant A0[which(is.na(A0))] <- 0
>
> which changes NaN's to zeroes
>
> HTH
> Petr
Regards

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] ifelse on data frames

Rolf Turner-2
In reply to this post by Murray Jorgensen
``MrJ Man'' wrote:

> On Friday 05 January 2007 12:34, Petr Pikal wrote:
> > Hi
> >
> > you could use also another approach in case of data
> frames
> >
> > A <- as.data.frame(A)
> > A0 <- -A*log(A)
> > A0[is.na(A0)] <- 0
> I think you meant A0[which(is.na(A0))] <- 0

        He most certainly DOES NOT mean this!

        You should try things out before offering gratuitous
        advice.

        (a) A0[is.na(A0)] <- 0
                works perfectly.

        (b) A0[which(is.na(A0))] <- 0
                gets it wrong!!!

        I would have thought that

        A0[which(is.na(A0),arr.ind=TRUE)] <- 0

        would work and get it right, but it gives the
        error message

                Error in `[<-.data.frame`(`*tmp*`, which(is.na(A0),
                      arr.ind = TRUE), value = 0) :
        only logical matrix subscripts are allowed in replacement
> >
> > which changes NaN's to zeroes
> >
> > HTH
> > Petr

                                cheers,

                                        Rolf Turner
                                        [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [R] ifelse on data frames

Henric Nilsson
In reply to this post by Murray Jorgensen
[hidden email] said the following on 2007-01-05 04:18:

> [Using R 2.2.0 on Windows XP; OK, OK, I will update soon!]
>
> I have noticed some undesirable behaviour when applying
> ifelse to a data frame. Here is my code:
>
> A <- scan()
>  1.000000 0.000000 0.000000  0 0.00000
>  0.027702 0.972045 0.000253  0 0.00000
>
> A <- matrix(A,nrow=2,ncol=5,byrow=T)
> A == 0
> ifelse(A==0,0,-A*log(A))
>
> A <- as.data.frame(A)
> ifelse(A==0,0,-A*log(A))

How about using

sapply(A, function(x) ifelse(x == 0, 0, -x*log(x)))

?


HTH,
Henric



>
> and this is the output:
>
>> A <- scan()
> 1:  1.000000 0.000000 0.000000  0 0.00000
> 6:  0.027702 0.972045 0.000253  0 0.00000
> 11:
> Read 10 items
>> A <- matrix(A,nrow=2,ncol=5,byrow=T)
>> A == 0
>       [,1]  [,2]  [,3] [,4] [,5]
> [1,] FALSE  TRUE  TRUE TRUE TRUE
> [2,] FALSE FALSE FALSE TRUE TRUE
>> ifelse(A==0,0,-A*log(A))
>            [,1]       [,2]        [,3] [,4] [,5]
> [1,] 0.00000000 0.00000000 0.000000000    0    0
> [2,] 0.09934632 0.02756057 0.002095377    0    0
>> A <- as.data.frame(A)
>> ifelse(A==0,0,-A*log(A))
> [[1]]
> [1] 0.00000000 0.09934632
>
> [[2]]
> [1]        NaN 0.02756057
>
> [[3]]
> [1] 0
>
> [[4]]
> [1] NaN NaN
>
> [[5]]
> [1] 0
>
> [[6]]
> [1] 0.00000000 0.09934632
>
> [[7]]
> [1] 0
>
> [[8]]
> [1] 0
>
> [[9]]
> [1] 0
>
> [[10]]
> [1] 0
>
>
> Is this a bug or a feature? Can the behaviour be explained?
>
> Regards,  Murray Jorgensen

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.