Modifying dataframe with mutate()

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
H-2
Reply | Threaded
Open this post in threaded view
|

Modifying dataframe with mutate()

H-2
In a statement like:

df %>% mutate(v1 = as.double(v1))

I expect the variable v1 in dataframe df to have been converted into a double. However, when I do:

str(df)

v1 still shows as int. Do I need to save the modified dataframe after mutating a variable?

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Modifying dataframe with mutate()

Patrick (Malone Quantitative)
This seems needlessly complicated.

df$v1 <- as.double(df$v1)

Or as.numeric()

On Sat, Jul 25, 2020 at 3:31 PM H <[hidden email]> wrote:

> In a statement like:
>
> df %>% mutate(v1 = as.double(v1))
>
> I expect the variable v1 in dataframe df to have been converted into a
> double. However, when I do:
>
> str(df)
>
> v1 still shows as int. Do I need to save the modified dataframe after
> mutating a variable?
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


--
Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Modifying dataframe with mutate()

Jeff Newmiller
In reply to this post by H-2
R is largely a functional language. You do something to an input and end up with an output that has no effect on the input. This is actually a highly desirable feature.

If you want your df variable to reflect changes made then you  need to assign your result back into it.

df <- df %>% mutate(v1 = as.double(v1))

(Note that the data.table package violates this principle and is controversial as a result.)

On July 25, 2020 12:11:24 PM PDT, H <[hidden email]> wrote:

>In a statement like:
>
>df %>% mutate(v1 = as.double(v1))
>
>I expect the variable v1 in dataframe df to have been converted into a
>double. However, when I do:
>
>str(df)
>
>v1 still shows as int. Do I need to save the modified dataframe after
>mutating a variable?
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Modifying dataframe with mutate()

Patrick (Malone Quantitative)
Jeff,

mutate(), which is I think part of dplyr, also violates this, for what it's
worth. I suspect the breaking point is that mutate() is intended to create
new columns in the dataframe, not alter existing ones.

On Sat, Jul 25, 2020 at 3:52 PM Jeff Newmiller <[hidden email]>
wrote:

> R is largely a functional language. You do something to an input and end
> up with an output that has no effect on the input. This is actually a
> highly desirable feature.
>
> If you want your df variable to reflect changes made then you  need to
> assign your result back into it.
>
> df <- df %>% mutate(v1 = as.double(v1))
>
> (Note that the data.table package violates this principle and is
> controversial as a result.)
>
> On July 25, 2020 12:11:24 PM PDT, H <[hidden email]> wrote:
> >In a statement like:
> >
> >df %>% mutate(v1 = as.double(v1))
> >
> >I expect the variable v1 in dataframe df to have been converted into a
> >double. However, when I do:
> >
> >str(df)
> >
> >v1 still shows as int. Do I need to save the modified dataframe after
> >mutating a variable?
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


--
Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Modifying dataframe with mutate()

Jeff Newmiller
False. Mutate is similar in structure to the base function `within`. Which is why you have to assign the altered data frame back onto itself.

On July 25, 2020 12:59:06 PM PDT, "Patrick (Malone Quantitative)" <[hidden email]> wrote:

>Jeff,
>
>mutate(), which is I think part of dplyr, also violates this, for what
>it's
>worth. I suspect the breaking point is that mutate() is intended to
>create
>new columns in the dataframe, not alter existing ones.
>
>On Sat, Jul 25, 2020 at 3:52 PM Jeff Newmiller
><[hidden email]>
>wrote:
>
>> R is largely a functional language. You do something to an input and
>end
>> up with an output that has no effect on the input. This is actually a
>> highly desirable feature.
>>
>> If you want your df variable to reflect changes made then you  need
>to
>> assign your result back into it.
>>
>> df <- df %>% mutate(v1 = as.double(v1))
>>
>> (Note that the data.table package violates this principle and is
>> controversial as a result.)
>>
>> On July 25, 2020 12:11:24 PM PDT, H <[hidden email]> wrote:
>> >In a statement like:
>> >
>> >df %>% mutate(v1 = as.double(v1))
>> >
>> >I expect the variable v1 in dataframe df to have been converted into
>a
>> >double. However, when I do:
>> >
>> >str(df)
>> >
>> >v1 still shows as int. Do I need to save the modified dataframe
>after
>> >mutating a variable?
>> >
>> >______________________________________________
>> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Modifying dataframe with mutate()

Patrick (Malone Quantitative)
Oh, right--I puzzled out my mistake.

On Sat, Jul 25, 2020 at 4:17 PM Jeff Newmiller <[hidden email]>
wrote:

> False. Mutate is similar in structure to the base function `within`. Which
> is why you have to assign the altered data frame back onto itself.
>
> On July 25, 2020 12:59:06 PM PDT, "Patrick (Malone Quantitative)" <
> [hidden email]> wrote:
> >Jeff,
> >
> >mutate(), which is I think part of dplyr, also violates this, for what
> >it's
> >worth. I suspect the breaking point is that mutate() is intended to
> >create
> >new columns in the dataframe, not alter existing ones.
> >
> >On Sat, Jul 25, 2020 at 3:52 PM Jeff Newmiller
> ><[hidden email]>
> >wrote:
> >
> >> R is largely a functional language. You do something to an input and
> >end
> >> up with an output that has no effect on the input. This is actually a
> >> highly desirable feature.
> >>
> >> If you want your df variable to reflect changes made then you  need
> >to
> >> assign your result back into it.
> >>
> >> df <- df %>% mutate(v1 = as.double(v1))
> >>
> >> (Note that the data.table package violates this principle and is
> >> controversial as a result.)
> >>
> >> On July 25, 2020 12:11:24 PM PDT, H <[hidden email]> wrote:
> >> >In a statement like:
> >> >
> >> >df %>% mutate(v1 = as.double(v1))
> >> >
> >> >I expect the variable v1 in dataframe df to have been converted into
> >a
> >> >double. However, when I do:
> >> >
> >> >str(df)
> >> >
> >> >v1 still shows as int. Do I need to save the modified dataframe
> >after
> >> >mutating a variable?
> >> >
> >> >______________________________________________
> >> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> >https://stat.ethz.ch/mailman/listinfo/r-help
> >> >PLEASE do read the posting guide
> >> >http://www.R-project.org/posting-guide.html
> >> >and provide commented, minimal, self-contained, reproducible code.
> >>
> >> --
> >> Sent from my phone. Please excuse my brevity.
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
>
> --
> Sent from my phone. Please excuse my brevity.
>


--
Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Modifying dataframe with mutate()

Duncan Murdoch-2
Were you thinking of the %<>% operator?  That's a magrittr thing, where
x %<>% y acts like x <- x %>% y .

Duncan Murdoch

On 25/07/2020 4:18 p.m., Patrick (Malone Quantitative) wrote:

> Oh, right--I puzzled out my mistake.
>
> On Sat, Jul 25, 2020 at 4:17 PM Jeff Newmiller <[hidden email]>
> wrote:
>
>> False. Mutate is similar in structure to the base function `within`. Which
>> is why you have to assign the altered data frame back onto itself.
>>
>> On July 25, 2020 12:59:06 PM PDT, "Patrick (Malone Quantitative)" <
>> [hidden email]> wrote:
>>> Jeff,
>>>
>>> mutate(), which is I think part of dplyr, also violates this, for what
>>> it's
>>> worth. I suspect the breaking point is that mutate() is intended to
>>> create
>>> new columns in the dataframe, not alter existing ones.
>>>
>>> On Sat, Jul 25, 2020 at 3:52 PM Jeff Newmiller
>>> <[hidden email]>
>>> wrote:
>>>
>>>> R is largely a functional language. You do something to an input and
>>> end
>>>> up with an output that has no effect on the input. This is actually a
>>>> highly desirable feature.
>>>>
>>>> If you want your df variable to reflect changes made then you  need
>>> to
>>>> assign your result back into it.
>>>>
>>>> df <- df %>% mutate(v1 = as.double(v1))
>>>>
>>>> (Note that the data.table package violates this principle and is
>>>> controversial as a result.)
>>>>
>>>> On July 25, 2020 12:11:24 PM PDT, H <[hidden email]> wrote:
>>>>> In a statement like:
>>>>>
>>>>> df %>% mutate(v1 = as.double(v1))
>>>>>
>>>>> I expect the variable v1 in dataframe df to have been converted into
>>> a
>>>>> double. However, when I do:
>>>>>
>>>>> str(df)
>>>>>
>>>>> v1 still shows as int. Do I need to save the modified dataframe
>>> after
>>>>> mutating a variable?
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> --
>>>> Sent from my phone. Please excuse my brevity.
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
H-2
Reply | Threaded
Open this post in threaded view
|

Re: Modifying dataframe with mutate()

H-2
In reply to this post by Jeff Newmiller
On 07/25/2020 04:17 PM, Jeff Newmiller wrote:

> False. Mutate is similar in structure to the base function `within`. Which is why you have to assign the altered data frame back onto itself.
>
> On July 25, 2020 12:59:06 PM PDT, "Patrick (Malone Quantitative)" <[hidden email]> wrote:
>> Jeff,
>>
>> mutate(), which is I think part of dplyr, also violates this, for what
>> it's
>> worth. I suspect the breaking point is that mutate() is intended to
>> create
>> new columns in the dataframe, not alter existing ones.
>>
>> On Sat, Jul 25, 2020 at 3:52 PM Jeff Newmiller
>> <[hidden email]>
>> wrote:
>>
>>> R is largely a functional language. You do something to an input and
>> end
>>> up with an output that has no effect on the input. This is actually a
>>> highly desirable feature.
>>>
>>> If you want your df variable to reflect changes made then you  need
>> to
>>> assign your result back into it.
>>>
>>> df <- df %>% mutate(v1 = as.double(v1))
>>>
>>> (Note that the data.table package violates this principle and is
>>> controversial as a result.)
>>>
>>> On July 25, 2020 12:11:24 PM PDT, H <[hidden email]> wrote:
>>>> In a statement like:
>>>>
>>>> df %>% mutate(v1 = as.double(v1))
>>>>
>>>> I expect the variable v1 in dataframe df to have been converted into
>> a
>>>> double. However, when I do:
>>>>
>>>> str(df)
>>>>
>>>> v1 still shows as int. Do I need to save the modified dataframe
>> after
>>>> mutating a variable?
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
Thank you, code corrected and problem solved. I was thrown off by the fact that after mutating it looked like the column data type had been changed. I also tried mutate_at() which similarly failed...

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Modifying dataframe with mutate()

Jeff Newmiller
> I was thrown off by the fact that after mutating it looked like the column data type had been changed.

It was changed... in a new copy of the data frame that, because it was at the top-level interactive prompt and not being saved, was printed and then discarded.

On July 25, 2020 5:11:03 PM PDT, H <[hidden email]> wrote:

>On 07/25/2020 04:17 PM, Jeff Newmiller wrote:
>> False. Mutate is similar in structure to the base function `within`.
>Which is why you have to assign the altered data frame back onto
>itself.
>>
>> On July 25, 2020 12:59:06 PM PDT, "Patrick (Malone Quantitative)"
><[hidden email]> wrote:
>>> Jeff,
>>>
>>> mutate(), which is I think part of dplyr, also violates this, for
>what
>>> it's
>>> worth. I suspect the breaking point is that mutate() is intended to
>>> create
>>> new columns in the dataframe, not alter existing ones.
>>>
>>> On Sat, Jul 25, 2020 at 3:52 PM Jeff Newmiller
>>> <[hidden email]>
>>> wrote:
>>>
>>>> R is largely a functional language. You do something to an input
>and
>>> end
>>>> up with an output that has no effect on the input. This is actually
>a
>>>> highly desirable feature.
>>>>
>>>> If you want your df variable to reflect changes made then you  need
>>> to
>>>> assign your result back into it.
>>>>
>>>> df <- df %>% mutate(v1 = as.double(v1))
>>>>
>>>> (Note that the data.table package violates this principle and is
>>>> controversial as a result.)
>>>>
>>>> On July 25, 2020 12:11:24 PM PDT, H <[hidden email]> wrote:
>>>>> In a statement like:
>>>>>
>>>>> df %>% mutate(v1 = as.double(v1))
>>>>>
>>>>> I expect the variable v1 in dataframe df to have been converted
>into
>>> a
>>>>> double. However, when I do:
>>>>>
>>>>> str(df)
>>>>>
>>>>> v1 still shows as int. Do I need to save the modified dataframe
>>> after
>>>>> mutating a variable?
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> --
>>>> Sent from my phone. Please excuse my brevity.
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>Thank you, code corrected and problem solved. I was thrown off by the
>fact that after mutating it looked like the column data type had been
>changed. I also tried mutate_at() which similarly failed...
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.