as.data.frame doesn't set col.names

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

as.data.frame doesn't set col.names

Ed Siefker
Why doesn't this work?

> samples$geno <- as.data.frame(sapply(yo, toupper), col.names="geno")
> samples
                          quant_samples   age sapply(yo, toupper)
E11.5 F20het BA40     E11.5 F20het BA40 E11.5              F20HET
E11.5 F20het BA45     E11.5 F20het BA45 E11.5              F20HET

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: as.data.frame doesn't set col.names

Ed Siefker
Wait.  Now I'm really confused.

>
> head(samples)
                          quant_samples   age sapply(yo, toupper)
E11.5 F20het BA40     E11.5 F20het BA40 E11.5              F20HET
E11.5 F20het BA45     E11.5 F20het BA45 E11.5              F20HET
E11.5 F20het BB84     E11.5 F20het BB84 E11.5              F20HET
E11.5 F9.20DKO KTr3 E11.5 F9.20DKO KTr3 E11.5            F9.20DKO
E11.5 F9.20DKO PEd2 E11.5 F9.20DKO PEd2 E11.5            F9.20DKO
E11.5 F9.20DKO j0J1 E11.5 F9.20DKO j0J1 E11.5            F9.20DKO
> colnames(samples)
[1] "quant_samples" "age"           "geno"

Really, really confused.

On Tue, Oct 24, 2017 at 12:58 PM, Ed Siefker <[hidden email]> wrote:
> Why doesn't this work?
>
>> samples$geno <- as.data.frame(sapply(yo, toupper), col.names="geno")
>> samples
>                           quant_samples   age sapply(yo, toupper)
> E11.5 F20het BA40     E11.5 F20het BA40 E11.5              F20HET
> E11.5 F20het BA45     E11.5 F20het BA45 E11.5              F20HET

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: as.data.frame doesn't set col.names

David Carlson
You left out all the most important bits of information. What is yo? Are you trying to assign a data frame to a single column in another data frame? Printing head(samples) tells us nothing about what data types you have, especially if the things that look like text are really factors that were created when you used one of the read.*() functions. Use str(samples) to see what you are dealing with.

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352

-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of Ed Siefker
Sent: Tuesday, October 24, 2017 1:00 PM
To: r-help <[hidden email]>
Subject: Re: [R] as.data.frame doesn't set col.names

Wait.  Now I'm really confused.

>
> head(samples)
                          quant_samples   age sapply(yo, toupper)
E11.5 F20het BA40     E11.5 F20het BA40 E11.5              F20HET
E11.5 F20het BA45     E11.5 F20het BA45 E11.5              F20HET
E11.5 F20het BB84     E11.5 F20het BB84 E11.5              F20HET
E11.5 F9.20DKO KTr3 E11.5 F9.20DKO KTr3 E11.5            F9.20DKO
E11.5 F9.20DKO PEd2 E11.5 F9.20DKO PEd2 E11.5            F9.20DKO
E11.5 F9.20DKO j0J1 E11.5 F9.20DKO j0J1 E11.5            F9.20DKO
> colnames(samples)
[1] "quant_samples" "age"           "geno"

Really, really confused.

On Tue, Oct 24, 2017 at 12:58 PM, Ed Siefker <[hidden email]> wrote:
> Why doesn't this work?
>
>> samples$geno <- as.data.frame(sapply(yo, toupper), col.names="geno")
>> samples
>                           quant_samples   age sapply(yo, toupper)
> E11.5 F20het BA40     E11.5 F20het BA40 E11.5              F20HET
> E11.5 F20het BA45     E11.5 F20het BA45 E11.5              F20HET

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: as.data.frame doesn't set col.names

Peter Dalgaard-2

> On 24 Oct 2017, at 22:45 , David L Carlson <[hidden email]> wrote:
>
> You left out all the most important bits of information. What is yo? Are you trying to assign a data frame to a single column in another data frame? Printing head(samples) tells us nothing about what data types you have, especially if the things that look like text are really factors that were created when you used one of the read.*() functions. Use str(samples) to see what you are dealing with.

Actually, I think there is enough information to diagnose this. The main issue is as you point out, assignment of an entire data frame to a column of another data frame:

> l <- letters[1:5]
> s <- as.data.frame(sapply(l,toupper))
> dput(s)
structure(list(`sapply(l, toupper)` = structure(1:5, .Label = c("A",
"B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)", row.names = c("a",
"b", "c", "d", "e"), class = "data.frame")

(incidentally, setting col.names has no effect on this; notice that it is only documented as an argument to "list" and "matrix" methods, and sapply() returns a vector)

Now, if we do this:

> dd <- data.frame(A=l)
> dd$B <- s

we end up with a data frame whose B "column" is another data frame

> dput(dd)
structure(list(A = structure(1:5, .Label = c("a", "b", "c", "d",
"e"), class = "factor"), B = structure(list(`sapply(l, toupper)` = structure(1:5, .Label = c("A",
"B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)", row.names = c("a",
"b", "c", "d", "e"), class = "data.frame")), .Names = c("A",
"B"), row.names = c(NA, -5L), class = "data.frame")

in printing such data frames, the inner frame "wins" the column names, which is sensible if you consider what would happen if it had more than one column:

> dd
  A sapply(l, toupper)
1 a                  A
2 b                  B
3 c                  C
4 d                  D
5 e                  E

To get the effect that Ed probably expected, do

> dd <- data.frame(A=l)
> dd["B"] <- s
> dd
  A B
1 a A
2 b B
3 c C
4 d D
5 e E

(and notice that single-bracket indexing is crucial here)

-pd

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: as.data.frame doesn't set col.names

Eric Berger
Hi Peter,
Thanks for contributing such a great answer. Can you please provide a
pointer to the documentation where it explains why dd$B <- s and dd["B"] <-
s have such different behavior?

(I am perfectly happy if you write the explanation but if it saves you time
to point to some reference that works fine for me.)

Regards,
Eric


On Wed, Oct 25, 2017 at 2:27 PM, Peter Dalgaard <[hidden email]> wrote:

>
> > On 24 Oct 2017, at 22:45 , David L Carlson <[hidden email]> wrote:
> >
> > You left out all the most important bits of information. What is yo? Are
> you trying to assign a data frame to a single column in another data frame?
> Printing head(samples) tells us nothing about what data types you have,
> especially if the things that look like text are really factors that were
> created when you used one of the read.*() functions. Use str(samples) to
> see what you are dealing with.
>
> Actually, I think there is enough information to diagnose this. The main
> issue is as you point out, assignment of an entire data frame to a column
> of another data frame:
>
> > l <- letters[1:5]
> > s <- as.data.frame(sapply(l,toupper))
> > dput(s)
> structure(list(`sapply(l, toupper)` = structure(1:5, .Label = c("A",
> "B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)",
> row.names = c("a",
> "b", "c", "d", "e"), class = "data.frame")
>
> (incidentally, setting col.names has no effect on this; notice that it is
> only documented as an argument to "list" and "matrix" methods, and sapply()
> returns a vector)
>
> Now, if we do this:
>
> > dd <- data.frame(A=l)
> > dd$B <- s
>
> we end up with a data frame whose B "column" is another data frame
>
> > dput(dd)
> structure(list(A = structure(1:5, .Label = c("a", "b", "c", "d",
> "e"), class = "factor"), B = structure(list(`sapply(l, toupper)` =
> structure(1:5, .Label = c("A",
> "B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)",
> row.names = c("a",
> "b", "c", "d", "e"), class = "data.frame")), .Names = c("A",
> "B"), row.names = c(NA, -5L), class = "data.frame")
>
> in printing such data frames, the inner frame "wins" the column names,
> which is sensible if you consider what would happen if it had more than one
> column:
>
> > dd
>   A sapply(l, toupper)
> 1 a                  A
> 2 b                  B
> 3 c                  C
> 4 d                  D
> 5 e                  E
>
> To get the effect that Ed probably expected, do
>
> > dd <- data.frame(A=l)
> > dd["B"] <- s
> > dd
>   A B
> 1 a A
> 2 b B
> 3 c C
> 4 d D
> 5 e E
>
> (and notice that single-bracket indexing is crucial here)
>
> -pd
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: as.data.frame doesn't set col.names

Duncan Murdoch-2
On 25/10/2017 8:15 AM, Eric Berger wrote:
> Hi Peter,
> Thanks for contributing such a great answer. Can you please provide a
> pointer to the documentation where it explains why dd$B <- s and dd["B"] <-
> s have such different behavior?

See Introduction to R, sections 6.1 (Lists) and 6.3 (Data frames).  Note
that dd$B is nearly the same as dd[["B"]], not dd["B"].

Duncan Murdoch


>
> (I am perfectly happy if you write the explanation but if it saves you time
> to point to some reference that works fine for me.)
>
> Regards,
> Eric
>
>
> On Wed, Oct 25, 2017 at 2:27 PM, Peter Dalgaard <[hidden email]> wrote:
>
>>
>>> On 24 Oct 2017, at 22:45 , David L Carlson <[hidden email]> wrote:
>>>
>>> You left out all the most important bits of information. What is yo? Are
>> you trying to assign a data frame to a single column in another data frame?
>> Printing head(samples) tells us nothing about what data types you have,
>> especially if the things that look like text are really factors that were
>> created when you used one of the read.*() functions. Use str(samples) to
>> see what you are dealing with.
>>
>> Actually, I think there is enough information to diagnose this. The main
>> issue is as you point out, assignment of an entire data frame to a column
>> of another data frame:
>>
>>> l <- letters[1:5]
>>> s <- as.data.frame(sapply(l,toupper))
>>> dput(s)
>> structure(list(`sapply(l, toupper)` = structure(1:5, .Label = c("A",
>> "B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)",
>> row.names = c("a",
>> "b", "c", "d", "e"), class = "data.frame")
>>
>> (incidentally, setting col.names has no effect on this; notice that it is
>> only documented as an argument to "list" and "matrix" methods, and sapply()
>> returns a vector)
>>
>> Now, if we do this:
>>
>>> dd <- data.frame(A=l)
>>> dd$B <- s
>>
>> we end up with a data frame whose B "column" is another data frame
>>
>>> dput(dd)
>> structure(list(A = structure(1:5, .Label = c("a", "b", "c", "d",
>> "e"), class = "factor"), B = structure(list(`sapply(l, toupper)` =
>> structure(1:5, .Label = c("A",
>> "B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)",
>> row.names = c("a",
>> "b", "c", "d", "e"), class = "data.frame")), .Names = c("A",
>> "B"), row.names = c(NA, -5L), class = "data.frame")
>>
>> in printing such data frames, the inner frame "wins" the column names,
>> which is sensible if you consider what would happen if it had more than one
>> column:
>>
>>> dd
>>    A sapply(l, toupper)
>> 1 a                  A
>> 2 b                  B
>> 3 c                  C
>> 4 d                  D
>> 5 e                  E
>>
>> To get the effect that Ed probably expected, do
>>
>>> dd <- data.frame(A=l)
>>> dd["B"] <- s
>>> dd
>>    A B
>> 1 a A
>> 2 b B
>> 3 c C
>> 4 d D
>> 5 e E
>>
>> (and notice that single-bracket indexing is crucial here)
>>
>> -pd
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.