dataframes with only one variable

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

dataframes with only one variable

Erich Neuwirth
Subsetting from a dataframe with only one variable
returns a vector, not a dataframe.
This seems somewhat inconsistent.
Wouldn't it be better if subsetting would respect
the structure completely?


v1<-1:4
v2<-4:1
df1<-data.frame(v1)
df2<-data.frame(v1,v2)
sel1<-c(TRUE,TRUE,TRUE,TRUE)

> df1[sel1,]
[1] 1 2 3 4
> df2[sel1,]
  v1 v2
1  1  4
2  2  3
3  3  2
4  4  1

--
Erich Neuwirth
Institute for Scientific Computing and
Didactic Center for Computer Science
University of Vienna
phone: +43-1-4277-39464  fax: +43-1-4277-39459

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: dataframes with only one variable

TEMPL Matthias
> Subsetting from a dataframe with only one variable
> returns a vector, not a dataframe.
> This seems somewhat inconsistent.
> Wouldn't it be better if subsetting would respect
> the structure completely?
>
>
> v1<-1:4
> v2<-4:1
> df1<-data.frame(v1)
> df2<-data.frame(v1,v2)
> sel1<-c(TRUE,TRUE,TRUE,TRUE)
>
> > df1[sel1,]


df1[[sel1, , drop=FALSE]

Should do what you want.

Best,
Matthias

> [1] 1 2 3 4
> > df2[sel1,]
>   v1 v2
> 1  1  4
> 2  2  3
> 3  3  2
> 4  4  1
>
> --
> Erich Neuwirth
> Institute for Scientific Computing and
> Didactic Center for Computer Science
> University of Vienna
> phone: +43-1-4277-39464  fax: +43-1-4277-39459
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read
> the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: dataframes with only one variable

Prof Brian Ripley
In reply to this post by Erich Neuwirth
On Wed, 11 Jan 2006, Erich Neuwirth wrote:

> Subsetting from a dataframe with only one variable
> returns a vector, not a dataframe.
> This seems somewhat inconsistent.

Not at all.  It is entirely consistent with matrix-like indexing (the form
you used).

> Wouldn't it be better if subsetting would respect
> the structure completely?

It depends how you do it.  [sel1,]  parallels a matrix, and drops
dimensions unless drop == FALSE is supplied.  [sel1] returns a
one-column df, and [[sel1]] returns a vector.

It is just a question of choosing the appropriate tool.  And any changes
to this sort of thing (from the White book) would break a lot of careful
code.

>
>
> v1<-1:4
> v2<-4:1
> df1<-data.frame(v1)
> df2<-data.frame(v1,v2)
> sel1<-c(TRUE,TRUE,TRUE,TRUE)
>
>> df1[sel1,]
> [1] 1 2 3 4
>> df2[sel1,]
>  v1 v2
> 1  1  4
> 2  2  3
> 3  3  2
> 4  4  1
>
> --
> Erich Neuwirth
> Institute for Scientific Computing and
> Didactic Center for Computer Science
> University of Vienna
> phone: +43-1-4277-39464  fax: +43-1-4277-39459

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: dataframes with only one variable

Richard M. Heiberger
In reply to this post by Erich Neuwirth
> df1
  v1
1  1
2  2
3  3
4  4
> df1[,]
[1] 1 2 3 4
> df1[,1]
[1] 1 2 3 4
> df1[,,drop=F]
  v1
1  1
2  2
3  3
4  4
> df1[,1,drop=F]
  v1
1  1
2  2
3  3
4  4
> df1[1]
  v1
1  1
2  2
3  3
4  4
> df1[[1]]
[1] 1 2 3 4
>


For transfers from Excel to R using the "[put/get] R dataframe" commands,
I think it is important always to use the drop=FALSE argument
(as I assume you are doing in RExcel V1.55).  The reason
for this is to maintain a rigid relationship between the only partially
compatible conventions of Excel and R.

For strictly within R use, the case is less clear.  I have trained myself
always (well 85% on the first try) to use the drop=FALSE argument when I care
about the structure after the copy.

The tension between keeping the structure and demoting the structure predates
data.frames.  This was a design issue in matrices as well.
> tmp <- matrix(1:6,2,3)
> tmp
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
> tmp[1,]
[1] 1 3 5
> tmp[1,,drop=FALSE]
     [,1] [,2] [,3]
[1,]    1    3    5
>

Rich

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html