|
I'm trying to add empty columns to data.tables using a variable containing the name of the desired column, but I'm unable to figure out how to dereference the variable value to satisfy the := operator.
Here's a simple example: require(data.table); dt <- data.table(read.table(text="N1 N2\nA B\nC D\n", header=TRUE)); new_col_name = "N3"; dt[, new_col_name := NA]; That creates a column literally named "new_col_name", rather than "N3" as desired. I can work around it this way: dt[, workaround := NA]; setnames(dt, "workaround", new_col_name); I have tried wrapping the new_col_name variable in all sorts of functions such as eval(), c(), list(), quote(), etc; all of these generate an error such as: Error in `[.data.table`(dt, , `:=`(quote(new_col_name), NA)) : LHS of := must be a single column name when with=TRUE. When with=FALSE the LHS may be a vector of column names or positions. Surely I am overlooking something trivial; please advise. Thanks George http://stackoverflow.com/users/1313052/gkaupas _______________________________________________ datatable-help mailing list [hidden email] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help |
|
Hmmm I would have liked to see
set(dt, j=new_col_name, value=NA) work but that doesn't. Any reason why necessarily? On Wed, Jul 25, 2012 at 2:03 PM, Kaupas, George <[hidden email]> wrote: > I'm trying to add empty columns to data.tables using a variable containing the name of the desired column, but I'm unable to figure out how to dereference the variable value to satisfy the := operator. > > Here's a simple example: > > require(data.table); > dt <- data.table(read.table(text="N1 N2\nA B\nC D\n", header=TRUE)); > new_col_name = "N3"; > dt[, new_col_name := NA]; > > That creates a column literally named "new_col_name", rather than "N3" as desired. > > I can work around it this way: > > dt[, workaround := NA]; > setnames(dt, "workaround", new_col_name); > > I have tried wrapping the new_col_name variable in all sorts of functions such as eval(), c(), list(), quote(), etc; all of these generate an error such as: > > Error in `[.data.table`(dt, , `:=`(quote(new_col_name), NA)) : > LHS of := must be a single column name when with=TRUE. When with=FALSE the LHS may be a vector of column names or positions. > > Surely I am overlooking something trivial; please advise. > > Thanks > George > http://stackoverflow.com/users/1313052/gkaupas > > _______________________________________________ > datatable-help mailing list > [hidden email] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help datatable-help mailing list [hidden email] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help |
|
There was a clue in the error message : >> Error in `[.data.table`(dt, , `:=`(quote(new_col_name), NA)) : >> LHS of := must be a single column name when with=TRUE. When with=FALSE >> the LHS may be a vector of column names or positions. Trying with=FALSE : DT = data.table(a=1:3,b=4:6) newcolname = "FOO" DT[,newcolname:=NA,with=FALSE] a b FOO 1: 1 4 NA 2: 2 5 NA 3: 3 6 NA But I'm thinking that wrapping the LHS with eval() or c() [the things George tried] should have worked too, and are more natural given that's what we do elsewhere. I seem to remember either some TO DO in the source or a feature request to improve that. Will take another look. Trying set() gives error as Chris said : newcolname = "BAR" set(DT, j=newcolname, value=NA) Error in set(DT, j = newcolname, value = NA) : 'BAR' is not a column name. Cannot add columns with set(), use := instead to add columns by reference. This is already FR#2077 "Improve set()'s messages to explain why it doesn't add new columns" : https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2077&group_id=240&atid=978 The thinking there was that set()'s purpose is for very fast updating single cells by reference, suitable inside a loop in the rare situations a loop is needed. To branch if the column didn't exist, and add it, would take time to branch. That was when character column names weren't acceptable to set() either, though; for speed, to save looking up the same column name over and over. Integer i (already warns if not) and integer j (should warn if not) should be much faster due to the avoidance of small allocations to coerce. So the short answer is that set() could add new columns, but need to make sure not at the expense of speed in the integer cases, and with suitable warnings to encourage use of integer j, in set() only, for speed inside loops. I'm probably over egging the problem and it would be simple to achieve that, but those were the worries anyway. > Hmmm I would have liked to see > > set(dt, j=new_col_name, value=NA) > > work but that doesn't. Any reason why necessarily? > > On Wed, Jul 25, 2012 at 2:03 PM, Kaupas, George > <[hidden email]> wrote: >> I'm trying to add empty columns to data.tables using a variable >> containing the name of the desired column, but I'm unable to figure out >> how to dereference the variable value to satisfy the := operator. >> >> Here's a simple example: >> >> require(data.table); >> dt <- data.table(read.table(text="N1 N2\nA B\nC D\n", header=TRUE)); >> new_col_name = "N3"; >> dt[, new_col_name := NA]; >> >> That creates a column literally named "new_col_name", rather than "N3" >> as desired. >> >> I can work around it this way: >> >> dt[, workaround := NA]; >> setnames(dt, "workaround", new_col_name); >> >> I have tried wrapping the new_col_name variable in all sorts of >> functions such as eval(), c(), list(), quote(), etc; all of these >> generate an error such as: >> >> Error in `[.data.table`(dt, , `:=`(quote(new_col_name), NA)) : >> LHS of := must be a single column name when with=TRUE. When with=FALSE >> the LHS may be a vector of column names or positions. >> >> Surely I am overlooking something trivial; please advise. >> >> Thanks >> George >> http://stackoverflow.com/users/1313052/gkaupas >> >> _______________________________________________ >> datatable-help mailing list >> [hidden email] >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > _______________________________________________ > datatable-help mailing list > [hidden email] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > _______________________________________________ datatable-help mailing list [hidden email] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help |
| Powered by Nabble | Edit this page |
