Dear R experts,
I've realized that it might not be possible to define a negative SELCET statement in a SQL call so now I'm looking for the smoothest way to generate a list of what I would like from my large database by first pulling all the names with a query like this "SELECT top 1 * FROM your_table" (thank you Bart Joosen for the idea) and then subtract the variables I am not allow to pull manually ending up with a 'positive' definition of what I want, something I can use in a SQL SELCT statement (see my email on this list from yesterday for more on that). When I query the database for the variable names I get something similar to 'DBquery' in my working example below, but considerable longer with over 2400 hundred variables. As I only need to remove two or three variables I would like to define a lookup table (like the list 'lookup' in my example) and subtract that from my data base query. Now to my question. Is there a way I can subtract one list from another? Like setoff or alike? I would like to end up with a list like the one shown in my example called 'result.' In short, I would like to subtract 'lookup' from 'DBquery' and end up with 'result,' please note that 'result' is a list fo vecktors and not a list of dataframes. In my real life example DBquery is considerable longer so defining that by hand would make a really really long syntax. Hope someone know some smart function that I can use to solve my problem in an elegant way. Thanks for reading. Erick ###### begin R code ###### DBquery <- list(tableA=data.frame(id = numeric(0), atwin = numeric(0), atrout = numeric(0)), tableB=data.frame(id = numeric(0), mq = numeric(0), z = numeric(0), m = numeric(0)), tableC=data.frame(V1 = numeric(0), mfn = numeric(0), iiff = numeric(0)), tableD=data.frame(id = numeric(0), msf = numeric(0), oom = numeric(0))) lookup <- list(tableA= c('atwin', 'atrout'), tableB= c('m', 'z'), tableC= 'ALL') ### ... result <- list(tableA= c('id'), tableB= c('id', 'mq'), tableC= c('V1', 'mfn', 'iiff')) ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
?setdiff
Sent from my iPad On Mar 29, 2012, at 4:28, "Eric Fail" <[hidden email]> wrote: > Dear R experts, > > I've realized that it might not be possible to define a negative SELCET statement in a SQL call so now I'm looking for the smoothest way to generate a list of what I would like from my large database by first pulling all the names with a query like this "SELECT top 1 * FROM your_table" (thank you Bart Joosen for the idea) and then subtract the variables I am not allow to pull manually ending up with a 'positive' definition of what I want, something I can use in a SQL SELCT statement (see my email on this list from yesterday for more on that). > > When I query the database for the variable names I get something similar to 'DBquery' in my working example below, but considerable longer with over 2400 hundred variables. As I only need to remove two or three variables I would like to define a lookup table (like the list 'lookup' in my example) and subtract that from my data base query. Now to my question. Is there a way I can subtract one list from another? Like setoff or alike? > > I would like to end up with a list like the one shown in my example called 'result.' In short, I would like to subtract 'lookup' from 'DBquery' and end up with 'result,' please note that 'result' is a list fo vecktors and not a list of dataframes. In my real life example DBquery is considerable longer so defining that by hand would make a really really long syntax. > > Hope someone know some smart function that I can use to solve my problem in an elegant way. > > Thanks for reading. > > Erick > > ###### begin R code ###### > > DBquery <- list(tableA=data.frame(id = numeric(0), atwin = numeric(0), atrout = numeric(0)), > tableB=data.frame(id = numeric(0), mq = numeric(0), z = numeric(0), m = numeric(0)), > tableC=data.frame(V1 = numeric(0), mfn = numeric(0), iiff = numeric(0)), > tableD=data.frame(id = numeric(0), msf = numeric(0), oom = numeric(0))) > > lookup <- list(tableA= c('atwin', 'atrout'), > tableB= c('m', 'z'), > tableC= 'ALL') > > ### ... > > result <- list(tableA= c('id'), > tableB= c('id', 'mq'), > tableC= c('V1', 'mfn', 'iiff')) > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
a working solution to the problem,
a <- DBquery[names(lookup)] mother.of.lookup <- list() for(string in names(a)) { a[[string]] <- names(a[[string]]) mother.of.lookup[[string]] <- setdiff(a[[string]], lookup[[string]]) } identical(mother.of.lookup, result) It might not be the most elegant solution, but it works. Best, Eric On Thu, Mar 29, 2012 at 4:07 AM, Jim Holtman <[hidden email]> wrote: > > ?setdiff > > Sent from my iPad > > On Mar 29, 2012, at 4:28, "Eric Fail" <[hidden email]> wrote: > > > Dear R experts, > > > > I've realized that it might not be possible to define a negative SELCET statement in a SQL call so now I'm looking for the smoothest way to generate a list of what I would like from my large database by first pulling all the names with a query like this "SELECT top 1 * FROM your_table" (thank you Bart Joosen for the idea) and then subtract the variables I am not allow to pull manually ending up with a 'positive' definition of what I want, something I can use in a SQL SELCT statement (see my email on this list from yesterday for more on that). > > > > When I query the database for the variable names I get something similar to 'DBquery' in my working example below, but considerable longer with over 2400 hundred variables. As I only need to remove two or three variables I would like to define a lookup table (like the list 'lookup' in my example) and subtract that from my data base query. Now to my question. Is there a way I can subtract one list from another? Like setoff or alike? > > > > I would like to end up with a list like the one shown in my example called 'result.' In short, I would like to subtract 'lookup' from 'DBquery' and end up with 'result,' please note that 'result' is a list fo vecktors and not a list of dataframes. In my real life example DBquery is considerable longer so defining that by hand would make a really really long syntax. > > > > Hope someone know some smart function that I can use to solve my problem in an elegant way. > > > > Thanks for reading. > > > > Erick > > > > ###### begin R code ###### > > > > DBquery <- list(tableA=data.frame(id = numeric(0), atwin = numeric(0), atrout = numeric(0)), > > tableB=data.frame(id = numeric(0), mq = numeric(0), z = numeric(0), m = numeric(0)), > > tableC=data.frame(V1 = numeric(0), mfn = numeric(0), iiff = numeric(0)), > > tableD=data.frame(id = numeric(0), msf = numeric(0), oom = numeric(0))) > > > > lookup <- list(tableA= c('atwin', 'atrout'), > > tableB= c('m', 'z'), > > tableC= 'ALL') > > > > ### ... > > > > result <- list(tableA= c('id'), > > tableB= c('id', 'mq'), > > tableC= c('V1', 'mfn', 'iiff')) > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Try this:
mapply('[', DBquery, mapply(setdiff, lapply(DBquery, names), lookup)) On Sat, Mar 31, 2012 at 5:50 PM, Eric Fail <[hidden email]> wrote: > a working solution to the problem, > > a <- DBquery[names(lookup)] > > mother.of.lookup <- list() > for(string in names(a)) { > a[[string]] <- names(a[[string]]) > mother.of.lookup[[string]] <- setdiff(a[[string]], lookup[[string]]) > } > > identical(mother.of.lookup, result) > > It might not be the most elegant solution, but it works. > > Best, > Eric > > On Thu, Mar 29, 2012 at 4:07 AM, Jim Holtman <[hidden email]> wrote: >> >> ?setdiff >> >> Sent from my iPad >> >> On Mar 29, 2012, at 4:28, "Eric Fail" <[hidden email]> wrote: >> >> > Dear R experts, >> > >> > I've realized that it might not be possible to define a negative SELCET statement in a SQL call so now I'm looking for the smoothest way to generate a list of what I would like from my large database by first pulling all the names with a query like this "SELECT top 1 * FROM your_table" (thank you Bart Joosen for the idea) and then subtract the variables I am not allow to pull manually ending up with a 'positive' definition of what I want, something I can use in a SQL SELCT statement (see my email on this list from yesterday for more on that). >> > >> > When I query the database for the variable names I get something similar to 'DBquery' in my working example below, but considerable longer with over 2400 hundred variables. As I only need to remove two or three variables I would like to define a lookup table (like the list 'lookup' in my example) and subtract that from my data base query. Now to my question. Is there a way I can subtract one list from another? Like setoff or alike? >> > >> > I would like to end up with a list like the one shown in my example called 'result.' In short, I would like to subtract 'lookup' from 'DBquery' and end up with 'result,' please note that 'result' is a list fo vecktors and not a list of dataframes. In my real life example DBquery is considerable longer so defining that by hand would make a really really long syntax. >> > >> > Hope someone know some smart function that I can use to solve my problem in an elegant way. >> > >> > Thanks for reading. >> > >> > Erick >> > >> > ###### begin R code ###### >> > >> > DBquery <- list(tableA=data.frame(id = numeric(0), atwin = numeric(0), atrout = numeric(0)), >> > tableB=data.frame(id = numeric(0), mq = numeric(0), z = numeric(0), m = numeric(0)), >> > tableC=data.frame(V1 = numeric(0), mfn = numeric(0), iiff = numeric(0)), >> > tableD=data.frame(id = numeric(0), msf = numeric(0), oom = numeric(0))) >> > >> > lookup <- list(tableA= c('atwin', 'atrout'), >> > tableB= c('m', 'z'), >> > tableC= 'ALL') >> > >> > ### ... >> > >> > result <- list(tableA= c('id'), >> > tableB= c('id', 'mq'), >> > tableC= c('V1', 'mfn', 'iiff')) >> > >> > ______________________________________________ >> > [hidden email] mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Eric Fail-3
On Thu, Mar 29, 2012 at 4:28 AM, Eric Fail <[hidden email]> wrote:
> Dear R experts, > > I've realized that it might not be possible to define a negative SELCET statement in a SQL call so now I'm looking for the smoothest way to generate a list of what I would like from my large database by first pulling all the names with a query like this "SELECT top 1 * FROM your_table" (thank you Bart Joosen for the idea) and then subtract the variables I am not allow to pull manually ending up with a 'positive' definition of what I want, something I can use in a SQL SELCT statement (see my email on this list from yesterday for more on that). > > When I query the database for the variable names I get something similar to 'DBquery' in my working example below, but considerable longer with over 2400 hundred variables. As I only need to remove two or three variables I would like to define a lookup table (like the list 'lookup' in my example) and subtract that from my data base query. Now to my question. Is there a way I can subtract one list from another? Like setoff or alike? > > I would like to end up with a list like the one shown in my example called 'result.' In short, I would like to subtract 'lookup' from 'DBquery' and end up with 'result,' please note that 'result' is a list fo vecktors and not a list of dataframes. In my real life example DBquery is considerable longer so defining that by hand would make a really really long syntax. > One can use except in sqlite and in a number of other database systems: library(sqldf) # first two rows of BOD BOD12 <- BOD[1:2, ] # all rows of BOD except those in BOD12 sqldf("select * from BOD except (select * from BOD12)") -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |