using for loop with data frames.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

using for loop with data frames.

Marcelo Mariano Silva
Hi,

Is it possible use a loop to process many data frames in the same way?

For example, if I have three data frames, all with same variables


df_bs_id1 <- read.csv("test1.csv",header =TRUE)
df_bs_id2 <- read.csv("test2.csv",header =TRUE)
df_bs_id3 <- read.csv("test3.csv",header =TRUE)


How could I would implement a code loop that , for instance, would select
two coluns of interest in a fashion of the code below ?


# selecting only 2 columns of interest

for (1, 1:3) {
df_selected [i] <- df_bs_id[i]  [ , c("column1", "column2")]  }


Tks

MMS

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: using for loop with data frames.

R help mailing list-2
 
Why not just use an rbind() and create one data.frame?
    On Thursday, May 10, 2018, 10:34:19 a.m. EDT, Marcelo Mariano Silva <[hidden email]> wrote:  
 
 Hi,

Is it possible use a loop to process many data frames in the same way?

For example, if I have three data frames, all with same variables


df_bs_id1 <- read.csv("test1.csv",header =TRUE)
df_bs_id2 <- read.csv("test2.csv",header =TRUE)
df_bs_id3 <- read.csv("test3.csv",header =TRUE)


How could I would implement a code loop that , for instance, would select
two coluns of interest in a fashion of the code below ?


# selecting only 2 columns of interest

for (1, 1:3) {
df_selected [i] <- df_bs_id[i]  [ , c("column1", "column2")]  }


Tks

MMS

    [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
 
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: using for loop with data frames.

MacQueen, Don
In reply to this post by Marcelo Mariano Silva
Evidently, you want your loop to create new data frames, named (in this example)
  df_selected1
  df_selected2
  df_selected3

Yes, it can be done. But to do it you will have to use the get() and assign() functions, and construct the data frame names as character strings. Syntax like
    df_bs_id[3]
does not give you df_bs_id3.

R experts typically discourage this kind of approach. A method more consistent with how R is designed to work would be to store the data frames as elements of a list.

dflst <- list(df_bs_id1, df_bs_id2, df_bs_id3)
nframes <- length(dflist)
newdf <- dflst

for (id in seq(nframes)) {
   newdf[id] <- dflst[[ id ]][ , c("column1", "column2")]  
}

Optionally, you could name the list elements:
 
    names(dflst) <- paste0('df_selected', seq(nframes))

After which you would have the original data frames as elements of dflst, and the processed data frames as elements of newdf. The loop can be simplified a bit if you don't need to keep copies of the original data frames.

With this approach, it would be better create dflst using a loop over the incoming file names, running read.csv() inside the loop. In which case you would never create separate data frames df_bs_id1, df_bs_id2, etc.

I have used both approaches at various times over the years, and each has pros and cons. In general, I would recommend the list approach, however, especially if you have a large number of files to process.

-Don

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 
On 5/10/18, 7:33 AM, "R-help on behalf of Marcelo Mariano Silva" <[hidden email] on behalf of [hidden email]> wrote:

    Hi,
   
    Is it possible use a loop to process many data frames in the same way?
   
    For example, if I have three data frames, all with same variables
   
   
    df_bs_id1 <- read.csv("test1.csv",header =TRUE)
    df_bs_id2 <- read.csv("test2.csv",header =TRUE)
    df_bs_id3 <- read.csv("test3.csv",header =TRUE)
   
   
    How could I would implement a code loop that , for instance, would select
    two coluns of interest in a fashion of the code below ?
   
   
    # selecting only 2 columns of interest
   
    for (1, 1:3) {
    df_selected [i] <- df_bs_id[i]  [ , c("column1", "column2")]  }
   
   
    Tks
   
    MMS
   
    [[alternative HTML version deleted]]
   
    ______________________________________________
    [hidden email] mailing list -- To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
   

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.