For loop with ifelse help

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

For loop with ifelse help

Pele
Hello R users,

I have 2 files (file1 and f2) and I am trying to sum columns 6:10 of a specific row in f2 and append it in
file 1 if the "state" variable in file 1 equals the rowname in f2.  Below is an example of the code I wrote
using a for loop, but it not working (i.e it only works for the last number (10) in the loop). Can someone tell me how to fix?

Many thanks !

 file1 <- data.frame(ID=seq(1:30), state=sample(1:10, 30, replace=TRUE)); file1
   ID state
1   1     7
2   2     7
3   3     6
4   4     4
5   5     5
6   6     7
7   7    10
8   8     1
9   9     1
10 10     5
............
.........

 file2 <- matrix(seq(1:100),nrow=10)
  f2 <- as.data.frame(file2); f2
   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1   1 11 21 31 41 51 61 71 81  91
2   2 12 22 32 42 52 62 72 82  92
3   3 13 23 33 43 53 63 73 83  93
4   4 14 24 34 44 54 64 74 84  94
5   5 15 25 35 45 55 65 75 85  95
6   6 16 26 36 46 56 66 76 86  96
7   7 17 27 37 47 57 67 77 87  97
8   8 18 28 38 48 58 68 78 88  98
9   9 19 29 39 49 59 69 79 89  99
10 10 20 30 40 50 60 70 80 90 100
 
 
 for (i in length(f2)) {
 file1$chksum <- ifelse ((file1$state==rownames(f2)[i]), rowSums(f2[rownames(f2)[i], 6:10]), 0)
                      }
 print(file1)
   ID state chksum
1   1     7      0
2   2     7      0
3   3     6      0
4   4     4      0
5   5     5      0
6   6     7      0
7   7    10    400
8   8     1      0
9   9     1      0
10 10     5      0
11 11    10    400
12 12     9      0
13 13    10    400
14 14     9      0
15 15     5      0
16 16     3      0
17 17     1      0
18 18     7      0
19 19     7      0
20 20     2      0
21 21     3      0
22 22     8      0
23 23     8      0
24 24     4      0
25 25     6      0
26 26     6      0
27 27     3      0
28 28     3      0
29 29     5      0
30 30     5      0
Reply | Threaded
Open this post in threaded view
|

Re: For loop with ifelse help

David Winsemius

On Sep 22, 2010, at 11:42 AM, Pele wrote:

>
> Hello R users,
>
> I have 2 files (file1 and f2) and I am trying to sum columns 6:10 of a
> specific row in f2 and append it in
> file 1 if the "state" variable in file 1 equals the rowname in f2.  
> Below is
> an example of the code I wrote
> using a for loop, but it not working (i.e it only works for the last  
> number
> (10) in the loop). Can someone tell me how to fix?
>
> Many thanks !
>
> file1 <- data.frame(ID=seq(1:30), state=sample(1:10, 30,  
> replace=TRUE));
> file1
>   ID state
> 1   1     7
> 2   2     7
> 3   3     6
> 4   4     4
> 5   5     5
> 6   6     7
> 7   7    10
> 8   8     1
> 9   9     1
> 10 10     5
> ............
> .........
>
> file2 <- matrix(seq(1:100),nrow=10)
>  f2 <- as.data.frame(file2); f2
>   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
> 1   1 11 21 31 41 51 61 71 81  91
> 2   2 12 22 32 42 52 62 72 82  92
> 3   3 13 23 33 43 53 63 73 83  93
> 4   4 14 24 34 44 54 64 74 84  94
> 5   5 15 25 35 45 55 65 75 85  95
> 6   6 16 26 36 46 56 66 76 86  96
> 7   7 17 27 37 47 57 67 77 87  97
> 8   8 18 28 38 48 58 68 78 88  98
> 9   9 19 29 39 49 59 69 79 89  99
> 10 10 20 30 40 50 60 70 80 90 100
>
>
> for (i in length(f2)) {
> file1$chksum <- ifelse ((file1$state==rownames(f2)[i]),
> rowSums(f2[rownames(f2)[i], 6:10]), 0)
>                      }

That looks overly complex and inefficient (not to mention wrong):

Try:
res <- merge(file1, rowSums(file2), by.x="state", by.y="row.names",  
all=TRUE)
names(res)[4] <- "chksum"

If you need it sorted by ID then:

res[order(res$ID],]

--
David.

>  print(file1)
>   ID state chksum
> 1   1     7      0
> 2   2     7      0
> 3   3     6      0
> 4   4     4      0
> 5   5     5      0
> 6   6     7      0
> 7   7    10    400
> 8   8     1      0
> 9   9     1      0
> 10 10     5      0
> 11 11    10    400
> 12 12     9      0
> 13 13    10    400
> 14 14     9      0
> 15 15     5      0
> 16 16     3      0
> 17 17     1      0
> 18 18     7      0
> 19 19     7      0
> 20 20     2      0
> 21 21     3      0
> 22 22     8      0
> 23 23     8      0
> 24 24     4      0
> 25 25     6      0
> 26 26     6      0
> 27 27     3      0
> 28 28     3      0
> 29 29     5      0
> 30 30     5      0
> --
> View this message in context: http://r.789695.n4.nabble.com/For-loop-with-ifelse-help-tp2550547p2550547.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For loop with ifelse help

Pele
Hi David - thanks for your suggestion, but I am trying to avoid doing any merging and sorting for this step because the real file I will be working with has about 20 million records.  If I can get this loop  or something similar to work will be good enough.

thanks again..

Reply | Threaded
Open this post in threaded view
|

Re: For loop with ifelse help

Steve Lianoglou-6
Hi Pele,

On Wed, Sep 22, 2010 at 12:40 PM, Pele <[hidden email]> wrote:
>
> Hi David - thanks for your suggestion, but I am trying to avoid doing any
> merging and sorting for this step because the real file I will be working
> with has about 20 million records.  If I can get this loop  or something
> similar to work will be good enough.

If that's the case, you might consider looking at the sqldf or
data.table packages.

They both implement data.frame-like objects, but can do subsetting
(and merging) rather quickly since they implement indexes over "keys"
(columns) of the respective data.frame(s).

Subsetting "normal" data.frames in this scenario you describe involves
a linear search for every query through the column(s) you are querying
against, which can get slow as the size of your data.frames get large.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For loop with ifelse help

sayan dasgupta
Hi Pele,

I think this should work

file1$state.sum <- rowSums(file2[file1$state,6:10],0)


On Thu, Sep 23, 2010 at 7:46 PM, Steve Lianoglou <
[hidden email]> wrote:

> Hi Pele,
>
> On Wed, Sep 22, 2010 at 12:40 PM, Pele <[hidden email]> wrote:
> >
> > Hi David - thanks for your suggestion, but I am trying to avoid doing any
> > merging and sorting for this step because the real file I will be working
> > with has about 20 million records.  If I can get this loop  or something
> > similar to work will be good enough.
>
> If that's the case, you might consider looking at the sqldf or
> data.table packages.
>
> They both implement data.frame-like objects, but can do subsetting
> (and merging) rather quickly since they implement indexes over "keys"
> (columns) of the respective data.frame(s).
>
> Subsetting "normal" data.frames in this scenario you describe involves
> a linear search for every query through the column(s) you are querying
> against, which can get slow as the size of your data.frames get large.
>
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: For loop with ifelse help

Pele
Hi Sayan,

This is exactly what I was looking for - it worked perfectly.

Many thanks!!

Also, thanks to everyone else for their suggestions.

Pele