replacing rows data.frame

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

replacing rows data.frame

evelyne
I create a data.frame using :
alloc <- data.frame(matrix(nrow=length(unique(mid2agi$gene)), ncol=8))
colnames(alloc) <- c('agi', 'hit_len', 'q_len', 'identity', 'ratio', 'e', 'ok' ,'gene')
alloc$gene <- unique(mid2agi$gene)

this results in:
> head(alloc)
 agi hit_len q_len identity   ratio          e    ok           gene
NA NA NA NA NA NA NA BrChr1g00001V4
NA NA NA NA NA NA NA BrChr1g00002V4

and I already have a dataframe (mid2agi) containing both integers and factors.
In my empty dataframe I want to replace rows using:

for (i in (1:nrow(alloc)) ) {
    find <- alloc[i,]$gene
    submid2agi <- subset(mid2agi, gene %in% find)
    max <- which.max(submid2agi$identity * submid2agi$ratio)
    if (length(max) > 0){
       alloc[i,] <- submid2agi[max,]
   }
}

But my problem is that all values are now interpreted as integers, so my text in my factors are converted to numbers.
Can anyone provide me with tips on how to solve this?

ouoput:
agi hit_len q_len identity   ratio          e ok           gene
18296     344   551   86.919 0.62432 2.1142e-89  2 BrChr1g00001V4

SHOULD be:
 agi hit_len q_len identity   ratio          e    ok           gene
AT4G38360.2     344   551   86.919 0.62432 2.1142e-89  True BrChr1g00001V4

Thanks you..
Evelyne


 
Reply | Threaded
Open this post in threaded view
|

Re: replacing rows data.frame

Rui Barradas
Hello,

Your problem comes from the fact that as new values are inserted in
'alloc' the column alloc$agi keeps changing (obvious!) therefore R can't
know all the factor levels beforehand. Therefore the values inserted are
the numeric codes of the original factor. Since your example doesn't
run, I've made up a dataset. Try to see what happens.


x <- data.frame(A = letters[1:10], B = 1:10)
str(x)

y <- data.frame(matrix(nrow=5, ncol=2))
colnames(y) <- colnames(x)

for(i in 1:10){
     if(i %% 2 == 0){  # example condition
         y[i/2, 1] <- as.character(x[i, 1])  # the factor column
         y[i/2, -1] <- x[i, -1]  # all but 1st column
     }
}
y$A <- factor(y$A)  # in the end we know the levels.
str(y)


Likewise, you must first fill in the column with character values then,
in the end, coerce to factor.

Hope this helps,

Rui Barradas
Em 19-10-2012 12:06, evelyne escreveu:

> I create a data.frame using :
> alloc <- data.frame(matrix(nrow=length(unique(mid2agi$gene)), ncol=8))
> colnames(alloc) <- c('agi', 'hit_len', 'q_len', 'identity', 'ratio', 'e',
> 'ok' ,'gene')
> alloc$gene <- unique(mid2agi$gene)
>
> this results in:
>> head(alloc)
>   agi hit_len q_len identity   ratio          e    ok           gene
> NA NA NA NA NA NA NA BrChr1g00001V4
> NA NA NA NA NA NA NA BrChr1g00002V4
>
> and I already have a dataframe (mid2agi) containing both integers and
> factors.
> In my empty dataframe I want to replace rows using:
>
> for (i in (1:nrow(alloc)) ) {
>      find <- alloc[i,]$gene
>      submid2agi <- subset(mid2agi, gene %in% find)
>      max <- which.max(submid2agi$identity * submid2agi$ratio)
>      if (length(max) > 0){
>         *alloc[i,] <- submid2agi[max,]*
>     }
> }
>
> But my problem is that all values are now interpreted as integers, so my
> text in my factors are converted to numbers.
> Can anyone provide me with tips on how to solve this?
>
> ouoput:
> agi hit_len q_len identity   ratio          e ok           gene
> *18296*     344   551   86.919 0.62432 2.1142e-89  *2* BrChr1g00001V4
>
> SHOULD be:
>   agi hit_len q_len identity   ratio          e    ok           gene
> AT4G38360.2     344   551   86.919 0.62432 2.1142e-89  True BrChr1g00001V4
>
> Thanks you..
> Evelyne
>
>
>    
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/replacing-rows-data-frame-tp4646731.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: replacing rows data.frame

evelyne
thanks a lot.

I also found that using "stringsAsFactors=FALSE" helps