help with error: DV "converted to a factor"

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

help with error: DV "converted to a factor"

cwladis
I've spent several days compiling the following code (I apologize in advance - this code is very inelegant, and I'm sure could be written much more efficiently, but I've stuck with whatever method I could get to work - sometimes the more efficient code I just couldn't get to work without an error, because of my R inexperience).  

My main motivation for writing the code is that I'd like to be able to assess interaction effects, and so I want to be able to rotate through the reference values so that I can assess different interaction effects than simply the default ones reported by the single set of reference values given in a single analysis.  The code below generates a lot more than that, but since I am such an R newbie (and therefore struggling to learn so many different things at once) I thought that if I could just generate one bit matrix with all the pairwise comparisons, I could then pick out the stuff I actually want later (probably just manually at first, but later hopefully by automating my code to return a matrix with just the comparisons I want).    

Here is the code:
        DV<-factors[,1]
        IV<-factors[,2]
        int<-factors[,3]
        IV<-IV[!is.na(IV)]
        int<-int[!is.na(int)]
#Limit our modification of reference values to categorical variables only by redefining IV vector as IVcat
        IVcat<-IV
        for(p in 1:length(IV)){
                if((class(dataset[[toString(IV[p])]])=="character")|(class(dataset[[toString(IV[p])]])=="factor")){
                IVcat[p]<-IV[p]
                }
                else{IVcat[p]<-""}
        }
        IVcat<-IVcat[!is.na(IVcat)]
#Create vectors (IVvalslist[n] for nth factor) for each IVcat containing each possible value for that IV
        IVvalslist<-vector('list', length(IVcat))
        for(i in 1:length(IVcat)){
                assign(paste("IVvalues",i,sep=""),unique(dataset[[toString(IVcat[i])]])[!is.na(unique(dataset[[toString(IVcat[i])]]))])
                IVvalslist[i]<-list(get(paste("IVvalues",i,sep="")))
                }
#Create a data frame (refM) with every combination of values for each IVcat
        refM<-expand.grid(IVvalslist)
#Loop through all possible reference values, and then run the model,
#and then compile the model summary output into a single matrix
        #Go through each row of the matrix of possible reference value combinations
        for(j in 1:nrow(refM)){
                #Go through each reference value for each factor in that row, and assign
                #that reference value for that factor
                for(k in 1:length(IVcat)){
                        dataset[[ toString(IVcat[k]) ]]<-relevel(dataset[[ toString(IVcat[k]) ]],ref=toString(refM[j,k]))
                        }
                        #Run model with new reference values from row j
                        model<-paste(paste(DV[1],"~1",sep=""),paste(IV,collapse="+"),paste(int,collapse="+"),sep="+")
                        modeloutput<-glm(model,family=binomial(logit),data=dataset)
                        #assigning all output from every possible combo of ref values to a single matrix named coeffM
                        if(j==1){coeffM<-coef(summary(modeloutput))}
                        if(j>1){coeffM<-rbind(coeffM,coef(summary(modeloutput)))}
        }


I tested each step of the code individually, and each individual step runs fine.  However, when I run the whole set of code at once, I get the following error message after the last line of R input above:

Warning in model.matrix.default(mt, mf, contrasts) :
  variable 'retention' converted to a factor
Error in weights * y : non-numeric argument to binary operator

And if I ask for R to return coeffM or modeloutput, it tells me that no such objects exist.  


The object factors is as follows:
> factors
       var1   var2         var3
1 retention method method*level
2      <NA>  level         <NA>
3      <NA>    gpa         <NA>

And I can't give the actual dataset (named dataset above) here for human subject reasons, but here is a made-up sample of what it looks like:

    id          instructor       method success retention  level    career
1 1001      NAME1           online                 1         1           LL        career
2 1002      NAME2           face-to-face       1         1          UL       lib. arts

      STEM             required                 ethnicity                                   gender age finaid  gpa
1 non-STEM      elective                    Asian or Pacific Islander      M              28               1.97
2 non-STEM      maj. req.                  Asian or Pacific Islander      F              21   none    3.01

      experience        credits    yrsenrolled      course
1 no online exp.      22           1                           NAME1_MAT100
2 no online exp.      33           2                           NAME2_ENG100

If anyone can help me figure out what is going wrong here, I'd be incredibly grateful!  I've tried searching repeatedly for this error, and the other instances that I've found haven't seemed to apply to this situation?  But as I am just learning R, I may be missing something really obvious.  

Thanks again for taking the time to read my post!