|
Hi,
I am wondering if anyone knows of an easy way to fit a saturated model using the sem package on raw data? Say the data were: mtcars[, c("mpg", "hp", "wt")] The model would estimate the three means (intercepts) of c("mpg", "hp", "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg with hp and wt and the covariance of hp with wt. I am interested in this because I want to obtain the MLE mean vector and covariance matrix when there is missing data (i.e., the sum of the case wise likelihoods or so-called full information maximum likelihood). Here is exemplary missing data: dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")]) dat[sample(length(dat), length(dat) * .25)] <- NA dat <- as.data.frame(dat) It is not too difficult to write a wrapper that does this in the OpenMx package because you can easily define paths using vectors and get all pairwise combinations using: combn(c("mpg", "hp", "wt"), 2) but I would prefer to use the sem package, because OpenMx does not work on 64 bit versions of R for Windows x64 and is not available from CRAN presently. Obviously it is not difficult to write out the model, but I am hoping to bundle this in a function that for some arbitrary data, will return the FIML estimated covariance (and correlation matrix). Alternately, if there are any functions/packages that just return FIML estimates of a covariance matrix from raw data, that would be great (but googling and using findFn() from the sos package did not turn up good results). Thanks! Josh -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Dear Joshua,
If I understand correctly what you want to do, the sem package won't do it. That is, the sem() function won't do what often is called FIML estimation for models with missing data. I've been thinking about implementing this feature, and don't think that it would be too difficult, but I can't promise when and if I'll get to it. You might also take a look at the lavaan package. As well, I must admit to some skepticism about the FIML estimator, as opposed to approaches such as multiple imputation of missing data. I suspect that the former is more sensitive than the latter to the assumption of multinormality. Best, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox > -----Original Message----- > From: [hidden email] [mailto:r-help-bounces@r- > project.org] On Behalf Of Joshua Wiley > Sent: July-12-12 2:53 AM > To: [hidden email] > Cc: John Fox > Subject: [R] easy way to fit saturated model in sem package? > > Hi, > > I am wondering if anyone knows of an easy way to fit a saturated model > using the sem package on raw data? Say the data were: > > mtcars[, c("mpg", "hp", "wt")] > > The model would estimate the three means (intercepts) of c("mpg", "hp", > "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg > with hp and wt and the covariance of hp with wt. > > I am interested in this because I want to obtain the MLE mean vector > and covariance matrix when there is missing data (i.e., the sum of the > case wise likelihoods or so-called full information maximum > likelihood). Here is exemplary missing data: > > dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")]) > dat[sample(length(dat), length(dat) * .25)] <- NA dat <- > as.data.frame(dat) > > It is not too difficult to write a wrapper that does this in the OpenMx > package because you can easily define paths using vectors and get all > pairwise combinations using: > > combn(c("mpg", "hp", "wt"), 2) > > but I would prefer to use the sem package, because OpenMx does not work > on 64 bit versions of R for Windows x64 and is not available from CRAN > presently. Obviously it is not difficult to write out the model, but I > am hoping to bundle this in a function that for some arbitrary data, > will return the FIML estimated covariance (and correlation matrix). > Alternately, if there are any functions/packages that just return FIML > estimates of a covariance matrix from raw data, that would be great > (but googling and using findFn() from the sos package did not turn up > good results). > > Thanks! > > Josh > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > Programmer Analyst II, Statistical Consulting Group University of > California, Los Angeles https://joshuawiley.com/ > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Hello,
There's a package, lavaan, that implements FIML as an option of function sem(). I have never used it, though, so I can't say much about it. Hope this helps, Rui Barradas Em 12-07-2012 16:20, John Fox escreveu: > Dear Joshua, > > If I understand correctly what you want to do, the sem package won't do it. > That is, the sem() function won't do what often is called FIML estimation > for models with missing data. I've been thinking about implementing this > feature, and don't think that it would be too difficult, but I can't promise > when and if I'll get to it. You might also take a look at the lavaan > package. > > As well, I must admit to some skepticism about the FIML estimator, as > opposed to approaches such as multiple imputation of missing data. I suspect > that the former is more sensitive than the latter to the assumption of > multinormality. > > Best, > John > > -------------------------------- > John Fox > Senator William McMaster > Professor of Social Statistics > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > http://socserv.mcmaster.ca/jfox > > > > >> -----Original Message----- >> From: [hidden email] [mailto:r-help-bounces@r- >> project.org] On Behalf Of Joshua Wiley >> Sent: July-12-12 2:53 AM >> To: [hidden email] >> Cc: John Fox >> Subject: [R] easy way to fit saturated model in sem package? >> >> Hi, >> >> I am wondering if anyone knows of an easy way to fit a saturated model >> using the sem package on raw data? Say the data were: >> >> mtcars[, c("mpg", "hp", "wt")] >> >> The model would estimate the three means (intercepts) of c("mpg", "hp", >> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg >> with hp and wt and the covariance of hp with wt. >> >> I am interested in this because I want to obtain the MLE mean vector >> and covariance matrix when there is missing data (i.e., the sum of the >> case wise likelihoods or so-called full information maximum >> likelihood). Here is exemplary missing data: >> >> dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")]) >> dat[sample(length(dat), length(dat) * .25)] <- NA dat <- >> as.data.frame(dat) >> >> It is not too difficult to write a wrapper that does this in the OpenMx >> package because you can easily define paths using vectors and get all >> pairwise combinations using: >> >> combn(c("mpg", "hp", "wt"), 2) >> >> but I would prefer to use the sem package, because OpenMx does not work >> on 64 bit versions of R for Windows x64 and is not available from CRAN >> presently. Obviously it is not difficult to write out the model, but I >> am hoping to bundle this in a function that for some arbitrary data, >> will return the FIML estimated covariance (and correlation matrix). >> Alternately, if there are any functions/packages that just return FIML >> estimates of a covariance matrix from raw data, that would be great >> (but googling and using findFn() from the sos package did not turn up >> good results). >> >> Thanks! >> >> Josh >> >> >> -- >> Joshua Wiley >> Ph.D. Student, Health Psychology >> Programmer Analyst II, Statistical Consulting Group University of >> California, Los Angeles https://joshuawiley.com/ >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by John Fox
Dear John,
Thanks very much for the reply. Looking at the optimizers, I had thought that the objectiveML did what I wanted. I appreciate the clarification. I think that multiple imputation is more flexible in some ways because you can easy create different models for every variable. At the same time, if the assumptions hold, FIML is equivalent to multiple imputation, and considerably more convenient. Further, I suspect that in many circumstances, either option is equal to or better than listwise deletion. In my case, I am working on some tools primarily for data exploration, in a SEM context (some characteristics of individual variables and then covariance/correlation matrices, clustering, etc.) and hoped to include listwise/pairwise/FIML as options. I will check out the lavaan package. Thanks again for your time, Josh On Thu, Jul 12, 2012 at 8:20 AM, John Fox <[hidden email]> wrote: > Dear Joshua, > > If I understand correctly what you want to do, the sem package won't do it. > That is, the sem() function won't do what often is called FIML estimation > for models with missing data. I've been thinking about implementing this > feature, and don't think that it would be too difficult, but I can't promise > when and if I'll get to it. You might also take a look at the lavaan > package. > > As well, I must admit to some skepticism about the FIML estimator, as > opposed to approaches such as multiple imputation of missing data. I suspect > that the former is more sensitive than the latter to the assumption of > multinormality. > > Best, > John > > -------------------------------- > John Fox > Senator William McMaster > Professor of Social Statistics > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > http://socserv.mcmaster.ca/jfox > > > > >> -----Original Message----- >> From: [hidden email] [mailto:r-help-bounces@r- >> project.org] On Behalf Of Joshua Wiley >> Sent: July-12-12 2:53 AM >> To: [hidden email] >> Cc: John Fox >> Subject: [R] easy way to fit saturated model in sem package? >> >> Hi, >> >> I am wondering if anyone knows of an easy way to fit a saturated model >> using the sem package on raw data? Say the data were: >> >> mtcars[, c("mpg", "hp", "wt")] >> >> The model would estimate the three means (intercepts) of c("mpg", "hp", >> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg >> with hp and wt and the covariance of hp with wt. >> >> I am interested in this because I want to obtain the MLE mean vector >> and covariance matrix when there is missing data (i.e., the sum of the >> case wise likelihoods or so-called full information maximum >> likelihood). Here is exemplary missing data: >> >> dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")]) >> dat[sample(length(dat), length(dat) * .25)] <- NA dat <- >> as.data.frame(dat) >> >> It is not too difficult to write a wrapper that does this in the OpenMx >> package because you can easily define paths using vectors and get all >> pairwise combinations using: >> >> combn(c("mpg", "hp", "wt"), 2) >> >> but I would prefer to use the sem package, because OpenMx does not work >> on 64 bit versions of R for Windows x64 and is not available from CRAN >> presently. Obviously it is not difficult to write out the model, but I >> am hoping to bundle this in a function that for some arbitrary data, >> will return the FIML estimated covariance (and correlation matrix). >> Alternately, if there are any functions/packages that just return FIML >> estimates of a covariance matrix from raw data, that would be great >> (but googling and using findFn() from the sos package did not turn up >> good results). >> >> Thanks! >> >> Josh >> >> >> -- >> Joshua Wiley >> Ph.D. Student, Health Psychology >> Programmer Analyst II, Statistical Consulting Group University of >> California, Los Angeles https://joshuawiley.com/ >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
They look fine to me.
luke On Fri, 13 Jul 2012, Joshua Wiley wrote: > Dear John, > > Thanks very much for the reply. Looking at the optimizers, I had > thought that the objectiveML did what I wanted. I appreciate the > clarification. > > I think that multiple imputation is more flexible in some ways because > you can easy create different models for every variable. At the same > time, if the assumptions hold, FIML is equivalent to multiple > imputation, and considerably more convenient. Further, I suspect that > in many circumstances, either option is equal to or better than > listwise deletion. > > In my case, I am working on some tools primarily for data exploration, > in a SEM context (some characteristics of individual variables and > then covariance/correlation matrices, clustering, etc.) and hoped to > include listwise/pairwise/FIML as options. > > I will check out the lavaan package. > > Thanks again for your time, > > Josh > > On Thu, Jul 12, 2012 at 8:20 AM, John Fox <[hidden email]> wrote: >> Dear Joshua, >> >> If I understand correctly what you want to do, the sem package won't do it. >> That is, the sem() function won't do what often is called FIML estimation >> for models with missing data. I've been thinking about implementing this >> feature, and don't think that it would be too difficult, but I can't promise >> when and if I'll get to it. You might also take a look at the lavaan >> package. >> >> As well, I must admit to some skepticism about the FIML estimator, as >> opposed to approaches such as multiple imputation of missing data. I suspect >> that the former is more sensitive than the latter to the assumption of >> multinormality. >> >> Best, >> John >> >> -------------------------------- >> John Fox >> Senator William McMaster >> Professor of Social Statistics >> Department of Sociology >> McMaster University >> Hamilton, Ontario, Canada >> http://socserv.mcmaster.ca/jfox >> >> >> >> >>> -----Original Message----- >>> From: [hidden email] [mailto:r-help-bounces@r- >>> project.org] On Behalf Of Joshua Wiley >>> Sent: July-12-12 2:53 AM >>> To: [hidden email] >>> Cc: John Fox >>> Subject: [R] easy way to fit saturated model in sem package? >>> >>> Hi, >>> >>> I am wondering if anyone knows of an easy way to fit a saturated model >>> using the sem package on raw data? Say the data were: >>> >>> mtcars[, c("mpg", "hp", "wt")] >>> >>> The model would estimate the three means (intercepts) of c("mpg", "hp", >>> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg >>> with hp and wt and the covariance of hp with wt. >>> >>> I am interested in this because I want to obtain the MLE mean vector >>> and covariance matrix when there is missing data (i.e., the sum of the >>> case wise likelihoods or so-called full information maximum >>> likelihood). Here is exemplary missing data: >>> >>> dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")]) >>> dat[sample(length(dat), length(dat) * .25)] <- NA dat <- >>> as.data.frame(dat) >>> >>> It is not too difficult to write a wrapper that does this in the OpenMx >>> package because you can easily define paths using vectors and get all >>> pairwise combinations using: >>> >>> combn(c("mpg", "hp", "wt"), 2) >>> >>> but I would prefer to use the sem package, because OpenMx does not work >>> on 64 bit versions of R for Windows x64 and is not available from CRAN >>> presently. Obviously it is not difficult to write out the model, but I >>> am hoping to bundle this in a function that for some arbitrary data, >>> will return the FIML estimated covariance (and correlation matrix). >>> Alternately, if there are any functions/packages that just return FIML >>> estimates of a covariance matrix from raw data, that would be great >>> (but googling and using findFn() from the sos package did not turn up >>> good results). >>> >>> Thanks! >>> >>> Josh >>> >>> >>> -- >>> Joshua Wiley >>> Ph.D. Student, Health Psychology >>> Programmer Analyst II, Statistical Consulting Group University of >>> California, Los Angeles https://joshuawiley.com/ >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting- >>> guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [hidden email] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Apologies -- replied to the wrong message.
luke On Fri, 13 Jul 2012, [hidden email] wrote: > They look fine to me. > > luke > > On Fri, 13 Jul 2012, Joshua Wiley wrote: > >> Dear John, >> >> Thanks very much for the reply. Looking at the optimizers, I had >> thought that the objectiveML did what I wanted. I appreciate the >> clarification. >> >> I think that multiple imputation is more flexible in some ways because >> you can easy create different models for every variable. At the same >> time, if the assumptions hold, FIML is equivalent to multiple >> imputation, and considerably more convenient. Further, I suspect that >> in many circumstances, either option is equal to or better than >> listwise deletion. >> >> In my case, I am working on some tools primarily for data exploration, >> in a SEM context (some characteristics of individual variables and >> then covariance/correlation matrices, clustering, etc.) and hoped to >> include listwise/pairwise/FIML as options. >> >> I will check out the lavaan package. >> >> Thanks again for your time, >> >> Josh >> >> On Thu, Jul 12, 2012 at 8:20 AM, John Fox <[hidden email]> wrote: >>> Dear Joshua, >>> >>> If I understand correctly what you want to do, the sem package won't do >>> it. >>> That is, the sem() function won't do what often is called FIML estimation >>> for models with missing data. I've been thinking about implementing this >>> feature, and don't think that it would be too difficult, but I can't >>> promise >>> when and if I'll get to it. You might also take a look at the lavaan >>> package. >>> >>> As well, I must admit to some skepticism about the FIML estimator, as >>> opposed to approaches such as multiple imputation of missing data. I >>> suspect >>> that the former is more sensitive than the latter to the assumption of >>> multinormality. >>> >>> Best, >>> John >>> >>> -------------------------------- >>> John Fox >>> Senator William McMaster >>> Professor of Social Statistics >>> Department of Sociology >>> McMaster University >>> Hamilton, Ontario, Canada >>> http://socserv.mcmaster.ca/jfox >>> >>> >>> >>> >>>> -----Original Message----- >>>> From: [hidden email] [mailto:r-help-bounces@r- >>>> project.org] On Behalf Of Joshua Wiley >>>> Sent: July-12-12 2:53 AM >>>> To: [hidden email] >>>> Cc: John Fox >>>> Subject: [R] easy way to fit saturated model in sem package? >>>> >>>> Hi, >>>> >>>> I am wondering if anyone knows of an easy way to fit a saturated model >>>> using the sem package on raw data? Say the data were: >>>> >>>> mtcars[, c("mpg", "hp", "wt")] >>>> >>>> The model would estimate the three means (intercepts) of c("mpg", "hp", >>>> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg >>>> with hp and wt and the covariance of hp with wt. >>>> >>>> I am interested in this because I want to obtain the MLE mean vector >>>> and covariance matrix when there is missing data (i.e., the sum of the >>>> case wise likelihoods or so-called full information maximum >>>> likelihood). Here is exemplary missing data: >>>> >>>> dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")]) >>>> dat[sample(length(dat), length(dat) * .25)] <- NA dat <- >>>> as.data.frame(dat) >>>> >>>> It is not too difficult to write a wrapper that does this in the OpenMx >>>> package because you can easily define paths using vectors and get all >>>> pairwise combinations using: >>>> >>>> combn(c("mpg", "hp", "wt"), 2) >>>> >>>> but I would prefer to use the sem package, because OpenMx does not work >>>> on 64 bit versions of R for Windows x64 and is not available from CRAN >>>> presently. Obviously it is not difficult to write out the model, but I >>>> am hoping to bundle this in a function that for some arbitrary data, >>>> will return the FIML estimated covariance (and correlation matrix). >>>> Alternately, if there are any functions/packages that just return FIML >>>> estimates of a covariance matrix from raw data, that would be great >>>> (but googling and using findFn() from the sos package did not turn up >>>> good results). >>>> >>>> Thanks! >>>> >>>> Josh >>>> >>>> >>>> -- >>>> Joshua Wiley >>>> Ph.D. Student, Health Psychology >>>> Programmer Analyst II, Statistical Consulting Group University of >>>> California, Los Angeles https://joshuawiley.com/ >>>> >>>> ______________________________________________ >>>> [hidden email] mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting- >>>> guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> > > -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [hidden email] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Joshua Wiley-2
> I will check out the lavaan package. Dear Joshua, The lavaan package may help you. The FIML estimator typically starts with the EM algorithm to estimate the moments of the unrestricted model. There is no 'one-shot' function for it, at the moment, but if you only need those moments, you can do something like this: Suppose your data is a data.frame called 'HS.missing', then the following commands can be used to get the estimated moments: library(lavaan) Data <- lavaan:::lavData(HS.missing, ov.names=names(HS.missing), missing="fiml") # we assume only 1 group Missing <- lavaan:::getMissingPatternStats(X = Data@X[[1L]], Mp = Data@Mp[[1L]]) # compute moments using EM algorithm Moments <- lavaan:::estimate.moments.EM(X=Data@X[[1L]], M=Missing, verbose=TRUE) # estimated covariance matrix Moments$sigma # estimated mean vector Moments$mu Hope this helps, Yves Rosseel http://lavaan.org ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
