

Hi,
I am wondering if anyone knows of an easy way to fit a saturated model
using the sem package on raw data? Say the data were:
mtcars[, c("mpg", "hp", "wt")]
The model would estimate the three means (intercepts) of c("mpg",
"hp", "wt"). The variances of c("mpg", "hp", "wt"). The covariance
of mpg with hp and wt and the covariance of hp with wt.
I am interested in this because I want to obtain the MLE mean vector
and covariance matrix when there is missing data (i.e., the sum of the
case wise likelihoods or socalled full information maximum
likelihood). Here is exemplary missing data:
dat < as.matrix(mtcars[, c("mpg", "hp", "wt")])
dat[sample(length(dat), length(dat) * .25)] < NA
dat < as.data.frame(dat)
It is not too difficult to write a wrapper that does this in the
OpenMx package because you can easily define paths using vectors and
get all pairwise combinations using:
combn(c("mpg", "hp", "wt"), 2)
but I would prefer to use the sem package, because OpenMx does not
work on 64 bit versions of R for Windows x64 and is not available from
CRAN presently. Obviously it is not difficult to write out the model,
but I am hoping to bundle this in a function that for some arbitrary
data, will return the FIML estimated covariance (and correlation
matrix). Alternately, if there are any functions/packages that just
return FIML estimates of a covariance matrix from raw data, that would
be great (but googling and using findFn() from the sos package did not
turn up good results).
Thanks!
Josh

Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Dear Joshua,
If I understand correctly what you want to do, the sem package won't do it.
That is, the sem() function won't do what often is called FIML estimation
for models with missing data. I've been thinking about implementing this
feature, and don't think that it would be too difficult, but I can't promise
when and if I'll get to it. You might also take a look at the lavaan
package.
As well, I must admit to some skepticism about the FIML estimator, as
opposed to approaches such as multiple imputation of missing data. I suspect
that the former is more sensitive than the latter to the assumption of
multinormality.
Best,
John

John Fox
Senator William McMaster
Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox> Original Message
> From: [hidden email] [mailto:rhelpbounces@r
> project.org] On Behalf Of Joshua Wiley
> Sent: July1212 2:53 AM
> To: [hidden email]
> Cc: John Fox
> Subject: [R] easy way to fit saturated model in sem package?
>
> Hi,
>
> I am wondering if anyone knows of an easy way to fit a saturated model
> using the sem package on raw data? Say the data were:
>
> mtcars[, c("mpg", "hp", "wt")]
>
> The model would estimate the three means (intercepts) of c("mpg", "hp",
> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg
> with hp and wt and the covariance of hp with wt.
>
> I am interested in this because I want to obtain the MLE mean vector
> and covariance matrix when there is missing data (i.e., the sum of the
> case wise likelihoods or socalled full information maximum
> likelihood). Here is exemplary missing data:
>
> dat < as.matrix(mtcars[, c("mpg", "hp", "wt")])
> dat[sample(length(dat), length(dat) * .25)] < NA dat <
> as.data.frame(dat)
>
> It is not too difficult to write a wrapper that does this in the OpenMx
> package because you can easily define paths using vectors and get all
> pairwise combinations using:
>
> combn(c("mpg", "hp", "wt"), 2)
>
> but I would prefer to use the sem package, because OpenMx does not work
> on 64 bit versions of R for Windows x64 and is not available from CRAN
> presently. Obviously it is not difficult to write out the model, but I
> am hoping to bundle this in a function that for some arbitrary data,
> will return the FIML estimated covariance (and correlation matrix).
> Alternately, if there are any functions/packages that just return FIML
> estimates of a covariance matrix from raw data, that would be great
> (but googling and using findFn() from the sos package did not turn up
> good results).
>
> Thanks!
>
> Josh
>
>
> 
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, Statistical Consulting Group University of
> California, Los Angeles https://joshuawiley.com/>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/posting> guide.html
> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hello,
There's a package, lavaan, that implements FIML as an option of function
sem(). I have never used it, though, so I can't say much about it.
Hope this helps,
Rui Barradas
Em 12072012 16:20, John Fox escreveu:
> Dear Joshua,
>
> If I understand correctly what you want to do, the sem package won't do it.
> That is, the sem() function won't do what often is called FIML estimation
> for models with missing data. I've been thinking about implementing this
> feature, and don't think that it would be too difficult, but I can't promise
> when and if I'll get to it. You might also take a look at the lavaan
> package.
>
> As well, I must admit to some skepticism about the FIML estimator, as
> opposed to approaches such as multiple imputation of missing data. I suspect
> that the former is more sensitive than the latter to the assumption of
> multinormality.
>
> Best,
> John
>
> 
> John Fox
> Senator William McMaster
> Professor of Social Statistics
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox>
>
>
>
>> Original Message
>> From: [hidden email] [mailto:rhelpbounces@r
>> project.org] On Behalf Of Joshua Wiley
>> Sent: July1212 2:53 AM
>> To: [hidden email]
>> Cc: John Fox
>> Subject: [R] easy way to fit saturated model in sem package?
>>
>> Hi,
>>
>> I am wondering if anyone knows of an easy way to fit a saturated model
>> using the sem package on raw data? Say the data were:
>>
>> mtcars[, c("mpg", "hp", "wt")]
>>
>> The model would estimate the three means (intercepts) of c("mpg", "hp",
>> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg
>> with hp and wt and the covariance of hp with wt.
>>
>> I am interested in this because I want to obtain the MLE mean vector
>> and covariance matrix when there is missing data (i.e., the sum of the
>> case wise likelihoods or socalled full information maximum
>> likelihood). Here is exemplary missing data:
>>
>> dat < as.matrix(mtcars[, c("mpg", "hp", "wt")])
>> dat[sample(length(dat), length(dat) * .25)] < NA dat <
>> as.data.frame(dat)
>>
>> It is not too difficult to write a wrapper that does this in the OpenMx
>> package because you can easily define paths using vectors and get all
>> pairwise combinations using:
>>
>> combn(c("mpg", "hp", "wt"), 2)
>>
>> but I would prefer to use the sem package, because OpenMx does not work
>> on 64 bit versions of R for Windows x64 and is not available from CRAN
>> presently. Obviously it is not difficult to write out the model, but I
>> am hoping to bundle this in a function that for some arbitrary data,
>> will return the FIML estimated covariance (and correlation matrix).
>> Alternately, if there are any functions/packages that just return FIML
>> estimates of a covariance matrix from raw data, that would be great
>> (but googling and using findFn() from the sos package did not turn up
>> good results).
>>
>> Thanks!
>>
>> Josh
>>
>>
>> 
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> Programmer Analyst II, Statistical Consulting Group University of
>> California, Los Angeles https://joshuawiley.com/>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/posting>> guide.html
>> and provide commented, minimal, selfcontained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Dear John,
Thanks very much for the reply. Looking at the optimizers, I had
thought that the objectiveML did what I wanted. I appreciate the
clarification.
I think that multiple imputation is more flexible in some ways because
you can easy create different models for every variable. At the same
time, if the assumptions hold, FIML is equivalent to multiple
imputation, and considerably more convenient. Further, I suspect that
in many circumstances, either option is equal to or better than
listwise deletion.
In my case, I am working on some tools primarily for data exploration,
in a SEM context (some characteristics of individual variables and
then covariance/correlation matrices, clustering, etc.) and hoped to
include listwise/pairwise/FIML as options.
I will check out the lavaan package.
Thanks again for your time,
Josh
On Thu, Jul 12, 2012 at 8:20 AM, John Fox < [hidden email]> wrote:
> Dear Joshua,
>
> If I understand correctly what you want to do, the sem package won't do it.
> That is, the sem() function won't do what often is called FIML estimation
> for models with missing data. I've been thinking about implementing this
> feature, and don't think that it would be too difficult, but I can't promise
> when and if I'll get to it. You might also take a look at the lavaan
> package.
>
> As well, I must admit to some skepticism about the FIML estimator, as
> opposed to approaches such as multiple imputation of missing data. I suspect
> that the former is more sensitive than the latter to the assumption of
> multinormality.
>
> Best,
> John
>
> 
> John Fox
> Senator William McMaster
> Professor of Social Statistics
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox>
>
>
>
>> Original Message
>> From: [hidden email] [mailto:rhelpbounces@r
>> project.org] On Behalf Of Joshua Wiley
>> Sent: July1212 2:53 AM
>> To: [hidden email]
>> Cc: John Fox
>> Subject: [R] easy way to fit saturated model in sem package?
>>
>> Hi,
>>
>> I am wondering if anyone knows of an easy way to fit a saturated model
>> using the sem package on raw data? Say the data were:
>>
>> mtcars[, c("mpg", "hp", "wt")]
>>
>> The model would estimate the three means (intercepts) of c("mpg", "hp",
>> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg
>> with hp and wt and the covariance of hp with wt.
>>
>> I am interested in this because I want to obtain the MLE mean vector
>> and covariance matrix when there is missing data (i.e., the sum of the
>> case wise likelihoods or socalled full information maximum
>> likelihood). Here is exemplary missing data:
>>
>> dat < as.matrix(mtcars[, c("mpg", "hp", "wt")])
>> dat[sample(length(dat), length(dat) * .25)] < NA dat <
>> as.data.frame(dat)
>>
>> It is not too difficult to write a wrapper that does this in the OpenMx
>> package because you can easily define paths using vectors and get all
>> pairwise combinations using:
>>
>> combn(c("mpg", "hp", "wt"), 2)
>>
>> but I would prefer to use the sem package, because OpenMx does not work
>> on 64 bit versions of R for Windows x64 and is not available from CRAN
>> presently. Obviously it is not difficult to write out the model, but I
>> am hoping to bundle this in a function that for some arbitrary data,
>> will return the FIML estimated covariance (and correlation matrix).
>> Alternately, if there are any functions/packages that just return FIML
>> estimates of a covariance matrix from raw data, that would be great
>> (but googling and using findFn() from the sos package did not turn up
>> good results).
>>
>> Thanks!
>>
>> Josh
>>
>>
>> 
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> Programmer Analyst II, Statistical Consulting Group University of
>> California, Los Angeles https://joshuawiley.com/>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/posting>> guide.html
>> and provide commented, minimal, selfcontained, reproducible code.
>

Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


They look fine to me.
luke
On Fri, 13 Jul 2012, Joshua Wiley wrote:
> Dear John,
>
> Thanks very much for the reply. Looking at the optimizers, I had
> thought that the objectiveML did what I wanted. I appreciate the
> clarification.
>
> I think that multiple imputation is more flexible in some ways because
> you can easy create different models for every variable. At the same
> time, if the assumptions hold, FIML is equivalent to multiple
> imputation, and considerably more convenient. Further, I suspect that
> in many circumstances, either option is equal to or better than
> listwise deletion.
>
> In my case, I am working on some tools primarily for data exploration,
> in a SEM context (some characteristics of individual variables and
> then covariance/correlation matrices, clustering, etc.) and hoped to
> include listwise/pairwise/FIML as options.
>
> I will check out the lavaan package.
>
> Thanks again for your time,
>
> Josh
>
> On Thu, Jul 12, 2012 at 8:20 AM, John Fox < [hidden email]> wrote:
>> Dear Joshua,
>>
>> If I understand correctly what you want to do, the sem package won't do it.
>> That is, the sem() function won't do what often is called FIML estimation
>> for models with missing data. I've been thinking about implementing this
>> feature, and don't think that it would be too difficult, but I can't promise
>> when and if I'll get to it. You might also take a look at the lavaan
>> package.
>>
>> As well, I must admit to some skepticism about the FIML estimator, as
>> opposed to approaches such as multiple imputation of missing data. I suspect
>> that the former is more sensitive than the latter to the assumption of
>> multinormality.
>>
>> Best,
>> John
>>
>> 
>> John Fox
>> Senator William McMaster
>> Professor of Social Statistics
>> Department of Sociology
>> McMaster University
>> Hamilton, Ontario, Canada
>> http://socserv.mcmaster.ca/jfox>>
>>
>>
>>
>>> Original Message
>>> From: [hidden email] [mailto:rhelpbounces@r
>>> project.org] On Behalf Of Joshua Wiley
>>> Sent: July1212 2:53 AM
>>> To: [hidden email]
>>> Cc: John Fox
>>> Subject: [R] easy way to fit saturated model in sem package?
>>>
>>> Hi,
>>>
>>> I am wondering if anyone knows of an easy way to fit a saturated model
>>> using the sem package on raw data? Say the data were:
>>>
>>> mtcars[, c("mpg", "hp", "wt")]
>>>
>>> The model would estimate the three means (intercepts) of c("mpg", "hp",
>>> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg
>>> with hp and wt and the covariance of hp with wt.
>>>
>>> I am interested in this because I want to obtain the MLE mean vector
>>> and covariance matrix when there is missing data (i.e., the sum of the
>>> case wise likelihoods or socalled full information maximum
>>> likelihood). Here is exemplary missing data:
>>>
>>> dat < as.matrix(mtcars[, c("mpg", "hp", "wt")])
>>> dat[sample(length(dat), length(dat) * .25)] < NA dat <
>>> as.data.frame(dat)
>>>
>>> It is not too difficult to write a wrapper that does this in the OpenMx
>>> package because you can easily define paths using vectors and get all
>>> pairwise combinations using:
>>>
>>> combn(c("mpg", "hp", "wt"), 2)
>>>
>>> but I would prefer to use the sem package, because OpenMx does not work
>>> on 64 bit versions of R for Windows x64 and is not available from CRAN
>>> presently. Obviously it is not difficult to write out the model, but I
>>> am hoping to bundle this in a function that for some arbitrary data,
>>> will return the FIML estimated covariance (and correlation matrix).
>>> Alternately, if there are any functions/packages that just return FIML
>>> estimates of a covariance matrix from raw data, that would be great
>>> (but googling and using findFn() from the sos package did not turn up
>>> good results).
>>>
>>> Thanks!
>>>
>>> Josh
>>>
>>>
>>> 
>>> Joshua Wiley
>>> Ph.D. Student, Health Psychology
>>> Programmer Analyst II, Statistical Consulting Group University of
>>> California, Los Angeles https://joshuawiley.com/>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide http://www.Rproject.org/posting>>> guide.html
>>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
>
>
>

Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 3193353386
Department of Statistics and Fax: 3193353017
Actuarial Science
241 Schaeffer Hall email: [hidden email]
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Apologies  replied to the wrong message.
luke
On Fri, 13 Jul 2012, [hidden email] wrote:
> They look fine to me.
>
> luke
>
> On Fri, 13 Jul 2012, Joshua Wiley wrote:
>
>> Dear John,
>>
>> Thanks very much for the reply. Looking at the optimizers, I had
>> thought that the objectiveML did what I wanted. I appreciate the
>> clarification.
>>
>> I think that multiple imputation is more flexible in some ways because
>> you can easy create different models for every variable. At the same
>> time, if the assumptions hold, FIML is equivalent to multiple
>> imputation, and considerably more convenient. Further, I suspect that
>> in many circumstances, either option is equal to or better than
>> listwise deletion.
>>
>> In my case, I am working on some tools primarily for data exploration,
>> in a SEM context (some characteristics of individual variables and
>> then covariance/correlation matrices, clustering, etc.) and hoped to
>> include listwise/pairwise/FIML as options.
>>
>> I will check out the lavaan package.
>>
>> Thanks again for your time,
>>
>> Josh
>>
>> On Thu, Jul 12, 2012 at 8:20 AM, John Fox < [hidden email]> wrote:
>>> Dear Joshua,
>>>
>>> If I understand correctly what you want to do, the sem package won't do
>>> it.
>>> That is, the sem() function won't do what often is called FIML estimation
>>> for models with missing data. I've been thinking about implementing this
>>> feature, and don't think that it would be too difficult, but I can't
>>> promise
>>> when and if I'll get to it. You might also take a look at the lavaan
>>> package.
>>>
>>> As well, I must admit to some skepticism about the FIML estimator, as
>>> opposed to approaches such as multiple imputation of missing data. I
>>> suspect
>>> that the former is more sensitive than the latter to the assumption of
>>> multinormality.
>>>
>>> Best,
>>> John
>>>
>>> 
>>> John Fox
>>> Senator William McMaster
>>> Professor of Social Statistics
>>> Department of Sociology
>>> McMaster University
>>> Hamilton, Ontario, Canada
>>> http://socserv.mcmaster.ca/jfox>>>
>>>
>>>
>>>
>>>> Original Message
>>>> From: [hidden email] [mailto:rhelpbounces@r
>>>> project.org] On Behalf Of Joshua Wiley
>>>> Sent: July1212 2:53 AM
>>>> To: [hidden email]
>>>> Cc: John Fox
>>>> Subject: [R] easy way to fit saturated model in sem package?
>>>>
>>>> Hi,
>>>>
>>>> I am wondering if anyone knows of an easy way to fit a saturated model
>>>> using the sem package on raw data? Say the data were:
>>>>
>>>> mtcars[, c("mpg", "hp", "wt")]
>>>>
>>>> The model would estimate the three means (intercepts) of c("mpg", "hp",
>>>> "wt"). The variances of c("mpg", "hp", "wt"). The covariance of mpg
>>>> with hp and wt and the covariance of hp with wt.
>>>>
>>>> I am interested in this because I want to obtain the MLE mean vector
>>>> and covariance matrix when there is missing data (i.e., the sum of the
>>>> case wise likelihoods or socalled full information maximum
>>>> likelihood). Here is exemplary missing data:
>>>>
>>>> dat < as.matrix(mtcars[, c("mpg", "hp", "wt")])
>>>> dat[sample(length(dat), length(dat) * .25)] < NA dat <
>>>> as.data.frame(dat)
>>>>
>>>> It is not too difficult to write a wrapper that does this in the OpenMx
>>>> package because you can easily define paths using vectors and get all
>>>> pairwise combinations using:
>>>>
>>>> combn(c("mpg", "hp", "wt"), 2)
>>>>
>>>> but I would prefer to use the sem package, because OpenMx does not work
>>>> on 64 bit versions of R for Windows x64 and is not available from CRAN
>>>> presently. Obviously it is not difficult to write out the model, but I
>>>> am hoping to bundle this in a function that for some arbitrary data,
>>>> will return the FIML estimated covariance (and correlation matrix).
>>>> Alternately, if there are any functions/packages that just return FIML
>>>> estimates of a covariance matrix from raw data, that would be great
>>>> (but googling and using findFn() from the sos package did not turn up
>>>> good results).
>>>>
>>>> Thanks!
>>>>
>>>> Josh
>>>>
>>>>
>>>> 
>>>> Joshua Wiley
>>>> Ph.D. Student, Health Psychology
>>>> Programmer Analyst II, Statistical Consulting Group University of
>>>> California, Los Angeles https://joshuawiley.com/>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide http://www.Rproject.org/posting>>>> guide.html
>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>
>>
>>
>>
>
>

Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 3193353386
Department of Statistics and Fax: 3193353017
Actuarial Science
241 Schaeffer Hall email: [hidden email]
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


> I will check out the lavaan package.
Dear Joshua,
The lavaan package may help you. The FIML estimator typically starts
with the EM algorithm to estimate the moments of the unrestricted model.
There is no 'oneshot' function for it, at the moment, but if you only
need those moments, you can do something like this:
Suppose your data is a data.frame called 'HS.missing', then the
following commands can be used to get the estimated moments:
library(lavaan)
Data < lavaan:::lavData(HS.missing, ov.names=names(HS.missing),
missing="fiml")
# we assume only 1 group
Missing < lavaan:::getMissingPatternStats(X = Data@X[[1L]], Mp =
Data@Mp[[1L]])
# compute moments using EM algorithm
Moments < lavaan:::estimate.moments.EM(X=Data@X[[1L]], M=Missing,
verbose=TRUE)
# estimated covariance matrix
Moments$sigma
# estimated mean vector
Moments$mu
Hope this helps,
Yves Rosseel
http://lavaan.org______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

