Error of Stepwise Regression with number of rows in use has changed: remove missing values?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Error of Stepwise Regression with number of rows in use has changed: remove missing values?

Kum-Hoe Hwang
Howdy, R Grues

I have enjoyed R, but I cannot solve one problem easily. Please help my problem.
When I tried the R script, I got the following Error. This error
results from input data file exported through a Excel spreadsheet
software.

 Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
as.numeric(nation.grant) +  :
  number of rows in use has changed: remove missing values?

Could you direct me to solve the Error?
Thanks in advance,


> ############### outputs from R console ###############
> pop <- step(
+             lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
as.numeric(nation.grant)
+                + as.numeric(do.grant) + as.numeric(city.grant) +
as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
na.action = na.omit)
+             )
Start:  AIC=494.27
pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
    as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) +
    as.numeric(Seoul.dist)
                           Df Sum of Sq    RSS    AIC
- as.numeric(do.grant)      1      0.71 6622.9 492.28
- as.factor(policy)         1      1.21 6623.4 492.29
- as.numeric(DMZ.dist)      1      1.91 6624.1 492.30
- as.numeric(city.grant)    1      5.07 6627.3 492.36
- as.numeric(nation.grant)  1     11.51 6633.7 492.47
- as.numeric(year)          1     29.58 6651.8 492.80
<none>                                  6622.2 494.27
- as.numeric(Seoul.dist)    1    673.22 7295.4 503.79
Step:  AIC=492.28
pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
    as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
                           Df Sum of Sq    RSS    AIC
- as.factor(policy)         1      1.99 6624.9 490.32
- as.numeric(DMZ.dist)      1      2.09 6625.0 490.32
- as.numeric(city.grant)    1      7.18 6630.1 490.41
- as.numeric(nation.grant)  1     20.08 6643.0 490.64
- as.numeric(year)          1     28.89 6651.8 490.80
<none>                                  6622.9 492.28
- as.numeric(Seoul.dist)    1    697.46 7320.4 502.20
Step:  AIC=490.32
pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
as.numeric(city.grant) +
    as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
                           Df Sum of Sq    RSS    AIC
- as.numeric(DMZ.dist)      1      2.08 6627.0 488.35
- as.numeric(city.grant)    1     10.65 6635.6 488.51
- as.numeric(nation.grant)  1     31.30 6656.2 488.88
- as.numeric(year)          1     31.44 6656.4 488.88
<none>                                  6624.9 490.32
- as.numeric(Seoul.dist)    1    732.88 7357.8 500.80
Step:  AIC=488.35
pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
as.numeric(city.grant) +
    as.numeric(Seoul.dist)
                           Df Sum of Sq    RSS    AIC
- as.numeric(city.grant)    1      9.86 6636.9 486.53
- as.numeric(year)          1     31.42 6658.4 486.92
- as.numeric(nation.grant)  1     33.33 6660.3 486.95
<none>                                  6627.0 488.35
- as.numeric(Seoul.dist)    1    754.40 7381.4 499.18

Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
as.numeric(nation.grant) +  :
-------------------------------------------------------------------------------------------------------------------------------------------
  number of rows in use has changed: remove missing values?
------------------------------------------------------------------------------------------




--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Error of Stepwise Regression with number of rows in use has changed: remove missing values?

Mohamed Lajnef

Hi Kum,

If you look at the code step function  (  by typing  step in the R
console), the condition (if (length(fit$residuals) != n) ) is not
fulfilled, this explains the error!
i hope this can help

Regards
M


Kum-Hoe Hwang a écrit :

> Howdy, R Grues
>
> I have enjoyed R, but I cannot solve one problem easily. Please help my problem.
> When I tried the R script, I got the following Error. This error
> results from input data file exported through a Excel spreadsheet
> software.
>
>  Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant) +  :
>   number of rows in use has changed: remove missing values?
>
> Could you direct me to solve the Error?
> Thanks in advance,
>
>
>  
>> ############### outputs from R console ###############
>> pop <- step(
>>    
> +             lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant)
> +                + as.numeric(do.grant) + as.numeric(city.grant) +
> as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
> na.action = na.omit)
> +             )
> Start:  AIC=494.27
> pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
>     as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) +
>     as.numeric(Seoul.dist)
>                            Df Sum of Sq    RSS    AIC
> - as.numeric(do.grant)      1      0.71 6622.9 492.28
> - as.factor(policy)         1      1.21 6623.4 492.29
> - as.numeric(DMZ.dist)      1      1.91 6624.1 492.30
> - as.numeric(city.grant)    1      5.07 6627.3 492.36
> - as.numeric(nation.grant)  1     11.51 6633.7 492.47
> - as.numeric(year)          1     29.58 6651.8 492.80
> <none>                                  6622.2 494.27
> - as.numeric(Seoul.dist)    1    673.22 7295.4 503.79
> Step:  AIC=492.28
> pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
>     as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>                            Df Sum of Sq    RSS    AIC
> - as.factor(policy)         1      1.99 6624.9 490.32
> - as.numeric(DMZ.dist)      1      2.09 6625.0 490.32
> - as.numeric(city.grant)    1      7.18 6630.1 490.41
> - as.numeric(nation.grant)  1     20.08 6643.0 490.64
> - as.numeric(year)          1     28.89 6651.8 490.80
> <none>                                  6622.9 492.28
> - as.numeric(Seoul.dist)    1    697.46 7320.4 502.20
> Step:  AIC=490.32
> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
> as.numeric(city.grant) +
>     as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>                            Df Sum of Sq    RSS    AIC
> - as.numeric(DMZ.dist)      1      2.08 6627.0 488.35
> - as.numeric(city.grant)    1     10.65 6635.6 488.51
> - as.numeric(nation.grant)  1     31.30 6656.2 488.88
> - as.numeric(year)          1     31.44 6656.4 488.88
> <none>                                  6624.9 490.32
> - as.numeric(Seoul.dist)    1    732.88 7357.8 500.80
> Step:  AIC=488.35
> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
> as.numeric(city.grant) +
>     as.numeric(Seoul.dist)
>                            Df Sum of Sq    RSS    AIC
> - as.numeric(city.grant)    1      9.86 6636.9 486.53
> - as.numeric(year)          1     31.42 6658.4 486.92
> - as.numeric(nation.grant)  1     33.33 6660.3 486.95
> <none>                                  6627.0 488.35
> - as.numeric(Seoul.dist)    1    754.40 7381.4 499.18
>
> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant) +  :
> -------------------------------------------------------------------------------------------------------------------------------------------
>   number of rows in use has changed: remove missing values?
> ------------------------------------------------------------------------------------------
>
>
>
>
> --
> Kum-Hoe Hwang, Ph.D.
>
> Phone : 82-31-250-3516
> Email : [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>  


--


Mohamed Lajnef,IE
INSERM U955 eq 15
Pôle de Psychiatrie
Hôpital CHENEVIER
40, rue Mesly
94010 CRETEIL Cedex FRANCE
[hidden email]
tel : 01 49 81 31 31 (poste 18470)
Sec : 01 49 81 32 90
fax : 01 49 81 30 99

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Error of Stepwise Regression with number of rows in use has changed: remove missing values?

Peter Ehlers
In reply to this post by Kum-Hoe Hwang
On 2010-02-16 1:24, Kum-Hoe Hwang wrote:

> Howdy, R Grues
>
> I have enjoyed R, but I cannot solve one problem easily. Please help my problem.
> When I tried the R script, I got the following Error. This error
> results from input data file exported through a Excel spreadsheet
> software.
>
>   Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant) +  :
>    number of rows in use has changed: remove missing values?
>
> Could you direct me to solve the Error?
> Thanks in advance,

This is a common situation when you use step() on data where
the predictors have missing values.

A case (row) is included in the model only if all the
predictors for that model are non-missing for the case.

As you vary which predictors are to be in the model, the
included cases will vary, resulting in models based on
different data. (Think of your cases as subjects; you want
all your models to be based on the same set of subjects.)

Finally: (Re-)read the help page and note the 'warning'.

  -Peter Ehlers

>
>
>> ############### outputs from R console ###############
>> pop<- step(
> +             lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant)
> +                + as.numeric(do.grant) + as.numeric(city.grant) +
> as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
> na.action = na.omit)
> +             )
> Start:  AIC=494.27
> pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
>      as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) +
>      as.numeric(Seoul.dist)
>                             Df Sum of Sq    RSS    AIC
> - as.numeric(do.grant)      1      0.71 6622.9 492.28
> - as.factor(policy)         1      1.21 6623.4 492.29
> - as.numeric(DMZ.dist)      1      1.91 6624.1 492.30
> - as.numeric(city.grant)    1      5.07 6627.3 492.36
> - as.numeric(nation.grant)  1     11.51 6633.7 492.47
> - as.numeric(year)          1     29.58 6651.8 492.80
> <none>                                    6622.2 494.27
> - as.numeric(Seoul.dist)    1    673.22 7295.4 503.79
> Step:  AIC=492.28
> pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
>      as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>                             Df Sum of Sq    RSS    AIC
> - as.factor(policy)         1      1.99 6624.9 490.32
> - as.numeric(DMZ.dist)      1      2.09 6625.0 490.32
> - as.numeric(city.grant)    1      7.18 6630.1 490.41
> - as.numeric(nation.grant)  1     20.08 6643.0 490.64
> - as.numeric(year)          1     28.89 6651.8 490.80
> <none>                                    6622.9 492.28
> - as.numeric(Seoul.dist)    1    697.46 7320.4 502.20
> Step:  AIC=490.32
> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
> as.numeric(city.grant) +
>      as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>                             Df Sum of Sq    RSS    AIC
> - as.numeric(DMZ.dist)      1      2.08 6627.0 488.35
> - as.numeric(city.grant)    1     10.65 6635.6 488.51
> - as.numeric(nation.grant)  1     31.30 6656.2 488.88
> - as.numeric(year)          1     31.44 6656.4 488.88
> <none>                                    6624.9 490.32
> - as.numeric(Seoul.dist)    1    732.88 7357.8 500.80
> Step:  AIC=488.35
> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
> as.numeric(city.grant) +
>      as.numeric(Seoul.dist)
>                             Df Sum of Sq    RSS    AIC
> - as.numeric(city.grant)    1      9.86 6636.9 486.53
> - as.numeric(year)          1     31.42 6658.4 486.92
> - as.numeric(nation.grant)  1     33.33 6660.3 486.95
> <none>                                    6627.0 488.35
> - as.numeric(Seoul.dist)    1    754.40 7381.4 499.18
>
> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant) +  :
> -------------------------------------------------------------------------------------------------------------------------------------------
>    number of rows in use has changed: remove missing values?
> ------------------------------------------------------------------------------------------
>
>
>
>
> --
> Kum-Hoe Hwang, Ph.D.
>
> Phone : 82-31-250-3516
> Email : [hidden email]
>

--
Peter Ehlers
University of Calgary

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Error of Stepwise Regression with number of rows in use has changed: remove missing values?

Kum-Hoe Hwang
I thank those who helped to solve a error in stepwise regression with
missing values.


Kum

*
*

A good solution that I have tried was Andreas's advice.

=====================================================================

Try

data<-na.omit(original database) before you run step() or stepAIC()

On Tue, Feb 16, 2010 at 8:09 PM, Peter Ehlers <[hidden email]> wrote:

> On 2010-02-16 1:24, Kum-Hoe Hwang wrote:
>
>> Howdy, R Grues
>>
>> I have enjoyed R, but I cannot solve one problem easily. Please help my
>> problem.
>> When I tried the R script, I got the following Error. This error
>> results from input data file exported through a Excel spreadsheet
>> software.
>>
>>  Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> as.numeric(nation.grant) +  :
>>   number of rows in use has changed: remove missing values?
>>
>> Could you direct me to solve the Error?
>> Thanks in advance,
>>
>
> This is a common situation when you use step() on data where
> the predictors have missing values.
>
> A case (row) is included in the model only if all the
> predictors for that model are non-missing for the case.
>
> As you vary which predictors are to be in the model, the
> included cases will vary, resulting in models based on
> different data. (Think of your cases as subjects; you want
> all your models to be based on the same set of subjects.)
>
> Finally: (Re-)read the help page and note the 'warning'.
>
>  -Peter Ehlers
>
>
>
>>
>>  ############### outputs from R console ###############
>>> pop<- step(
>>>
>> +             lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> as.numeric(nation.grant)
>> +                + as.numeric(do.grant) + as.numeric(city.grant) +
>> as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
>> na.action = na.omit)
>> +             )
>> Start:  AIC=494.27
>> pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant)
>> +
>>     as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) +
>>     as.numeric(Seoul.dist)
>>                            Df Sum of Sq    RSS    AIC
>> - as.numeric(do.grant)      1      0.71 6622.9 492.28
>> - as.factor(policy)         1      1.21 6623.4 492.29
>> - as.numeric(DMZ.dist)      1      1.91 6624.1 492.30
>> - as.numeric(city.grant)    1      5.07 6627.3 492.36
>> - as.numeric(nation.grant)  1     11.51 6633.7 492.47
>> - as.numeric(year)          1     29.58 6651.8 492.80
>> <none>                                    6622.2 494.27
>> - as.numeric(Seoul.dist)    1    673.22 7295.4 503.79
>> Step:  AIC=492.28
>> pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant)
>> +
>>     as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>>                            Df Sum of Sq    RSS    AIC
>> - as.factor(policy)         1      1.99 6624.9 490.32
>> - as.numeric(DMZ.dist)      1      2.09 6625.0 490.32
>> - as.numeric(city.grant)    1      7.18 6630.1 490.41
>> - as.numeric(nation.grant)  1     20.08 6643.0 490.64
>> - as.numeric(year)          1     28.89 6651.8 490.80
>> <none>                                    6622.9 492.28
>> - as.numeric(Seoul.dist)    1    697.46 7320.4 502.20
>> Step:  AIC=490.32
>> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
>> as.numeric(city.grant) +
>>     as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>>                            Df Sum of Sq    RSS    AIC
>> - as.numeric(DMZ.dist)      1      2.08 6627.0 488.35
>> - as.numeric(city.grant)    1     10.65 6635.6 488.51
>> - as.numeric(nation.grant)  1     31.30 6656.2 488.88
>> - as.numeric(year)          1     31.44 6656.4 488.88
>> <none>                                    6624.9 490.32
>> - as.numeric(Seoul.dist)    1    732.88 7357.8 500.80
>> Step:  AIC=488.35
>> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
>> as.numeric(city.grant) +
>>     as.numeric(Seoul.dist)
>>                            Df Sum of Sq    RSS    AIC
>> - as.numeric(city.grant)    1      9.86 6636.9 486.53
>> - as.numeric(year)          1     31.42 6658.4 486.92
>> - as.numeric(nation.grant)  1     33.33 6660.3 486.95
>> <none>                                    6627.0 488.35
>> - as.numeric(Seoul.dist)    1    754.40 7381.4 499.18
>>
>> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> as.numeric(nation.grant) +  :
>>
>> -------------------------------------------------------------------------------------------------------------------------------------------
>>   number of rows in use has changed: remove missing values?
>>
>> ------------------------------------------------------------------------------------------
>>
>>
>>
>>
>> --
>> Kum-Hoe Hwang, Ph.D.
>>
>> Phone : 82-31-250-3516
>> Email : [hidden email]
>>
>>
> --
> Peter Ehlers
> University of Calgary
>



--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : [hidden email]

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Error of Stepwise Regression with number of rows in use has changed: remove missing values?

Kum-Hoe Hwang
In reply to this post by Peter Ehlers
Sorry for my faulty email and another correct email

I thank those who helped to solve a error in stepwise regression with
missing values.

A good solution that I have tried was Andreas's advice.

=====================================================================

Try

data<-na.omit(original database) before you run step() or stepAIC()


Kum

On Tue, Feb 16, 2010 at 8:09 PM, Peter Ehlers <[hidden email]> wrote:

>
> On 2010-02-16 1:24, Kum-Hoe Hwang wrote:
>>
>> Howdy, R Grues
>>
>> I have enjoyed R, but I cannot solve one problem easily. Please help my problem.
>> When I tried the R script, I got the following Error. This error
>> results from input data file exported through a Excel spreadsheet
>> software.
>>
>>  Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> as.numeric(nation.grant) +  :
>>   number of rows in use has changed: remove missing values?
>>
>> Could you direct me to solve the Error?
>> Thanks in advance,
>
> This is a common situation when you use step() on data where
> the predictors have missing values.
>
> A case (row) is included in the model only if all the
> predictors for that model are non-missing for the case.
>
> As you vary which predictors are to be in the model, the
> included cases will vary, resulting in models based on
> different data. (Think of your cases as subjects; you want
> all your models to be based on the same set of subjects.)
>
> Finally: (Re-)read the help page and note the 'warning'.
>
>  -Peter Ehlers
>
>>
>>
>>> ############### outputs from R console ###############
>>> pop<- step(
>>
>> +             lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> as.numeric(nation.grant)
>> +                + as.numeric(do.grant) + as.numeric(city.grant) +
>> as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
>> na.action = na.omit)
>> +             )
>> Start:  AIC=494.27
>> pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
>>     as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) +
>>     as.numeric(Seoul.dist)
>>                            Df Sum of Sq    RSS    AIC
>> - as.numeric(do.grant)      1      0.71 6622.9 492.28
>> - as.factor(policy)         1      1.21 6623.4 492.29
>> - as.numeric(DMZ.dist)      1      1.91 6624.1 492.30
>> - as.numeric(city.grant)    1      5.07 6627.3 492.36
>> - as.numeric(nation.grant)  1     11.51 6633.7 492.47
>> - as.numeric(year)          1     29.58 6651.8 492.80
>> <none>                                    6622.2 494.27
>> - as.numeric(Seoul.dist)    1    673.22 7295.4 503.79
>> Step:  AIC=492.28
>> pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
>>     as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>>                            Df Sum of Sq    RSS    AIC
>> - as.factor(policy)         1      1.99 6624.9 490.32
>> - as.numeric(DMZ.dist)      1      2.09 6625.0 490.32
>> - as.numeric(city.grant)    1      7.18 6630.1 490.41
>> - as.numeric(nation.grant)  1     20.08 6643.0 490.64
>> - as.numeric(year)          1     28.89 6651.8 490.80
>> <none>                                    6622.9 492.28
>> - as.numeric(Seoul.dist)    1    697.46 7320.4 502.20
>> Step:  AIC=490.32
>> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
>> as.numeric(city.grant) +
>>     as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>>                            Df Sum of Sq    RSS    AIC
>> - as.numeric(DMZ.dist)      1      2.08 6627.0 488.35
>> - as.numeric(city.grant)    1     10.65 6635.6 488.51
>> - as.numeric(nation.grant)  1     31.30 6656.2 488.88
>> - as.numeric(year)          1     31.44 6656.4 488.88
>> <none>                                    6624.9 490.32
>> - as.numeric(Seoul.dist)    1    732.88 7357.8 500.80
>> Step:  AIC=488.35
>> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
>> as.numeric(city.grant) +
>>     as.numeric(Seoul.dist)
>>                            Df Sum of Sq    RSS    AIC
>> - as.numeric(city.grant)    1      9.86 6636.9 486.53
>> - as.numeric(year)          1     31.42 6658.4 486.92
>> - as.numeric(nation.grant)  1     33.33 6660.3 486.95
>> <none>                                    6627.0 488.35
>> - as.numeric(Seoul.dist)    1    754.40 7381.4 499.18
>>
>> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> as.numeric(nation.grant) +  :
>> -------------------------------------------------------------------------------------------------------------------------------------------
>>   number of rows in use has changed: remove missing values?
>> ------------------------------------------------------------------------------------------
>>
>>
>>
>>
>> --
>> Kum-Hoe Hwang, Ph.D.
>>
>> Phone : 82-31-250-3516
>> Email : [hidden email]
>>
>
> --
> Peter Ehlers
> University of Calgary



--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Error of Stepwise Regression with number of rows in use has changed: remove missing values?

Greg Snow-2
In reply to this post by Kum-Hoe Hwang
Have you considered the implications of that solution?

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[hidden email]
801.408.8111


> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of Kum-Hoe Hwang
> Sent: Wednesday, February 17, 2010 1:41 AM
> To: [hidden email]
> Subject: Re: [R] Error of Stepwise Regression with number of rows in
> use has changed: remove missing values?
>
> I thank those who helped to solve a error in stepwise regression with
> missing values.
>
>
> Kum
>
> *
> *
>
> A good solution that I have tried was Andreas's advice.
>
> =====================================================================
>
> Try
>
> data<-na.omit(original database) before you run step() or stepAIC()
>
> On Tue, Feb 16, 2010 at 8:09 PM, Peter Ehlers <[hidden email]>
> wrote:
>
> > On 2010-02-16 1:24, Kum-Hoe Hwang wrote:
> >
> >> Howdy, R Grues
> >>
> >> I have enjoyed R, but I cannot solve one problem easily. Please help
> my
> >> problem.
> >> When I tried the R script, I got the following Error. This error
> >> results from input data file exported through a Excel spreadsheet
> >> software.
> >>
> >>  Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> >> as.numeric(nation.grant) +  :
> >>   number of rows in use has changed: remove missing values?
> >>
> >> Could you direct me to solve the Error?
> >> Thanks in advance,
> >>
> >
> > This is a common situation when you use step() on data where
> > the predictors have missing values.
> >
> > A case (row) is included in the model only if all the
> > predictors for that model are non-missing for the case.
> >
> > As you vary which predictors are to be in the model, the
> > included cases will vary, resulting in models based on
> > different data. (Think of your cases as subjects; you want
> > all your models to be based on the same set of subjects.)
> >
> > Finally: (Re-)read the help page and note the 'warning'.
> >
> >  -Peter Ehlers
> >
> >
> >
> >>
> >>  ############### outputs from R console ###############
> >>> pop<- step(
> >>>
> >> +             lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> >> as.numeric(nation.grant)
> >> +                + as.numeric(do.grant) + as.numeric(city.grant) +
> >> as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
> >> na.action = na.omit)
> >> +             )
> >> Start:  AIC=494.27
> >> pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant)
> >> +
> >>     as.numeric(do.grant) + as.numeric(city.grant) +
> as.numeric(DMZ.dist) +
> >>     as.numeric(Seoul.dist)
> >>                            Df Sum of Sq    RSS    AIC
> >> - as.numeric(do.grant)      1      0.71 6622.9 492.28
> >> - as.factor(policy)         1      1.21 6623.4 492.29
> >> - as.numeric(DMZ.dist)      1      1.91 6624.1 492.30
> >> - as.numeric(city.grant)    1      5.07 6627.3 492.36
> >> - as.numeric(nation.grant)  1     11.51 6633.7 492.47
> >> - as.numeric(year)          1     29.58 6651.8 492.80
> >> <none>                                    6622.2 494.27
> >> - as.numeric(Seoul.dist)    1    673.22 7295.4 503.79
> >> Step:  AIC=492.28
> >> pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant)
> >> +
> >>     as.numeric(city.grant) + as.numeric(DMZ.dist) +
> as.numeric(Seoul.dist)
> >>                            Df Sum of Sq    RSS    AIC
> >> - as.factor(policy)         1      1.99 6624.9 490.32
> >> - as.numeric(DMZ.dist)      1      2.09 6625.0 490.32
> >> - as.numeric(city.grant)    1      7.18 6630.1 490.41
> >> - as.numeric(nation.grant)  1     20.08 6643.0 490.64
> >> - as.numeric(year)          1     28.89 6651.8 490.80
> >> <none>                                    6622.9 492.28
> >> - as.numeric(Seoul.dist)    1    697.46 7320.4 502.20
> >> Step:  AIC=490.32
> >> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
> >> as.numeric(city.grant) +
> >>     as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
> >>                            Df Sum of Sq    RSS    AIC
> >> - as.numeric(DMZ.dist)      1      2.08 6627.0 488.35
> >> - as.numeric(city.grant)    1     10.65 6635.6 488.51
> >> - as.numeric(nation.grant)  1     31.30 6656.2 488.88
> >> - as.numeric(year)          1     31.44 6656.4 488.88
> >> <none>                                    6624.9 490.32
> >> - as.numeric(Seoul.dist)    1    732.88 7357.8 500.80
> >> Step:  AIC=488.35
> >> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
> >> as.numeric(city.grant) +
> >>     as.numeric(Seoul.dist)
> >>                            Df Sum of Sq    RSS    AIC
> >> - as.numeric(city.grant)    1      9.86 6636.9 486.53
> >> - as.numeric(year)          1     31.42 6658.4 486.92
> >> - as.numeric(nation.grant)  1     33.33 6660.3 486.95
> >> <none>                                    6627.0 488.35
> >> - as.numeric(Seoul.dist)    1    754.40 7381.4 499.18
> >>
> >> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> >> as.numeric(nation.grant) +  :
> >>
> >> --------------------------------------------------------------------
> -----------------------------------------------------------------------
> >>   number of rows in use has changed: remove missing values?
> >>
> >> --------------------------------------------------------------------
> ----------------------
> >>
> >>
> >>
> >>
> >> --
> >> Kum-Hoe Hwang, Ph.D.
> >>
> >> Phone : 82-31-250-3516
> >> Email : [hidden email]
> >>
> >>
> > --
> > Peter Ehlers
> > University of Calgary
> >
>
>
>
> --
> Kum-Hoe Hwang, Ph.D.
>
> Phone : 82-31-250-3516
> Email : [hidden email]
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Error of Stepwise Regression with number of rows in use has changed: remove missing values?

Kum-Hoe Hwang
This solution such as " data<-na.omit(original database) before you
run step() or stepAIC()" has some limitations, I think. I reduced the
number of data lines, and it enhance R square value.

If you have some tips or advices for another solution, I welcome.

Kum

Urban and Regional Planning, GRI


On Sat, Feb 20, 2010 at 5:57 AM, Greg Snow <[hidden email]> wrote:

> Have you considered the implications of that solution?
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [hidden email]
> 801.408.8111
>
>
>> -----Original Message-----
>> From: [hidden email] [mailto:r-help-bounces@r-
>> project.org] On Behalf Of Kum-Hoe Hwang
>> Sent: Wednesday, February 17, 2010 1:41 AM
>> To: [hidden email]
>> Subject: Re: [R] Error of Stepwise Regression with number of rows in
>> use has changed: remove missing values?
>>
>> I thank those who helped to solve a error in stepwise regression with
>> missing values.
>>
>>
>> Kum
>>
>> *
>> *
>>
>> A good solution that I have tried was Andreas's advice.
>>
>> =====================================================================
>>
>> Try
>>
>> data<-na.omit(original database) before you run step() or stepAIC()
>>
>> On Tue, Feb 16, 2010 at 8:09 PM, Peter Ehlers <[hidden email]>
>> wrote:
>>
>> > On 2010-02-16 1:24, Kum-Hoe Hwang wrote:
>> >
>> >> Howdy, R Grues
>> >>
>> >> I have enjoyed R, but I cannot solve one problem easily. Please help
>> my
>> >> problem.
>> >> When I tried the R script, I got the following Error. This error
>> >> results from input data file exported through a Excel spreadsheet
>> >> software.
>> >>
>> >>  Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> >> as.numeric(nation.grant) +  :
>> >>   number of rows in use has changed: remove missing values?
>> >>
>> >> Could you direct me to solve the Error?
>> >> Thanks in advance,
>> >>
>> >
>> > This is a common situation when you use step() on data where
>> > the predictors have missing values.
>> >
>> > A case (row) is included in the model only if all the
>> > predictors for that model are non-missing for the case.
>> >
>> > As you vary which predictors are to be in the model, the
>> > included cases will vary, resulting in models based on
>> > different data. (Think of your cases as subjects; you want
>> > all your models to be based on the same set of subjects.)
>> >
>> > Finally: (Re-)read the help page and note the 'warning'.
>> >
>> >  -Peter Ehlers
>> >
>> >
>> >
>> >>
>> >>  ############### outputs from R console ###############
>> >>> pop<- step(
>> >>>
>> >> +             lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> >> as.numeric(nation.grant)
>> >> +                + as.numeric(do.grant) + as.numeric(city.grant) +
>> >> as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
>> >> na.action = na.omit)
>> >> +             )
>> >> Start:  AIC=494.27
>> >> pop.rate ~ as.numeric(year) + as.factor(policy) +
>> as.numeric(nation.grant)
>> >> +
>> >>     as.numeric(do.grant) + as.numeric(city.grant) +
>> as.numeric(DMZ.dist) +
>> >>     as.numeric(Seoul.dist)
>> >>                            Df Sum of Sq    RSS    AIC
>> >> - as.numeric(do.grant)      1      0.71 6622.9 492.28
>> >> - as.factor(policy)         1      1.21 6623.4 492.29
>> >> - as.numeric(DMZ.dist)      1      1.91 6624.1 492.30
>> >> - as.numeric(city.grant)    1      5.07 6627.3 492.36
>> >> - as.numeric(nation.grant)  1     11.51 6633.7 492.47
>> >> - as.numeric(year)          1     29.58 6651.8 492.80
>> >> <none>                                    6622.2 494.27
>> >> - as.numeric(Seoul.dist)    1    673.22 7295.4 503.79
>> >> Step:  AIC=492.28
>> >> pop.rate ~ as.numeric(year) + as.factor(policy) +
>> as.numeric(nation.grant)
>> >> +
>> >>     as.numeric(city.grant) + as.numeric(DMZ.dist) +
>> as.numeric(Seoul.dist)
>> >>                            Df Sum of Sq    RSS    AIC
>> >> - as.factor(policy)         1      1.99 6624.9 490.32
>> >> - as.numeric(DMZ.dist)      1      2.09 6625.0 490.32
>> >> - as.numeric(city.grant)    1      7.18 6630.1 490.41
>> >> - as.numeric(nation.grant)  1     20.08 6643.0 490.64
>> >> - as.numeric(year)          1     28.89 6651.8 490.80
>> >> <none>                                    6622.9 492.28
>> >> - as.numeric(Seoul.dist)    1    697.46 7320.4 502.20
>> >> Step:  AIC=490.32
>> >> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
>> >> as.numeric(city.grant) +
>> >>     as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
>> >>                            Df Sum of Sq    RSS    AIC
>> >> - as.numeric(DMZ.dist)      1      2.08 6627.0 488.35
>> >> - as.numeric(city.grant)    1     10.65 6635.6 488.51
>> >> - as.numeric(nation.grant)  1     31.30 6656.2 488.88
>> >> - as.numeric(year)          1     31.44 6656.4 488.88
>> >> <none>                                    6624.9 490.32
>> >> - as.numeric(Seoul.dist)    1    732.88 7357.8 500.80
>> >> Step:  AIC=488.35
>> >> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
>> >> as.numeric(city.grant) +
>> >>     as.numeric(Seoul.dist)
>> >>                            Df Sum of Sq    RSS    AIC
>> >> - as.numeric(city.grant)    1      9.86 6636.9 486.53
>> >> - as.numeric(year)          1     31.42 6658.4 486.92
>> >> - as.numeric(nation.grant)  1     33.33 6660.3 486.95
>> >> <none>                                    6627.0 488.35
>> >> - as.numeric(Seoul.dist)    1    754.40 7381.4 499.18
>> >>
>> >> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
>> >> as.numeric(nation.grant) +  :
>> >>
>> >> --------------------------------------------------------------------
>> -----------------------------------------------------------------------
>> >>   number of rows in use has changed: remove missing values?
>> >>
>> >> --------------------------------------------------------------------
>> ----------------------
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Kum-Hoe Hwang, Ph.D.
>> >>
>> >> Phone : 82-31-250-3516
>> >> Email : [hidden email]
>> >>
>> >>
>> > --
>> > Peter Ehlers
>> > University of Calgary
>> >
>>
>>
>>
>> --
>> Kum-Hoe Hwang, Ph.D.
>>
>> Phone : 82-31-250-3516
>> Email : [hidden email]
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.