Different result of multiple regression in R and SPSS

14 messages
Open this post in threaded view
|

Different result of multiple regression in R and SPSS

 This post was updated on . Hi, I am trying to do a multiple regression analysis that has one nominal variable (gender) and three numeric variables as independent variables and one numeric variable as a dependent variable. So, I got a formula like this (in R): summary(out.3 <- lm(scale(DV) ~  gender + scale(IV.1) + scale(IV.2) + scale(IV.3)) After running analysis, I tried to compare the outcome in R with the outcome in SPSS and found the results are different! I found that R and SPSS have the exact same outcome when every variable is numeric; however, whenever I included "gender (0/1)" variable in the equation, the result become different. I guess that SPSS automatically treat gender as a numeric variable and standardize it when running analysis. So, I tried to change "gender" to a numeric variable and ran analysis but the results were still not identical. What is the problem here and what is the right way to do this analysis? Thanks, Jay Yang
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 Answer: Contrasts, i.e. the parameterization of the categorical variable(s) df. ?contrasts may be of some help, but you really need to do some background studying of the linear models principles involved. Googling may provide tutorials. Also searching the mail archives, e.g.: https://stat.ethz.ch/pipermail/r-help/2009-February/187479.html-- Bert On Tue, Jul 19, 2011 at 2:39 PM, J. <[hidden email]> wrote: > Hi, I am trying to do a simple multiple regression analysis that has one > nominal variable (gender) and three numeric variables as independent > variables and one numeric variable as dependent variable. > > So, I got a formula like this: > summary(out.3 <- lm(scale(DV) ~  gender + scale(IV.1) + scale(IV.2) + > scale(IV.3)) > > I tried to compare the outcome in R with the outcome in SPSS and found the > results are different! > I found that R and SPSS have the exact same outcome when every variable is > numeric; however, whenever I included "gender (0/1)" variable in the > equation, the result become different. > > I guess that SPSS automatically treat gender as a numeric variable and > standardize it when running analysis. So, I tried to change "gender" to a > numeric variable and ran analysis but the results were still not identical. > > What is the problem here and what is the right way to do this analysis? > Thanks, > > Jay Yang > > -- > View this message in context: http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679423.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 Thanks for the answer. However, I am still curious about which result I should use? The result from R or the one from SPSS? Why the results from two programs are different? Jay
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 In reply to this post by Bert Gunter I don't think SPSS does anything with the variables you enter there. Have you entered it as numeric? Have you entered gender as numeric in R? On Tue, Jul 19, 2011 at 6:11 PM, Bert Gunter <[hidden email]> wrote: > Answer: Contrasts, i.e. the parameterization of the categorical variable(s) df. > > ?contrasts may be of some help, but you really need to do some > background studying of the linear models principles involved. Googling > may provide tutorials. Also searching the mail archives, e.g.: > > https://stat.ethz.ch/pipermail/r-help/2009-February/187479.html> > -- Bert > > On Tue, Jul 19, 2011 at 2:39 PM, J. <[hidden email]> wrote: >> Hi, I am trying to do a simple multiple regression analysis that has one >> nominal variable (gender) and three numeric variables as independent >> variables and one numeric variable as dependent variable. >> >> So, I got a formula like this: >> summary(out.3 <- lm(scale(DV) ~  gender + scale(IV.1) + scale(IV.2) + >> scale(IV.3)) >> >> I tried to compare the outcome in R with the outcome in SPSS and found the >> results are different! >> I found that R and SPSS have the exact same outcome when every variable is >> numeric; however, whenever I included "gender (0/1)" variable in the >> equation, the result become different. >> >> I guess that SPSS automatically treat gender as a numeric variable and >> standardize it when running analysis. So, I tried to change "gender" to a >> numeric variable and ran analysis but the results were still not identical. >> >> What is the problem here and what is the right way to do this analysis? >> Thanks, >> >> Jay Yang >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Different-result-of-multiple-regression-in-R-and-SPSS-tp3679423p3679423.html>> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > "Men by nature long to get on to the ultimate truths, and will often > be impatient with elementary studies or fight shy of them. If it were > possible to reach the ultimate truths without the elementary studies > usually prefixed to them, these would not be preparatory studies but > superfluous diversions." > > -- Maimonides (1135-1204) > > Bert Gunter > Genentech Nonclinical Biostatistics > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > -- Dimitri Liakhovitski marketfusionanalytics.com ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 In reply to this post by J. On Jul 19, 2011, at 6:29 PM, J. wrote: > Thanks for the answer. > > However, I am still curious about which result I should use? The   > result from > R or the one from SPSS? It is becoming apparent that you do not know how to use the results   from either system. The progress of science would be safer if you get   some advice from a person that knows what they are doing. > Why the results from two programs are different? Different parametrizations. If I had to guess I would bet that the   gender coefficient is R is exactly twice that of the one from SPSS.   They are probably both correct in the context of their respective   codings. -- David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 On Tue, Jul 19, 2011 at 3:45 PM, David Winsemius <[hidden email]> wrote: > > On Jul 19, 2011, at 6:29 PM, J. wrote: > >> Thanks for the answer. >> ######################### >> However, I am still curious about which result I should use? The result >> from >> R or the one from SPSS? > > It is becoming apparent that you do not know how to use the results from > either system. The progress of science would be safer if you get some advice > from a person that knows what they are doing. ########################## I nominate this for an R fortune. -- Bert > >> Why the results from two programs are different? > > Different parametrizations. If I had to guess I would bet that the gender > coefficient is R is exactly twice that of the one from SPSS. They are > probably both correct in the context of their respective codings. > > -- > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 In reply to this post by David Winsemius @Dimitri: I tried to enter it as numeric and still got the same outcome. I still wonder if there is any way to get the same result from both programs. @David, Bert: Yes, I found that the gender coefficient is R is exactly twice that of the one from SPSS. Need to study on parametrization. Thanks, Jay
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 In reply to this post by David Winsemius ---------------------------------------- > From: [hidden email] > To: [hidden email] > Date: Tue, 19 Jul 2011 18:45:47 -0400 > CC: [hidden email] > Subject: Re: [R] Different result of multiple regression in R and SPSS > > > On Jul 19, 2011, at 6:29 PM, J. wrote: > > > Thanks for the answer. > > > > However, I am still curious about which result I should use? The > > result from > > R or the one from SPSS? > > It is becoming apparent that you do not know how to use the results > from either system. The progress of science would be safer if you get > some advice from a person that knows what they are doing. > > > Why the results from two programs are different? > > Different parametrizations. If I had to guess I would bet that the > gender coefficient is R is exactly twice that of the one from SPSS. > They are probably both correct in the context of their respective > codings. I guess I would also suggest, again, run some samples with known data sets and see what you get(RSSWKDSASWYG). You would want to do this anyway if you want to insure your real data is being used reasonably. You still need to have some way to check your opinion from the expert mentioned above and known data will help there too.  A factor of 2 often shows up from just looking at pictures once you have some intuition. I've often been wrong on intuition, but chasing it down and proving it helps you learn a lot :) > > -- > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.       ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 In reply to this post by Bert Gunter On 7/19/2011 4:04 PM, Bert Gunter wrote: > On Tue, Jul 19, 2011 at 3:45 PM, David Winsemius<[hidden email]>  wrote: >> On Jul 19, 2011, at 6:29 PM, J. wrote: >> >>> Thanks for the answer. >>> > ######################### >>> However, I am still curious about which result I should use? The result >>> from >>> R or the one from SPSS? >> It is becoming apparent that you do not know how to use the results from >> either system. The progress of science would be safer if you get some advice >> from a person that knows what they are doing. > ########################## > I nominate this for an R fortune. > > -- Bert None of us ever know what we're doing at some level.  We often think we do, and sometimes we get results more in spite of what we've done than because of it.  That of course increases our confidence and encourages us to repeat mistakes in contexts where we might not be so lucky. Spencer >>> Why the results from two programs are different? >> Different parametrizations. If I had to guess I would bet that the gender >> coefficient is R is exactly twice that of the one from SPSS. They are >> probably both correct in the context of their respective codings. >> >> -- >> David Winsemius, MD >> West Hartford, CT >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph:  408-655-4567 web:  www.structuremonitoring.com ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 In reply to this post by Dimitri Liakhovitski-2 I finally got the same result by converting "gender" variable as numeric, and standardize it. I guess SPSS automatically doing the same thing when doing analysis. But, it still is not clear to me how I can interpret "standardized categorical (dummy coded)" variable. I'd rather stick to use R. Thanks for all the comments and advice. Jay
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 First, it would have helped if you had posted the actual results for us to see how far they are off (and, more specifically, by which factor). Second, given your epiphany, you will find that that's exactly what David (and others before him) said or suggested. It is not about standardizing a nominal variable, which you theoretically cannot. It is about how the programs encode nominal variables by standard. Daniel J. wrote I finally got the same result by converting "gender" variable as numeric, and standardize it. I guess SPSS automatically doing the same thing when doing analysis. But, it still is not clear to me how I can interpret "standardized categorical (dummy coded)" variable. I'd rather stick to use R. Thanks for all the comments and advice. Jay
Open this post in threaded view
|

Re: Different result of multiple regression in R and SPSS

 In reply to this post by Spencer Graves-2 At 19.07.2011 18:50 -0700, Spencer Graves wrote: >On 7/19/2011 4:04 PM, Bert Gunter wrote: >>On Tue, Jul 19, 2011 at 3:45 PM, David >>Winsemius<[hidden email]>  wrote: >>>On Jul 19, 2011, at 6:29 PM, J. wrote: >>> >>>>Thanks for the answer. >>######################### >>>>However, I am still curious about which result I should use? The result >>>>from >>>>R or the one from SPSS? >>>It is becoming apparent that you do not know how to use the results from >>>either system. The progress of science would be safer if you get some advice >>>from a person that knows what they are doing. >>########################## >>I nominate this for an R fortune. >> >>-- Bert > >None of us ever know what we're doing at some >level.  We often think we do, and sometimes we >get results more in spite of what we've done >than because of it.  That of course increases >our confidence and encourages us to repeat >mistakes in contexts where we might not be so lucky. > > >Spencer Wise! Heinz >>>>Why the results from two programs are different? >>>Different parametrizations. If I had to guess I would bet that the gender >>>coefficient is R is exactly twice that of the one from SPSS. They are >>>probably both correct in the context of their respective codings. >>> >>>-- >>>David Winsemius, MD >>>West Hartford, CT >>> >>>______________________________________________ >>>[hidden email] mailing list >>>https://stat.ethz.ch/mailman/listinfo/r-help>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>>>and provide commented, minimal, self-contained, reproducible code. >>______________________________________________ >>[hidden email] mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>>and provide commented, minimal, self-contained, reproducible code. > > >-- >Spencer Graves, PE, PhD >President and Chief Technology Officer >Structure Inspection and Monitoring, Inc. >751 Emerson Ct. >San José, CA 95126 >ph:  408-655-4567 >web:  www.structuremonitoring.com > >______________________________________________ >[hidden email] mailing list >https://stat.ethz.ch/mailman/listinfo/r-help>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.