Quantcast

prcomp - principal components in R

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

prcomp - principal components in R

zubin-2
Hello, not understanding the output of prcomp, I reduce the number of
components and the output continues to show cumulative 100% of the
variance explained, which can't be the case dropping from 8 components
to 3.

How do i get the output in terms of the cumulative % of the total
variance, so when i go from total solution of 8 (8 variables in the data
set), to a reduced number of components, i can evaluate % of variance
explained, or am I missing something??

8 variables in the data set

 > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
 > summary(princ)
Importance of components:
                         PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*

 > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
 > summary(princ)

Importance of components:
                         PC1   PC2   PC3
Standard deviation     1.381 1.247 1.211
Proportion of Variance 0.387 0.316 0.297
Cumulative Proportion  0.387 0.703 *1.000*

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: prcomp - principal components in R

ssefick
principal components is  a data reduction technique.  It looks like
you have three axes that account for 100%.  Make this reporducible.

On Mon, Nov 9, 2009 at 11:37 AM, zubin <[hidden email]> wrote:

> Hello, not understanding the output of prcomp, I reduce the number of
> components and the output continues to show cumulative 100% of the
> variance explained, which can't be the case dropping from 8 components
> to 3.
>
> How do i get the output in terms of the cumulative % of the total
> variance, so when i go from total solution of 8 (8 variables in the data
> set), to a reduced number of components, i can evaluate % of variance
> explained, or am I missing something??
>
> 8 variables in the data set
>
>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
>  > summary(princ)
> Importance of components:
>                         PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
> Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
> Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
> Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*
>
>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
>  > summary(princ)
>
> Importance of components:
>                         PC1   PC2   PC3
> Standard deviation     1.381 1.247 1.211
> Proportion of Variance 0.387 0.316 0.297
> Cumulative Proportion  0.387 0.703 *1.000*
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

                                                                -K. Mullis

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: prcomp - principal components in R

zubin-2
okay, an extreme case, only 1 component, explains 100%, something weird
going on..

 > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.95)
 > summary(princ)
Importance of components:
                        PC1
Standard deviation     1.38
Proportion of Variance 1.00
Cumulative Proportion  1.00

stephen sefick wrote:

> principal components is  a data reduction technique.  It looks like
> you have three axes that account for 100%.  Make this reporducible.
>
> On Mon, Nov 9, 2009 at 11:37 AM, zubin <[hidden email]> wrote:
>  
>> Hello, not understanding the output of prcomp, I reduce the number of
>> components and the output continues to show cumulative 100% of the
>> variance explained, which can't be the case dropping from 8 components
>> to 3.
>>
>> How do i get the output in terms of the cumulative % of the total
>> variance, so when i go from total solution of 8 (8 variables in the data
>> set), to a reduced number of components, i can evaluate % of variance
>> explained, or am I missing something??
>>
>> 8 variables in the data set
>>
>>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
>>  > summary(princ)
>> Importance of components:
>>                         PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
>> Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
>> Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
>> Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*
>>
>>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
>>  > summary(princ)
>>
>> Importance of components:
>>                         PC1   PC2   PC3
>> Standard deviation     1.381 1.247 1.211
>> Proportion of Variance 0.387 0.316 0.297
>> Cumulative Proportion  0.387 0.703 *1.000*
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>    
>
>
>
>  

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: prcomp - principal components in R

ssefick
Look at it linearly?

On Mon, Nov 9, 2009 at 11:45 AM, zubin <[hidden email]> wrote:

> okay, an extreme case, only 1 component, explains 100%, something weird
> going on..
>
>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.95)
>  > summary(princ)
> Importance of components:
>                        PC1
> Standard deviation     1.38
> Proportion of Variance 1.00
> Cumulative Proportion  1.00
>
> stephen sefick wrote:
>> principal components is  a data reduction technique.  It looks like
>> you have three axes that account for 100%.  Make this reporducible.
>>
>> On Mon, Nov 9, 2009 at 11:37 AM, zubin <[hidden email]> wrote:
>>
>>> Hello, not understanding the output of prcomp, I reduce the number of
>>> components and the output continues to show cumulative 100% of the
>>> variance explained, which can't be the case dropping from 8 components
>>> to 3.
>>>
>>> How do i get the output in terms of the cumulative % of the total
>>> variance, so when i go from total solution of 8 (8 variables in the data
>>> set), to a reduced number of components, i can evaluate % of variance
>>> explained, or am I missing something??
>>>
>>> 8 variables in the data set
>>>
>>>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
>>>  > summary(princ)
>>> Importance of components:
>>>                         PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
>>> Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
>>> Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
>>> Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*
>>>
>>>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
>>>  > summary(princ)
>>>
>>> Importance of components:
>>>                         PC1   PC2   PC3
>>> Standard deviation     1.381 1.247 1.211
>>> Proportion of Variance 0.387 0.316 0.297
>>> Cumulative Proportion  0.387 0.703 *1.000*
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>>
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

                                                                -K. Mullis

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: prcomp - principal components in R

Daniel Malter
In reply to this post by zubin-2
In the first PCA you ask how much variance of the EIGHT (!) variables is
captured by the first, second,..., eigth principal component.

In the second PCA you ask how much variance of the THREE (!) variables is
captured by the first, second, and third principal component.

Of course you need only as many PCs as there are variables to capture 100 %
of the variance. Your "problem" thus comes from the fact that you have eight
variables in the first PCA, which requires eight PCs to capture 100%, and
that you have only three variables in the second PCA, which naturally only
requires three PCs to capture 100% of the variance.

So it's more, yes, you are missing something in this case, rather than that
something is wrong with the analyses.

HTH,
Daniel

-------------------------
cuncta stricte discussurus
-------------------------

-----Ursprüngliche Nachricht-----
Von: [hidden email] [mailto:[hidden email]] Im
Auftrag von zubin
Gesendet: Monday, November 09, 2009 12:37 PM
An: [hidden email]
Betreff: [R] prcomp - principal components in R

Hello, not understanding the output of prcomp, I reduce the number of
components and the output continues to show cumulative 100% of the variance
explained, which can't be the case dropping from 8 components to 3.

How do i get the output in terms of the cumulative % of the total variance,
so when i go from total solution of 8 (8 variables in the data set), to a
reduced number of components, i can evaluate % of variance explained, or am
I missing something??

8 variables in the data set

 > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
 > summary(princ)
Importance of components:
                         PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*

 > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
 > summary(princ)

Importance of components:
                         PC1   PC2   PC3
Standard deviation     1.381 1.247 1.211
Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion  0.387 0.703
*1.000*

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: prcomp - principal components in R

zubin-2
All 8 variables are still in the analysis, i am just reducing the number
of components being estimated i thought..

Example 1 component 8 variables, there is no way 1 component explains
100% of the variance of the 8 variable data set.

 > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.95)
 > summary(princ)
Importance of components:
                        PC1
Standard deviation     1.38
Proportion of Variance 1.00
Cumulative Proportion  1.00

 > summary(princ)

Rotation:
                PC1
VIX0    -0.08217686
UUP0    -0.18881983
USO0     0.26647346
GLD0     0.26983923
HYG0     0.60674758
term0    0.18220237
spread0  0.61614047
TNX0     0.18111684




Daniel Malter wrote:

> In the first PCA you ask how much variance of the EIGHT (!) variables is
> captured by the first, second,..., eigth principal component.
>
> In the second PCA you ask how much variance of the THREE (!) variables is
> captured by the first, second, and third principal component.
>
> Of course you need only as many PCs as there are variables to capture 100 %
> of the variance. Your "problem" thus comes from the fact that you have eight
> variables in the first PCA, which requires eight PCs to capture 100%, and
> that you have only three variables in the second PCA, which naturally only
> requires three PCs to capture 100% of the variance.
>
> So it's more, yes, you are missing something in this case, rather than that
> something is wrong with the analyses.
>
> HTH,
> Daniel
>
> -------------------------
> cuncta stricte discussurus
> -------------------------
>
> -----Ursprüngliche Nachricht-----
> Von: [hidden email] [mailto:[hidden email]] Im
> Auftrag von zubin
> Gesendet: Monday, November 09, 2009 12:37 PM
> An: [hidden email]
> Betreff: [R] prcomp - principal components in R
>
> Hello, not understanding the output of prcomp, I reduce the number of
> components and the output continues to show cumulative 100% of the variance
> explained, which can't be the case dropping from 8 components to 3.
>
> How do i get the output in terms of the cumulative % of the total variance,
> so when i go from total solution of 8 (8 variables in the data set), to a
> reduced number of components, i can evaluate % of variance explained, or am
> I missing something??
>
> 8 variables in the data set
>
>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
>  > summary(princ)
> Importance of components:
>                          PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
> Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
> Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
> Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*
>
>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
>  > summary(princ)
>
> Importance of components:
>                          PC1   PC2   PC3
> Standard deviation     1.381 1.247 1.211
> Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion  0.387 0.703
> *1.000*
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: prcomp - principal components in R

markleeds
In reply to this post by zubin-2

   Hi: I'm not familar with prcomp but with the principal components function
   in bill revelle's  psych package , one can specify the number of components
   one wants to use to build the "closest" covariance matrix  I don't know
   what tol is doing in your example  but it's not doing  that.
   Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
   Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
   Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
   Â Â Â Â Â Â Â Â Â Â Â Â Â  mark

   On Nov 9, 2009, zubin <[hidden email]> wrote:

     All 8 variables are still in the analysis, i am just reducing the number
     of components being estimated i thought..
     Example 1 component 8 variables, there is no way 1 component explains
     100% of the variance of the 8 variable data set.
     > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.95)
     > summary(princ)
     Importance of components:
     PC1
     Standard deviation 1.38
     Proportion of Variance 1.00
     Cumulative Proportion 1.00
     > summary(princ)
     Rotation:
     PC1
     VIX0 -0.08217686
     UUP0 -0.18881983
     USO0 0.26647346
     GLD0 0.26983923
     HYG0 0.60674758
     term0 0.18220237
     spread0 0.61614047
     TNX0 0.18111684
     Daniel Malter wrote:
     > In the first PCA you ask how much variance of the EIGHT (!) variables is
     > captured by the first, second,..., eigth principal component.
     >
     > In the second PCA you ask how much variance of the THREE (!) variables
     is
     > captured by the first, second, and third principal component.
     >
     > Of course you need only as many PCs as there are variables to capture
     100 %
     > of the variance. Your "problem" thus comes from the fact that you have
     eight
     > variables in the first PCA, which requires eight PCs to capture 100%,
     and
     > that you have only three variables in the second PCA, which naturally
     only
     > requires three PCs to capture 100% of the variance.
     >
     > So it's more, yes, you are missing something in this case, rather than
     that
     > something is wrong with the analyses.
     >
     > HTH,
     > Daniel
     >
     > -------------------------
     > cuncta stricte discussurus
     > -------------------------
     >
     > -----Ursprüngliche Nachricht-----
     > Von: [1][hidden email]
     [[2]mailto:[hidden email]] Im
     > Auftrag von zubin
     > Gesendet: Monday, November 09, 2009 12:37 PM
     > An: [3][hidden email]
     > Betreff: [R] prcomp - principal components in R
     >
     > Hello, not understanding the output of prcomp, I reduce the number of
     > components and the output continues to show cumulative 100% of the
     variance
     > explained, which can't be the case dropping from 8 components to 3.
     >
     > How do i get the output in terms of the cumulative % of the total
     variance,
     > so when i go from total solution of 8 (8 variables in the data set), to
     a
     > reduced number of components, i can evaluate % of variance explained, or
     am
     > I missing something??
     >
     > 8 variables in the data set
     >
     > > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
     > > summary(princ)
     > Importance of components:
     > PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
     > Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
     > Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
     >  Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762
     *1.0000*
     >
     > > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
     > > summary(princ)
     >
     > Importance of components:
     > PC1 PC2 PC3
     > Standard deviation 1.381 1.247 1.211
     > Proportion of Variance 0.387 0.316 0.297 Cumulative Proportion 0.387
     0.703
     > *1.000*
     >
     > [[alternative HTML version deleted]]
     >
     > ______________________________________________
     > [4][hidden email] mailing list
     > [5]https://stat.ethz.ch/mailman/listinfo/r-help
     > PLEASE do read the posting guide
     [6]http://www.R-project.org/posting-guide.html
     > and provide commented, minimal, self-contained, reproducible code.
     >
     >
     >
     ______________________________________________
     [7][hidden email] mailing list
     [8]https://stat.ethz.ch/mailman/listinfo/r-help
     PLEASE do read the posting guide
     [9]http://www.R-project.org/posting-guide.html
     and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:[hidden email]
   2. mailto:[hidden email]
   3. mailto:[hidden email]
   4. mailto:[hidden email]
   5. https://stat.ethz.ch/mailman/listinfo/r-help
   6. http://www.R-project.org/posting-guide.html
   7. mailto:[hidden email]
   8. https://stat.ethz.ch/mailman/listinfo/r-help
   9. http://www.R-project.org/posting-guide.html
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: prcomp - principal components in R

Tony Plate-3
In reply to this post by zubin-2
The output of summary prcomp displays the cumulative amount of variance explained relative to the total variance explained by the principal components PRESENT in the object.  So, it is always guaranteed to be at 100% for the last principal component present.  You can see this from the code in summary.prcomp() (see this code with getAnywhere("summary.prcomp")).

Here's how to get the output you want (the last line in the transcript below):

> set.seed(1)
> summary(pc1 <- prcomp(x))
Importance of components:
                         PC1   PC2   PC3   PC4   PC5
Standard deviation     1.175 1.058 0.976 0.916 0.850
Proportion of Variance 0.275 0.223 0.190 0.167 0.144
Cumulative Proportion  0.275 0.498 0.688 0.856 1.000
> summary(pc2 <- prcomp(x, tol=0.8))
Importance of components:
                        PC1   PC2   PC3
Standard deviation     1.17 1.058 0.976
Proportion of Variance 0.40 0.324 0.276
Cumulative Proportion  0.40 0.724 1.000
> pc2$sdev
[1] 1.1749061 1.0581362 0.9759016
> pc1$sdev
[1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122
> svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1)
[1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122
> cumsum(pc1$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1))^2)
[1] 0.2752317 0.4984734 0.6883643 0.8558386 1.0000000
>
> # output in terms of the cumulative % of the total variance
> cumsum(pc2$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1))^2)
[1] 0.2752317 0.4984734 0.6883643
>

It's probably better to get prcomp to compute all the components in the first place, because the SVD is the bulk of the computation anyway (so doing it again will be slower for large matrices.)  Then just look at the most important principal components.  However, there may be a shortcut for computing the values of D in the SVD of a matrix -- you could look for that if you have demanding computations (e.g., the sqrts of the eigen values of the covariance matrix of scaled x: sqrt(eigen(var(scale(x, center=T, scale=F)), only.values=T)$values)).

-- Tony Plate


zubin wrote:

> Hello, not understanding the output of prcomp, I reduce the number of
> components and the output continues to show cumulative 100% of the
> variance explained, which can't be the case dropping from 8 components
> to 3.
>
> How do i get the output in terms of the cumulative % of the total
> variance, so when i go from total solution of 8 (8 variables in the data
> set), to a reduced number of components, i can evaluate % of variance
> explained, or am I missing something??
>
> 8 variables in the data set
>
>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
>  > summary(princ)
> Importance of components:
>                          PC1   PC2   PC3   PC4   PC5   PC6    PC7    PC8
> Standard deviation     1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
> Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
> Cumulative Proportion  0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*
>
>  > princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
>  > summary(princ)
>
> Importance of components:
>                          PC1   PC2   PC3
> Standard deviation     1.381 1.247 1.211
> Proportion of Variance 0.387 0.316 0.297
> Cumulative Proportion  0.387 0.703 *1.000*
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: prcomp - principal components in R

MikeAmato.WI
This post has NOT been accepted by the mailing list yet.
In reply to this post by markleeds
Hello,
I have similar concerns about tol. I am attempting to do a principal components analysis on 15 survey items, using a sample of 132 people who each responded to all of the survey items. I want to use a varimax rotation on the retained components, but I am dubious of the output I am getting, and so I suspect I am doing something wrong. I proceed in the following steps:

   1) use prcomp() to inspect all 15 components, and decide which to retain
   2) run prcomp() again, using the "tol" parameter to omit unwanted components
   3) pass the output of step 2 to varimax()

My concern is with the reported proportions of variance for the 3 components after varimax rotation. It looks like each of my 3 components explains 1/15 of the total variance, summing to a cumulative proportion of 20% of variance explained. But those 3 components I retained should now be the only components in the analysis, so they should be able to account for 100% of the explained variance.

I am able to get reliable seeming results using principal() from the "psych" package, in which the total amount of variance explained by my retained components does not differ before or after rotation. But principal() uses varimax(), so I suspect I am either doing something wrong or misinterpreting the output when using the base package functions.  

Am I doing something wrong when attempting to retain only 3 components?
Am I using varimax() incorrectly?
Am I misinterpreting the returned values from varimax()?

Thanks for any help,
Mike



Here is a link to the data file I am using: https://www.dropbox.com/s/scypebzy0nnhlwk/pca_sampledata.txt

### step 1 ###
> d1 = read.table("pca_sampledata.txt", T)
> m1 = with(d1, ~ ~ g.enjoy + g.look + g.cost + g.fit + g.health + g.resale + b.withstand + b.satisfy + b.vegetated + b.everyone + b.harmed + b.eco + b.ingenuity + b.security + b.proud)
> pca1 = prcomp(m1)
> summary(pca1) #output truncated for this posting
Importance of components:
                          PC1    PC2    PC3     PC4     PC5 ...    PC15
Standard deviation     1.5531 1.3064 1.1695 0.93512 0.92167 ... 0.35500
Proportion of Variance 0.2199 0.1556 0.1247 0.07972 0.07744 ... 0.01149
Cumulative Proportion  0.2199 0.3755 0.5002 0.57988 0.65732 ... 1.00000


### step 2 ###
> pca2 = prcomp(m1, tol=.75)
> summary(pca2) #full output shown
Importance of components:
                          PC1    PC2    PC3
Standard deviation     1.5531 1.3064 1.1695
Proportion of Variance 0.4397 0.3111 0.2493
Cumulative Proportion  0.4397 0.7507 1.0000


### step 3 ###
> pca3 = varimax(pca2$rotation)
> pca3
> ...
>                  PC1   PC2   PC3
> SS loadings    1.000 1.000 1.000
> Proportion Var 0.067 0.067 0.067
> Cumulative Var 0.067 0.133 0.200
Loading...