Rearranging PCA results from R

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Rearranging PCA results from R

psycrcyo
Hi!!
I'm having trouble selecting 10 out of 41 attributes of the KDD data set. In order to identify the components with the higher variance I'm using princomp. the result i get for summary(pca1) is:


                              Comp.1            Comp.2          Comp.3              Comp.4        Comp.5              Comp.6            Comp.7           Comp.8           Comp.9           Comp.10
Standard deviation     9.882181e+05  3.303966e+04  7.083767e+02  3.282215e+02  9.839173e+01 4.642758e+01  2.923245e+01  6.447245e+00  2.689471e+00  1.292525e+00

Proportion of Variance 9.988828e-01  1.116555e-03  5.132601e-07  1.101902e-07  9.902073e-09  2.204758e-09  8.740565e-10  4.251648e-11  7.398482e-12  1.708784e-12

Cumulative Proportion  9.988828e-01 9.999994e-01 9.999999e-01 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00

and for the loadings a constant 0.024 for the proportion of variability:

                    Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10
SS loadings     1.000   1.000    1.000   1.000   1.000    1.000    1.000   1.000   1.000   1.000  
Proportion Var  0.024    0.024    0.024  0.024    0.024   0.024    0.024   0.024   0.024   0.024  
Cumulative Var  0.024    0.048    0.071  0.095    0.119  0.143    0.167   0.190   0.214   0.238  

So the questions are: Which of the two is the right proportion of variance? and, is there a way for R to tell me which attributes they belong to?

Any help will be very appreciated.

psycrcyo

Reply | Threaded
Open this post in threaded view
|

Re: Rearranging PCA results from R

Uwe Ligges-3
On 22.04.2011 00:36, psycrcyo wrote:
> Hi!!
> I'm having trouble selecting 10 out of 41 attributes of the KDD data set. In
> order to identify the components with the higher variance I'm using
> princomp. the result i get for summary(pca1) is:


Actually you calculated the first 10 principal components. You have not
selected anything - particularly no "attributes", all "attributes" are
included in your 10 first PCs. I'd suggest to read some textbook about PCA.

Some people like to perform stepwise regression of variables on the
first PC if it explains a lot of the variance, but that should be done
*very* carefully, if at all.

Best,
Uwe Ligges



>
>
>                                Comp.1            Comp.2          Comp.3
> Comp.4        Comp.5              Comp.6            Comp.7           Comp.8
> Comp.9           Comp.10
> Standard deviation     9.882181e+05  3.303966e+04  7.083767e+02
> 3.282215e+02  9.839173e+01 4.642758e+01  2.923245e+01  6.447245e+00
> 2.689471e+00  1.292525e+00
>
> Proportion of Variance 9.988828e-01  1.116555e-03  5.132601e-07
> 1.101902e-07  9.902073e-09  2.204758e-09  8.740565e-10  4.251648e-11
> 7.398482e-12  1.708784e-12
>
> Cumulative Proportion  9.988828e-01 9.999994e-01 9.999999e-01 1.000000e+00
> 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00
> 1.000000e+00
>
> and for the loadings a constant 0.024 for the proportion of variability:
>
>                      Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
> Comp.9 Comp.10
> SS loadings     1.000   1.000    1.000   1.000   1.000    1.000    1.000
> 1.000   1.000   1.000
> Proportion Var  0.024    0.024    0.024  0.024    0.024   0.024    0.024
> 0.024   0.024   0.024
> Cumulative Var  0.024    0.048    0.071  0.095    0.119  0.143    0.167
> 0.190   0.214   0.238
>
> So the questions are: Which of the two is the right proportion of variance?
> and, is there a way for R to tell me which attributes they belong to?
>
> Any help will be very appreciated.
>
> psycrcyo
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Rearranging-PCA-results-from-R-tp3467015p3467015.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.