Hi Billy,

Thanks for posting your data. Okay, first off as Michael pointed out:

> table(complete.cases(Q))

FALSE TRUE

12 54

shows that of the 66 rows in your data set, only 54 of them are

complete. That means when you use na = na.omit, you are actually only

passing a data frame with 54 rows. Second, prcomp() will not return

more components than observations. Think of it this way, it is like

trying to connect 54 data points with 84 lines---53 lines will fit

perfectly (straight line connects two points), you are trying to go

way past that.

I have not heard (this is just an indicator of my ignorance, not their

lack of existence) of 'S' vs. 'R' mode PCA, but if you want the

columns to be the cases, just t()ranspose the data frame.

> table(complete.cases(t(Q)))

FALSE TRUE

9 75

So there are still only 75 possible observations to work with due to

missingness, but that is enough for only 66 variables

test_pca_Q2 <- prcomp(~ ., data = data.frame(t(Q)), scale = TRUE, retx = FALSE,

na.action = na.omit)

> length(test_pca_Q2$sdev)

[1] 66

so there are 66 SDs for the 66 principal components.

Regarding what the 'sdev' values are, they are the square root of the

eigen values of the correlation (in your case since you scaled)

matrix. You can see this below:

## first ten (1:10) square roots of the eigen values of the correlation matrix

## of the complete cases of the transposed data set 'Q'

> sqrt(eigen(cor(na.omit(t(Q))))$values)[1:10]

[1] 7.3267465 2.0349335 1.2913823 1.0750288 0.9035650 0.8301671 0.7370896

[8] 0.7132530 0.6196836 0.5396176

## sdev from prcomp()

> test_pca_Q2$sdev[1:10]

[1] 7.3267465 2.0349335 1.2913823 1.0750288 0.9035650 0.8301671 0.7370896

[8] 0.7132530 0.6196836 0.5396176

You can also try the principal() function in package "psych". It has

a lot of nice options, and I tend to use it for all this sort of

stuff.

Cheers,

Josh

On Thu, Aug 4, 2011 at 11:07 AM, William Armstrong <

[hidden email]> wrote:

> David and Josh,

>

> Thank you for the suggestions. I have attached a file ('q_values.txt') that

> contains the values of the 'Q' variable.

>

> David -- I am attempting an 'S' mode PCA, where the columns are actually the

> cases (different stream gaging stations) and the rows are the variables (the

> maximum flow at each station for a given year). I think the format you are

> referring to is 'R' mode, but I was under the impression that R (the

> program, not the PCA mode) could handle the analyses in either format. Am I

> mistaken?

>

> My first eigenvalue is:

>

>> unrotated_pca_q$sdev[1]^2

> [1] 17.77812

>

> Does that value seem large enough to explain the reduction in principal

> components from 65 to 54?

>

> Also, the loadings on the first PC are not particularly high:

>

> > max(abs(unrotated_pca_q$rotation[1:84]))

> [1] 0.1794776

>

> Does that suggest that maybe the data are not very highly correlated?

>

> Thank you both very much for your help.

>

> Billy

>

>

http://r.789695.n4.nabble.com/file/n3719440/q_values.txt q_values.txt

>

> --

> View this message in context:

http://r.789695.n4.nabble.com/Limited-number-of-principal-components-in-PCA-tp3704956p3719440.html> Sent from the R help mailing list archive at Nabble.com.

>

> ______________________________________________

>

[hidden email] mailing list

>

https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

>

--

Joshua Wiley

Ph.D. Student, Health Psychology

Programmer Analyst II, ATS Statistical Consulting Group

University of California, Los Angeles

https://joshuawiley.com/______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.