At 4:10 PM +0100 9/6/11, Lívio Cipriano wrote:

>Hi,

>

>Can anyone explain me the differences in Q and R mode in Principal Component

>Analysis, as performed by prcomp and princom respectively.

Dear Livio,

The help file of prcomp says it pretty well:

"The calculation is done by a singular value

decomposition of the (centered and possibly

scaled) data matrix, not by using eigen on the

covariance matrix. This is generally the

preferred method for numerical accuracy. "

with the help file from princomp:

princomp only handles so-called R-mode PCA, that

is feature extraction of variables. If a data

matrix is supplied (possibly via a formula) it is

required that there are at least as many units as

variables. For Q-mode PCA use prcomp.

This R and Q (as well as S and T) terminology was

introduced (at least in psychology) by Ray

Cattell in his discussion of the "Data Box". It

is the idea that you can consider three

dimensions of data (across subjects, variables,

and time). Then there are six different ways to

cut up the data. A typical data matrix has rows

for observations and columns for variables.

Typically the number of rows >> columns. If you

are trying to find a structure that reduces the

complexity of the variables, you do the normal

analysis (R) of the variables. An alternative is

do the analysis on the transpose of the data

matrix (Q analysis). That is, to try to reduce

the complexity of the rows.

This is not a problem if you do aingular value

decomposition (which is what prcomp does). It

can be if you do a princomp analysis which is

based upon the covariance of the data.

Let nXv represent your original matrix. (n

observations on v variables). For an R analysis,

using princomp, you are finding the principal

components of the covariance matrix C which is of

size v x v with rank = the lesser of n and v. But

for a Q analysis, if you are using princomp, you

are still trying to find the principal components

of a covariance matrix C* which has dimensions n

x n but has a rank of the lesser of n and v.

That is, if the number of rows > number of

columns the rank of the covariance matrix of the

transposed matrix will still be the number of

columns although the size of the correlation

matrix will be n x n.

Q analysis is looking for patterns of similarity

in the subjects over variables, R analysis is

looking for similarity in the variables over

subjects. This then gets generalized to the case

of subjects over time, variables, over time, ....

"The data box emphasized that we are not limited

to correlating tests over people at one time. In

its 1946 formulation, there were six 'designs of

covariation using literal measurement' and 12

'designs of covariation using differential or

ratio measurement' (Cattell, 1946c, p 94-95).

Considering Persons, Tests, and Occasions as the

fundamental dimensions, it was possible to

generalize the normal correlation of Tests over

Persons design (R analysis) to consider how

Persons correlated over Tests (Q analysis), or

Tests over Occasions (P analysis), etc. Cattell

(1966) extended the data box's original three

dimensions to five by adding Background or

preceding conditions as well as Observers (see

also Cattell (1977)). Applications of the data

box concept have been seen throughout psychology,

but the primary influence has probably been on

those who study personality development and

change over the life span (McArdle & Bell, 2000,

Mroczek, 2007, Nesselroade, 1984). Unfortunately,

even for the original three dimensions, Cattell

(1978) used a different notation than he did in

Cattell (1966, 1977) or Cattell (1946b)."

British Journal of Psychology (2009), 100, 253-257

q 2009 The British Psychological Society

[1] R. B. Cattell. The data box: Its ordering

of total resources in terms of possible

relational systems. In R. B. Cattell, editor,

Handbook of multivariate experimental psychology,

pages 67-128. Rand-McNally, Chicago, 1966.

I suspect this is more than you wanted to know.

Bill

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.