# How to plot PCA output?

14 messages
Open this post in threaded view
|
Report Content as Inappropriate

## How to plot PCA output?

 I have a decent sized matrix (36 x 11,000) that I have preformed a PCA on with prcomp(), but due to the large number of variables I can't plot the result with biplot(). How else can I plot the PCA output? I tried posting this before, but got no responses so I'm trying again. Surely this is a common problem, but  I can't find a solution with google? The University of Dundee is a registered Scottish Charity, No: SC015096 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

 That depends on what you want to plot there. Basically, you could just use plot() with pcaResult\$x. You might need to define which PCs you want to plot there though. pcaResult<-prcomp(iris[,1:4]) plot(pcaResult\$x) # gives the first 2 PCs plot(pcaResult\$x[,2:3]) #gives the second vs the 3rd PC or if you want to see more you can use pairs() pairs(pcaResult\$x) if you want things colored, theres the col parameter that works for both functions: pairs(pcaResult\$x,col=iris[,5]) Does this help? Am 07.05.2012 um 12:22 schrieb Christian Cole: > I have a decent sized matrix (36 x 11,000) that I have preformed a PCA on > with prcomp(), but due to the large number of variables I can't plot the > result with biplot(). How else can I plot the PCA output? > > I tried posting this before, but got no responses so I'm trying again. > Surely this is a common problem, but  I can't find a solution with google? > > > The University of Dundee is a registered Scottish Charity, No: SC015096 > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

 To add: If thats not it, maybe you could be a bit more specific about what you consider the "result", and how you want it visualized. Am 07.05.2012 um 15:24 schrieb Jessica Streicher: > That depends on what you want to plot there. Basically, you could just use plot() with pcaResult\$x. You might need to define which PCs you want to plot there though. > > pcaResult<-prcomp(iris[,1:4]) > plot(pcaResult\$x) # gives the first 2 PCs > plot(pcaResult\$x[,2:3]) #gives the second vs the 3rd PC > > or if you want to see more you can use pairs() > > pairs(pcaResult\$x) > > if you want things colored, theres the col parameter that works for both functions: > > pairs(pcaResult\$x,col=iris[,5]) > > Does this help? > > Am 07.05.2012 um 12:22 schrieb Christian Cole: > >> I have a decent sized matrix (36 x 11,000) that I have preformed a PCA on >> with prcomp(), but due to the large number of variables I can't plot the >> result with biplot(). How else can I plot the PCA output? >> >> I tried posting this before, but got no responses so I'm trying again. >> Surely this is a common problem, but  I can't find a solution with google? >> >> >> The University of Dundee is a registered Scottish Charity, No: SC015096 >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

 In reply to this post by Jessica Streicher Hi Jessica, Yes, that does help. It confirms my digging around in the prcomp object. I was plotting \$x, but wasn't sure whether this was appropriate. Mainly because the data ranges are different in \$x than when plotted by biplot() - as I mentioned my reply to Bryan. Do you know if this difference is data range matters? Many thanks, Chris On 07/05/2012 14:24, "Jessica Streicher" <[hidden email]> wrote: >That depends on what you want to plot there. Basically, you could just >use plot() with pcaResult\$x. You might need to define which PCs you want >to plot there though. > >pcaResult<-prcomp(iris[,1:4]) >plot(pcaResult\$x) # gives the first 2 PCs >plot(pcaResult\$x[,2:3]) #gives the second vs the 3rd PC > >or if you want to see more you can use pairs() > >pairs(pcaResult\$x) > >if you want things colored, theres the col parameter that works for both >functions: > >pairs(pcaResult\$x,col=iris[,5]) > >Does this help? > >Am 07.05.2012 um 12:22 schrieb Christian Cole: > >> I have a decent sized matrix (36 x 11,000) that I have preformed a PCA >>on >> with prcomp(), but due to the large number of variables I can't plot the >> result with biplot(). How else can I plot the PCA output? >> >> I tried posting this before, but got no responses so I'm trying again. >> Surely this is a common problem, but  I can't find a solution with >>google? >> >> >> The University of Dundee is a registered Scottish Charity, No: SC015096 >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code. > >______________________________________________ >[hidden email] mailing list >https://stat.ethz.ch/mailman/listinfo/r-help>PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html>and provide commented, minimal, self-contained, reproducible code. > The University of Dundee is a registered Scottish Charity, No: SC015096 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

 Biplot, depending on what parameters you give it, scales the data in a certain way. See http://stat.ethz.ch/R-manual/R-patched/library/stats/html/biplot.princomp.htmlscale The variables are scaled by lambda ^ scale and the observations are scaled by lambda ^ (1-scale) where lambda are the singular values as computed by princomp. Normally 0 <= scale <= 1, and a warning will be issued if the specified scale is outside this range. Am 07.05.2012 um 16:01 schrieb Christian Cole: > Hi Jessica, > > Yes, that does help. It confirms my digging around in the prcomp object. > > I was plotting \$x, but wasn't sure whether this was appropriate. Mainly > because the data ranges are different in \$x than when plotted by biplot() > - as I mentioned my reply to Bryan. Do you know if this difference is data > range matters? > Many thanks, > > Chris > > > > On 07/05/2012 14:24, "Jessica Streicher" <[hidden email]> wrote: > >> That depends on what you want to plot there. Basically, you could just >> use plot() with pcaResult\$x. You might need to define which PCs you want >> to plot there though. >> >> pcaResult<-prcomp(iris[,1:4]) >> plot(pcaResult\$x) # gives the first 2 PCs >> plot(pcaResult\$x[,2:3]) #gives the second vs the 3rd PC >> >> or if you want to see more you can use pairs() >> >> pairs(pcaResult\$x) >> >> if you want things colored, theres the col parameter that works for both >> functions: >> >> pairs(pcaResult\$x,col=iris[,5]) >> >> Does this help? >> >> Am 07.05.2012 um 12:22 schrieb Christian Cole: >> >>> I have a decent sized matrix (36 x 11,000) that I have preformed a PCA >>> on >>> with prcomp(), but due to the large number of variables I can't plot the >>> result with biplot(). How else can I plot the PCA output? >>> >>> I tried posting this before, but got no responses so I'm trying again. >>> Surely this is a common problem, but  I can't find a solution with >>> google? >>> >>> >>> The University of Dundee is a registered Scottish Charity, No: SC015096 >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help>>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html>>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code. >> > > > The University of Dundee is a registered Scottish Charity, No: SC015096 >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

 In reply to this post by Bryan Hanson   > -----Original Message----- > I avoid the biplot at all costs, because IMHO it violates one > of the tenets of good graphic design:  It has two entirely > different scales on axes.  These are maximally confusing to > the end-user.  So I never use it. I think you're being unnecessarily restrictive there. The confusion that arises when using multiple scales in the same graphical dimension arises from a tendency to read distances and locations on the wrong scale. In a biplot, the PC's have essentially no intuitive physical interpretation (by which I mean a 1:1 mapping onto an identifiable variable) so this doesn't matter much even if it happens (in fact you  cold probably lose the scales entirely in a biplot without compromising its interpretation much). And the alternative - sticking rigidly to the 'one axis per dimension' rule and to plot them with the _same_ scales - often leads to unreadable plots: invisibly tiny arrows or an invisibly tiny cloud of data points. But having indicated that I don't see a biplot's multiple scales as particularly likely to confuse or mislead, I'm always interested in alternatives. The interesting question is 'given the same objective - a qualitative indication of which variables have most influenced the location of particular data points (or vice versa) and in which general direction - what do you suggest instead?' Steve Ellison ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}} ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: How to plot PCA output?

 In reply to this post by Christian Cole I think the question on your mind should be: 'what do I want to do with this plot'? Just producing output from the PCA is easy - plotting the output\$sd is probably quite informative. From the sounds of it, though, you want to do clustering with the PCA component loadings? (Since that's mostly what the biplot accomplishes using the first two PCs.) The first thing to note, then is that you might not want to plot all 36 PCs, then! Once you go higher than the first few, your results will likely become remarkably awful in ways that might not be obvious. A biplot with PCs 1 & 2, or 2 & 3, for example, could be easily sufficient. If you want to still plot many PCs, from an exploratory point of view, something like a parallel coordinates plot might be helpful. Alternatively, you could look at rgl for general plotting of 3d points (so you can do a 3d version of the biplot), or apply more systematic clustering algorithms. Zhou