apply pairs function to multiple columns in a data frame

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

apply pairs function to multiple columns in a data frame

jarvisma
I am very new to R and programming and thank you in advance for your patience and help with a complete novice!

I am working with a large multivariate data set that has 10 explanatory environmental variables (e.g. temp, depth) and over 60 response variables (each is a separate species).  My data frame is set up like the simplified version below:

JulianDay  Temperature  Salinity  Depth  Copepod  Barnacle  Gastropod  Bivalve
222              12.1                    33            0.3         500            756             0                     178
222              12.3                    33.2         1.1        145             111             0                     0
223              11.1                    33.1         7            752            234             12                   0

Where JulianDay, Temperature, Salinity, Depth, Copepod, Barnacle, Gastropod, and Bivalve are the column headers.

I am using the pairs function in R to explore my data.  My data frame is named Sunset.
Using this code:

Z=cbind(Sunset$Copepod,Sunset[,c(1:10)])   #the first 10 columns of my data frame are explanatory variables such as temp
Pairs.Copepod=pairs(Z,main="Copepods vs. explanatory variables", panel=panel.smooth)

I get a great pair plot of Copepods vs. all of the 10 explanatory environmental variables.  I would like to do this for each of my 60+ species.  I can't just make one big pair plot of all of my explanatory variables vs. all of my species because I have too many.  So instead I would like a separate pair plot for each species (each column of data after the first 10 columns of explanatory variables)

I would like to be able to write a loop that creates all of these plots at once, but haven't been able to do so.  Ideally, each pair plot would have the main title be the column header (e.g. Copepod) for that plot.  I would love to include a way to save each pair plot as  separate jpeg to a folder on my desktop with a file name that includes the species name (e.g. Copepod).  

Again, I am very new to R and to programing so I would GREATLY appreciate anyone patient and kind enough to respond with lots of detail so I can hopefully follow your answer and suggestions.  I have searched but haven't been able to find a way to write a loop that uses each column of data, so I would love some help!

Thank you!
Marley
Reply | Threaded
Open this post in threaded view
|

Re: apply pairs function to multiple columns in a data frame

Phil Spector
A good way to solve problems like this is to write a function that
will work with one variable, and then use one of the apply family
of functions (in this case sapply) to do it for each of your
variables.  In this case, such a function would be something like
this:

make1plot = function(var){
     jpeg(paste(var,'jpg',sep='.'))
     pairs(Sunset[,c(var,names(Sunset)[1:10])],main=paste(var,'vs. explanatory variables'),
           panel=panel.smooth)
     dev.off()
}

(Notice that indexing precludes the need for cbind when you're creating a
sub-matrix.)

Now suppose the names of your dependent variables are stored in a vector
called "deps".  Then

sapply(deps,make1plot)

should do what you want.

I couldn't test this, since you didn't provide a reproducible example.  If you
use this list often, you'll find that providing a reproducible example goes a
long way towards getting a good answer.
  - Phil Spector
  Statistical Computing Facility
  Department of Statistics
  UC Berkeley
  [hidden email]

On Fri, 10 Feb 2012, jarvisma wrote:

> I am very new to R and programming and thank you in advance for your patience
> and help with a complete novice!
>
> I am working with a large multivariate data set that has 10 explanatory
> environmental variables (e.g. temp, depth) and over 60 response variables
> (each is a separate species).  My data frame is set up like the simplified
> version below:
>
> JulianDay  Temperature  Salinity  Depth  Copepod  Barnacle  Gastropod
> Bivalve
> 222              12.1                    33            0.3         500
> 756             0                     178
> 222              12.3                    33.2         1.1        145
> 111             0                     0
> 223              11.1                    33.1         7            752
> 234             12                   0
>
> Where JulianDay, Temperature, Salinity, Depth, Copepod, Barnacle, Gastropod,
> and Bivalve are the column headers.
>
> I am using the pairs function in R to explore my data.  My data frame is
> named Sunset.
> Using this code:
>
> Z=cbind(Sunset$Copepod,Sunset[,c(1:10)])   #the first 10 columns of my data
> frame are explanatory variables such as temp
> Pairs.Copepod=pairs(Z,main="Copepods vs. explanatory variables",
> panel=panel.smooth)
>
> I get a great pair plot of Copepods vs. all of the 10 explanatory
> environmental variables.  I would like to do this for each of my 60+
> species.  I can't just make one big pair plot of all of my explanatory
> variables vs. all of my species because I have too many.  So instead I would
> like a separate pair plot for each species (each column of data after the
> first 10 columns of explanatory variables)
>
> I would like to be able to write a loop that creates all of these plots at
> once, but haven't been able to do so.  Ideally, each pair plot would have
> the main title be the column header (e.g. Copepod) for that plot.  I would
> love to include a way to save each pair plot as  separate jpeg to a folder
> on my desktop with a file name that includes the species name (e.g.
> Copepod).
>
> Again, I am very new to R and to programing so I would GREATLY appreciate
> anyone patient and kind enough to respond with lots of detail so I can
> hopefully follow your answer and suggestions.  I have searched but haven't
> been able to find a way to write a loop that uses each column of data, so I
> would love some help!
>
> Thank you!
> Marley
>
> --
> View this message in context: http://r.789695.n4.nabble.com/apply-pairs-function-to-multiple-columns-in-a-data-frame-tp4377425p4377425.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: apply pairs function to multiple columns in a data frame

glsnow
In reply to this post by jarvisma
You might find the pairs2 function in the TeachingDemos package useful.

On Fri, Feb 10, 2012 at 1:13 PM, jarvisma <[hidden email]> wrote:

> I am very new to R and programming and thank you in advance for your patience
> and help with a complete novice!
>
> I am working with a large multivariate data set that has 10 explanatory
> environmental variables (e.g. temp, depth) and over 60 response variables
> (each is a separate species).  My data frame is set up like the simplified
> version below:
>
> JulianDay  Temperature  Salinity  Depth  Copepod  Barnacle  Gastropod
> Bivalve
> 222              12.1                    33            0.3         500
> 756             0                     178
> 222              12.3                    33.2         1.1        145
> 111             0                     0
> 223              11.1                    33.1         7            752
> 234             12                   0
>
> Where JulianDay, Temperature, Salinity, Depth, Copepod, Barnacle, Gastropod,
> and Bivalve are the column headers.
>
> I am using the pairs function in R to explore my data.  My data frame is
> named Sunset.
> Using this code:
>
> Z=cbind(Sunset$Copepod,Sunset[,c(1:10)])   #the first 10 columns of my data
> frame are explanatory variables such as temp
> Pairs.Copepod=pairs(Z,main="Copepods vs. explanatory variables",
> panel=panel.smooth)
>
> I get a great pair plot of Copepods vs. all of the 10 explanatory
> environmental variables.  I would like to do this for each of my 60+
> species.  I can't just make one big pair plot of all of my explanatory
> variables vs. all of my species because I have too many.  So instead I would
> like a separate pair plot for each species (each column of data after the
> first 10 columns of explanatory variables)
>
> I would like to be able to write a loop that creates all of these plots at
> once, but haven't been able to do so.  Ideally, each pair plot would have
> the main title be the column header (e.g. Copepod) for that plot.  I would
> love to include a way to save each pair plot as  separate jpeg to a folder
> on my desktop with a file name that includes the species name (e.g.
> Copepod).
>
> Again, I am very new to R and to programing so I would GREATLY appreciate
> anyone patient and kind enough to respond with lots of detail so I can
> hopefully follow your answer and suggestions.  I have searched but haven't
> been able to find a way to write a loop that uses each column of data, so I
> would love some help!
>
> Thank you!
> Marley
>
> --
> View this message in context: http://r.789695.n4.nabble.com/apply-pairs-function-to-multiple-columns-in-a-data-frame-tp4377425p4377425.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Gregory (Greg) L. Snow Ph.D.
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.