Statistical analysis

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Statistical analysis

chrisli1223
Hi all,

I have got two datasets, one of them is rainfall data and the other one is groundwater level data.

I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated.

My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated.

Thanks in advance.
Chris
Reply | Threaded
Open this post in threaded view
|

Re: Statistical analysis

Sharpie

Chris Li wrote
Hi all,

I have got two datasets, one of them is rainfall data and the other one is groundwater level data.

I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated.

My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated.

Thanks in advance.
Chris

Supposing you have two variables-- precipitation, p, and groundwater potential, h-- a simple test for linear correlation is to produce a scatterplot of h vs. p:

plot( h ~ p )

If it looks linear, than it may be worthwhile to have R estimate the coefficient of correlation for the data:

cor( p, h )

If the correlation coefficient is close to +/- 1, than your data is exhibiting a strong linear trend and a linear model may be appropriate:

linModel <- lm( h ~ p )

abline( linModel )


Good luck!

-Charlie
Charlie Sharpsteen
Undergraduate-- Environmental Resources Engineering
Humboldt State University
Reply | Threaded
Open this post in threaded view
|

Re: Statistical analysis

Paul Hiemstra
In reply to this post by chrisli1223
Chris Li wrote:

> Hi all,
>
> I have got two datasets, one of them is rainfall data and the other one is
> groundwater level data.
>
> I would like to see whether there is a correlation between these two
> datasets and if there is, to what extent they are correlated.
>
> My stats background is limited, therefore any advice on which command I
> should use in R would be greatly appreciated.
>
> Thanks in advance.
> Chris
>  
Hi,

My advice would be to get an introductory statistics book and start with
that. There is an Introductory stats book by Dalgaard that uses R.
Strikes two birds with one blow.

http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759

cheers,
Paul

--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Statistical analysis

Arien Lam
In reply to this post by chrisli1223
Hi Chris,

If I understand your question correctly, what you want is both easy and hard.
Easy:
# making a reproducible example, as asked in the posting guide
# two vectors
water <- rnorm(1000)
rain <- rgamma(1000,.5)
# the following does everything you mention and more
summary(lm(water~rain))
cor(water,rain)

Hard:
lm() and cor() assume independence of observations, linearity of the relation, normality of the
residuals. Are these assumptions valid for your problem?
Are your datasets time series? There will be ??autocorrelation in both datasets. There may be a
?lag. Decide whether to estimate and correct for those.
Are there multiple sample locations? There may be dependence.
Would you rather assume rain and change in groundwater level are related?
Etc.

Cheers,

Arien


Chris Li wrote:

> Hi all,
>
> I have got two datasets, one of them is rainfall data and the other one is
> groundwater level data.
>
> I would like to see whether there is a correlation between these two
> datasets and if there is, to what extent they are correlated.
>
> My stats background is limited, therefore any advice on which command I
> should use in R would be greatly appreciated.
>
> Thanks in advance.
> Chris

--
drs. H.A. (Arien) Lam (Ph.D. student)
Department of Physical Geography
Faculty of Geosciences
Utrecht University, The Netherlands

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Statistical analysis

Arun.stat
In reply to this post by Sharpie
Rainfall data is widely accepted as Random walk process and hence it is non-stationary. Therefore if correlation or regression coef. is measured on raw data then you may land in the world of spurious measures. I would suggest you to check whether unit root is there in your data or not first. If it is there then estimate corr or any other statistical measure on differenced data.

Best,


cls59 wrote
Chris Li wrote
Hi all,

I have got two datasets, one of them is rainfall data and the other one is groundwater level data.

I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated.

My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated.

Thanks in advance.
Chris

Supposing you have two variables-- precipitation, p, and groundwater potential, h-- a simple test for linear correlation is to produce a scatterplot of h vs. p:

plot( h ~ p )

If it looks linear, than it may be worthwhile to have R estimate the coefficient of correlation for the data:

cor( p, h )

If the correlation coefficient is close to +/- 1, than your data is exhibiting a strong linear trend and a linear model may be appropriate:

linModel <- lm( h ~ p )

abline( linModel )


Good luck!

-Charlie
Reply | Threaded
Open this post in threaded view
|

Re: Statistical analysis

Greg Snow-2
In reply to this post by chrisli1223
Since todays ground water may be influenced by yesterdays rainfall, you may want to look at the dynlm package and possibly lag.plot and the zoo package.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[hidden email]
801.408.8111


> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of Chris Li
> Sent: Wednesday, September 23, 2009 5:37 PM
> To: [hidden email]
> Subject: [R] Statistical analysis
>
>
> Hi all,
>
> I have got two datasets, one of them is rainfall data and the other one
> is
> groundwater level data.
>
> I would like to see whether there is a correlation between these two
> datasets and if there is, to what extent they are correlated.
>
> My stats background is limited, therefore any advice on which command I
> should use in R would be greatly appreciated.
>
> Thanks in advance.
> Chris
> --
> View this message in context: http://www.nabble.com/Statistical-
> analysis-tp25531331p25531331.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.