

The small bit of script below is an example of what I'm attempting to do 
find the day on which the 'center of mass' occurs. In case that is the
wrong term, I'd like to know the day that essentially cuts the area under
the curve in to two equal parts:
set.seed(4004)
Date < seq(as.Date('20000901'), as.Date('20000930'), by='day')
hyd < ((100*(sin(seq(0.5,4.5,length.out=30))+10) +
seq(45,1,length.out=30)) + rnorm(30)*8)  800
# View the example curve
plot(Date, hyd, las=1)
# By trialanderror, the day on which the center of mass occurs is the
11th day:
# Add up the area under the curve for the first 11 days and compare
# with the last 19 days:
sum(hyd[1:11])
# 3546.364
sum(hyd[12:30])
# 3947.553
# Add up the area under the curve for the first 12 days and compare
# with the last 18 days:
sum(hyd[1:12])
# 3875.753
sum(hyd[13:30])
# 3618.164
By day 12, the halfway point has already been passed, so the answer that
would be returned would be:
Date[11]
# "20000911"
For the larger problem, it'd be handy if the proposed function could
process a multiyear time series (a runoff hydrograph) and return the day
of the center of mass for each year in the time series.
I appreciate any pointers...Eric
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Eric,
How about
match( TRUE, cumsum(hyd/sum(hyd)) > .5 )  1
HTH,
Eric
On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric < [hidden email]> wrote:
> The small bit of script below is an example of what I'm attempting to do 
> find the day on which the 'center of mass' occurs. In case that is the
> wrong term, I'd like to know the day that essentially cuts the area under
> the curve in to two equal parts:
>
> set.seed(4004)
> Date < seq(as.Date('20000901'), as.Date('20000930'), by='day')
> hyd < ((100*(sin(seq(0.5,4.5,length.out=30))+10) +
> seq(45,1,length.out=30)) + rnorm(30)*8)  800
>
> # View the example curve
> plot(Date, hyd, las=1)
>
> # By trialanderror, the day on which the center of mass occurs is the
> 11th day:
> # Add up the area under the curve for the first 11 days and compare
> # with the last 19 days:
>
> sum(hyd[1:11])
> # 3546.364
> sum(hyd[12:30])
> # 3947.553
>
> # Add up the area under the curve for the first 12 days and compare
> # with the last 18 days:
>
> sum(hyd[1:12])
> # 3875.753
> sum(hyd[13:30])
> # 3618.164
>
> By day 12, the halfway point has already been passed, so the answer that
> would be returned would be:
>
> Date[11]
> # "20000911"
>
> For the larger problem, it'd be handy if the proposed function could
> process a multiyear time series (a runoff hydrograph) and return the day
> of the center of mass for each year in the time series.
>
> I appreciate any pointers...Eric
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/> postingguide.html
> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Slightly faster: sum(cumsum(hyd) <= .5 * sum(hyd))
Best regards,
ir. Thierry Onkelinx
Statisticus / Statistician
Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
[hidden email]
Kliniekstraat 25, B1070 Brussel
www.inbo.be
///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no
more than asking him to perform a postmortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////
Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in
Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis.
Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel.
///////////////////////////////////////////////////////////////////////////////////////////
20171216 14:32 GMT+01:00 Eric Berger < [hidden email]>:
> Hi Eric,
> How about
>
> match( TRUE, cumsum(hyd/sum(hyd)) > .5 )  1
>
> HTH,
> Eric
>
>
> On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric < [hidden email]> wrote:
>
>> The small bit of script below is an example of what I'm attempting to do 
>> find the day on which the 'center of mass' occurs. In case that is the
>> wrong term, I'd like to know the day that essentially cuts the area under
>> the curve in to two equal parts:
>>
>> set.seed(4004)
>> Date < seq(as.Date('20000901'), as.Date('20000930'), by='day')
>> hyd < ((100*(sin(seq(0.5,4.5,length.out=30))+10) +
>> seq(45,1,length.out=30)) + rnorm(30)*8)  800
>>
>> # View the example curve
>> plot(Date, hyd, las=1)
>>
>> # By trialanderror, the day on which the center of mass occurs is the
>> 11th day:
>> # Add up the area under the curve for the first 11 days and compare
>> # with the last 19 days:
>>
>> sum(hyd[1:11])
>> # 3546.364
>> sum(hyd[12:30])
>> # 3947.553
>>
>> # Add up the area under the curve for the first 12 days and compare
>> # with the last 18 days:
>>
>> sum(hyd[1:12])
>> # 3875.753
>> sum(hyd[13:30])
>> # 3618.164
>>
>> By day 12, the halfway point has already been passed, so the answer that
>> would be returned would be:
>>
>> Date[11]
>> # "20000911"
>>
>> For the larger problem, it'd be handy if the proposed function could
>> process a multiyear time series (a runoff hydrograph) and return the day
>> of the center of mass for each year in the time series.
>>
>> I appreciate any pointers...Eric
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/>> postingguide.html
>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Eric B's response provided just the kind of quick & simple solution I was
hoping for (appears as the function com below). However, I once again
failed to take advantage of the power of R and have reverted back to using
a for loop for the next step of the processing. The example below (which
requires the library EGRET for pulling an example dataset) works, but
probably can be replaced with some version of the apply functionality? So
far, I've been unable to figure out how to enter the arguments to the apply
function. The idea is this: for each unique water year (variable 'wyrs' in
example below) in a 27 year continuous time series of daily values, find
the date of the 'center of mass', and build a vector of those dates.
Thanks, Eric M
library(EGRET)
StartDate < "19901001"
EndDate < "20170930"
siteNumber < "10310000"
QParameterCd < "00060"
Daily < readNWISDaily(siteNumber, QParameterCd, StartDate, EndDate)
# Define 'center of mass' function
com < function(x) {
match(TRUE, cumsum(x/sum(x)) > 0.5)  1
}
wyrs < unique(Daily$waterYear)
for(i in (1:length(wyrs))){
OneYr < Daily[Daily$waterYear==wyrs[i], ]
mid < com(OneYr$Q)
if(i==1){
midPts < as.Date(OneYr$Date[mid])
} else {
midPts < c(midPts, as.Date(OneYr$Date[mid]))
}
}
Eric Morway
Research Hydrologist
Nevada Water Science Center
U.S. Geological Survey
2730 N. Deer Run Rd.
Carson City, NV 89701
(775) 8877668
*orcid*: 0000000285536140 < http://orcid.org/0000000285536140>
On Sat, Dec 16, 2017 at 5:32 AM, Eric Berger < [hidden email]> wrote:
> Hi Eric,
> How about
>
> match( TRUE, cumsum(hyd/sum(hyd)) > .5 )  1
>
> HTH,
> Eric
>
>
> On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric < [hidden email]> wrote:
>
>> The small bit of script below is an example of what I'm attempting to do 
>> find the day on which the 'center of mass' occurs. In case that is the
>> wrong term, I'd like to know the day that essentially cuts the area under
>> the curve in to two equal parts:
>>
>> set.seed(4004)
>> Date < seq(as.Date('20000901'), as.Date('20000930'), by='day')
>> hyd < ((100*(sin(seq(0.5,4.5,length.out=30))+10) +
>> seq(45,1,length.out=30)) + rnorm(30)*8)  800
>>
>> # View the example curve
>> plot(Date, hyd, las=1)
>>
>> # By trialanderror, the day on which the center of mass occurs is the
>> 11th day:
>> # Add up the area under the curve for the first 11 days and compare
>> # with the last 19 days:
>>
>> sum(hyd[1:11])
>> # 3546.364
>> sum(hyd[12:30])
>> # 3947.553
>>
>> # Add up the area under the curve for the first 12 days and compare
>> # with the last 18 days:
>>
>> sum(hyd[1:12])
>> # 3875.753
>> sum(hyd[13:30])
>> # 3618.164
>>
>> By day 12, the halfway point has already been passed, so the answer that
>> would be returned would be:
>>
>> Date[11]
>> # "20000911"
>>
>> For the larger problem, it'd be handy if the proposed function could
>> process a multiyear time series (a runoff hydrograph) and return the day
>> of the center of mass for each year in the time series.
>>
>> I appreciate any pointers...Eric
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/posti>> ngguide.html
>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Eric,
the following works for me.
HTH,
Eric
library(EGRET)
StartDate < "19901001"
EndDate < "20170930"
siteNumber < "10310000"
QParameterCd < "00060"
Daily < readNWISDaily(siteNumber, QParameterCd, StartDate, EndDate)
# Define 'center of mass' function
com < function(x) {
match(TRUE, cumsum(x/sum(x)) > 0.5)  1
}
wyrs < unique(Daily$waterYear)
x < as.Date(sapply( wyrs, function(yr) { Df <
Daily[Daily$waterYear==yr,]; Df$Date[com(Df$Q)] } ), "19700101")
On Mon, Dec 18, 2017 at 4:47 PM, Morway, Eric < [hidden email]> wrote:
> Eric B's response provided just the kind of quick & simple solution I was
> hoping for (appears as the function com below). However, I once again
> failed to take advantage of the power of R and have reverted back to using
> a for loop for the next step of the processing. The example below (which
> requires the library EGRET for pulling an example dataset) works, but
> probably can be replaced with some version of the apply functionality? So
> far, I've been unable to figure out how to enter the arguments to the apply
> function. The idea is this: for each unique water year (variable 'wyrs' in
> example below) in a 27 year continuous time series of daily values, find
> the date of the 'center of mass', and build a vector of those dates.
> Thanks, Eric M
>
> library(EGRET)
>
> StartDate < "19901001"
> EndDate < "20170930"
> siteNumber < "10310000"
> QParameterCd < "00060"
>
> Daily < readNWISDaily(siteNumber, QParameterCd, StartDate, EndDate)
>
> # Define 'center of mass' function
> com < function(x) {
> match(TRUE, cumsum(x/sum(x)) > 0.5)  1
> }
>
>
> wyrs < unique(Daily$waterYear)
> for(i in (1:length(wyrs))){
> OneYr < Daily[Daily$waterYear==wyrs[i], ]
> mid < com(OneYr$Q)
> if(i==1){
> midPts < as.Date(OneYr$Date[mid])
> } else {
> midPts < c(midPts, as.Date(OneYr$Date[mid]))
> }
> }
>
>
>
> Eric Morway
> Research Hydrologist
> Nevada Water Science Center
> U.S. Geological Survey
> 2730 N. Deer Run Rd.
> < https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+City,+NV+89701+(775&entry=gmail&source=g>
> Carson City, NV 89701
> < https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+City,+NV+89701+(775&entry=gmail&source=g>
> (775
> < https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+City,+NV+89701+(775&entry=gmail&source=g>)
> 8877668
> *orcid*: 0000000285536140 < http://orcid.org/0000000285536140>
>
>
>
> On Sat, Dec 16, 2017 at 5:32 AM, Eric Berger < [hidden email]>
> wrote:
>
>> Hi Eric,
>> How about
>>
>> match( TRUE, cumsum(hyd/sum(hyd)) > .5 )  1
>>
>> HTH,
>> Eric
>>
>>
>> On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric < [hidden email]> wrote:
>>
>>> The small bit of script below is an example of what I'm attempting to do
>>> 
>>> find the day on which the 'center of mass' occurs. In case that is the
>>> wrong term, I'd like to know the day that essentially cuts the area under
>>> the curve in to two equal parts:
>>>
>>> set.seed(4004)
>>> Date < seq(as.Date('20000901'), as.Date('20000930'), by='day')
>>> hyd < ((100*(sin(seq(0.5,4.5,length.out=30))+10) +
>>> seq(45,1,length.out=30)) + rnorm(30)*8)  800
>>>
>>> # View the example curve
>>> plot(Date, hyd, las=1)
>>>
>>> # By trialanderror, the day on which the center of mass occurs is the
>>> 11th day:
>>> # Add up the area under the curve for the first 11 days and compare
>>> # with the last 19 days:
>>>
>>> sum(hyd[1:11])
>>> # 3546.364
>>> sum(hyd[12:30])
>>> # 3947.553
>>>
>>> # Add up the area under the curve for the first 12 days and compare
>>> # with the last 18 days:
>>>
>>> sum(hyd[1:12])
>>> # 3875.753
>>> sum(hyd[13:30])
>>> # 3618.164
>>>
>>> By day 12, the halfway point has already been passed, so the answer that
>>> would be returned would be:
>>>
>>> Date[11]
>>> # "20000911"
>>>
>>> For the larger problem, it'd be handy if the proposed function could
>>> process a multiyear time series (a runoff hydrograph) and return the day
>>> of the center of mass for each year in the time series.
>>>
>>> I appreciate any pointers...Eric
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide http://www.Rproject.org/posti>>> ngguide.html
>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>
>>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


... and for the record:
The apply() family of functions  here sapply()  are *not* vectorized.
This means that one should not expect them to be necessarily more efficient
than expicit for() loops. Their advantage for many is clarity of code.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Dec 18, 2017 at 7:42 AM, Eric Berger < [hidden email]> wrote:
> Hi Eric,
> the following works for me.
>
> HTH,
> Eric
>
> library(EGRET)
>
> StartDate < "19901001"
> EndDate < "20170930"
> siteNumber < "10310000"
> QParameterCd < "00060"
>
> Daily < readNWISDaily(siteNumber, QParameterCd, StartDate, EndDate)
>
> # Define 'center of mass' function
> com < function(x) {
> match(TRUE, cumsum(x/sum(x)) > 0.5)  1
> }
> wyrs < unique(Daily$waterYear)
> x < as.Date(sapply( wyrs, function(yr) { Df <
> Daily[Daily$waterYear==yr,]; Df$Date[com(Df$Q)] } ), "19700101")
>
>
>
> On Mon, Dec 18, 2017 at 4:47 PM, Morway, Eric < [hidden email]> wrote:
>
> > Eric B's response provided just the kind of quick & simple solution I was
> > hoping for (appears as the function com below). However, I once again
> > failed to take advantage of the power of R and have reverted back to
> using
> > a for loop for the next step of the processing. The example below (which
> > requires the library EGRET for pulling an example dataset) works, but
> > probably can be replaced with some version of the apply functionality?
> So
> > far, I've been unable to figure out how to enter the arguments to the
> apply
> > function. The idea is this: for each unique water year (variable 'wyrs'
> in
> > example below) in a 27 year continuous time series of daily values, find
> > the date of the 'center of mass', and build a vector of those dates.
> > Thanks, Eric M
> >
> > library(EGRET)
> >
> > StartDate < "19901001"
> > EndDate < "20170930"
> > siteNumber < "10310000"
> > QParameterCd < "00060"
> >
> > Daily < readNWISDaily(siteNumber, QParameterCd, StartDate, EndDate)
> >
> > # Define 'center of mass' function
> > com < function(x) {
> > match(TRUE, cumsum(x/sum(x)) > 0.5)  1
> > }
> >
> >
> > wyrs < unique(Daily$waterYear)
> > for(i in (1:length(wyrs))){
> > OneYr < Daily[Daily$waterYear==wyrs[i], ]
> > mid < com(OneYr$Q)
> > if(i==1){
> > midPts < as.Date(OneYr$Date[mid])
> > } else {
> > midPts < c(midPts, as.Date(OneYr$Date[mid]))
> > }
> > }
> >
> >
> >
> > Eric Morway
> > Research Hydrologist
> > Nevada Water Science Center
> > U.S. Geological Survey
> > 2730 N. Deer Run Rd.
> > < https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+> City,+NV+89701+(775&entry=gmail&source=g>
> > Carson City, NV 89701
> > < https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+> City,+NV+89701+(775&entry=gmail&source=g>
> > (775
> > < https://maps.google.com/?q=2730+N.+Deer+Run+Rd.Carson+> City,+NV+89701+(775&entry=gmail&source=g>)
> > 8877668
> > *orcid*: 0000000285536140 < http://orcid.org/0000000285536140>
> >
> >
> >
> > On Sat, Dec 16, 2017 at 5:32 AM, Eric Berger < [hidden email]>
> > wrote:
> >
> >> Hi Eric,
> >> How about
> >>
> >> match( TRUE, cumsum(hyd/sum(hyd)) > .5 )  1
> >>
> >> HTH,
> >> Eric
> >>
> >>
> >> On Sat, Dec 16, 2017 at 3:18 PM, Morway, Eric < [hidden email]> wrote:
> >>
> >>> The small bit of script below is an example of what I'm attempting to
> do
> >>> 
> >>> find the day on which the 'center of mass' occurs. In case that is the
> >>> wrong term, I'd like to know the day that essentially cuts the area
> under
> >>> the curve in to two equal parts:
> >>>
> >>> set.seed(4004)
> >>> Date < seq(as.Date('20000901'), as.Date('20000930'), by='day')
> >>> hyd < ((100*(sin(seq(0.5,4.5,length.out=30))+10) +
> >>> seq(45,1,length.out=30)) + rnorm(30)*8)  800
> >>>
> >>> # View the example curve
> >>> plot(Date, hyd, las=1)
> >>>
> >>> # By trialanderror, the day on which the center of mass occurs is the
> >>> 11th day:
> >>> # Add up the area under the curve for the first 11 days and compare
> >>> # with the last 19 days:
> >>>
> >>> sum(hyd[1:11])
> >>> # 3546.364
> >>> sum(hyd[12:30])
> >>> # 3947.553
> >>>
> >>> # Add up the area under the curve for the first 12 days and compare
> >>> # with the last 18 days:
> >>>
> >>> sum(hyd[1:12])
> >>> # 3875.753
> >>> sum(hyd[13:30])
> >>> # 3618.164
> >>>
> >>> By day 12, the halfway point has already been passed, so the answer
> that
> >>> would be returned would be:
> >>>
> >>> Date[11]
> >>> # "20000911"
> >>>
> >>> For the larger problem, it'd be handy if the proposed function could
> >>> process a multiyear time series (a runoff hydrograph) and return the
> day
> >>> of the center of mass for each year in the time series.
> >>>
> >>> I appreciate any pointers...Eric
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/rhelp> >>> PLEASE do read the posting guide http://www.Rproject.org/posti> >>> ngguide.html
> >>> and provide commented, minimal, selfcontained, reproducible code.
> >>>
> >>
> >>
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/> postingguide.html
> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

