Maybe bug? Using non-integer frequencies in stats::ts

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Maybe bug? Using non-integer frequencies in stats::ts

Johann R. Kleinbub
I am developing a package to analyse physiological time-series and I
thought that the most reliable and robust solution was to base it on the
native stats::ts class. In my domain it is common to express series
frequencies as samples-per-second. So ts(..., frequency=10) would mean a
signal sampled 10 times every second, and ts(..., frequency = 1) a signal
sampled every second. Following this logic, a few slower signals are
sampled every 5 seconds (or more), resulting in a frequency of e.g. 0.2
Nowhere in the documentation is stated that the frequency must be an
integer value, but using fractional values gives inconsistent results.
For instance, in this example, foo and bar are identical, just with
start-end values shifted by 1. Yet when extracting an arbitrary window, the
'bar' series gives error.

x = 1:22
foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2)
bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2)

window(foo, start = 20, end = 30, extend=TRUE)

# Time Series:# Start = 20 # End = 25 # Frequency = 0.2 # [1] 5 6

window(bar, start = 20, end = 30, extend=TRUE)

# Error in attr(y, "tsp") <- c(ystart, yend, xfreq) : #   invalid time
series parameters specified

The reason is in the rounding procedures for ystart and yend at the end of
the stats::window function. For the 'foo' series the ystart and yend values
are calculated as: c(20, 25), whereas for the 'bar' series, they
become c(20, 30) although the window should be of the very same size in
both cases. (A further discussion on the example is at:
https://stackoverflow.com/questions/57928054 )
Should I report a bug or am I misunderstanding something?

--
Johann R. Kleinbub, PhD
University of Padova
FISPPA Dep. - Section of Applied Psychology
Cell: +39 3495986373

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Maybe bug? Using non-integer frequencies in stats::ts

Johann R. Kleinbub
It's been three months without an answer, is it ok to thread bump?
Would someone provide a pointer?

Thank you for your consideration,
Johann


On Mon, 16 Sep 2019 at 15:53, Johann R. Kleinbub <[hidden email]>
wrote:
>
> I am developing a package to analyse physiological time-series and I
thought that the most reliable and robust solution was to base it on the
native stats::ts class. In my domain it is common to express series
frequencies as samples-per-second. So ts(..., frequency=10) would mean a
signal sampled 10 times every second, and ts(..., frequency = 1) a signal
sampled every second. Following this logic, a few slower signals are
sampled every 5 seconds (or more), resulting in a frequency of e.g. 0.2
> Nowhere in the documentation is stated that the frequency must be an
integer value, but using fractional values gives inconsistent results.
> For instance, in this example, foo and bar are identical, just with
start-end values shifted by 1. Yet when extracting an arbitrary window, the
'bar' series gives error.

>
> x = 1:22
> foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2)
> bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2)
>
> window(foo, start = 20, end = 30, extend=TRUE)
>
> # Time Series:
> # Start = 20
> # End = 25
> # Frequency = 0.2
> # [1] 5 6
>
> window(bar, start = 20, end = 30, extend=TRUE)
>
> # Error in attr(y, "tsp") <- c(ystart, yend, xfreq) :
> #   invalid time series parameters specified
>
> The reason is in the rounding procedures for ystart and yend at the end
of the stats::window function. For the 'foo' series the ystart and yend
values are calculated as: c(20, 25), whereas for the 'bar' series, they
become c(20, 30) although the window should be of the very same size in
both cases. (A further discussion on the example is at:
https://stackoverflow.com/questions/57928054 )
> Should I report a bug or am I misunderstanding something?
>
> --
> Johann R. Kleinbub, PhD
> University of Padova
> FISPPA Dep. - Section of Applied Psychology
> Cell: +39 3495986373



--
Johann R. Kleinbub, PhD
University of Padova
FISPPA Dep. - Section of Applied Psychology
Cell: +39 3495986373

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Maybe bug? Using non-integer frequencies in stats::ts

Duncan Murdoch-2
On 05/12/2019 11:00 a.m., Johann R. Kleinbub wrote:
> It's been three months without an answer, is it ok to thread bump?
> Would someone provide a pointer?

I agree it's a bug, and agree with your analysis.  You should report it
on bugs.r-project.org.  (If you don't have an account there, let us
know, and either someone will give you one, or someone will report it
for you.)

As a workaround, I don't see it happening with extend=FALSE, but of
course that might not suit your needs in general.

Duncan Murdoch



>
> Thank you for your consideration,
> Johann
>
>
> On Mon, 16 Sep 2019 at 15:53, Johann R. Kleinbub <[hidden email]>
> wrote:
>>
>> I am developing a package to analyse physiological time-series and I
> thought that the most reliable and robust solution was to base it on the
> native stats::ts class. In my domain it is common to express series
> frequencies as samples-per-second. So ts(..., frequency=10) would mean a
> signal sampled 10 times every second, and ts(..., frequency = 1) a signal
> sampled every second. Following this logic, a few slower signals are
> sampled every 5 seconds (or more), resulting in a frequency of e.g. 0.2
>> Nowhere in the documentation is stated that the frequency must be an
> integer value, but using fractional values gives inconsistent results.
>> For instance, in this example, foo and bar are identical, just with
> start-end values shifted by 1. Yet when extracting an arbitrary window, the
> 'bar' series gives error.
>>
>> x = 1:22
>> foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2)
>> bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2)
>>
>> window(foo, start = 20, end = 30, extend=TRUE)
>>
>> # Time Series:
>> # Start = 20
>> # End = 25
>> # Frequency = 0.2
>> # [1] 5 6
>>
>> window(bar, start = 20, end = 30, extend=TRUE)
>>
>> # Error in attr(y, "tsp") <- c(ystart, yend, xfreq) :
>> #   invalid time series parameters specified
>>
>> The reason is in the rounding procedures for ystart and yend at the end
> of the stats::window function. For the 'foo' series the ystart and yend
> values are calculated as: c(20, 25), whereas for the 'bar' series, they
> become c(20, 30) although the window should be of the very same size in
> both cases. (A further discussion on the example is at:
> https://stackoverflow.com/questions/57928054 )
>> Should I report a bug or am I misunderstanding something?
>>
>> --
>> Johann R. Kleinbub, PhD
>> University of Padova
>> FISPPA Dep. - Section of Applied Psychology
>> Cell: +39 3495986373
>
>
>
> --
> Johann R. Kleinbub, PhD
> University of Padova
> FISPPA Dep. - Section of Applied Psychology
> Cell: +39 3495986373
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Maybe bug? Using non-integer frequencies in stats::ts

Johann R. Kleinbub
Thank you for the quick follow up, Duncan.
Unfortunately extend=TRUE is called internally in various instances such as
when replacing parts of the time-series with  window<-.ts
Consider the following examples of time series with ugly values:
x = 1:22
foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2) # a ts of 525 cycles
bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2) # a ts of 525 cycles
starting 5 cycles later than foo
qux = ts(x, start = 2.5, end = 102.5, frequency = 0.2) # a ts of 500 cycles
starting 5 cycles later than foo

# extraction works fine
window(foo, start = 20, end = 30)  # works fine
window(bar, start = 20, end = 30)  # works fine
window(qux, start = 20, end = 30)  # works fine

# assignment fails in different ways for bar and qux
window(foo, start = 20, end = 30) <- NA  # works fine
window(bar, start = 20, end = 30) <- NA  # ERROR: "invalid time series
parameters specified"
window(qux, start = 20, end = 30) <- NA  # ERROR: "times to be replaced do
not match"

If extraction works fine, there's no reason why replacing the values should
fail.
I don't have an account on bugs.r-project.org yet. I'd be available to do
the report if I'm assigned one.
Best,
Johann

On Thu, 5 Dec 2019 at 17:46, Duncan Murdoch <[hidden email]>
wrote:

> On 05/12/2019 11:00 a.m., Johann R. Kleinbub wrote:
> > It's been three months without an answer, is it ok to thread bump?
> > Would someone provide a pointer?
>
> I agree it's a bug, and agree with your analysis.  You should report it
> on bugs.r-project.org.  (If you don't have an account there, let us
> know, and either someone will give you one, or someone will report it
> for you.)
>
> As a workaround, I don't see it happening with extend=FALSE, but of
> course that might not suit your needs in general.
>
> Duncan Murdoch
>
>
>
> >
> > Thank you for your consideration,
> > Johann
> >
> >
> > On Mon, 16 Sep 2019 at 15:53, Johann R. Kleinbub <
> [hidden email]>
> > wrote:
> >>
> >> I am developing a package to analyse physiological time-series and I
> > thought that the most reliable and robust solution was to base it on the
> > native stats::ts class. In my domain it is common to express series
> > frequencies as samples-per-second. So ts(..., frequency=10) would mean a
> > signal sampled 10 times every second, and ts(..., frequency = 1) a signal
> > sampled every second. Following this logic, a few slower signals are
> > sampled every 5 seconds (or more), resulting in a frequency of e.g. 0.2
> >> Nowhere in the documentation is stated that the frequency must be an
> > integer value, but using fractional values gives inconsistent results.
> >> For instance, in this example, foo and bar are identical, just with
> > start-end values shifted by 1. Yet when extracting an arbitrary window,
> the
> > 'bar' series gives error.
> >>
> >> x = 1:22
> >> foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2)
> >> bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2)
> >>
> >> window(foo, start = 20, end = 30, extend=TRUE)
> >>
> >> # Time Series:
> >> # Start = 20
> >> # End = 25
> >> # Frequency = 0.2
> >> # [1] 5 6
> >>
> >> window(bar, start = 20, end = 30, extend=TRUE)
> >>
> >> # Error in attr(y, "tsp") <- c(ystart, yend, xfreq) :
> >> #   invalid time series parameters specified
> >>
> >> The reason is in the rounding procedures for ystart and yend at the end
> > of the stats::window function. For the 'foo' series the ystart and yend
> > values are calculated as: c(20, 25), whereas for the 'bar' series, they
> > become c(20, 30) although the window should be of the very same size in
> > both cases. (A further discussion on the example is at:
> > https://stackoverflow.com/questions/57928054 )
> >> Should I report a bug or am I misunderstanding something?
> >>
> >> --
> >> Johann R. Kleinbub, PhD
> >> University of Padova
> >> FISPPA Dep. - Section of Applied Psychology
> >> Cell: +39 3495986373
> >
> >
> >
> > --
> > Johann R. Kleinbub, PhD
> > University of Padova
> > FISPPA Dep. - Section of Applied Psychology
> > Cell: +39 3495986373
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>

--
Johann R. Kleinbub, PhD
University of Padova
FISPPA Dep. - Section of Applied Psychology
Cell: +39 3495986373

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Maybe bug? Using non-integer frequencies in stats::ts

Duncan Murdoch-2
To R-devel:

I've sent this to Johann privately already; just in case anyone else is
interested in this issue, here's what I wrote:

Just started looking into it, and discovered this paragraph in ?ts:

"The value of argument frequency is used when the series is sampled an
integral number of times in each unit time interval. For example, one
could use a value of 7 for frequency when the data are sampled daily,
and the natural time period is a week, or 12 when the data are sampled
monthly and the natural time period is a year. Values of 4 and 12 are
assumed in (e.g.) print methods to imply a quarterly and monthly series
respectively."

That says that frequency will be a positive integer, so frequency=0.2
was not intended to be covered, and I'd say it's not exactly a bug that
it doesn't work.  (It might be called a bug that there's no error message.)

On the other hand, it comes close to working, and it seems like allowing
frequency=0.2 would be a useful addition.  I'm going to keep looking,
and see how hard it would be to get this to work properly.  If it
doesn't break other things, I may submit this as an enhancement.

Duncan Murdoch


On 06/12/2019 10:00 a.m., Johann R. Kleinbub wrote:

> Thank you for the quick follow up, Duncan.
> Unfortunately extend=TRUE is called internally in various instances such as
> when replacing parts of the time-series with  window<-.ts
> Consider the following examples of time series with ugly values:
> x = 1:22
> foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2) # a ts of 525 cycles
> bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2) # a ts of 525 cycles
> starting 5 cycles later than foo
> qux = ts(x, start = 2.5, end = 102.5, frequency = 0.2) # a ts of 500 cycles
> starting 5 cycles later than foo
>
> # extraction works fine
> window(foo, start = 20, end = 30)  # works fine
> window(bar, start = 20, end = 30)  # works fine
> window(qux, start = 20, end = 30)  # works fine
>
> # assignment fails in different ways for bar and qux
> window(foo, start = 20, end = 30) <- NA  # works fine
> window(bar, start = 20, end = 30) <- NA  # ERROR: "invalid time series
> parameters specified"
> window(qux, start = 20, end = 30) <- NA  # ERROR: "times to be replaced do
> not match"
>
> If extraction works fine, there's no reason why replacing the values should
> fail.
> I don't have an account on bugs.r-project.org yet. I'd be available to do
> the report if I'm assigned one.
> Best,
> Johann
>
> On Thu, 5 Dec 2019 at 17:46, Duncan Murdoch <[hidden email]>
> wrote:
>
>> On 05/12/2019 11:00 a.m., Johann R. Kleinbub wrote:
>>> It's been three months without an answer, is it ok to thread bump?
>>> Would someone provide a pointer?
>>
>> I agree it's a bug, and agree with your analysis.  You should report it
>> on bugs.r-project.org.  (If you don't have an account there, let us
>> know, and either someone will give you one, or someone will report it
>> for you.)
>>
>> As a workaround, I don't see it happening with extend=FALSE, but of
>> course that might not suit your needs in general.
>>
>> Duncan Murdoch
>>
>>
>>
>>>
>>> Thank you for your consideration,
>>> Johann
>>>
>>>
>>> On Mon, 16 Sep 2019 at 15:53, Johann R. Kleinbub <
>> [hidden email]>
>>> wrote:
>>>>
>>>> I am developing a package to analyse physiological time-series and I
>>> thought that the most reliable and robust solution was to base it on the
>>> native stats::ts class. In my domain it is common to express series
>>> frequencies as samples-per-second. So ts(..., frequency=10) would mean a
>>> signal sampled 10 times every second, and ts(..., frequency = 1) a signal
>>> sampled every second. Following this logic, a few slower signals are
>>> sampled every 5 seconds (or more), resulting in a frequency of e.g. 0.2
>>>> Nowhere in the documentation is stated that the frequency must be an
>>> integer value, but using fractional values gives inconsistent results.
>>>> For instance, in this example, foo and bar are identical, just with
>>> start-end values shifted by 1. Yet when extracting an arbitrary window,
>> the
>>> 'bar' series gives error.
>>>>
>>>> x = 1:22
>>>> foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2)
>>>> bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2)
>>>>
>>>> window(foo, start = 20, end = 30, extend=TRUE)
>>>>
>>>> # Time Series:
>>>> # Start = 20
>>>> # End = 25
>>>> # Frequency = 0.2
>>>> # [1] 5 6
>>>>
>>>> window(bar, start = 20, end = 30, extend=TRUE)
>>>>
>>>> # Error in attr(y, "tsp") <- c(ystart, yend, xfreq) :
>>>> #   invalid time series parameters specified
>>>>
>>>> The reason is in the rounding procedures for ystart and yend at the end
>>> of the stats::window function. For the 'foo' series the ystart and yend
>>> values are calculated as: c(20, 25), whereas for the 'bar' series, they
>>> become c(20, 30) although the window should be of the very same size in
>>> both cases. (A further discussion on the example is at:
>>> https://stackoverflow.com/questions/57928054 )
>>>> Should I report a bug or am I misunderstanding something?
>>>>
>>>> --
>>>> Johann R. Kleinbub, PhD
>>>> University of Padova
>>>> FISPPA Dep. - Section of Applied Psychology
>>>> Cell: +39 3495986373
>>>
>>>
>>>
>>> --
>>> Johann R. Kleinbub, PhD
>>> University of Padova
>>> FISPPA Dep. - Section of Applied Psychology
>>> Cell: +39 3495986373
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Maybe bug? Using non-integer frequencies in stats::ts

Duncan Murdoch-2
I've now posted this as an enhancement request:

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17669

On 06/12/2019 12:35 p.m., Duncan Murdoch wrote:

> To R-devel:
>
> I've sent this to Johann privately already; just in case anyone else is
> interested in this issue, here's what I wrote:
>
> Just started looking into it, and discovered this paragraph in ?ts:
>
> "The value of argument frequency is used when the series is sampled an
> integral number of times in each unit time interval. For example, one
> could use a value of 7 for frequency when the data are sampled daily,
> and the natural time period is a week, or 12 when the data are sampled
> monthly and the natural time period is a year. Values of 4 and 12 are
> assumed in (e.g.) print methods to imply a quarterly and monthly series
> respectively."
>
> That says that frequency will be a positive integer, so frequency=0.2
> was not intended to be covered, and I'd say it's not exactly a bug that
> it doesn't work.  (It might be called a bug that there's no error message.)
>
> On the other hand, it comes close to working, and it seems like allowing
> frequency=0.2 would be a useful addition.  I'm going to keep looking,
> and see how hard it would be to get this to work properly.  If it
> doesn't break other things, I may submit this as an enhancement.
>
> Duncan Murdoch
>
>
> On 06/12/2019 10:00 a.m., Johann R. Kleinbub wrote:
>> Thank you for the quick follow up, Duncan.
>> Unfortunately extend=TRUE is called internally in various instances such as
>> when replacing parts of the time-series with  window<-.ts
>> Consider the following examples of time series with ugly values:
>> x = 1:22
>> foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2) # a ts of 525 cycles
>> bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2) # a ts of 525 cycles
>> starting 5 cycles later than foo
>> qux = ts(x, start = 2.5, end = 102.5, frequency = 0.2) # a ts of 500 cycles
>> starting 5 cycles later than foo
>>
>> # extraction works fine
>> window(foo, start = 20, end = 30)  # works fine
>> window(bar, start = 20, end = 30)  # works fine
>> window(qux, start = 20, end = 30)  # works fine
>>
>> # assignment fails in different ways for bar and qux
>> window(foo, start = 20, end = 30) <- NA  # works fine
>> window(bar, start = 20, end = 30) <- NA  # ERROR: "invalid time series
>> parameters specified"
>> window(qux, start = 20, end = 30) <- NA  # ERROR: "times to be replaced do
>> not match"
>>
>> If extraction works fine, there's no reason why replacing the values should
>> fail.
>> I don't have an account on bugs.r-project.org yet. I'd be available to do
>> the report if I'm assigned one.
>> Best,
>> Johann
>>
>> On Thu, 5 Dec 2019 at 17:46, Duncan Murdoch <[hidden email]>
>> wrote:
>>
>>> On 05/12/2019 11:00 a.m., Johann R. Kleinbub wrote:
>>>> It's been three months without an answer, is it ok to thread bump?
>>>> Would someone provide a pointer?
>>>
>>> I agree it's a bug, and agree with your analysis.  You should report it
>>> on bugs.r-project.org.  (If you don't have an account there, let us
>>> know, and either someone will give you one, or someone will report it
>>> for you.)
>>>
>>> As a workaround, I don't see it happening with extend=FALSE, but of
>>> course that might not suit your needs in general.
>>>
>>> Duncan Murdoch
>>>
>>>
>>>
>>>>
>>>> Thank you for your consideration,
>>>> Johann
>>>>
>>>>
>>>> On Mon, 16 Sep 2019 at 15:53, Johann R. Kleinbub <
>>> [hidden email]>
>>>> wrote:
>>>>>
>>>>> I am developing a package to analyse physiological time-series and I
>>>> thought that the most reliable and robust solution was to base it on the
>>>> native stats::ts class. In my domain it is common to express series
>>>> frequencies as samples-per-second. So ts(..., frequency=10) would mean a
>>>> signal sampled 10 times every second, and ts(..., frequency = 1) a signal
>>>> sampled every second. Following this logic, a few slower signals are
>>>> sampled every 5 seconds (or more), resulting in a frequency of e.g. 0.2
>>>>> Nowhere in the documentation is stated that the frequency must be an
>>>> integer value, but using fractional values gives inconsistent results.
>>>>> For instance, in this example, foo and bar are identical, just with
>>>> start-end values shifted by 1. Yet when extracting an arbitrary window,
>>> the
>>>> 'bar' series gives error.
>>>>>
>>>>> x = 1:22
>>>>> foo = ts(x, start = 1.5, end = 106.5, frequency = 0.2)
>>>>> bar = ts(x, start = 2.5, end = 107.5, frequency = 0.2)
>>>>>
>>>>> window(foo, start = 20, end = 30, extend=TRUE)
>>>>>
>>>>> # Time Series:
>>>>> # Start = 20
>>>>> # End = 25
>>>>> # Frequency = 0.2
>>>>> # [1] 5 6
>>>>>
>>>>> window(bar, start = 20, end = 30, extend=TRUE)
>>>>>
>>>>> # Error in attr(y, "tsp") <- c(ystart, yend, xfreq) :
>>>>> #   invalid time series parameters specified
>>>>>
>>>>> The reason is in the rounding procedures for ystart and yend at the end
>>>> of the stats::window function. For the 'foo' series the ystart and yend
>>>> values are calculated as: c(20, 25), whereas for the 'bar' series, they
>>>> become c(20, 30) although the window should be of the very same size in
>>>> both cases. (A further discussion on the example is at:
>>>> https://stackoverflow.com/questions/57928054 )
>>>>> Should I report a bug or am I misunderstanding something?
>>>>>
>>>>> --
>>>>> Johann R. Kleinbub, PhD
>>>>> University of Padova
>>>>> FISPPA Dep. - Section of Applied Psychology
>>>>> Cell: +39 3495986373
>>>>
>>>>
>>>>
>>>> --
>>>> Johann R. Kleinbub, PhD
>>>> University of Padova
>>>> FISPPA Dep. - Section of Applied Psychology
>>>> Cell: +39 3495986373
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>>
>>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel