

Hi,
I would like to explore some basic investment "behaviors" (not real
quant "strategies"), such like the cost average effect.
Therefore, I would like to create artificial time series with similar
statistical features as real stock price time series.
1) How could I create them? What is a common distribution function to
get returns from? (Without having reference data)
2) How can I create a time series with similar features as a given time series?
3) How can I create a time series with statistical features that are
similar to most of the data from a set of given time series?
4) Is there anything valuable which could make given data more
exhaustible? Something like bootstrapping?
Thanks
a
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


On 28 October 2012 at 13:21, Alex Grund wrote:
 Hi,

 I would like to explore some basic investment "behaviors" (not real
 quant "strategies"), such like the cost average effect.

 Therefore, I would like to create artificial time series with similar
 statistical features as real stock price time series.


 1) How could I create them? What is a common distribution function to
 get returns from? (Without having reference data)
There are libraries full of papers and dissertations on this.
You first need to establish _which properties_ you actually want to model /
recreate. And at which time frame. Eg for daily data you may use a normal
mixture, maybe add a jump, overlay some sort of Garch or SV... but those are
"still wrong".
I'd (carefully) resample as per 4).
 2) How can I create a time series with similar features as a given time series?
See 1). Which features?
 3) How can I create a time series with statistical features that are
 similar to most of the data from a set of given time series?
See 1) and 2). Seriously :) The last paper presentation I saw was Diebold who
showed how to regenerate trade duration data, as well as high frequency vol,
from a "simple" four parameter model. And simple is a relative term  he
recaptured the features of his (SP100 equity TAQ) data set, but its not a
model you can code up in just a few lines.
 4) Is there anything valuable which could make given data more
 exhaustible? Something like bootstrapping?
Block bootstrap for time series is pretty well established, and the tseries
package even had a tsbootstrap() function for over a decade. You can (fairly
easily) extend similar schemes.
Dirk

Dirk Eddelbuettel  [hidden email]  http://dirk.eddelbuettel.com_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


Hi Dirk,
thanks for your reply.
2012/10/28 Dirk Eddelbuettel < [hidden email]>:
> There are libraries full of papers and dissertations on this.
Okay, could you please mention a few valuable papers? So that I can search more?
> See 1). Which features?
Basically, I started from the naive question: "How to create a time
series that "looks" like a stock price process over time".
So, the basic features I came through has been a) the distribution of
the (daily) returns, b) their autocorrel features and c) binominal
features.
To explain what I mean by c):
Imagine you create normaldistributed (N(0,1)) returns. Then the
generated time series of prices (price[i] = price[i1]*(returns[i]+1))
will slightly tend to fall. This is obviously because of this: Imagine
you have three returns generated, [.5; 0; .5], then the series will
fall. It should be [.5;0;1] for the series to hold it's level,
however P(X<.5) > P(X>1), X~N(0,1), so the series with returns mean 0
is obviously to fall.
Additionally, one could think of volatility features (such as
suggested by GARCH).
>  3) How can I create a time series with statistical features that are
>  similar to most of the data from a set of given time series?
>
> See 1) and 2). Seriously :) The last paper presentation I saw was Diebold who
> showed how to regenerate trade duration data, as well as high frequency vol,
> from a "simple" four parameter model. And simple is a relative term  he
> recaptured the features of his (SP100 equity TAQ) data set, but its not a
> model you can code up in just a few lines.
Okay, are there models to start with? They don't need to be perfect,
because I want to use them for learning...
>  4) Is there anything valuable which could make given data more
>  exhaustible? Something like bootstrapping?
>
> Block bootstrap for time series is pretty well established, and the tseries
> package even had a tsbootstrap() function for over a decade. You can (fairly
> easily) extend similar schemes.
Ok, thanks
a
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


The books "Analysis of Financial Time Series" by Ruey Tsay and
"Statistics of Financial Markets" by Franke, Hardle and Hafner are both
good references.
But ultimately if the end goal is to test a trading strategy why
simulate your own data? It seems like a lot of work and the end result
would be to generate a profitable strategy on fictitious data?
On 10/28/2012 09:22 AM, Alex Grund wrote:
> Hi Dirk,
>
> thanks for your reply.
>
> 2012/10/28 Dirk Eddelbuettel < [hidden email]>:
>
>> There are libraries full of papers and dissertations on this.
> Okay, could you please mention a few valuable papers? So that I can search more
>
>> See 1). Which features?
> Basically, I started from the naive question: "How to create a time
> series that "looks" like a stock price process over time".
> So, the basic features I came through has been a) the distribution of
> the (daily) returns, b) their autocorrel features and c) binominal
> features.
> To explain what I mean by c):
> Imagine you create normaldistributed (N(0,1)) returns. Then the
> generated time series of prices (price[i] = price[i1]*(returns[i]+1))
> will slightly tend to fall. This is obviously because of this: Imagine
> you have three returns generated, [.5; 0; .5], then the series will
> fall. It should be [.5;0;1] for the series to hold it's level,
> however P(X<.5) > P(X>1), X~N(0,1), so the series with returns mean 0
> is obviously to fall.
>
> Additionally, one could think of volatility features (such as
> suggested by GARCH).
>
>>  3) How can I create a time series with statistical features that are
>>  similar to most of the data from a set of given time series?
>>
>> See 1) and 2). Seriously :) The last paper presentation I saw was Diebold who
>> showed how to regenerate trade duration data, as well as high frequency vol,
>> from a "simple" four parameter model. And simple is a relative term  he
>> recaptured the features of his (SP100 equity TAQ) data set, but its not a
>> model you can code up in just a few lines.
> Okay, are there models to start with? They don't need to be perfect,
> because I want to use them for learning...
>
>>  4) Is there anything valuable which could make given data more
>>  exhaustible? Something like bootstrapping?
>>
>> Block bootstrap for time series is pretty well established, and the tseries
>> package even had a tsbootstrap() function for over a decade. You can (fairly
>> easily) extend similar schemes.
> Ok, thanks
>
>
> a
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rsigfinance>  Subscriberposting only. If you want to post, subscribe first.
>  Also note that this is not the rhelp list where general R questions should go.
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


Hi Matthew,
2012/10/28 Matthew Gilbert < [hidden email]>:
> The books "Analysis of Financial Time Series" by Ruey Tsay and "Statistics
> of Financial Markets" by Franke, Hardle and Hafner are both good references.
Thank your for this hints!
> But ultimately if the end goal is to test a trading strategy why simulate
> your own data? It seems like a lot of work and the end result would be to
> generate a profitable strategy on fictitious data?
No, the goal should NOT be to have a trading strategy. The goal is to
find some rational bahaviors.
For example: Given special characteristics of prcing data, is it
rational to invest 300000 $ directly or to invest 100000 $ at each
month's first trading day for three month. What will the result likely
be in 12 months?
Is it rational to take some profits?
...
That is not the same as a strategy "buy if MA crosses price" or
something like that. It is rather an market condition independent
bahavior. If one cannot "predict" the market, is it possible to reduce
risk or gain extra returns if one does other things like buy and hold,
but not with any information influence, only by bhavioral patterns.
That's why I called it "bahavior" rather than "strategy".
Why not on live data?
I could run simulations on 500 stocks (e.g. from SP500). But to
eliminate survivorship bias etc. and to run much more tests (1000s to
10000s) it sounds more suitable to run against artificial market data.
Maybe special characteristics are revealed which gives an insight to
"black swans" which are not obvious from real data.
a
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


You might find an agent based modelling approach useful  one
interesting implementation of which can be found here:
http://fimas.sourceforge.net/project_info.htmAlexios
On 28/10/12 16:24, Alex Grund wrote:
> Hi Matthew,
>
> 2012/10/28 Matthew Gilbert < [hidden email]>:
>> The books "Analysis of Financial Time Series" by Ruey Tsay and "Statistics
>> of Financial Markets" by Franke, Hardle and Hafner are both good references.
>
> Thank your for this hints!
>
>> But ultimately if the end goal is to test a trading strategy why simulate
>> your own data? It seems like a lot of work and the end result would be to
>> generate a profitable strategy on fictitious data?
>
> No, the goal should NOT be to have a trading strategy. The goal is to
> find some rational bahaviors.
> For example: Given special characteristics of prcing data, is it
> rational to invest 300000 $ directly or to invest 100000 $ at each
> month's first trading day for three month. What will the result likely
> be in 12 months?
> Is it rational to take some profits?
> ...
>
> That is not the same as a strategy "buy if MA crosses price" or
> something like that. It is rather an market condition independent
> bahavior. If one cannot "predict" the market, is it possible to reduce
> risk or gain extra returns if one does other things like buy and hold,
> but not with any information influence, only by bhavioral patterns.
>
> That's why I called it "bahavior" rather than "strategy".
>
> Why not on live data?
> I could run simulations on 500 stocks (e.g. from SP500). But to
> eliminate survivorship bias etc. and to run much more tests (1000s to
> 10000s) it sounds more suitable to run against artificial market data.
> Maybe special characteristics are revealed which gives an insight to
> "black swans" which are not obvious from real data.
>
>
> a
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rsigfinance>  Subscriberposting only. If you want to post, subscribe first.
>  Also note that this is not the rhelp list where general R questions should go.
>
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


looks nice, thank you very much for the link, I'll have a more
detailled look soon and will come back with my thoughts on this. a
2012/10/28 alexios ghalanos < [hidden email]>:
> You might find an agent based modelling approach useful  one interesting
> implementation of which can be found here:
> http://fimas.sourceforge.net/project_info.htm>
> Alexios
>
> On 28/10/12 16:24, Alex Grund wrote:
>>
>> Hi Matthew,
>>
>> 2012/10/28 Matthew Gilbert < [hidden email]>:
>>>
>>> The books "Analysis of Financial Time Series" by Ruey Tsay and
>>> "Statistics
>>> of Financial Markets" by Franke, Hardle and Hafner are both good
>>> references.
>>
>>
>> Thank your for this hints!
>>
>>> But ultimately if the end goal is to test a trading strategy why simulate
>>> your own data? It seems like a lot of work and the end result would be to
>>> generate a profitable strategy on fictitious data?
>>
>>
>> No, the goal should NOT be to have a trading strategy. The goal is to
>> find some rational bahaviors.
>> For example: Given special characteristics of prcing data, is it
>> rational to invest 300000 $ directly or to invest 100000 $ at each
>> month's first trading day for three month. What will the result likely
>> be in 12 months?
>> Is it rational to take some profits?
>> ...
>>
>> That is not the same as a strategy "buy if MA crosses price" or
>> something like that. It is rather an market condition independent
>> bahavior. If one cannot "predict" the market, is it possible to reduce
>> risk or gain extra returns if one does other things like buy and hold,
>> but not with any information influence, only by bhavioral patterns.
>>
>> That's why I called it "bahavior" rather than "strategy".
>>
>> Why not on live data?
>> I could run simulations on 500 stocks (e.g. from SP500). But to
>> eliminate survivorship bias etc. and to run much more tests (1000s to
>> 10000s) it sounds more suitable to run against artificial market data.
>> Maybe special characteristics are revealed which gives an insight to
>> "black swans" which are not obvious from real data.
>>
>>
>> a
>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rsigfinance>>  Subscriberposting only. If you want to post, subscribe first.
>>  Also note that this is not the rhelp list where general R questions
>> should go.
>>
>
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


If you are assuming a normal (or other
symmetric) distribution for returns,
then those will be log returns rather
than simple returns*. So the price series
will be generated by:
initialPrice * exp(c(0, cumsum(returnVector)))
I would suggest garch** simulations as a starting
point. The most obvious feature of market returns
not being IID is volatility clustering. If what
you care about depends on autocorrelation, then
you could include nonzero ARMA in your model.
But I think you should keep Dirk's caution in mind.
The best models will depend on the question.
* http://www.portfolioprobe.com/2010/10/04/ataleoftworeturns/**
http://www.portfolioprobe.com/2012/07/06/apracticalintroductiontogarchmodeling/Pat
On 28/10/2012 13:22, Alex Grund wrote:
> Hi Dirk,
>
> thanks for your reply.
>
> 2012/10/28 Dirk Eddelbuettel < [hidden email]>:
>
>> There are libraries full of papers and dissertations on this.
>
> Okay, could you please mention a few valuable papers? So that I can search more?
>
>> See 1). Which features?
>
> Basically, I started from the naive question: "How to create a time
> series that "looks" like a stock price process over time".
> So, the basic features I came through has been a) the distribution of
> the (daily) returns, b) their autocorrel features and c) binominal
> features.
> To explain what I mean by c):
> Imagine you create normaldistributed (N(0,1)) returns. Then the
> generated time series of prices (price[i] = price[i1]*(returns[i]+1))
> will slightly tend to fall. This is obviously because of this: Imagine
> you have three returns generated, [.5; 0; .5], then the series will
> fall. It should be [.5;0;1] for the series to hold it's level,
> however P(X<.5) > P(X>1), X~N(0,1), so the series with returns mean 0
> is obviously to fall.
>
> Additionally, one could think of volatility features (such as
> suggested by GARCH).
>
>>  3) How can I create a time series with statistical features that are
>>  similar to most of the data from a set of given time series?
>>
>> See 1) and 2). Seriously :) The last paper presentation I saw was Diebold who
>> showed how to regenerate trade duration data, as well as high frequency vol,
>> from a "simple" four parameter model. And simple is a relative term  he
>> recaptured the features of his (SP100 equity TAQ) data set, but its not a
>> model you can code up in just a few lines.
>
> Okay, are there models to start with? They don't need to be perfect,
> because I want to use them for learning...
>
>>  4) Is there anything valuable which could make given data more
>>  exhaustible? Something like bootstrapping?
>>
>> Block bootstrap for time series is pretty well established, and the tseries
>> package even had a tsbootstrap() function for over a decade. You can (fairly
>> easily) extend similar schemes.
>
> Ok, thanks
>
>
> a
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rsigfinance>  Subscriberposting only. If you want to post, subscribe first.
>  Also note that this is not the rhelp list where general R questions should go.
>

Patrick Burns
[hidden email]
http://www.burnsstat.comhttp://www.portfolioprobe.com/blogtwitter: @portfolioprobe
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


Hi Alex: The paper below explains how Mandelbrot did what you're
describing. There's no pseudocode so programming what he describes could
be an interesting challenge. If you use it and get anywhere with it, let
me know. Good luck.
jamesgoulding.com/Research_II/Mandelbrot/Mandelbrot (MMAR, Multifractal
Model of Asset Returns).pdf
On Sun, Oct 28, 2012 at 8:21 AM, Alex Grund < [hidden email]>wrote:
> Hi,
>
> I would like to explore some basic investment "behaviors" (not real
> quant "strategies"), such like the cost average effect.
>
> Therefore, I would like to create artificial time series with similar
> statistical features as real stock price time series.
>
>
> 1) How could I create them? What is a common distribution function to
> get returns from? (Without having reference data)
>
> 2) How can I create a time series with similar features as a given time
> series?
>
> 3) How can I create a time series with statistical features that are
> similar to most of the data from a set of given time series?
>
> 4) Is there anything valuable which could make given data more
> exhaustible? Something like bootstrapping?
>
>
> Thanks
>
> a
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rsigfinance>  Subscriberposting only. If you want to post, subscribe first.
>  Also note that this is not the rhelp list where general R questions
> should go.
>
[[alternative HTML version deleted]]
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.


And in the spirit of OSS, share your code (if you get that far) with the list!
Jeff
On Sun, Oct 28, 2012 at 11:23 AM, Mark Leeds < [hidden email]> wrote:
> Hi Alex: The paper below explains how Mandelbrot did what you're
> describing. There's no pseudocode so programming what he describes could
> be an interesting challenge. If you use it and get anywhere with it, let
> me know. Good luck.
>
> jamesgoulding.com/Research_II/Mandelbrot/Mandelbrot (MMAR, Multifractal
> Model of Asset Returns).pdf
>
> On Sun, Oct 28, 2012 at 8:21 AM, Alex Grund < [hidden email]>wrote:
>
>> Hi,
>>
>> I would like to explore some basic investment "behaviors" (not real
>> quant "strategies"), such like the cost average effect.
>>
>> Therefore, I would like to create artificial time series with similar
>> statistical features as real stock price time series.
>>
>>
>> 1) How could I create them? What is a common distribution function to
>> get returns from? (Without having reference data)
>>
>> 2) How can I create a time series with similar features as a given time
>> series?
>>
>> 3) How can I create a time series with statistical features that are
>> similar to most of the data from a set of given time series?
>>
>> 4) Is there anything valuable which could make given data more
>> exhaustible? Something like bootstrapping?
>>
>>
>> Thanks
>>
>> a
>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rsigfinance>>  Subscriberposting only. If you want to post, subscribe first.
>>  Also note that this is not the rhelp list where general R questions
>> should go.
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rsigfinance>  Subscriberposting only. If you want to post, subscribe first.
>  Also note that this is not the rhelp list where general R questions should go.

Jeffrey Ryan
[hidden email]
www.lemnica.com
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rsigfinance Subscriberposting only. If you want to post, subscribe first.
 Also note that this is not the rhelp list where general R questions should go.

