correlation based time series clustering?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

correlation based time series clustering?

LosemindL
Hi all,

I am looking for a function for correlation based time-series clustering in
R... I have googled for quite a while and couldn't find any in R...

Could you please help me?

Thanks a lot!

        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: correlation based time series clustering?

julien cuisinier

Hi Michael,



A very general question here with little input from you...I am not surprised to see little feedback

I have been looking for something similar & same result so I do not think it exist yet. I am a complete newbie in clustering but looking around there are plenty of R function available, nothing that I could find as simple as using correlation per se.

Thinking about it Im not sure how it would work & anything I can think of would be quite sensitive to the starting point (e.g. calculate pair-wise correls within a market, then start by one stock & cluster with it all other stocks with corrells higher than a certain threshold?) May be some recursive function trying many different starting points? But then what to do with the resulting different cluster structure?

Could you share with the list what reference (not in R) you found on the topic? That would be great if you could share / bring something to the list as well & then see if we can build that in? (very very ambitious of me here =)



Thanks & regards,
Julien









> Date: Tue, 21 Feb 2012 15:05:59 -0600
> From: [hidden email]
> To: [hidden email]
> Subject: [R-SIG-Finance] correlation based time series clustering?
>
> Hi all,
>
> I am looking for a function for correlation based time-series clustering in
> R... I have googled for quite a while and couldn't find any in R...
>
> Could you please help me?
>
> Thanks a lot!
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
     
        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: correlation based time series clustering?

Murali.Menon-3
Folks,

There's been quite a bit of work on clustering in finance:

http://www.mendeley.com/research/correlation-based-hierarchical-clustering-in-financial-time-series/

I think many of Mantegna's works are available for download.

Very simply, though, you can calculate a correlation matrix given your time series of returns, and then apply, say, a medoid-clustering scheme (e.g. the pam function in package 'cluster') and see what happens. According to Mantegna, if you do this on the components of the Dow Jones, the various industrial types very naturally form individual clusters.

Hope this helps.

Murali

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of julien cuisinier
Sent: 23 February 2012 09:29
To: [hidden email]; [hidden email]
Subject: Re: [R-SIG-Finance] correlation based time series clustering?


Hi Michael,



A very general question here with little input from you...I am not surprised to see little feedback

I have been looking for something similar & same result so I do not think it exist yet. I am a complete newbie in clustering but looking around there are plenty of R function available, nothing that I could find as simple as using correlation per se.

Thinking about it Im not sure how it would work & anything I can think of would be quite sensitive to the starting point (e.g. calculate pair-wise correls within a market, then start by one stock & cluster with it all other stocks with corrells higher than a certain threshold?) May be some recursive function trying many different starting points? But then what to do with the resulting different cluster structure?

Could you share with the list what reference (not in R) you found on the topic? That would be great if you could share / bring something to the list as well & then see if we can build that in? (very very ambitious of me here =)



Thanks & regards,
Julien









> Date: Tue, 21 Feb 2012 15:05:59 -0600
> From: [hidden email]
> To: [hidden email]
> Subject: [R-SIG-Finance] correlation based time series clustering?
>
> Hi all,
>
> I am looking for a function for correlation based time-series clustering in
> R... I have googled for quite a while and couldn't find any in R...
>
> Could you please help me?
>
> Thanks a lot!
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
     
        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: correlation based time series clustering?

julien cuisinier

That is very useful, many thanks Murali!

Rgds,
Julien


> From: [hidden email]
> To: [hidden email]; [hidden email]; [hidden email]
> Date: Thu, 23 Feb 2012 09:47:52 +0000
> Subject: RE: [R-SIG-Finance] correlation based time series clustering?
>
> Folks,
>
> There's been quite a bit of work on clustering in finance:
>
> http://www.mendeley.com/research/correlation-based-hierarchical-clustering-in-financial-time-series/
>
> I think many of Mantegna's works are available for download.
>
> Very simply, though, you can calculate a correlation matrix given your time series of returns, and then apply, say, a medoid-clustering scheme (e.g. the pam function in package 'cluster') and see what happens. According to Mantegna, if you do this on the components of the Dow Jones, the various industrial types very naturally form individual clusters.
>
> Hope this helps.
>
> Murali
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of julien cuisinier
> Sent: 23 February 2012 09:29
> To: [hidden email]; [hidden email]
> Subject: Re: [R-SIG-Finance] correlation based time series clustering?
>
>
> Hi Michael,
>
>
>
> A very general question here with little input from you...I am not surprised to see little feedback
>
> I have been looking for something similar & same result so I do not think it exist yet. I am a complete newbie in clustering but looking around there are plenty of R function available, nothing that I could find as simple as using correlation per se.
>
> Thinking about it Im not sure how it would work & anything I can think of would be quite sensitive to the starting point (e.g. calculate pair-wise correls within a market, then start by one stock & cluster with it all other stocks with corrells higher than a certain threshold?) May be some recursive function trying many different starting points? But then what to do with the resulting different cluster structure?
>
> Could you share with the list what reference (not in R) you found on the topic? That would be great if you could share / bring something to the list as well & then see if we can build that in? (very very ambitious of me here =)
>
>
>
> Thanks & regards,
> Julien
>
>
>
>
>
>
>
>
>
> > Date: Tue, 21 Feb 2012 15:05:59 -0600
> > From: [hidden email]
> > To: [hidden email]
> > Subject: [R-SIG-Finance] correlation based time series clustering?
> >
> > Hi all,
> >
> > I am looking for a function for correlation based time-series clustering in
> > R... I have googled for quite a while and couldn't find any in R...
> >
> > Could you please help me?
> >
> > Thanks a lot!
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> > -- Subscriber-posting only. If you want to post, subscribe first.
> > -- Also note that this is not the r-help list where general R questions should go.
>      
> [[alternative HTML version deleted]]
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
     
        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: correlation based time series clustering?

Patrick Burns-2
In reply to this post by Murali.Menon-3
You can just use:

1 - cor.matrix

as your distance matrix in the clustering
(there are lots of choices in R).

You can see how stable your results are by
using bootstrap estimates of the correlation.

Pat

On 23/02/2012 09:47, [hidden email] wrote:

> Folks,
>
> There's been quite a bit of work on clustering in finance:
>
> http://www.mendeley.com/research/correlation-based-hierarchical-clustering-in-financial-time-series/
>
> I think many of Mantegna's works are available for download.
>
> Very simply, though, you can calculate a correlation matrix given your time series of returns, and then apply, say, a medoid-clustering scheme (e.g. the pam function in package 'cluster') and see what happens. According to Mantegna, if you do this on the components of the Dow Jones, the various industrial types very naturally form individual clusters.
>
> Hope this helps.
>
> Murali
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of julien cuisinier
> Sent: 23 February 2012 09:29
> To: [hidden email]; [hidden email]
> Subject: Re: [R-SIG-Finance] correlation based time series clustering?
>
>
> Hi Michael,
>
>
>
> A very general question here with little input from you...I am not surprised to see little feedback
>
> I have been looking for something similar&  same result so I do not think it exist yet. I am a complete newbie in clustering but looking around there are plenty of R function available, nothing that I could find as simple as using correlation per se.
>
> Thinking about it Im not sure how it would work&  anything I can think of would be quite sensitive to the starting point (e.g. calculate pair-wise correls within a market, then start by one stock&  cluster with it all other stocks with corrells higher than a certain threshold?) May be some recursive function trying many different starting points? But then what to do with the resulting different cluster structure?
>
> Could you share with the list what reference (not in R) you found on the topic? That would be great if you could share / bring something to the list as well&  then see if we can build that in? (very very ambitious of me here =)
>
>
>
> Thanks&  regards,
> Julien
>
>
>
>
>
>
>
>
>
>> Date: Tue, 21 Feb 2012 15:05:59 -0600
>> From: [hidden email]
>> To: [hidden email]
>> Subject: [R-SIG-Finance] correlation based time series clustering?
>>
>> Hi all,
>>
>> I am looking for a function for correlation based time-series clustering in
>> R... I have googled for quite a while and couldn't find any in R...
>>
>> Could you please help me?
>>
>> Thanks a lot!
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions should go.
>    
> [[alternative HTML version deleted]]
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
>

--
Patrick Burns
[hidden email]
http://www.burns-stat.com
http://www.portfolioprobe.com/blog
twitter: @portfolioprobe

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: correlation based time series clustering?

Vincent Zoonekynd
In reply to this post by julien cuisinier
Here are a few ideas to cluster time series,
with more references.

1. Build the minimum spanning tree on the correlation
matrix. The result is usually very noisy: you may
want to resample the data to see how the trees
change.
This usually gives acceptable results: for
instance, you can often recognise industry groups
from daily or weekly stock returns.
A few references:
  An introduction to econophysics, Correlations and complexity in
finance, R.N. Mantegna and H.E. Stanley (2000)
  http://arxiv.org/abs/cond-mat/0302546
  http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1617257
  http://arxiv.org/abs/0806.4714
  http://arxiv.org/abs/0708.0562
  http://arxiv.org/abs/cond-mat/0412411

2. Threshold the correlation matrix and consider the
result as the incidence matrix of a graph: its
connected components can be interpreted as
clusters.

3. Convert the correlation matrix to a distance
matrix, and apply the standard clustering
algorithms: k-means, hierarchical clustering,
Kohonen networks, etc.  You may want to try these
with various estimators of the correlation matrix:
for instance, shrinkage estimators should help
reduce the noise in the data.

4. If you accept methods not based on correlation,
you can model your times series, e.g., with
econometric models (ARMA, GARCH, etc.), stochastic
differential equations (the "Markov operator
distance" at the end of "Option pricing and
estimation of financial models with R", by
S.M. Iacus), wavelet decomposition, iSAX
(http://www.cs.ucr.edu/~eamonn/iSAX/iSAX.html),
etc., and cluster the coefficients of those
models.

-- Vincent

On 23 February 2012 18:29, julien cuisinier <[hidden email]> wrote:

>
> Hi Michael,
>
>
>
> A very general question here with little input from you...I am not surprised to see little feedback
>
> I have been looking for something similar & same result so I do not think it exist yet. I am a complete newbie in clustering but looking around there are plenty of R function available, nothing that I could find as simple as using correlation per se.
>
> Thinking about it Im not sure how it would work & anything I can think of would be quite sensitive to the starting point (e.g. calculate pair-wise correls within a market, then start by one stock & cluster with it all other stocks with corrells higher than a certain threshold?) May be some recursive function trying many different starting points? But then what to do with the resulting different cluster structure?
>
> Could you share with the list what reference (not in R) you found on the topic? That would be great if you could share / bring something to the list as well & then see if we can build that in? (very very ambitious of me here =)
>
>
>
> Thanks & regards,
> Julien

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: correlation based time series clustering?

asethy
You might also like to have a look at use of random matrix theory:

http://arxiv.org/pdf/cond-mat/0508122.pdf
https://www.math.nyu.edu/faculty/avellane/LalouxPCA.pdf

Regards
Anmol Sethy

On Thu, Feb 23, 2012 at 6:17 PM, Vincent Zoonekynd <[hidden email]> wrote:

> Here are a few ideas to cluster time series,
> with more references.
>
> 1. Build the minimum spanning tree on the correlation
> matrix. The result is usually very noisy: you may
> want to resample the data to see how the trees
> change.
> This usually gives acceptable results: for
> instance, you can often recognise industry groups
> from daily or weekly stock returns.
> A few references:
>  An introduction to econophysics, Correlations and complexity in
> finance, R.N. Mantegna and H.E. Stanley (2000)
>  http://arxiv.org/abs/cond-mat/0302546
>  http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1617257
>  http://arxiv.org/abs/0806.4714
>  http://arxiv.org/abs/0708.0562
>  http://arxiv.org/abs/cond-mat/0412411
>
> 2. Threshold the correlation matrix and consider the
> result as the incidence matrix of a graph: its
> connected components can be interpreted as
> clusters.
>
> 3. Convert the correlation matrix to a distance
> matrix, and apply the standard clustering
> algorithms: k-means, hierarchical clustering,
> Kohonen networks, etc.  You may want to try these
> with various estimators of the correlation matrix:
> for instance, shrinkage estimators should help
> reduce the noise in the data.
>
> 4. If you accept methods not based on correlation,
> you can model your times series, e.g., with
> econometric models (ARMA, GARCH, etc.), stochastic
> differential equations (the "Markov operator
> distance" at the end of "Option pricing and
> estimation of financial models with R", by
> S.M. Iacus), wavelet decomposition, iSAX
> (http://www.cs.ucr.edu/~eamonn/iSAX/iSAX.html),
> etc., and cluster the coefficients of those
> models.
>
> -- Vincent
>
> On 23 February 2012 18:29, julien cuisinier <[hidden email]>
> wrote:
> >
> > Hi Michael,
> >
> >
> >
> > A very general question here with little input from you...I am not
> surprised to see little feedback
> >
> > I have been looking for something similar & same result so I do not
> think it exist yet. I am a complete newbie in clustering but looking around
> there are plenty of R function available, nothing that I could find as
> simple as using correlation per se.
> >
> > Thinking about it Im not sure how it would work & anything I can think
> of would be quite sensitive to the starting point (e.g. calculate pair-wise
> correls within a market, then start by one stock & cluster with it all
> other stocks with corrells higher than a certain threshold?) May be some
> recursive function trying many different starting points? But then what to
> do with the resulting different cluster structure?
> >
> > Could you share with the list what reference (not in R) you found on the
> topic? That would be great if you could share / bring something to the list
> as well & then see if we can build that in? (very very ambitious of me here
> =)
> >
> >
> >
> > Thanks & regards,
> > Julien
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>

        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.