Cluster analysis

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Cluster analysis

Pablo Cerdeira
Hi all,

I have no idea if this question is to easy to be answered, but I´m starting
with R. So, here we go.

I have a large dataset with a lot of steps a judicial case. A sample is
attached.

I´d like to do a cluster analysis to try to understand with one is the most
usual path followed by this legal cases.

After that, I´d like to plot a cluster tree.

In the attached sample, the column:

- "id_processo" is the primary key of a legal case;
- "number" is the "step number" in the legal case;
- "andamento" is the description of the legal case step.

I have no idea on how to do it using R. Can someone help me?

Thanks in advanced

--
*Pablo de Camargo Cerdeira*
[hidden email]
[hidden email]
+55 (21) 3799-6065

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Odp: Cluster analysis

PIKAL Petr
Hi

you can look at package cluster or maybe mvpart or tree. You could also
look to CRAN search facilities where you can find other possible packages.

BTW there is no attached sample data, the list has strict policy for
allowed attachments. See Posting Guide

Regards
Petr

[hidden email] napsal dne 26.07.2010 18:43:07:

> Hi all,
>
> I have no idea if this question is to easy to be answered, but I´m
starting
> with R. So, here we go.
>
> I have a large dataset with a lot of steps a judicial case. A sample is
> attached.
>
> I´d like to do a cluster analysis to try to understand with one is the
most

> usual path followed by this legal cases.
>
> After that, I´d like to plot a cluster tree.
>
> In the attached sample, the column:
>
> - "id_processo" is the primary key of a legal case;
> - "number" is the "step number" in the legal case;
> - "andamento" is the description of the legal case step.
>
> I have no idea on how to do it using R. Can someone help me?
>
> Thanks in advanced
>
> --
> *Pablo de Camargo Cerdeira*
> [hidden email]
> [hidden email]
> +55 (21) 3799-6065
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Cluster analysis

Jim Porzak
In reply to this post by Pablo Cerdeira
Pablo, we've had success using
http://mephisto.unige.ch/traminer/preview.shtml to look at marketing paths.
Question would be how many distinct case step discriptions are there?

HTH, Jim

On Jul 26, 2010 9:44 AM, "Pablo Cerdeira" <[hidden email]> wrote:

Hi all,

I have no idea if this question is to easy to be answered, but I´m starting
with R. So, here we go.

I have a large dataset with a lot of steps a judicial case. A sample is
attached.

I´d like to do a cluster analysis to try to understand with one is the most
usual path followed by this legal cases.

After that, I´d like to plot a cluster tree.

In the attached sample, the column:

- "id_processo" is the primary key of a legal case;
- "number" is the "step number" in the legal case;
- "andamento" is the description of the legal case step.

I have no idea on how to do it using R. Can someone help me?

Thanks in advanced

--
*Pablo de Camargo Cerdeira*
[hidden email]
[hidden email]
+55 (21) 3799-6065

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Cluster analysis

Pablo Cerdeira
In reply to this post by Pablo Cerdeira
Hi Allan,

It helps a lot. I´ll try to read more about it.

But, as you asked me, here goes a brief explanation about the necessary
columns of the sample date paste at the end:

id_processo: identify a legal case, it is its primary key.
ordem_andamento: is the step number inside a legal case (id_processo);
id_andamento: is the primary key of the step.

I´d like to identify the most commom steps (id_andamento) sequence
(ordem_andamento) inside a lot of legal cases (id_processo). Probably a
cluster analysis with a dendogram plot is what I´m looking for.

Here goes the sample of two different legal cases (2 different
id_processo):

Best regards and thank you in advanced

id_processo,proc_num,ordem_andamento,id_andamento,andamento,data,dias,origem_tribunal,data_entrada,relator,duracao_dias
1480010,1,1,208,DISTRIBUIDO,"1988-10-06 00:00:00",5,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,2,69,CONCLUSAO,"1988-10-06 00:00:00",0,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,3,180,"DESPACHO ORDINATORIO","1988-10-11 00:00:00",8,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,4,465,"PEDIDO DE INFORMACOES","1988-10-19 00:00:00",1,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,5,465,"PEDIDO DE INFORMACOES","1988-10-20 00:00:00",15,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,6,241,"INFORMACOES RECEBIDAS, OFICIO NRO.:","1988-11-04
00:00:00",24,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN.
CÉLIO BORJA",1251
1480010,1,7,241,"INFORMACOES RECEBIDAS, OFICIO NRO.:","1988-11-28
00:00:00",0,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN.
CÉLIO BORJA",1251
1480010,1,8,69,CONCLUSAO,"1988-11-28 00:00:00",38,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,9,584,"VISTA AO PROCURADOR-GERAL DA REPUBLICA","1989-01-05
00:00:00",874,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN.
CÉLIO BORJA",1251
1480010,1,10,26,"AUTOS DEVOLVIDOS","1991-05-29 00:00:00",8,"FÓRUM DA COMARCA
DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,11,75,"CONCLUSOS AO RELATOR","1991-05-29 00:00:00",0,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,12,578,"VISTA AO ADVOGADO-GERAL DA UNIAO","1991-06-06
00:00:00",232,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN.
CÉLIO BORJA",1251
1480010,1,13,507,"RECEBIMENTO DOS AUTOS","1992-01-24 00:00:00",10,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,14,75,"CONCLUSOS AO RELATOR","1992-02-03 00:00:00",21,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,15,284,"JULG. POR DESPACHO - NEGADO SEGUIMENTO","1992-02-24
00:00:00",3,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN.
CÉLIO BORJA",1251
1480010,1,16,497,"PUBLICADO DESPACHO NO DJ","1992-02-27 00:00:00",12,"FÓRUM
DA COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,17,163,"DECORRIDO O PRAZO","1992-03-10 00:00:00",0,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480010,1,18,34,"BAIXA AO ARQUIVO DO STF","1992-03-10 00:00:00",0,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-06 00:00:00","MIN. CÉLIO BORJA",1251
1480183,2,1,208,DISTRIBUIDO,"1988-10-12 00:00:00",8,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,2,69,CONCLUSAO,"1988-10-12 00:00:00",0,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,3,352,"JULGAMENTO NO PLENO","1988-10-20 00:00:00",22,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,4,476,"PETICAO AVULSA","1988-11-11 00:00:00",13,"FÓRUM DA COMARCA
DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,5,531,"REMESSA DOS AUTOS","1988-11-11 00:00:00",0,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,6,495,"PUBLICADO ACORDAO, DJ:","1988-11-24 00:00:00",11,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,7,163,"DECORRIDO O PRAZO","1988-12-05 00:00:00",8,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,8,241,"INFORMACOES RECEBIDAS, OFICIO NRO.:","1988-12-13
00:00:00",63,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN.
PAULO BROSSARD",6677
1480183,2,9,69,CONCLUSAO,"1988-12-13 00:00:00",0,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,10,584,"VISTA AO PROCURADOR-GERAL DA REPUBLICA","1989-02-14
00:00:00",83,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN.
PAULO BROSSARD",6677
1480183,2,11,69,CONCLUSAO,"1989-05-08 00:00:00",91,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,12,584,"VISTA AO PROCURADOR-GERAL DA REPUBLICA","1989-08-07
00:00:00",21,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN.
PAULO BROSSARD",6677
1480183,2,13,69,CONCLUSAO,"1989-08-28 00:00:00",2,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,14,484,"PROCESSO A JULGAMENTO - PAUTA, DJ:","1989-08-30
00:00:00",13,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN.
PAULO BROSSARD",6677
1480183,2,15,69,CONCLUSAO,"1989-09-12 00:00:00",2,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,16,151,"DECISAO INTERLOCUTORIA","1989-09-14 00:00:00",7,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,17,476,"PETICAO AVULSA","1989-09-21 00:00:00",13,"FÓRUM DA COMARCA
DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,18,69,CONCLUSAO,"1989-09-21 00:00:00",0,"FÓRUM DA COMARCA DE
RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,19,151,"DECISAO INTERLOCUTORIA","1989-10-04 00:00:00",635,"FÓRUM
DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,20,3,"ADIADO O JULGAMENTO","1991-07-01 00:00:00",32,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,21,158,"DECISAO PUBLICADA, DJ:","1991-08-02 00:00:00",139,"FÓRUM
DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,22,3,"ADIADO O JULGAMENTO","1991-12-19 00:00:00",49,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,23,336,"JULGAMENTO DO PLENO - NAO CONHECIDO","1992-02-06
00:00:00",6,"FÓRUM DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN.
PAULO BROSSARD",6677
1480183,2,24,158,"DECISAO PUBLICADA, DJ:","1992-02-12 00:00:00",2,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,25,158,"DECISAO PUBLICADA, DJ:","1992-02-14 00:00:00",2107,"FÓRUM
DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,26,495,"PUBLICADO ACORDAO, DJ:","1997-11-21 00:00:00",12,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,27,565,"TRANSITADO EM JULGADO","1997-12-03 00:00:00",3338,"FÓRUM
DA COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677
1480183,2,28,34,"BAIXA AO ARQUIVO DO STF","2007-01-23 00:00:00",0,"FÓRUM DA
COMARCA DE RANCHARIA","1988-10-12 00:00:00","MIN. PAULO BROSSARD",6677

Best regards

On Tue, Jul 27, 2010 at 7:10 AM, Allan Engelhardt <[hidden email]> wrote:

>  Your attachments are scrubbed by the list server so we never see them.
>
> The maptree package may be what you are after, but post a copy of your data
> and the code that builds the tree, and we may be able to help you.  Also, if
> my.tree is your model tree, try methods(class=class(my.tree)).  And
> RSiteSearch is a useful function.
>
> Hope this helps a little.
>
>
> On 26/07/10 17:43, Pablo Cerdeira wrote:
>
> Hi all,
>
> I have no idea if this question is to easy to be answered, but I´m starting
> with R. So, here we go.
>
> I have a large dataset with a lot of steps a judicial case. A sample is
> attached.
>
> I´d like to do a cluster analysis to try to understand with one is the most
> usual path followed by this legal cases.
>
> After that, I´d like to plot a cluster tree.
>
> In the attached sample, the column:
>
> - "id_processo" is the primary key of a legal case;
> - "number" is the "step number" in the legal case;
> - "andamento" is the description of the legal case step.
>
> I have no idea on how to do it using R. Can someone help me?
>
> Thanks in advanced
>
>
>
>
> [hidden email] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

--
*Pablo de Camargo Cerdeira*
[hidden email]
[hidden email]
+55 (21) 3799-6065

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Cluster analysis

Pablo Cerdeira
In reply to this post by Jim Porzak
Hi Jim,

Ow! Very nice job at http://mephisto.unige.ch/traminer/preview.shtml I´m
going to read more about it.

I have a lot of different steps, in a sequence. Actually, 586 different
possible steps, but I have 4269 legal cases, with a maximum of 379 steps
each one.

If you want, I can send this dataset to you.

Best regards and thank you very much,



On Tue, Jul 27, 2010 at 10:16 AM, Jim Porzak <[hidden email]> wrote:

> Pablo, we've had success using
> http://mephisto.unige.ch/traminer/preview.shtml to look at marketing
> paths. Question would be how many distinct case step discriptions are there?
>
> HTH, Jim
>
> On Jul 26, 2010 9:44 AM, "Pablo Cerdeira" <[hidden email]>
> wrote:
>
> Hi all,
>
> I have no idea if this question is to easy to be answered, but I´m starting
> with R. So, here we go.
>
> I have a large dataset with a lot of steps a judicial case. A sample is
> attached.
>
> I´d like to do a cluster analysis to try to understand with one is the most
> usual path followed by this legal cases.
>
> After that, I´d like to plot a cluster tree.
>
> In the attached sample, the column:
>
> - "id_processo" is the primary key of a legal case;
> - "number" is the "step number" in the legal case;
> - "andamento" is the description of the legal case step.
>
> I have no idea on how to do it using R. Can someone help me?
>
> Thanks in advanced
>
> --
> *Pablo de Camargo Cerdeira*
> [hidden email]
> [hidden email]
> +55 (21) 3799-6065
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

--
*Pablo de Camargo Cerdeira*
[hidden email]
[hidden email]
+55 (21) 3799-6065

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.