Pattern Analysis Libraries

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Pattern Analysis Libraries

reichmaj
R-Help

I have a need to find aggregated patterns within a data.frame of some 80
million records and wanted to know if there are any packages which could be
used to find patterns by row. For example

Col 1 Col 2 Col3
A 1 aa
A 2 bb
A 1 aa

In this example pattern A - 1 - aa occurs twice, and A - 2 - bb occurs once.
Presently I'm simply concatenating the columns and performing a group by,
and count. Which works but wonder if there were any packages that would
perform such (and maybe other) analytics.

Sincerely

Jeff Reichman
(314) 457-1966

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Pattern Analysis Libraries

Bert Gunter-2
Your specification seems too vague to me. What sort of "patterns" are of
interest?

See also ?table on your "concatenated" columns, e.g. something like:

table(do.call(paste0, yourdata.frame))

or even

do.call(table,yourdata.frame)

for a contingency table.

There are books written on the "analytics" (both statistical and graphical)
of multidimensional contingency tables and categorical data that you may
wish to consult some to get some more specific ideas.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Dec 16, 2019 at 11:13 AM Jeff Reichman <[hidden email]>
wrote:

> R-Help
>
> I have a need to find aggregated patterns within a data.frame of some 80
> million records and wanted to know if there are any packages which could be
> used to find patterns by row. For example
>
> Col 1   Col 2   Col3
> A       1       aa
> A       2       bb
> A       1       aa
>
> In this example pattern A - 1 - aa occurs twice, and A - 2 - bb occurs
> once.
> Presently I'm simply concatenating the columns and performing a group by,
> and count. Which works but wonder if there were any packages that would
> perform such (and maybe other) analytics.
>
> Sincerely
>
> Jeff Reichman
> (314) 457-1966
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.