

Hello,
I have a matrix mat (see dput(mat))
> mat
[,1] [,2]
[1,] 5 6
[2,] 6 5
[3,] 5 4
[4,] 5 5
....
I want the frequencies of the pairs in a new matrix, whereas the
combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat).
In other words: What is the probability of each combination (each row)
ignoring the order in the combination. As a result I would like to have a
matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear
in my matrix.
dput (mat)
structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
Thanks
Hermann
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Since order is not important to you, you can order your pairs (e.g. decreasing) before compiling the frequencies.
But I don't understand the second part about values "that do not appear in the matrix". Do you mean you want to assess all combinations? If that's the case I would think about a hash table or other indexed data structure, rather than iterating through a matrix.
B.
On Oct 6, 2015, at 4:59 PM, Hermann Norpois < [hidden email]> wrote:
> Hello,
>
> I have a matrix mat (see dput(mat))
>
>> mat
> [,1] [,2]
> [1,] 5 6
> [2,] 6 5
> [3,] 5 4
> [4,] 5 5
> ....
>
> I want the frequencies of the pairs in a new matrix, whereas the
> combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat).
> In other words: What is the probability of each combination (each row)
> ignoring the order in the combination. As a result I would like to have a
> matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear
> in my matrix.
>
> dput (mat)
> structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
>
> Thanks
> Hermann
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Ok, this was misleading. And was not that important. My result matrix
should look like this:
1 2 3 4 5 6 7 ...
1 p1 p2
2 p
3
4
p1 etc are the frequencies of the combinations
1 and 2 for instance do not appear in my example. So the values would be
zero. Actually, this part is not too important. I would be happy enough to
solve the challenge with the frequencies of the pairs.
Thanks Hermann
20151007 2:40 GMT+02:00 Boris Steipe < [hidden email]>:
> Since order is not important to you, you can order your pairs (e.g.
> decreasing) before compiling the frequencies.
> But I don't understand the second part about values "that do not appear in
> the matrix". Do you mean you want to assess all combinations? If that's the
> case I would think about a hash table or other indexed data structure,
> rather than iterating through a matrix.
>
>
> B.
>
>
>
> On Oct 6, 2015, at 4:59 PM, Hermann Norpois < [hidden email]> wrote:
>
> > Hello,
> >
> > I have a matrix mat (see dput(mat))
> >
> >> mat
> > [,1] [,2]
> > [1,] 5 6
> > [2,] 6 5
> > [3,] 5 4
> > [4,] 5 5
> > ....
> >
> > I want the frequencies of the pairs in a new matrix, whereas the
> > combination 5 and 6 is the same as 6 and 5 (see the first two rows of
> mat).
> > In other words: What is the probability of each combination (each row)
> > ignoring the order in the combination. As a result I would like to have a
> > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not
> appear
> > in my matrix.
> >
> > dput (mat)
> > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> >
> > Thanks
> > Hermann
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> > and provide commented, minimal, selfcontained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Still not sure I understand. But here is what I think you might mean:
# Your data
mat < structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
# Create a square matrix with enough space to have an element for each pair. Since
# order is not important, only the upper triangle is used. If the matrix is
# large and sparse, a different approach might be needed.
freq < matrix(numeric(max(mat) * max(mat)), nrow = max(mat), ncol = max(mat))
# Loop over your input
for (i in 1:nrow(mat)) {
# Sort the elements of a row by size.
x < sort(mat[i,])
# Increment the corresponding element of the frequency matrix
freq[x[1], x[2]] < freq[x[1], x[2]] + 1
}
freq
Cheers,
B.
On Oct 7, 2015, at 1:17 AM, Hermann Norpois < [hidden email]> wrote:
> Ok, this was misleading. And was not that important. My result matrix should look like this:
>
> 1 2 3 4 5 6 7 ...
> 1 p1 p2
> 2 p
> 3
> 4
>
> p1 etc are the frequencies of the combinations
>
> 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs.
> Thanks Hermann
>
> 20151007 2:40 GMT+02:00 Boris Steipe < [hidden email]>:
> Since order is not important to you, you can order your pairs (e.g. decreasing) before compiling the frequencies.
> But I don't understand the second part about values "that do not appear in the matrix". Do you mean you want to assess all combinations? If that's the case I would think about a hash table or other indexed data structure, rather than iterating through a matrix.
>
>
> B.
>
>
>
> On Oct 6, 2015, at 4:59 PM, Hermann Norpois < [hidden email]> wrote:
>
> > Hello,
> >
> > I have a matrix mat (see dput(mat))
> >
> >> mat
> > [,1] [,2]
> > [1,] 5 6
> > [2,] 6 5
> > [3,] 5 4
> > [4,] 5 5
> > ....
> >
> > I want the frequencies of the pairs in a new matrix, whereas the
> > combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat).
> > In other words: What is the probability of each combination (each row)
> > ignoring the order in the combination. As a result I would like to have a
> > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear
> > in my matrix.
> >
> > dput (mat)
> > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> >
> > Thanks
> > Hermann
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> > and provide commented, minimal, selfcontained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


As with Boris, I'm not sure what you are looking for, but this may help
> # To get all possibilities, create a grid
> grd < expand.grid(0:9, 0:9)
> # Extract those with smaller first column values
> grd < grd[grd$Var1 <= grd$Var2,]
> # Tabulate after pasting first and second column
> grd2 < data.frame(table(apply(grd, 1, paste0, collapse="  ")))
>
> # Combine the two tables and subtract 1 to get rid of the counts from grd2$Freq
> dta2 < rbind(grd2, dta)
> freqs < data.frame(xtabs(Freq~Var1, dta2)  1)
> str(freqs)
'data.frame': 55 obs. of 2 variables:
$ Var1: Factor w/ 55 levels "0  0","0  1",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Freq: num 0 0 0 0 0 0 0 0 0 0 ...
> freqs[c(40:50), ]
Var1 Freq
40 4  9 0
41 5  5 2
42 5  6 10
43 5  7 4
44 5  8 0
45 5  9 0
46 6  6 0
47 6  7 2
48 6  8 0
49 6  9 0
50 7  7 0

David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 778404352
Original Message
From: Rhelp [mailto: [hidden email]] On Behalf Of Boris Steipe
Sent: Wednesday, October 7, 2015 8:10 AM
To: Hermann Norpois
Cc: rhelp
Subject: Re: [R] Measure the frequencies of pairs in a matrix
Still not sure I understand. But here is what I think you might mean:
# Your data
mat < structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
# Create a square matrix with enough space to have an element for each pair. Since
# order is not important, only the upper triangle is used. If the matrix is
# large and sparse, a different approach might be needed.
freq < matrix(numeric(max(mat) * max(mat)), nrow = max(mat), ncol = max(mat))
# Loop over your input
for (i in 1:nrow(mat)) {
# Sort the elements of a row by size.
x < sort(mat[i,])
# Increment the corresponding element of the frequency matrix
freq[x[1], x[2]] < freq[x[1], x[2]] + 1
}
freq
Cheers,
B.
On Oct 7, 2015, at 1:17 AM, Hermann Norpois < [hidden email]> wrote:
> Ok, this was misleading. And was not that important. My result matrix should look like this:
>
> 1 2 3 4 5 6 7 ...
> 1 p1 p2
> 2 p
> 3
> 4
>
> p1 etc are the frequencies of the combinations
>
> 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs.
> Thanks Hermann
>
> 20151007 2:40 GMT+02:00 Boris Steipe < [hidden email]>:
> Since order is not important to you, you can order your pairs (e.g. decreasing) before compiling the frequencies.
> But I don't understand the second part about values "that do not appear in the matrix". Do you mean you want to assess all combinations? If that's the case I would think about a hash table or other indexed data structure, rather than iterating through a matrix.
>
>
> B.
>
>
>
> On Oct 6, 2015, at 4:59 PM, Hermann Norpois < [hidden email]> wrote:
>
> > Hello,
> >
> > I have a matrix mat (see dput(mat))
> >
> >> mat
> > [,1] [,2]
> > [1,] 5 6
> > [2,] 6 5
> > [3,] 5 4
> > [4,] 5 5
> > ....
> >
> > I want the frequencies of the pairs in a new matrix, whereas the
> > combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat).
> > In other words: What is the probability of each combination (each row)
> > ignoring the order in the combination. As a result I would like to have a
> > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear
> > in my matrix.
> >
> > dput (mat)
> > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> >
> > Thanks
> > Hermann
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> > and provide commented, minimal, selfcontained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


You could also call table() on the columns of the input matrix, first
converting them
to factors with levels 1:max. Then add together the upper and lower
triangles of
the table if order is not important. E.g.,
f2 < function (mat)
{
maxMat < max(mat)
stopifnot(is.matrix(mat), all(mat %in% seq_len(maxMat)))
L < split(factor(mat, levels = seq_len(maxMat)), col(mat))
Table < do.call(table, unname(L))
ignoreOrder < function(M) {
stopifnot(length(dim(M)) == 2)
lower < lower.tri(M, diag = FALSE)
upper < upper.tri(M, diag = FALSE)
M[lower] < M[lower] + t(M)[lower]
M[upper] < t(M)[upper]
M
}
ignoreOrder(Table)
}
> mat < structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> f2(mat)
1 2 3 4 5 6 7
1 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0
3 0 0 0 2 0 0 2
4 0 0 2 0 4 0 0
5 0 0 0 4 2 10 4
6 0 0 0 0 10 0 2
7 0 0 2 0 4 2 0
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Oct 7, 2015 at 6:09 AM, Boris Steipe < [hidden email]> wrote:
> Still not sure I understand. But here is what I think you might mean:
>
> # Your data
> mat < structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
>
> # Create a square matrix with enough space to have an element for each pair. Since
> # order is not important, only the upper triangle is used. If the matrix is
> # large and sparse, a different approach might be needed.
> freq < matrix(numeric(max(mat) * max(mat)), nrow = max(mat), ncol = max(mat))
>
> # Loop over your input
> for (i in 1:nrow(mat)) {
> # Sort the elements of a row by size.
> x < sort(mat[i,])
> # Increment the corresponding element of the frequency matrix
> freq[x[1], x[2]] < freq[x[1], x[2]] + 1
> }
>
> freq
>
>
> Cheers,
> B.
>
>
>
>
>
> On Oct 7, 2015, at 1:17 AM, Hermann Norpois < [hidden email]> wrote:
>
>> Ok, this was misleading. And was not that important. My result matrix should look like this:
>>
>> 1 2 3 4 5 6 7 ...
>> 1 p1 p2
>> 2 p
>> 3
>> 4
>>
>> p1 etc are the frequencies of the combinations
>>
>> 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs.
>> Thanks Hermann
>>
>> 20151007 2:40 GMT+02:00 Boris Steipe < [hidden email]>:
>> Since order is not important to you, you can order your pairs (e.g. decreasing) before compiling the frequencies.
>> But I don't understand the second part about values "that do not appear in the matrix". Do you mean you want to assess all combinations? If that's the case I would think about a hash table or other indexed data structure, rather than iterating through a matrix.
>>
>>
>> B.
>>
>>
>>
>> On Oct 6, 2015, at 4:59 PM, Hermann Norpois < [hidden email]> wrote:
>>
>> > Hello,
>> >
>> > I have a matrix mat (see dput(mat))
>> >
>> >> mat
>> > [,1] [,2]
>> > [1,] 5 6
>> > [2,] 6 5
>> > [3,] 5 4
>> > [4,] 5 5
>> > ....
>> >
>> > I want the frequencies of the pairs in a new matrix, whereas the
>> > combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat).
>> > In other words: What is the probability of each combination (each row)
>> > ignoring the order in the combination. As a result I would like to have a
>> > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear
>> > in my matrix.
>> >
>> > dput (mat)
>> > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
>> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
>> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
>> >
>> > Thanks
>> > Hermann
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/rhelp>> > PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> > and provide commented, minimal, selfcontained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Thanks a lot. This was very helpful. I want to apologise for being
unprecise. My favourite solution was William's.
Thanks again.
20151007 18:39 GMT+02:00 William Dunlap < [hidden email]>:
> You could also call table() on the columns of the input matrix, first
> converting them
> to factors with levels 1:max. Then add together the upper and lower
> triangles of
> the table if order is not important. E.g.,
> f2 < function (mat)
> {
> maxMat < max(mat)
> stopifnot(is.matrix(mat), all(mat %in% seq_len(maxMat)))
> L < split(factor(mat, levels = seq_len(maxMat)), col(mat))
> Table < do.call(table, unname(L))
> ignoreOrder < function(M) {
> stopifnot(length(dim(M)) == 2)
> lower < lower.tri(M, diag = FALSE)
> upper < upper.tri(M, diag = FALSE)
> M[lower] < M[lower] + t(M)[lower]
> M[upper] < t(M)[upper]
> M
> }
> ignoreOrder(Table)
> }
>
> > mat < structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> > f2(mat)
>
> 1 2 3 4 5 6 7
> 1 0 0 0 0 0 0 0
> 2 0 0 0 0 0 0 0
> 3 0 0 0 2 0 0 2
> 4 0 0 2 0 4 0 0
> 5 0 0 0 4 2 10 4
> 6 0 0 0 0 10 0 2
> 7 0 0 2 0 4 2 0
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Wed, Oct 7, 2015 at 6:09 AM, Boris Steipe < [hidden email]>
> wrote:
> > Still not sure I understand. But here is what I think you might mean:
> >
> > # Your data
> > mat < structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> >
> > # Create a square matrix with enough space to have an element for each
> pair. Since
> > # order is not important, only the upper triangle is used. If the matrix
> is
> > # large and sparse, a different approach might be needed.
> > freq < matrix(numeric(max(mat) * max(mat)), nrow = max(mat), ncol =
> max(mat))
> >
> > # Loop over your input
> > for (i in 1:nrow(mat)) {
> > # Sort the elements of a row by size.
> > x < sort(mat[i,])
> > # Increment the corresponding element of the frequency matrix
> > freq[x[1], x[2]] < freq[x[1], x[2]] + 1
> > }
> >
> > freq
> >
> >
> > Cheers,
> > B.
> >
> >
> >
> >
> >
> > On Oct 7, 2015, at 1:17 AM, Hermann Norpois < [hidden email]> wrote:
> >
> >> Ok, this was misleading. And was not that important. My result matrix
> should look like this:
> >>
> >> 1 2 3 4 5 6 7 ...
> >> 1 p1 p2
> >> 2 p
> >> 3
> >> 4
> >>
> >> p1 etc are the frequencies of the combinations
> >>
> >> 1 and 2 for instance do not appear in my example. So the values would
> be zero. Actually, this part is not too important. I would be happy enough
> to solve the challenge with the frequencies of the pairs.
> >> Thanks Hermann
> >>
> >> 20151007 2:40 GMT+02:00 Boris Steipe < [hidden email]>:
> >> Since order is not important to you, you can order your pairs (e.g.
> decreasing) before compiling the frequencies.
> >> But I don't understand the second part about values "that do not appear
> in the matrix". Do you mean you want to assess all combinations? If that's
> the case I would think about a hash table or other indexed data structure,
> rather than iterating through a matrix.
> >>
> >>
> >> B.
> >>
> >>
> >>
> >> On Oct 6, 2015, at 4:59 PM, Hermann Norpois < [hidden email]> wrote:
> >>
> >> > Hello,
> >> >
> >> > I have a matrix mat (see dput(mat))
> >> >
> >> >> mat
> >> > [,1] [,2]
> >> > [1,] 5 6
> >> > [2,] 6 5
> >> > [3,] 5 4
> >> > [4,] 5 5
> >> > ....
> >> >
> >> > I want the frequencies of the pairs in a new matrix, whereas the
> >> > combination 5 and 6 is the same as 6 and 5 (see the first two rows of
> mat).
> >> > In other words: What is the probability of each combination (each row)
> >> > ignoring the order in the combination. As a result I would like to
> have a
> >> > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not
> appear
> >> > in my matrix.
> >> >
> >> > dput (mat)
> >> > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> >> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> >> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> >> >
> >> > Thanks
> >> > Hermann
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/rhelp> >> > PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> >> > and provide commented, minimal, selfcontained, reproducible code.
> >>
> >> ______________________________________________
> >> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/rhelp> >> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> >> and provide commented, minimal, selfcontained, reproducible code.
> >>
> >
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> > and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


More like this?
> mat < structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
+ 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
+ 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
>
> # Convert columns in mat so first column is always smaller
> mat2 < data.frame(t(apply(mat, 1, range)))
> mat2$X1 < factor(mat2$X1, 1:9)
> mat2$X2 < factor(mat2$X2, 1:9)
> tbl < xtabs(~X1+X2, mat2)
> tbl.p < tbl/sum(tbl)
> round(tbl.p, 2)
X2
X1 1 2 3 4 5 6 7 8 9
1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
3 0.00 0.00 0.00 0.08 0.00 0.00 0.08 0.00 0.00
4 0.00 0.00 0.00 0.00 0.15 0.00 0.00 0.00 0.00
5 0.00 0.00 0.00 0.00 0.08 0.38 0.15 0.00 0.00
6 0.00 0.00 0.00 0.00 0.00 0.00 0.08 0.00 0.00
7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
This puts everything on the diagonal and upper triangle. To get the lower triangle just use
> tbl < xtabs(~X2+X1, mat2)

David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 778404352
Original Message
From: Rhelp [mailto: [hidden email]] On Behalf Of Hermann Norpois
Sent: Wednesday, October 7, 2015 12:17 AM
To: Boris Steipe; rhelp
Subject: Re: [R] Measure the frequencies of pairs in a matrix
Ok, this was misleading. And was not that important. My result matrix
should look like this:
1 2 3 4 5 6 7 ...
1 p1 p2
2 p
3
4
p1 etc are the frequencies of the combinations
1 and 2 for instance do not appear in my example. So the values would be
zero. Actually, this part is not too important. I would be happy enough to
solve the challenge with the frequencies of the pairs.
Thanks Hermann
20151007 2:40 GMT+02:00 Boris Steipe < [hidden email]>:
> Since order is not important to you, you can order your pairs (e.g.
> decreasing) before compiling the frequencies.
> But I don't understand the second part about values "that do not appear in
> the matrix". Do you mean you want to assess all combinations? If that's the
> case I would think about a hash table or other indexed data structure,
> rather than iterating through a matrix.
>
>
> B.
>
>
>
> On Oct 6, 2015, at 4:59 PM, Hermann Norpois < [hidden email]> wrote:
>
> > Hello,
> >
> > I have a matrix mat (see dput(mat))
> >
> >> mat
> > [,1] [,2]
> > [1,] 5 6
> > [2,] 6 5
> > [3,] 5 4
> > [4,] 5 5
> > ....
> >
> > I want the frequencies of the pairs in a new matrix, whereas the
> > combination 5 and 6 is the same as 6 and 5 (see the first two rows of
> mat).
> > In other words: What is the probability of each combination (each row)
> > ignoring the order in the combination. As a result I would like to have a
> > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not
> appear
> > in my matrix.
> >
> > dput (mat)
> > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> >
> > Thanks
> > Hermann
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> > and provide commented, minimal, selfcontained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

