Problem creation tensor

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem creation tensor

GiuseppeRicci
Hi guys,

I need some help to analyzing my data.
I start to describe my data: I have 21 matrices, every matrix on the
rows has users and on columns has items, in my case films.
Element of index (i, j) represent the rating expressed by user i about item j.
I have a matrix for each of professions.
An example of a this type of matrix is:

                    item 1    item 2    item 3    item4
  id user 1        1          ?              ?           5
  id user 2        ?          3              3           ?
  id user 3        2          ?              3           2
  id user 4        ?          ?              ?           4
  ...
So user 1 don't like item 1 but he likes so much item 4, for item 2
and 3 he hasn't expressed a rating, etc.
I need to construct a tensor with n users, m items and 21 occupations.
After I have construct the tensor I want apply Parafac.
I read data from a CSV file and build each matrix for each occupation.

Didier Leibovici (author of PTAk package) suggested to me:

ok that's bit clearer you have 21 matrices ( 1 for each occupations)
of users rating their preferences (from 1 to 5 but without rating all
of them: missing values) of  m items.
but I suppose the users are not the same across the 21 occupations
(one has only one occupation .... if you're talking about
working/living occupation)
so you can't create a tensor n users x m items x 21 occupations
but you can build the contingencies of preferences m items x 21
occupations x 5 ratings

One way to build your tensor m x 21 x 5 is:
M1 is the first occupation (users x m) ...
UserItem <-rbind(M1,M2, ...M21)

m=1682

for (j in 1:m){
    UserItem[,j] =factor(UserItem[,j],levels=1:5)
}
occ=factor(c(rep(1,dim(M1)[1]),rep(2,dim(M2)[1]),
...,rep(21,dim(M21)[1])),levels=1:21)

Z <- array(rep(0,m*21*5),c(m,21,5),
list(paste("item",1:m,sep=""),paste("Occ",1:21,sep=""),c("pr1","pr2","pr3","pr4","pr5")))
for ( i in 1:m){
  as.matrix(table(occ, UserItem[,2]))
  Z[i,,]=table(occ, UserItem[,i])
}

Z.CAND <- CANPARA(Z,dim=7)

I have implemented this code but I have one error in correspondance of:

  for ( i in 1:m){
        Z[i,,]=table(occ,UserItem[,i])
  }

and error is:

Error in
Z[i,,]=table(occ,UserItem[,i])
the number of elements to be replaced is not a multiple of the length
of substitution

Can anyone help me to understand this code and how I can resolve the error?
Thanks.
Best regards.
Giuseppe

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem creation tensor

Kjetil Halvorsen
Amusing that someone named RICCI is asking about tensors ....
(sorry!)

Kjetil

On Tue, Jul 17, 2012 at 6:31 AM, Peppe Ricci <[hidden email]> wrote:

> Hi guys,
>
> I need some help to analyzing my data.
> I start to describe my data: I have 21 matrices, every matrix on the
> rows has users and on columns has items, in my case films.
> Element of index (i, j) represent the rating expressed by user i about item j.
> I have a matrix for each of professions.
> An example of a this type of matrix is:
>
>                     item 1    item 2    item 3    item4
>   id user 1        1          ?              ?           5
>   id user 2        ?          3              3           ?
>   id user 3        2          ?              3           2
>   id user 4        ?          ?              ?           4
>   ...
> So user 1 don't like item 1 but he likes so much item 4, for item 2
> and 3 he hasn't expressed a rating, etc.
> I need to construct a tensor with n users, m items and 21 occupations.
> After I have construct the tensor I want apply Parafac.
> I read data from a CSV file and build each matrix for each occupation.
>
> Didier Leibovici (author of PTAk package) suggested to me:
>
> ok that's bit clearer you have 21 matrices ( 1 for each occupations)
> of users rating their preferences (from 1 to 5 but without rating all
> of them: missing values) of  m items.
> but I suppose the users are not the same across the 21 occupations
> (one has only one occupation .... if you're talking about
> working/living occupation)
> so you can't create a tensor n users x m items x 21 occupations
> but you can build the contingencies of preferences m items x 21
> occupations x 5 ratings
>
> One way to build your tensor m x 21 x 5 is:
> M1 is the first occupation (users x m) ...
> UserItem <-rbind(M1,M2, ...M21)
>
> m=1682
>
> for (j in 1:m){
>     UserItem[,j] =factor(UserItem[,j],levels=1:5)
> }
> occ=factor(c(rep(1,dim(M1)[1]),rep(2,dim(M2)[1]),
> ...,rep(21,dim(M21)[1])),levels=1:21)
>
> Z <- array(rep(0,m*21*5),c(m,21,5),
> list(paste("item",1:m,sep=""),paste("Occ",1:21,sep=""),c("pr1","pr2","pr3","pr4","pr5")))
> for ( i in 1:m){
>   as.matrix(table(occ, UserItem[,2]))
>   Z[i,,]=table(occ, UserItem[,i])
> }
>
> Z.CAND <- CANPARA(Z,dim=7)
>
> I have implemented this code but I have one error in correspondance of:
>
>   for ( i in 1:m){
>         Z[i,,]=table(occ,UserItem[,i])
>   }
>
> and error is:
>
> Error in
> Z[i,,]=table(occ,UserItem[,i])
> the number of elements to be replaced is not a multiple of the length
> of substitution
>
> Can anyone help me to understand this code and how I can resolve the error?
> Thanks.
> Best regards.
> Giuseppe
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem creation tensor

Petr Savicky
In reply to this post by GiuseppeRicci
On Tue, Jul 17, 2012 at 12:31:38PM +0200, Peppe Ricci wrote:

> Hi guys,
>
> I need some help to analyzing my data.
> I start to describe my data: I have 21 matrices, every matrix on the
> rows has users and on columns has items, in my case films.
> Element of index (i, j) represent the rating expressed by user i about item j.
> I have a matrix for each of professions.
> An example of a this type of matrix is:
>
>                     item 1    item 2    item 3    item4
>   id user 1        1          ?              ?           5
>   id user 2        ?          3              3           ?
>   id user 3        2          ?              3           2
>   id user 4        ?          ?              ?           4
>   ...
> So user 1 don't like item 1 but he likes so much item 4, for item 2
> and 3 he hasn't expressed a rating, etc.
> I need to construct a tensor with n users, m items and 21 occupations.
> After I have construct the tensor I want apply Parafac.
> I read data from a CSV file and build each matrix for each occupation.
>
> Didier Leibovici (author of PTAk package) suggested to me:
>
> ok that's bit clearer you have 21 matrices ( 1 for each occupations)
> of users rating their preferences (from 1 to 5 but without rating all
> of them: missing values) of  m items.
> but I suppose the users are not the same across the 21 occupations
> (one has only one occupation .... if you're talking about
> working/living occupation)
> so you can't create a tensor n users x m items x 21 occupations
> but you can build the contingencies of preferences m items x 21
> occupations x 5 ratings
>
> One way to build your tensor m x 21 x 5 is:
> M1 is the first occupation (users x m) ...
> UserItem <-rbind(M1,M2, ...M21)
>
> m=1682
>
> for (j in 1:m){
>     UserItem[,j] =factor(UserItem[,j],levels=1:5)
> }
> occ=factor(c(rep(1,dim(M1)[1]),rep(2,dim(M2)[1]),
> ...,rep(21,dim(M21)[1])),levels=1:21)
>
> Z <- array(rep(0,m*21*5),c(m,21,5),
> list(paste("item",1:m,sep=""),paste("Occ",1:21,sep=""),c("pr1","pr2","pr3","pr4","pr5")))
> for ( i in 1:m){
>   as.matrix(table(occ, UserItem[,2]))
>   Z[i,,]=table(occ, UserItem[,i])
> }
>
> Z.CAND <- CANPARA(Z,dim=7)
>
> I have implemented this code but I have one error in correspondance of:
>
>   for ( i in 1:m){
>         Z[i,,]=table(occ,UserItem[,i])
>   }
>
> and error is:
>
> Error in
> Z[i,,]=table(occ,UserItem[,i])
> the number of elements to be replaced is not a multiple of the length
> of substitution

Hi.

The problem in this code is that the command

  UserItem <- rbind(M1, M2, ..., M21)

produces a matrix and not a data.frame. Due to this, the commands

    UserItem[, j] <- factor(UserItem[, j], levels=1:5)

do not convert the columns to factors, but the columns remain numeric.
Due to this, the table created as

  table(occ, UserItem[, i])

may not have the full size, since the columns correspond only to
preferences, which do occur in UserItem[, i], and not to all possible
preferences.

Changing

  UserItem <- rbind(M1, M2, ..., M21)

to

  UserItem <- data.frame(rbind(M1, M2, ..., M21))

can resolve the problem, since then the columns will be coerced to factors,
whose list of levels is complete, even if some level is not used.

For better clarity, consider the definition of the array in an equivalent
form

  Z <- array(0, dim=c(m, 21, 5),
  dimnames=list(paste("item", 1:m, sep=""), paste("Occ", 1:21, sep=""),
  c("pr1", "pr2", "pr3", "pr4", "pr5")))

which contains the names of the used arguments of the function array().

Hope this helps.

Petr Savicky.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem creation tensor

Petr Savicky
On Wed, Jul 18, 2012 at 11:38:59AM +0200, Petr Savicky wrote:

> On Tue, Jul 17, 2012 at 12:31:38PM +0200, Peppe Ricci wrote:
> > Hi guys,
> >
> > I need some help to analyzing my data.
> > I start to describe my data: I have 21 matrices, every matrix on the
> > rows has users and on columns has items, in my case films.
> > Element of index (i, j) represent the rating expressed by user i about item j.
> > I have a matrix for each of professions.
> > An example of a this type of matrix is:
> >
> >                     item 1    item 2    item 3    item4
> >   id user 1        1          ?              ?           5
> >   id user 2        ?          3              3           ?
> >   id user 3        2          ?              3           2
> >   id user 4        ?          ?              ?           4
> >   ...
> > So user 1 don't like item 1 but he likes so much item 4, for item 2
> > and 3 he hasn't expressed a rating, etc.
> > I need to construct a tensor with n users, m items and 21 occupations.
> > After I have construct the tensor I want apply Parafac.
> > I read data from a CSV file and build each matrix for each occupation.
> >
[...]

> > I have implemented this code but I have one error in correspondance of:
> >
> >   for ( i in 1:m){
> >         Z[i,,]=table(occ,UserItem[,i])
> >   }
> >
> > and error is:
> >
> > Error in
> > Z[i,,]=table(occ,UserItem[,i])
> > the number of elements to be replaced is not a multiple of the length
> > of substitution
>
> Hi.
>
> The problem in this code is that the command
>
>   UserItem <- rbind(M1, M2, ..., M21)
>
> produces a matrix and not a data.frame. Due to this, the commands

Hi.

Let me include a few more comments. The function rbind() preserves the
types "matrix" and "data.frame". So, if Mi are indeed matrices, then
the above applies.

>     UserItem[, j] <- factor(UserItem[, j], levels=1:5)
>
> do not convert the columns to factors, but the columns remain numeric.
> Due to this, the table created as
>
>   table(occ, UserItem[, i])
>
> may not have the full size, since the columns correspond only to
> preferences, which do occur in UserItem[, i], and not to all possible
> preferences.

This can be demonstrated by the following example.

  occ <- 1:3
  pref <- c(1, 3, 4)
  table(occ, pref)

     pref
  occ 1 3 4
    1 1 0 0
    2 0 1 0
    3 0 0 1

and

  table(occ, pref=factor(pref, levels=1:5))

     pref
  occ 1 2 3 4 5
    1 1 0 0 0 0
    2 0 0 1 0 0
    3 0 0 0 1 0

Hope this helps.

Petr Savicky.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem creation tensor

GiuseppeRicci
In reply to this post by Kjetil Halvorsen
what is the problem?
gr

-----Messaggio originale-----
From: Kjetil Halvorsen
Sent: Wednesday, July 18, 2012 1:01 AM
To: Peppe Ricci
Cc: [hidden email]
Subject: Re: [R] Problem creation tensor

Amusing that someone named RICCI is asking about tensors ....
(sorry!)

Kjetil

On Tue, Jul 17, 2012 at 6:31 AM, Peppe Ricci <[hidden email]> wrote:

> Hi guys,
>
> I need some help to analyzing my data.
> I start to describe my data: I have 21 matrices, every matrix on the
> rows has users and on columns has items, in my case films.
> Element of index (i, j) represent the rating expressed by user i about
> item j.
> I have a matrix for each of professions.
> An example of a this type of matrix is:
>
>                     item 1    item 2    item 3    item4
>   id user 1        1          ?              ?           5
>   id user 2        ?          3              3           ?
>   id user 3        2          ?              3           2
>   id user 4        ?          ?              ?           4
>   ...
> So user 1 don't like item 1 but he likes so much item 4, for item 2
> and 3 he hasn't expressed a rating, etc.
> I need to construct a tensor with n users, m items and 21 occupations.
> After I have construct the tensor I want apply Parafac.
> I read data from a CSV file and build each matrix for each occupation.
>
> Didier Leibovici (author of PTAk package) suggested to me:
>
> ok that's bit clearer you have 21 matrices ( 1 for each occupations)
> of users rating their preferences (from 1 to 5 but without rating all
> of them: missing values) of  m items.
> but I suppose the users are not the same across the 21 occupations
> (one has only one occupation .... if you're talking about
> working/living occupation)
> so you can't create a tensor n users x m items x 21 occupations
> but you can build the contingencies of preferences m items x 21
> occupations x 5 ratings
>
> One way to build your tensor m x 21 x 5 is:
> M1 is the first occupation (users x m) ...
> UserItem <-rbind(M1,M2, ...M21)
>
> m=1682
>
> for (j in 1:m){
>     UserItem[,j] =factor(UserItem[,j],levels=1:5)
> }
> occ=factor(c(rep(1,dim(M1)[1]),rep(2,dim(M2)[1]),
> ...,rep(21,dim(M21)[1])),levels=1:21)
>
> Z <- array(rep(0,m*21*5),c(m,21,5),
> list(paste("item",1:m,sep=""),paste("Occ",1:21,sep=""),c("pr1","pr2","pr3","pr4","pr5")))
> for ( i in 1:m){
>   as.matrix(table(occ, UserItem[,2]))
>   Z[i,,]=table(occ, UserItem[,i])
> }
>
> Z.CAND <- CANPARA(Z,dim=7)
>
> I have implemented this code but I have one error in correspondance of:
>
>   for ( i in 1:m){
>         Z[i,,]=table(occ,UserItem[,i])
>   }
>
> and error is:
>
> Error in
> Z[i,,]=table(occ,UserItem[,i])
> the number of elements to be replaced is not a multiple of the length
> of substitution
>
> Can anyone help me to understand this code and how I can resolve the
> error?
> Thanks.
> Best regards.
> Giuseppe
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem creation tensor

GiuseppeRicci
Hi,

thank Petr for your help.
I have implemented you code suggestion but there is another problem.
It seems that the code:

for (i in 1:m){
 Z[i,,]=table(occ, data_matrix[,i])
}

don't charge any values in Z.
Is there some error?
Thanks.
Giuseppe
Reply | Threaded
Open this post in threaded view
|

Re: Problem creation tensor

Petr Savicky
On Mon, Jul 30, 2012 at 04:11:40AM -0700, GiuseppeRicci wrote:

> Hi,
>
> thank Petr for your help.
> I have implemented you code suggestion but there is another problem.
> It seems that the code:
>
> for (i in 1:m){
>  Z[i,,]=table(occ, data_matrix[,i])
> }
>
> don't charge any values in Z.
> Is there some error?

Hi.

I do not see an error in this part of the code, but there
may be an error in the context, in which this code is used.

Did you look at the value of table(occ, data_matrix[,i])
at the time, when the command is executed?

If the command

  Z[i,,]=table(occ, data_matrix[,i])

does not stop with an error, then it does not change Z[i,,] if
Z[i,,] contains the values equal to table(occ, data_matrix[,i])
already before the command is executed. This may happen, for
example, if you run the command twice.

Petr Savicky.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.