

Thank you for all .
One more question.How can I calculate these efficiently?
set.seed(100)
dat<data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
# In this example,id's elements are 0,1,2.
y<list()
for (i in 0:2){
X<as.matrix(subset(dat,id==i,c("x1","x2")))
u<as.matrix(subset(dat,id==i,c("u")))
y[[i+1]]<t(X)%*%u%*%t(u)%*%X
}
y[[1]]+y[[2]]+y[[3]]
the above code is not elegant.And my second solution to this problem
is using by to get a list.
matlis<by(dat, dat$id, function(x){
a<as.matrix(x[,c("x1","x2")])
b<as.matrix(x[, "u"])
t(a) %*% b %*% t(b) %*% a
})
S < matrix(unlist(matlis), 4, length(matlis))
S1 < matrix(rowSums(S), 2, 2)
The code works ,but I want to ask if there is any other more better
ways to do it? It seems that this kind of computation is quite common.
2006/2/28, Gabor Grothendieck < [hidden email]>:
> Try:
>
> crossprod(x)
>
> or
>
> t(x) %*% x
>
> On 2/28/06, ronggui < [hidden email]> wrote:
> > This is the code:
> >
> > x<matrix(rnorm(20),5)
> > y<list()
> > for (i in seq(nrow(x))) y[[i]]<t(x[i,,drop=F])%*%x[i,,drop=F]
> > y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
> >
> > How can I do it without using for loops?
> > Thank you in advance!
> > 
> > ronggui
> > Deparment of Sociology
> > Fudan University
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html> >
>

黄荣贵
Deparment of Sociology
Fudan University
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


Thomas' solution is better but thought this might be of interest
anyways since it can be written closer to mathematical notation.
That is, the required expression can be written in the
following equivalent way for a suitable matrix A:
X' diag(u) A' A diag(u) X
where diag(u) is a diagonal matrix with u along the diagonal
as in the R diag function, spaces refer to matrix multiplication
and ' means transpose.
Thus we have:
A < outer(unique(dat$id), dat$id, "==")
crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))
On 3/8/06, Thomas Lumley < [hidden email]> wrote:
> On Wed, 8 Mar 2006, ronggui wrote:
>
> > Thank you for all .
> >
> > One more question.How can I calculate these efficiently?
> >
> > set.seed(100)
> > dat<data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
> > # In this example,id's elements are 0,1,2.
> > y<list()
> > for (i in 0:2){
> > X<as.matrix(subset(dat,id==i,c("x1","x2")))
> > u<as.matrix(subset(dat,id==i,c("u")))
> > y[[i+1]]<t(X)%*%u%*%t(u)%*%X
> > }
> > y[[1]]+y[[2]]+y[[3]]
> >
>
> People have already told you about crossprod, so crossprod(crossprod(X,u))
> would seem an obvious improvement over the matrix multiplications.
>
> There is a better solution, though.
>
> Xu<dat[,c("x1","x2")]*dat[,"u"]
> crossprod( rowsum(Xu, dat$id))
>
> thomas
>
>
> > the above code is not elegant.And my second solution to this problem
> > is using by to get a list.
> >
> > matlis<by(dat, dat$id, function(x){
> > a<as.matrix(x[,c("x1","x2")])
> > b<as.matrix(x[, "u"])
> > t(a) %*% b %*% t(b) %*% a
> > })
> >
> > S < matrix(unlist(matlis), 4, length(matlis))
> > S1 < matrix(rowSums(S), 2, 2)
> >
> > The code works ,but I want to ask if there is any other more better
> > ways to do it? It seems that this kind of computation is quite common.
> >
> >
> >
> >
> >
> > 2006/2/28, Gabor Grothendieck < [hidden email]>:
> >> Try:
> >>
> >> crossprod(x)
> >>
> >> or
> >>
> >> t(x) %*% x
> >>
> >> On 2/28/06, ronggui < [hidden email]> wrote:
> >>> This is the code:
> >>>
> >>> x<matrix(rnorm(20),5)
> >>> y<list()
> >>> for (i in seq(nrow(x))) y[[i]]<t(x[i,,drop=F])%*%x[i,,drop=F]
> >>> y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
> >>>
> >>> How can I do it without using for loops?
> >>> Thank you in advance!
> >>> 
> >>> ronggui
> >>> Deparment of Sociology
> >>> Fudan University
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/rhelp> >>> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html> >>>
> >>
> >
> >
> > 
> > »ÆÈÙ¹ó
> > Deparment of Sociology
> > Fudan University
> >
> >
>
> Thomas Lumley Assoc. Professor, Biostatistics
> [hidden email] University of Washington, Seattle
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


On Wed, 8 Mar 2006, Gabor Grothendieck wrote:
> Thomas' solution is better but thought this might be of interest
> anyways since it can be written closer to mathematical notation.
> That is, the required expression can be written in the
> following equivalent way for a suitable matrix A:
>
> X' diag(u) A' A diag(u) X
Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine
when n=20, or even 200, but still...
thomas
> where diag(u) is a diagonal matrix with u along the diagonal
> as in the R diag function, spaces refer to matrix multiplication
> and ' means transpose.
>
> Thus we have:
>
> A < outer(unique(dat$id), dat$id, "==")
> crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))
>
>
> On 3/8/06, Thomas Lumley < [hidden email]> wrote:
>> On Wed, 8 Mar 2006, ronggui wrote:
>>
>>> Thank you for all .
>>>
>>> One more question.How can I calculate these efficiently?
>>>
>>> set.seed(100)
>>> dat<data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
>>> # In this example,id's elements are 0,1,2.
>>> y<list()
>>> for (i in 0:2){
>>> X<as.matrix(subset(dat,id==i,c("x1","x2")))
>>> u<as.matrix(subset(dat,id==i,c("u")))
>>> y[[i+1]]<t(X)%*%u%*%t(u)%*%X
>>> }
>>> y[[1]]+y[[2]]+y[[3]]
>>>
>>
>> People have already told you about crossprod, so crossprod(crossprod(X,u))
>> would seem an obvious improvement over the matrix multiplications.
>>
>> There is a better solution, though.
>>
>> Xu<dat[,c("x1","x2")]*dat[,"u"]
>> crossprod( rowsum(Xu, dat$id))
>>
>> thomas
>>
>>
>>> the above code is not elegant.And my second solution to this problem
>>> is using by to get a list.
>>>
>>> matlis<by(dat, dat$id, function(x){
>>> a<as.matrix(x[,c("x1","x2")])
>>> b<as.matrix(x[, "u"])
>>> t(a) %*% b %*% t(b) %*% a
>>> })
>>>
>>> S < matrix(unlist(matlis), 4, length(matlis))
>>> S1 < matrix(rowSums(S), 2, 2)
>>>
>>> The code works ,but I want to ask if there is any other more better
>>> ways to do it? It seems that this kind of computation is quite common.
>>>
>>>
>>>
>>>
>>>
>>> 2006/2/28, Gabor Grothendieck < [hidden email]>:
>>>> Try:
>>>>
>>>> crossprod(x)
>>>>
>>>> or
>>>>
>>>> t(x) %*% x
>>>>
>>>> On 2/28/06, ronggui < [hidden email]> wrote:
>>>>> This is the code:
>>>>>
>>>>> x<matrix(rnorm(20),5)
>>>>> y<list()
>>>>> for (i in seq(nrow(x))) y[[i]]<t(x[i,,drop=F])%*%x[i,,drop=F]
>>>>> y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
>>>>>
>>>>> How can I do it without using for loops?
>>>>> Thank you in advance!
>>>>> 
>>>>> ronggui
>>>>> Deparment of Sociology
>>>>> Fudan University
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>>> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>>>>>
>>>>
>>>
>>>
>>> 
>>> »ÆÈÙ¹ó
>>> Deparment of Sociology
>>> Fudan University
>>>
>>>
>>
>> Thomas Lumley Assoc. Professor, Biostatistics
>> [hidden email] University of Washington, Seattle
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>>
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>
Thomas Lumley Assoc. Professor, Biostatistics
[hidden email] University of Washington, Seattle ______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


Yes, its horribly inefficient but as I stated
the reason I mentioned it was because it could be translated
more easily from mathematical notation and I did mention I
preferred your solution.
Actually one problem with both of the solutions is determining
that they are correct. From that viewpoint, although not as
elegant or fun, a more straightforward
translation of the original question might actually be preferable
(Thomas also already mentioned the computation in f):
f < function(x) crossprod(crossprod(x[[3]], as.matrix(x[,1:2])))
mm <by(dat, dat$id, f)
mm[[1]] + mm[[2]] + mm[[3]]
or the last line could be replaced with these two if you don't know
that there are necessarily three components:
result < mm[[1]]
result[] < do.call("mapply", c(sum, mm))
One thing that this bought out for me is that R does not have
a parallel to pmax for sum. If one had psum of if "+" allowed
more than 2 arguments one could have done this instead of the
two lines involving result above:
do.call("psum", mm) # if psum analog to pmax were available
do.call("+", mm) # if + allowed > 2 arguments
On 3/8/06, Thomas Lumley < [hidden email]> wrote:
> On Wed, 8 Mar 2006, Gabor Grothendieck wrote:
>
> > Thomas' solution is better but thought this might be of interest
> > anyways since it can be written closer to mathematical notation.
> > That is, the required expression can be written in the
> > following equivalent way for a suitable matrix A:
> >
> > X' diag(u) A' A diag(u) X
>
> Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine
> when n=20, or even 200, but still...
>
> thomas
>
>
>
> > where diag(u) is a diagonal matrix with u along the diagonal
> > as in the R diag function, spaces refer to matrix multiplication
> > and ' means transpose.
> >
> > Thus we have:
> >
> > A < outer(unique(dat$id), dat$id, "==")
> > crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))
> >
> >
> > On 3/8/06, Thomas Lumley < [hidden email]> wrote:
> >> On Wed, 8 Mar 2006, ronggui wrote:
> >>
> >>> Thank you for all .
> >>>
> >>> One more question.How can I calculate these efficiently?
> >>>
> >>> set.seed(100)
> >>> dat<data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
> >>> # In this example,id's elements are 0,1,2.
> >>> y<list()
> >>> for (i in 0:2){
> >>> X<as.matrix(subset(dat,id==i,c("x1","x2")))
> >>> u<as.matrix(subset(dat,id==i,c("u")))
> >>> y[[i+1]]<t(X)%*%u%*%t(u)%*%X
> >>> }
> >>> y[[1]]+y[[2]]+y[[3]]
> >>>
> >>
> >> People have already told you about crossprod, so crossprod(crossprod(X,u))
> >> would seem an obvious improvement over the matrix multiplications.
> >>
> >> There is a better solution, though.
> >>
> >> Xu<dat[,c("x1","x2")]*dat[,"u"]
> >> crossprod( rowsum(Xu, dat$id))
> >>
> >> thomas
> >>
> >>
> >>> the above code is not elegant.And my second solution to this problem
> >>> is using by to get a list.
> >>>
> >>> matlis<by(dat, dat$id, function(x){
> >>> a<as.matrix(x[,c("x1","x2")])
> >>> b<as.matrix(x[, "u"])
> >>> t(a) %*% b %*% t(b) %*% a
> >>> })
> >>>
> >>> S < matrix(unlist(matlis), 4, length(matlis))
> >>> S1 < matrix(rowSums(S), 2, 2)
> >>>
> >>> The code works ,but I want to ask if there is any other more better
> >>> ways to do it? It seems that this kind of computation is quite common.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> 2006/2/28, Gabor Grothendieck < [hidden email]>:
> >>>> Try:
> >>>>
> >>>> crossprod(x)
> >>>>
> >>>> or
> >>>>
> >>>> t(x) %*% x
> >>>>
> >>>> On 2/28/06, ronggui < [hidden email]> wrote:
> >>>>> This is the code:
> >>>>>
> >>>>> x<matrix(rnorm(20),5)
> >>>>> y<list()
> >>>>> for (i in seq(nrow(x))) y[[i]]<t(x[i,,drop=F])%*%x[i,,drop=F]
> >>>>> y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
> >>>>>
> >>>>> How can I do it without using for loops?
> >>>>> Thank you in advance!
> >>>>> 
> >>>>> ronggui
> >>>>> Deparment of Sociology
> >>>>> Fudan University
> >>>>>
> >>>>> ______________________________________________
> >>>>> [hidden email] mailing list
> >>>>> https://stat.ethz.ch/mailman/listinfo/rhelp> >>>>> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html> >>>>>
> >>>>
> >>>
> >>>
> >>> 
> >>> »ÆÈÙ¹ó
> >>> Deparment of Sociology
> >>> Fudan University
> >>>
> >>>
> >>
> >> Thomas Lumley Assoc. Professor, Biostatistics
> >> [hidden email] University of Washington, Seattle
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >> https://stat.ethz.ch/mailman/listinfo/rhelp> >> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html> >>
> >>
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html> >
>
> Thomas Lumley Assoc. Professor, Biostatistics
> [hidden email] University of Washington, Seattle
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html


On Wed, 8 Mar 2006, Gabor Grothendieck wrote:
> Actually one problem with both of the solutions is determining
> that they are correct. From that viewpoint, although not as
> elegant or fun, a more straightforward
> translation of the original question might actually be preferable
Yes. Although for the application where I encounter this (sandwich
variance estimators), the proof that the two versions are the same is of
independent interest, since it shows the relationship between the cluster
jackknife, deltabetas, and the sandwich estimator.
thomas
> (Thomas also already mentioned the computation in f):
>
> f < function(x) crossprod(crossprod(x[[3]], as.matrix(x[,1:2])))
> mm <by(dat, dat$id, f)
> mm[[1]] + mm[[2]] + mm[[3]]
>
> or the last line could be replaced with these two if you don't know
> that there are necessarily three components:
>
> result < mm[[1]]
> result[] < do.call("mapply", c(sum, mm))
>
> One thing that this bought out for me is that R does not have
> a parallel to pmax for sum. If one had psum of if "+" allowed
> more than 2 arguments one could have done this instead of the
> two lines involving result above:
>
> do.call("psum", mm) # if psum analog to pmax were available
> do.call("+", mm) # if + allowed > 2 arguments
>
>
> On 3/8/06, Thomas Lumley < [hidden email]> wrote:
>> On Wed, 8 Mar 2006, Gabor Grothendieck wrote:
>>
>>> Thomas' solution is better but thought this might be of interest
>>> anyways since it can be written closer to mathematical notation.
>>> That is, the required expression can be written in the
>>> following equivalent way for a suitable matrix A:
>>>
>>> X' diag(u) A' A diag(u) X
>>
>> Um. n x n matrix? O(n^2) storage? O(n^3) execution time? Yes, it's fine
>> when n=20, or even 200, but still...
>>
>> thomas
>>
>>
>>
>>> where diag(u) is a diagonal matrix with u along the diagonal
>>> as in the R diag function, spaces refer to matrix multiplication
>>> and ' means transpose.
>>>
>>> Thus we have:
>>>
>>> A < outer(unique(dat$id), dat$id, "==")
>>> crossprod(A %*% diag(dat$u) %*% as.matrix(dat[1:2]))
>>>
>>>
>>> On 3/8/06, Thomas Lumley < [hidden email]> wrote:
>>>> On Wed, 8 Mar 2006, ronggui wrote:
>>>>
>>>>> Thank you for all .
>>>>>
>>>>> One more question.How can I calculate these efficiently?
>>>>>
>>>>> set.seed(100)
>>>>> dat<data.frame(x1=rnorm(20),x2=rnorm(20),u=rnorm(20),id=round(2*runif(20)))
>>>>> # In this example,id's elements are 0,1,2.
>>>>> y<list()
>>>>> for (i in 0:2){
>>>>> X<as.matrix(subset(dat,id==i,c("x1","x2")))
>>>>> u<as.matrix(subset(dat,id==i,c("u")))
>>>>> y[[i+1]]<t(X)%*%u%*%t(u)%*%X
>>>>> }
>>>>> y[[1]]+y[[2]]+y[[3]]
>>>>>
>>>>
>>>> People have already told you about crossprod, so crossprod(crossprod(X,u))
>>>> would seem an obvious improvement over the matrix multiplications.
>>>>
>>>> There is a better solution, though.
>>>>
>>>> Xu<dat[,c("x1","x2")]*dat[,"u"]
>>>> crossprod( rowsum(Xu, dat$id))
>>>>
>>>> thomas
>>>>
>>>>
>>>>> the above code is not elegant.And my second solution to this problem
>>>>> is using by to get a list.
>>>>>
>>>>> matlis<by(dat, dat$id, function(x){
>>>>> a<as.matrix(x[,c("x1","x2")])
>>>>> b<as.matrix(x[, "u"])
>>>>> t(a) %*% b %*% t(b) %*% a
>>>>> })
>>>>>
>>>>> S < matrix(unlist(matlis), 4, length(matlis))
>>>>> S1 < matrix(rowSums(S), 2, 2)
>>>>>
>>>>> The code works ,but I want to ask if there is any other more better
>>>>> ways to do it? It seems that this kind of computation is quite common.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2006/2/28, Gabor Grothendieck < [hidden email]>:
>>>>>> Try:
>>>>>>
>>>>>> crossprod(x)
>>>>>>
>>>>>> or
>>>>>>
>>>>>> t(x) %*% x
>>>>>>
>>>>>> On 2/28/06, ronggui < [hidden email]> wrote:
>>>>>>> This is the code:
>>>>>>>
>>>>>>> x<matrix(rnorm(20),5)
>>>>>>> y<list()
>>>>>>> for (i in seq(nrow(x))) y[[i]]<t(x[i,,drop=F])%*%x[i,,drop=F]
>>>>>>> y[[1]]+y[[2]]+y[[3]]+y[[4]]+y[[5]]
>>>>>>>
>>>>>>> How can I do it without using for loops?
>>>>>>> Thank you in advance!
>>>>>>> 
>>>>>>> ronggui
>>>>>>> Deparment of Sociology
>>>>>>> Fudan University
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> [hidden email] mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>>>>> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> 
>>>>> »ÆÈÙ¹ó
>>>>> Deparment of Sociology
>>>>> Fudan University
>>>>>
>>>>>
>>>>
>>>> Thomas Lumley Assoc. Professor, Biostatistics
>>>> [hidden email] University of Washington, Seattle
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>>>>
>>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>>>
>>
>> Thomas Lumley Assoc. Professor, Biostatistics
>> [hidden email] University of Washington, Seattle
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide! http://www.Rproject.org/postingguide.html>
Thomas Lumley Assoc. Professor, Biostatistics
[hidden email] University of Washington, Seattle ______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide! http://www.Rproject.org/postingguide.html

