backsolve, chol, Matrix, and SparseM

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

backsolve, chol, Matrix, and SparseM

Benjamin Tyner
Hi

I have some code which does (on a symmetric matrix 'x')

    backsolve(chol(x), diag(nrow(x)))

and I am wondering what is the recommended way to accomplish this when x
is also sparse (from package:Matrix). I know that package:Matrix
provides a chol method for such matrices, but not a backsolve method. On
the other hand, package:SparseM does provide a backsolve method, but
doesn't actually return a sparse matrix. Moreover, I am a little
hesitant to use SparseM, as the vignette seems to be from 2003.

I did notice that help(topic = "solve", package = "Matrix") says "In
‘solve(a,b)’ in the ‘Matrix’ package, ‘a’ may also be a
‘MatrixFactorization’ instead of directly a matrix." which makes me
think this is the right way:

    Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x)))

but unfortunately this didn't give the same result as:

    Matrix::solve(chol(x), .sparseDiagonal(nrow(x)))

so I'm asking here in case someone has any suggestions.

Regards,
Ben

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: backsolve, chol, Matrix, and SparseM

Martin Maechler
Dear Ben,

>>>>> Benjamin Tyner <[hidden email]>
>>>>>     on Thu, 24 Sep 2015 13:47:58 -0400 writes:

    > Hi I have some code which does (on a symmetric matrix 'x')

    >     backsolve(chol(x), diag(nrow(x)))

    > and I am wondering what is the recommended way to
    > accomplish this when x is also sparse (from
    > package:Matrix). I know that package:Matrix provides a
    > chol method for such matrices, but not a backsolve
    > method. On the other hand, package:SparseM does provide a
    > backsolve method, but doesn't actually return a sparse
    > matrix. Moreover, I am a little hesitant to use SparseM,
    > as the vignette seems to be from 2003.

Roger Koenker has agreed in the past, that new projects should
rather use Matrix.   SparseM has been the very first R package
providing sparse matrix support.


    > I did notice that help(topic = "solve", package =
    > "Matrix") says "In ‘solve(a,b)’ in the ‘Matrix’ package,
    > ‘a’ may also be a ‘MatrixFactorization’ instead of
    > directly a matrix." which makes me think this is the right
    > way:

    >     Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x)))

    > but unfortunately this didn't give the same result as:

    >     Matrix::solve(chol(x), .sparseDiagonal(nrow(x)))

    > so I'm asking here in case someone has any suggestions.

You don't give any examples.
So a few remarks and a reproducible example to get more concrete

A. As the Matrix package has classes for triangular matrices and
  Matrix :: chol() returns them, there   is no need for
  forwardsolve() or backwardsolve(), as just   solve() is always
  enough.

B. As Doug Bates has been teaching for many decennia, "it is
  almost always computationally *wrong* to compute a matrix
  inverse explicitly".
  Rather compute    A^{-1} B   or  A^{-1} x  {for vector x,
  matrix B (but different from Identity).

C. Inspite of B, there are cases (such as computing sandwich
  estimates of covariance matrices) where you do want the inverse.
  In that case,

   solve(A)    is semantically equivalent to
   solve(A, diag(.))

   and almost always the *first* form is implempented more
   efficiently than the second.

D. In Matrix,  use chol(.) ... unless you really read a bit
   about Cholesky(.) and its special purpose sparse cholesky decompositions.
   As mentioned above,  Matrix :: chol()  will return a
   "formally triangular" matrix, i.e., inheriting from
   "triangularMatrix"; in the sparse case, very typically of
   specific class "dtCMatrix".

Here's a small reproducible example,
please use it to ask further questions:

*.R:

library(Matrix)
M <- as(diag(4)+1,"dsCMatrix")
m <- as(M, "matrix") # -> traditional R matrix
stopifnot( all(M == m) )
M
L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
(L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically "dtCMatrix")
(cM <- chol(M))# *upper* triagonal ("dtCMatrix")
(cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" :
all.equal(as.matrix(cM), cm) # TRUE

(r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
(R. <-     solve(cM) ) ## the "same"  (but nicer printing)
all.equal(as.matrix(R.), r., check.attributes=FALSE) # TRUE
all( abs(R. - r.) <  1e-12 * mean(abs(R.))) # TRUE

*.Rout:

> M <- as(diag(4)+1,"dsCMatrix")
> m <- as(M, "matrix") # -> traditional R matrix
> stopifnot( all(M == m) )
> M
4 x 4 sparse Matrix of class "dsCMatrix"
           
[1,] 2 1 1 1
[2,] 1 2 1 1
[3,] 1 1 2 1
[4,] 1 1 1 2
> L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
> (L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically "dtCMatrix")
4 x 4 sparse Matrix of class "dtCMatrix"
                                           
[1,] 1.4142136 .         .         .      
[2,] 0.7071068 1.2247449 .         .      
[3,] 0.7071068 0.4082483 1.1547005 .      
[4,] 0.7071068 0.4082483 0.2886751 1.118034
> (cM <- chol(M))# *upper* triagonal ("dtCMatrix")
4 x 4 sparse Matrix of class "dtCMatrix"
                                           
[1,] 1.414214 0.7071068 0.7071068 0.7071068
[2,] .        1.2247449 0.4082483 0.4082483
[3,] .        .         1.1547005 0.2886751
[4,] .        .         .         1.1180340
> (cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" :
         [,1]      [,2]      [,3]      [,4]
[1,] 1.414214 0.7071068 0.7071068 0.7071068
[2,] 0.000000 1.2247449 0.4082483 0.4082483
[3,] 0.000000 0.0000000 1.1547005 0.2886751
[4,] 0.000000 0.0000000 0.0000000 1.1180340
> all.equal(as.matrix(cM), cm) # TRUE
[1] TRUE
> (r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
          [,1]       [,2]       [,3]       [,4]
[1,] 0.7071068 -0.4082483 -0.2886751 -0.2236068
[2,] 0.0000000  0.8164966 -0.2886751 -0.2236068
[3,] 0.0000000  0.0000000  0.8660254 -0.2236068
[4,] 0.0000000  0.0000000  0.0000000  0.8944272
> (R. <-     solve(cM) ) ## the "same"  (but nicer printing)
4 x 4 sparse Matrix of class "dtCMatrix"
                                               
[1,] 0.7071068 -0.4082483 -0.2886751 -0.2236068
[2,] .          0.8164966 -0.2886751 -0.2236068
[3,] .          .          0.8660254 -0.2236068
[4,] .          .          .          0.8944272
> all.equal(as.matrix(R.), r., check.attributes=FALSE) # TRUE
[1] TRUE
> all( abs(R. - r.) <  1e-12 * mean(abs(R.))) # TRUE
[1] TRUE
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: backsolve, chol, Matrix, and SparseM

Benjamin Tyner
Hi Martin,

Thanks for the remarks and examples, and for confirming that I was
indeed barking up the wrong tree with SparseM.

A. I assume that is a typo and you meant to say, no need for backsolve().
B. Absolutely; however, in this case I am taking advantage of
quadprog::solve.QP(..., factorized = TRUE) which requires the inverse of
the Cholesky factor; it turns out to be faster to compute this one time
upfront rather than have solve.QP(..., factorized = FALSE) do it over
and over again. Of course the holy grail would be a QP solver which
takes advantage of the various innovations from package:Matrix, but I
digress...
C. Agreed, assuming you are talking about Matrix::solve(X) on X of class
Matrix. On the other hand for a regular matrix x it is not difficult to
construct examples where backsolve(chol(x), diag(nrow(x))) is twice as
fast as base::solve(chol(x)), which led me down this path in the first
place.

By the way, is R-forge still the correct place to report bugs in
package:Matrix?

Regards
Ben


On 09/25/2015 04:25 AM, Martin Maechler wrote:

> Dear Ben,
>
>>>>>> Benjamin Tyner <[hidden email]>
>>>>>>     on Thu, 24 Sep 2015 13:47:58 -0400 writes:
>     > Hi I have some code which does (on a symmetric matrix 'x')
>
>     >     backsolve(chol(x), diag(nrow(x)))
>
>     > and I am wondering what is the recommended way to
>     > accomplish this when x is also sparse (from
>     > package:Matrix). I know that package:Matrix provides a
>     > chol method for such matrices, but not a backsolve
>     > method. On the other hand, package:SparseM does provide a
>     > backsolve method, but doesn't actually return a sparse
>     > matrix. Moreover, I am a little hesitant to use SparseM,
>     > as the vignette seems to be from 2003.
>
> Roger Koenker has agreed in the past, that new projects should
> rather use Matrix.   SparseM has been the very first R package
> providing sparse matrix support.
>
>
>     > I did notice that help(topic = "solve", package =
>     > "Matrix") says "In ‘solve(a,b)’ in the ‘Matrix’ package,
>     > ‘a’ may also be a ‘MatrixFactorization’ instead of
>     > directly a matrix." which makes me think this is the right
>     > way:
>
>     >     Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x)))
>
>     > but unfortunately this didn't give the same result as:
>
>     >     Matrix::solve(chol(x), .sparseDiagonal(nrow(x)))
>
>     > so I'm asking here in case someone has any suggestions.
>
> You don't give any examples.
> So a few remarks and a reproducible example to get more concrete
>
> A. As the Matrix package has classes for triangular matrices and
>   Matrix :: chol() returns them, there   is no need for
>   forwardsolve() or backwardsolve(), as just   solve() is always
>   enough.
>
> B. As Doug Bates has been teaching for many decennia, "it is
>   almost always computationally *wrong* to compute a matrix
>   inverse explicitly".
>   Rather compute    A^{-1} B   or  A^{-1} x  {for vector x,
>   matrix B (but different from Identity).
>
> C. Inspite of B, there are cases (such as computing sandwich
>   estimates of covariance matrices) where you do want the inverse.
>   In that case,
>
>    solve(A)    is semantically equivalent to
>    solve(A, diag(.))
>
>    and almost always the *first* form is implempented more
>    efficiently than the second.
>
> D. In Matrix,  use chol(.) ... unless you really read a bit
>    about Cholesky(.) and its special purpose sparse cholesky decompositions.
>    As mentioned above,  Matrix :: chol()  will return a
>    "formally triangular" matrix, i.e., inheriting from
>    "triangularMatrix"; in the sparse case, very typically of
>    specific class "dtCMatrix".
>
> Here's a small reproducible example,
> please use it to ask further questions:
>
> *.R:
>
> library(Matrix)
> M <- as(diag(4)+1,"dsCMatrix")
> m <- as(M, "matrix") # -> traditional R matrix
> stopifnot( all(M == m) )
> M
> L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
> (L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically "dtCMatrix")
> (cM <- chol(M))# *upper* triagonal ("dtCMatrix")
> (cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" :
> all.equal(as.matrix(cM), cm) # TRUE
>
> (r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
> (R. <-     solve(cM) ) ## the "same"  (but nicer printing)
> all.equal(as.matrix(R.), r., check.attributes=FALSE) # TRUE
> all( abs(R. - r.) <  1e-12 * mean(abs(R.))) # TRUE
>
> *.Rout:
>
>> M <- as(diag(4)+1,"dsCMatrix")
>> m <- as(M, "matrix") # -> traditional R matrix
>> stopifnot( all(M == m) )
>> M
> 4 x 4 sparse Matrix of class "dsCMatrix"
>            
> [1,] 2 1 1 1
> [2,] 1 2 1 1
> [3,] 1 1 2 1
> [4,] 1 1 1 2
>> L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
>> (L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically "dtCMatrix")
> 4 x 4 sparse Matrix of class "dtCMatrix"
>                                            
> [1,] 1.4142136 .         .         .      
> [2,] 0.7071068 1.2247449 .         .      
> [3,] 0.7071068 0.4082483 1.1547005 .      
> [4,] 0.7071068 0.4082483 0.2886751 1.118034
>> (cM <- chol(M))# *upper* triagonal ("dtCMatrix")
> 4 x 4 sparse Matrix of class "dtCMatrix"
>                                            
> [1,] 1.414214 0.7071068 0.7071068 0.7071068
> [2,] .        1.2247449 0.4082483 0.4082483
> [3,] .        .         1.1547005 0.2886751
> [4,] .        .         .         1.1180340
>> (cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" :
>          [,1]      [,2]      [,3]      [,4]
> [1,] 1.414214 0.7071068 0.7071068 0.7071068
> [2,] 0.000000 1.2247449 0.4082483 0.4082483
> [3,] 0.000000 0.0000000 1.1547005 0.2886751
> [4,] 0.000000 0.0000000 0.0000000 1.1180340
>> all.equal(as.matrix(cM), cm) # TRUE
> [1] TRUE
>> (r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
>           [,1]       [,2]       [,3]       [,4]
> [1,] 0.7071068 -0.4082483 -0.2886751 -0.2236068
> [2,] 0.0000000  0.8164966 -0.2886751 -0.2236068
> [3,] 0.0000000  0.0000000  0.8660254 -0.2236068
> [4,] 0.0000000  0.0000000  0.0000000  0.8944272
>> (R. <-     solve(cM) ) ## the "same"  (but nicer printing)
> 4 x 4 sparse Matrix of class "dtCMatrix"
>                                                
> [1,] 0.7071068 -0.4082483 -0.2886751 -0.2236068
> [2,] .          0.8164966 -0.2886751 -0.2236068
> [3,] .          .          0.8660254 -0.2236068
> [4,] .          .          .          0.8944272
>> all.equal(as.matrix(R.), r., check.attributes=FALSE) # TRUE
> [1] TRUE
>> all( abs(R. - r.) <  1e-12 * mean(abs(R.))) # TRUE
> [1] TRUE


______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.