Fast matrix multiplication

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Fast matrix multiplication

Ravi Varadhan-2
Hi,

I would like to compute:  A %*% B %*% t(A)



A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, where m is large relative to n (e.g., m=50,000 and n=100).



Here is a sample code.



M <- 10000

N <- 100

A <- matrix(rnorm(M*N), M, N)

B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric positive-definite matrix



# method 1

system.time(D <- A %*% B %*% t(A))



# I can obtain speedup by using a Cholesky decomposition of B

# method 2

system.time({

C <- t(chol(B))

E <- tcrossprod(A%*%C)

})



all.equal(D, E)



I am wondering how to obtain more substantial speedup.  Any suggestions would be greatly appreciated.



Thanks,

Ravi



        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fast matrix multiplication

Ista Zahn
Hi Ravi,

You can achieve substantial speed up by using a faster BLAS (e.g.,
OpenBLAS or MKL), especially on systems with multiple CPUs. On my (6
year old, but 8 core) system your example takes 3.9 seconds with using
the reference BLAS and only 0.9 seconds using OpenBLAS.

Best,
Ista
On Fri, Aug 10, 2018 at 11:46 AM Ravi Varadhan <[hidden email]> wrote:

>
> Hi,
>
> I would like to compute:  A %*% B %*% t(A)
>
>
>
> A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, where m is large relative to n (e.g., m=50,000 and n=100).
>
>
>
> Here is a sample code.
>
>
>
> M <- 10000
>
> N <- 100
>
> A <- matrix(rnorm(M*N), M, N)
>
> B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric positive-definite matrix
>
>
>
> # method 1
>
> system.time(D <- A %*% B %*% t(A))
>
>
>
> # I can obtain speedup by using a Cholesky decomposition of B
>
> # method 2
>
> system.time({
>
> C <- t(chol(B))
>
> E <- tcrossprod(A%*%C)
>
> })
>
>
>
> all.equal(D, E)
>
>
>
> I am wondering how to obtain more substantial speedup.  Any suggestions would be greatly appreciated.
>
>
>
> Thanks,
>
> Ravi
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fast matrix multiplication

Doran, Harold
In reply to this post by Ravi Varadhan-2
Yeah, you might not be able to go much faster here unless A has some
specialized structure that you can take advantage of (e.g., sparsity)?

On 8/10/18, 11:22 AM, "Ravi Varadhan" <[hidden email]> wrote:

>Hi,
>
>I would like to compute:  A %*% B %*% t(A)
>
>
>
>A is a mxn matrix and B is an nxn symmetric, positive-definite matrix,
>where m is large relative to n (e.g., m=50,000 and n=100).
>
>
>
>Here is a sample code.
>
>
>
>M <- 10000
>
>N <- 100
>
>A <- matrix(rnorm(M*N), M, N)
>
>B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric
>positive-definite matrix
>
>
>
># method 1
>
>system.time(D <- A %*% B %*% t(A))
>
>
>
># I can obtain speedup by using a Cholesky decomposition of B
>
># method 2
>
>system.time({
>
>C <- t(chol(B))
>
>E <- tcrossprod(A%*%C)
>
>})
>
>
>
>all.equal(D, E)
>
>
>
>I am wondering how to obtain more substantial speedup.  Any suggestions
>would be greatly appreciated.
>
>
>
>Thanks,
>
>Ravi
>
>
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fast matrix multiplication

Peter Carbonetto
In reply to this post by Ravi Varadhan-2
Hi Ravi,

Like Ista, I have also had success in using R with OpenBLAS on our Linux
compute cluster.

This will involve installing R from source. If you'd like, I can send you
the configure settings I used, as well as the environment settings used to
control OpenBLAS multithreading. The CRAN site also has some good
documentation of this.

Note that I had issues using the latest version of OpenBLAS (0.3.x) with R
3.5.1. I'm not sure if you would have the same issues as me, but to be safe
I would suggest using OpenBLAS 0.2.x instead.

Peter

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fast matrix multiplication

Ravi Varadhan-2
In reply to this post by Ista Zahn
Hi Ista,
Thank you for the response.  I use Windows.  Is there a pre-compiled version of openBLAS for windows that would make it easy for me to use it?
Thanks,
Ravi

-----Original Message-----
From: Ista Zahn <[hidden email]>
Sent: Friday, August 10, 2018 12:20 PM
To: Ravi Varadhan <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] Fast matrix multiplication


Hi Ravi,

You can achieve substantial speed up by using a faster BLAS (e.g., OpenBLAS or MKL), especially on systems with multiple CPUs. On my (6 year old, but 8 core) system your example takes 3.9 seconds with using the reference BLAS and only 0.9 seconds using OpenBLAS.

Best,
Ista
On Fri, Aug 10, 2018 at 11:46 AM Ravi Varadhan <[hidden email]> wrote:

>
> Hi,
>
> I would like to compute:  A %*% B %*% t(A)
>
>
>
> A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, where m is large relative to n (e.g., m=50,000 and n=100).
>
>
>
> Here is a sample code.
>
>
>
> M <- 10000
>
> N <- 100
>
> A <- matrix(rnorm(M*N), M, N)
>
> B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric
> positive-definite matrix
>
>
>
> # method 1
>
> system.time(D <- A %*% B %*% t(A))
>
>
>
> # I can obtain speedup by using a Cholesky decomposition of B
>
> # method 2
>
> system.time({
>
> C <- t(chol(B))
>
> E <- tcrossprod(A%*%C)
>
> })
>
>
>
> all.equal(D, E)
>
>
>
> I am wondering how to obtain more substantial speedup.  Any suggestions would be greatly appreciated.
>
>
>
> Thanks,
>
> Ravi
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fast matrix multiplication

Ista Zahn
On Mon, Aug 13, 2018 at 2:41 PM Ravi Varadhan <[hidden email]> wrote:
>
> Hi Ista,
> Thank you for the response.  I use Windows.  Is there a pre-compiled version of openBLAS for windows that would make it easy for me to use it?

Not sure. If you want an easy way I would use MRO. More info at
https://mran.microsoft.com/rro#intelmkl1

--Ista

> Thanks,
> Ravi
>
> -----Original Message-----
> From: Ista Zahn <[hidden email]>
> Sent: Friday, August 10, 2018 12:20 PM
> To: Ravi Varadhan <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [R] Fast matrix multiplication
>
>
> Hi Ravi,
>
> You can achieve substantial speed up by using a faster BLAS (e.g., OpenBLAS or MKL), especially on systems with multiple CPUs. On my (6 year old, but 8 core) system your example takes 3.9 seconds with using the reference BLAS and only 0.9 seconds using OpenBLAS.
>
> Best,
> Ista
> On Fri, Aug 10, 2018 at 11:46 AM Ravi Varadhan <[hidden email]> wrote:
> >
> > Hi,
> >
> > I would like to compute:  A %*% B %*% t(A)
> >
> >
> >
> > A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, where m is large relative to n (e.g., m=50,000 and n=100).
> >
> >
> >
> > Here is a sample code.
> >
> >
> >
> > M <- 10000
> >
> > N <- 100
> >
> > A <- matrix(rnorm(M*N), M, N)
> >
> > B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric
> > positive-definite matrix
> >
> >
> >
> > # method 1
> >
> > system.time(D <- A %*% B %*% t(A))
> >
> >
> >
> > # I can obtain speedup by using a Cholesky decomposition of B
> >
> > # method 2
> >
> > system.time({
> >
> > C <- t(chol(B))
> >
> > E <- tcrossprod(A%*%C)
> >
> > })
> >
> >
> >
> > all.equal(D, E)
> >
> >
> >
> > I am wondering how to obtain more substantial speedup.  Any suggestions would be greatly appreciated.
> >
> >
> >
> > Thanks,
> >
> > Ravi
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fast matrix multiplication

plangfelder
On Mon, Aug 13, 2018 at 12:18 PM Ista Zahn <[hidden email]> wrote:
>
> On Mon, Aug 13, 2018 at 2:41 PM Ravi Varadhan <[hidden email]> wrote:
> >
> > Hi Ista,
> > Thank you for the response.  I use Windows.  Is there a pre-compiled version of openBLAS for windows that would make it easy for me to use it?
>
> Not sure. If you want an easy way I would use MRO. More info at
> https://mran.microsoft.com/rro#intelmkl1

OpenBLAS is provided as a binary for Windows, see http://www.openblas.net/ .

You may need to compile R from source though, unless you can use an
equivalent of the linux trick to replace libRblas.so with a symlink to
the compiled openBLAS library.

Peter

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fast matrix multiplication

Ravi Varadhan-2
In reply to this post by Ista Zahn
Does Microsoft open R come with pre-compiled BLAS that is optimized for matrix computations?

Thanks,
Ravi

-----Original Message-----
From: Ista Zahn <[hidden email]>
Sent: Monday, August 13, 2018 3:18 PM
To: Ravi Varadhan <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] Fast matrix multiplication


On Mon, Aug 13, 2018 at 2:41 PM Ravi Varadhan <[hidden email]> wrote:
>
> Hi Ista,
> Thank you for the response.  I use Windows.  Is there a pre-compiled version of openBLAS for windows that would make it easy for me to use it?

Not sure. If you want an easy way I would use MRO. More info at
https://mran.microsoft.com/rro#intelmkl1

--Ista

> Thanks,
> Ravi
>
> -----Original Message-----
> From: Ista Zahn <[hidden email]>
> Sent: Friday, August 10, 2018 12:20 PM
> To: Ravi Varadhan <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [R] Fast matrix multiplication
>
>
> Hi Ravi,
>
> You can achieve substantial speed up by using a faster BLAS (e.g., OpenBLAS or MKL), especially on systems with multiple CPUs. On my (6 year old, but 8 core) system your example takes 3.9 seconds with using the reference BLAS and only 0.9 seconds using OpenBLAS.
>
> Best,
> Ista
> On Fri, Aug 10, 2018 at 11:46 AM Ravi Varadhan <[hidden email]> wrote:
> >
> > Hi,
> >
> > I would like to compute:  A %*% B %*% t(A)
> >
> >
> >
> > A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, where m is large relative to n (e.g., m=50,000 and n=100).
> >
> >
> >
> > Here is a sample code.
> >
> >
> >
> > M <- 10000
> >
> > N <- 100
> >
> > A <- matrix(rnorm(M*N), M, N)
> >
> > B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric
> > positive-definite matrix
> >
> >
> >
> > # method 1
> >
> > system.time(D <- A %*% B %*% t(A))
> >
> >
> >
> > # I can obtain speedup by using a Cholesky decomposition of B
> >
> > # method 2
> >
> > system.time({
> >
> > C <- t(chol(B))
> >
> > E <- tcrossprod(A%*%C)
> >
> > })
> >
> >
> >
> > all.equal(D, E)
> >
> >
> >
> > I am wondering how to obtain more substantial speedup.  Any suggestions would be greatly appreciated.
> >
> >
> >
> > Thanks,
> >
> > Ravi
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fast matrix multiplication

Ista Zahn
On Tue, Aug 14, 2018 at 9:41 AM Ravi Varadhan <[hidden email]> wrote:
>
> Does Microsoft open R come with pre-compiled BLAS that is optimized for matrix computations?

Yes, see https://mran.microsoft.com/rro#intelmkl1 for details.

--Ista

>
> Thanks,
> Ravi
>
> -----Original Message-----
> From: Ista Zahn <[hidden email]>
> Sent: Monday, August 13, 2018 3:18 PM
> To: Ravi Varadhan <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [R] Fast matrix multiplication
>
>
> On Mon, Aug 13, 2018 at 2:41 PM Ravi Varadhan <[hidden email]> wrote:
> >
> > Hi Ista,
> > Thank you for the response.  I use Windows.  Is there a pre-compiled version of openBLAS for windows that would make it easy for me to use it?
>
> Not sure. If you want an easy way I would use MRO. More info at
> https://mran.microsoft.com/rro#intelmkl1
>
> --Ista
>
> > Thanks,
> > Ravi
> >
> > -----Original Message-----
> > From: Ista Zahn <[hidden email]>
> > Sent: Friday, August 10, 2018 12:20 PM
> > To: Ravi Varadhan <[hidden email]>
> > Cc: [hidden email]
> > Subject: Re: [R] Fast matrix multiplication
> >
> >
> > Hi Ravi,
> >
> > You can achieve substantial speed up by using a faster BLAS (e.g., OpenBLAS or MKL), especially on systems with multiple CPUs. On my (6 year old, but 8 core) system your example takes 3.9 seconds with using the reference BLAS and only 0.9 seconds using OpenBLAS.
> >
> > Best,
> > Ista
> > On Fri, Aug 10, 2018 at 11:46 AM Ravi Varadhan <[hidden email]> wrote:
> > >
> > > Hi,
> > >
> > > I would like to compute:  A %*% B %*% t(A)
> > >
> > >
> > >
> > > A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, where m is large relative to n (e.g., m=50,000 and n=100).
> > >
> > >
> > >
> > > Here is a sample code.
> > >
> > >
> > >
> > > M <- 10000
> > >
> > > N <- 100
> > >
> > > A <- matrix(rnorm(M*N), M, N)
> > >
> > > B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric
> > > positive-definite matrix
> > >
> > >
> > >
> > > # method 1
> > >
> > > system.time(D <- A %*% B %*% t(A))
> > >
> > >
> > >
> > > # I can obtain speedup by using a Cholesky decomposition of B
> > >
> > > # method 2
> > >
> > > system.time({
> > >
> > > C <- t(chol(B))
> > >
> > > E <- tcrossprod(A%*%C)
> > >
> > > })
> > >
> > >
> > >
> > > all.equal(D, E)
> > >
> > >
> > >
> > > I am wondering how to obtain more substantial speedup.  Any suggestions would be greatly appreciated.
> > >
> > >
> > >
> > > Thanks,
> > >
> > > Ravi
> > >
> > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.