fastest way to compute the squared Euclidean distance between two vectors in R

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

fastest way to compute the squared Euclidean distance between two vectors in R

Jason Liao
I have a program which needs to compute squared Euclidean distance
between two vectors million of times, which the Rprof shows is the
bottleneck. I wondered if there is any faster way than my own simple
function

distance2 = function(x1, x2)
{
   temp = x1-x2
   sum(temp*temp)
}

I have searched the R-help archives and can not find anything except
when the arguments are matrices. Thanks for any lead.

Jason

Jason Liao, http://www.geocities.com/jg_liao   
Associate Professor of Biostatistics
Drexel University School of Public Health
1505 Race Street, Mail Stop 1033
Bellet Building, 6th Floor
Philadelphia, PA   19102
phone 215-762-3934


      ____________________________________________________________________________________
Looking for last minute shopping deals?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: fastest way to compute the squared Euclidean distance betweentwo vectors in R

Dimitris Rizopoulos
hits=0.5 tests=BAYES_00,ETHZ_GEOCITIES,FORGED_RCVD_HELO
X-USF-Spam-Flag: NO

try this:

distance2 <- function (x1, x2) {
   temp <- x1 - x2
   sum(temp * temp)
}

x1 <- rnorm(1e06)
x2 <- rnorm(1e06)

system.time(for (i in 1:100) distance2(x1, x2))
system.time(for (i in 1:100) crossprod(x1 - x2))


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message -----
From: "Jason Liao" <[hidden email]>
To: <[hidden email]>
Sent: Thursday, January 31, 2008 3:28 AM
Subject: [R] fastest way to compute the squared Euclidean distance
betweentwo vectors in R


>I have a program which needs to compute squared Euclidean distance
> between two vectors million of times, which the Rprof shows is the
> bottleneck. I wondered if there is any faster way than my own simple
> function
>
> distance2 = function(x1, x2)
> {
>   temp = x1-x2
>   sum(temp*temp)
> }
>
> I have searched the R-help archives and can not find anything except
> when the arguments are matrices. Thanks for any lead.
>
> Jason
>
> Jason Liao, http://www.geocities.com/jg_liao
> Associate Professor of Biostatistics
> Drexel University School of Public Health
> 1505 Race Street, Mail Stop 1033
> Bellet Building, 6th Floor
> Philadelphia, PA   19102
> phone 215-762-3934
>
>
>
> ____________________________________________________________________________________
> Looking for last minute shopping deals?
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: fastest way to compute the squared Euclidean distance between two vectors in R

Hesen Peng
In reply to this post by Jason Liao
Hi,

I wonder whether the following may help a little:

Since $\sum (x_i - y_i)^2 = \sum x_i^2 + \sum y_i^2 - 2 \sum x_i y_i$,
we can modify the function as:

temp.distance<-function(x,y){
 sum(x^2) + sum(y^2) - 2* x %*% y
}

I think this may be helpful when you need to compute the bootstrap
distance for vectors from a matrix $X$. We simply need to
pre-calculate sum(x^2) for each vector $x$, and access the resulted
from designated vectors, say $xx$. Then we have:

xx<-RowSum(X^2)

another.distance<-function(i,j){ # i and j are the row index of vector x and y
 xx[i] + xx[j] - 2*x[i,] %*% x[j,]
}

On Jan 31, 2008 10:28 AM, Jason Liao <[hidden email]> wrote:

> I have a program which needs to compute squared Euclidean distance
> between two vectors million of times, which the Rprof shows is the
> bottleneck. I wondered if there is any faster way than my own simple
> function
>
> distance2 = function(x1, x2)
> {
>    temp = x1-x2
>    sum(temp*temp)
> }
>
> I have searched the R-help archives and can not find anything except
> when the arguments are matrices. Thanks for any lead.
>
> Jason
>
> Jason Liao, http://www.geocities.com/jg_liao
> Associate Professor of Biostatistics
> Drexel University School of Public Health
> 1505 Race Street, Mail Stop 1033
> Bellet Building, 6th Floor
> Philadelphia, PA   19102
> phone 215-762-3934
>
>
>       ____________________________________________________________________________________
> Looking for last minute shopping deals?
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
彭河森
Hesen Peng
Department of Statistics
Fudan University
Shanghai, P. R. C.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: fastest way to compute the squared Euclidean distance between two vectors in R

François Pinard
In reply to this post by Jason Liao
Jason Liao <[hidden email]> writes:

> I have a program which needs to compute squared Euclidean distance
> between two vectors million of times, which the Rprof shows is the
> bottleneck. I wondered if there is any faster way than my own simple
> function
>
> distance2 = function(x1, x2)
> {
>    temp = x1-x2
>    sum(temp*temp)
> }

You might try:

  distance2 <- function(x1, x2) crossprod(x1-x2)

And if you do not have to pass the distance2 function itself to some
other function, you might also spare the indirection trhough the
distance2 function and call crossprod directly (replacing the comma by
a minus sign :-).

--
François Pinard   http://pinard.progiciels-bpi.ca

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.