Hi Hadley,

On 10/21/2013 10:51 AM, Hadley Wickham wrote:

> Hi all,

>

> Would anyone be interested in reviewing a patch to make the set

> operations (union, intersect, setdiff, setequal, is.element) generic?

S3 generics, S4 generics, or primitives?

Since they are binary operations, sounds like supporting multiple

dispatch would be a plus.

Note that all those things heavily rely on match() behind the scene.

If match() itself was an S4 generic (or a primitive like c() and [)

then union(), intersect(), setdiff(), is.element() could be defined

with something like:

union <- function(x, y)

{

xy <- c(x, y)

sm <- match(xy, xy)

xy[sm == seq_along(sm)]

}

intersect <- function(x, y)

{

sm <- match(x, x)

x <- x[sm == seq_along(sm)]

m <- match(x, y)

x[!is.na(m)]

}

setequal <- function(x, y)

{

!(anyNA(match(x, y)) || anyNA(match(x, y)))

}

and as long as your objects support [, c(), and match(), then the set

operations will work out-of-the-box on them. Note that you would also

get %in% for free.

There might be some rare situations where it might still be useful

that the set operations are generic functions but I see a lot more

value in making match() itself a generic (which doesn't exclude also

making the set operations generic).

For the record, match(), union(), intersect(), and setdiff() are S4

generics in the BiocGenerics package. But there is no doubt it would

be a better/cleaner situation if base::match() itself was an S4 generic

or primitive.

My 2 cents,

Cheers,

H.

>

> Thanks,

>

> Hadley

>

--

Hervé Pagès

Program in Computational Biology

Division of Public Health Sciences

Fred Hutchinson Cancer Research Center

1100 Fairview Ave. N, M1-B514

P.O. Box 19024

Seattle, WA 98109-1024

E-mail:

[hidden email]
Phone: (206) 667-5791

Fax: (206) 667-1319

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel