|
|
The duplicated() function gives TRUE if an item in a vector (or row in a
matrix, etc.) is a duplicate of an earlier item. But what I would like
to know is which item does it duplicate?
For example,
v <- c("a", "b", "b", "a")
duplicated(v)
returns
[1] FALSE FALSE TRUE TRUE
What I want is a fast way to calculate
[1] NA NA 2 1
or (equally useful to me)
[1] 1 2 2 1
The result should have the property that if result[i] == j, then v[i] ==
v[j], at least for i != j.
Does this already exist somewhere, or is it easy to write?
Duncan Murdoch
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
what about as.integer(factor(v, levels = unique(v)))
I recall very clearly when I realized the power of this feature of
factor(), but I've not seen it discussed much.
Cheers, Mike.
On Tue, 13 Nov 2018 at 12:08 Duncan Murdoch < [hidden email]>
wrote:
> The duplicated() function gives TRUE if an item in a vector (or row in a
> matrix, etc.) is a duplicate of an earlier item. But what I would like
> to know is which item does it duplicate?
>
> For example,
>
> v <- c("a", "b", "b", "a")
> duplicated(v)
>
> returns
>
> [1] FALSE FALSE TRUE TRUE
>
> What I want is a fast way to calculate
>
> [1] NA NA 2 1
>
> or (equally useful to me)
>
> [1] 1 2 2 1
>
> The result should have the property that if result[i] == j, then v[i] ==
> v[j], at least for i != j.
>
> Does this already exist somewhere, or is it easy to write?
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
>
--
Dr. Michael Sumner
Software and Database Engineer
Australian Antarctic Division
203 Channel Highway
Kingston Tasmania 7050 Australia
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
> match(v, unique(v))
[1] 1 2 2 1
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch < [hidden email]>
wrote:
> The duplicated() function gives TRUE if an item in a vector (or row in a
> matrix, etc.) is a duplicate of an earlier item. But what I would like
> to know is which item does it duplicate?
>
> For example,
>
> v <- c("a", "b", "b", "a")
> duplicated(v)
>
> returns
>
> [1] FALSE FALSE TRUE TRUE
>
> What I want is a fast way to calculate
>
> [1] NA NA 2 1
>
> or (equally useful to me)
>
> [1] 1 2 2 1
>
> The result should have the property that if result[i] == j, then v[i] ==
> v[j], at least for i != j.
>
> Does this already exist somewhere, or is it easy to write?
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Hi,
On 11/12/18 17:08, Duncan Murdoch wrote:
> The duplicated() function gives TRUE if an item in a vector (or row in
> a matrix, etc.) is a duplicate of an earlier item. But what I would
> like to know is which item does it duplicate?
>
> For example,
>
> v <- c("a", "b", "b", "a")
> duplicated(v)
>
> returns
>
> [1] FALSE FALSE TRUE TRUE
>
> What I want is a fast way to calculate
>
> [1] NA NA 2 1
>
> or (equally useful to me)
>
> [1] 1 2 2 1
>
> The result should have the property that if result[i] == j, then v[i]
> == v[j], at least for i != j.
>
> Does this already exist somewhere, or is it easy to write?
I generally use match() for that:
> v <- c("a", "b", "b", "a")
> match(v, v)
[1] 1 2 2 1
H.
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: [hidden email]
Phone: (206) 667-5791
Fax: (206) 667-1319
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
It is not clear to what you want for the general case. Perhaps:
> v <- letters[c(2,2,1,2,1,1)]
> wh <- tapply(seq_along(v),factor(v), '[',1)
> w <- wh[match(v,v[wh])]
> w
b b a b a a
1 1 3 1 3 3
> ## and if you want NA's for the first occurences of unique values
> ## of course:
> w[wh] <- NA
> w
b b a b a a
NA 1 NA 1 3 3
I'd like to see a cleverer solution that vectorizes and avoids the
tapply(), though.
Cheers,
Bert
On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter < [hidden email]> wrote:
> > match(v, unique(v))
> [1] 1 2 2 1
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch < [hidden email]>
> wrote:
>
>> The duplicated() function gives TRUE if an item in a vector (or row in a
>> matrix, etc.) is a duplicate of an earlier item. But what I would like
>> to know is which item does it duplicate?
>>
>> For example,
>>
>> v <- c("a", "b", "b", "a")
>> duplicated(v)
>>
>> returns
>>
>> [1] FALSE FALSE TRUE TRUE
>>
>> What I want is a fast way to calculate
>>
>> [1] NA NA 2 1
>>
>> or (equally useful to me)
>>
>> [1] 1 2 2 1
>>
>> The result should have the property that if result[i] == j, then v[i] ==
>> v[j], at least for i != j.
>>
>> Does this already exist somewhere, or is it easy to write?
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code.
>>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
"I'd like to see a cleverer solution that vectorizes..."
and Herve provided it.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Nov 12, 2018 at 9:43 PM Bert Gunter < [hidden email]> wrote:
> It is not clear to what you want for the general case. Perhaps:
>
> > v <- letters[c(2,2,1,2,1,1)]
> > wh <- tapply(seq_along(v),factor(v), '[',1)
> > w <- wh[match(v,v[wh])]
> > w
> b b a b a a
> 1 1 3 1 3 3
> > ## and if you want NA's for the first occurences of unique values
> > ## of course:
> > w[wh] <- NA
> > w
> b b a b a a
> NA 1 NA 1 3 3
>
> I'd like to see a cleverer solution that vectorizes and avoids the
> tapply(), though.
>
> Cheers,
> Bert
>
>
>
>
> On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter < [hidden email]>
> wrote:
>
>> > match(v, unique(v))
>> [1] 1 2 2 1
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch < [hidden email]>
>> wrote:
>>
>>> The duplicated() function gives TRUE if an item in a vector (or row in a
>>> matrix, etc.) is a duplicate of an earlier item. But what I would like
>>> to know is which item does it duplicate?
>>>
>>> For example,
>>>
>>> v <- c("a", "b", "b", "a")
>>> duplicated(v)
>>>
>>> returns
>>>
>>> [1] FALSE FALSE TRUE TRUE
>>>
>>> What I want is a fast way to calculate
>>>
>>> [1] NA NA 2 1
>>>
>>> or (equally useful to me)
>>>
>>> [1] 1 2 2 1
>>>
>>> The result should have the property that if result[i] == j, then v[i] ==
>>> v[j], at least for i != j.
>>>
>>> Does this already exist somewhere, or is it easy to write?
>>>
>>> Duncan Murdoch
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Hi
similar result (with different numerical values) could be achieved by making v a factor.
> v <- letters[c(2,2,1,2,1,1)]
> vf<-factor(v)
> as.numeric(vf)
[1] 2 2 1 2 1 1
Cheers
Petr
> -----Original Message-----
> From: R-help < [hidden email]> On Behalf Of Bert Gunter
> Sent: Tuesday, November 13, 2018 6:44 AM
> To: Duncan Murdoch < [hidden email]>
> Cc: R-help < [hidden email]>
> Subject: Re: [R] which element is duplicated?
>
> It is not clear to what you want for the general case. Perhaps:
>
> > v <- letters[c(2,2,1,2,1,1)]
> > wh <- tapply(seq_along(v),factor(v), '[',1) w <- wh[match(v,v[wh])] w
> b b a b a a
> 1 1 3 1 3 3
> > ## and if you want NA's for the first occurences of unique values ##
> > of course:
> > w[wh] <- NA
> > w
> b b a b a a
> NA 1 NA 1 3 3
>
> I'd like to see a cleverer solution that vectorizes and avoids the tapply(),
> though.
>
> Cheers,
> Bert
>
>
>
>
> On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter < [hidden email]>
> wrote:
>
> > > match(v, unique(v))
> > [1] 1 2 2 1
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch
> > < [hidden email]>
> > wrote:
> >
> >> The duplicated() function gives TRUE if an item in a vector (or row
> >> in a matrix, etc.) is a duplicate of an earlier item. But what I
> >> would like to know is which item does it duplicate?
> >>
> >> For example,
> >>
> >> v <- c("a", "b", "b", "a")
> >> duplicated(v)
> >>
> >> returns
> >>
> >> [1] FALSE FALSE TRUE TRUE
> >>
> >> What I want is a fast way to calculate
> >>
> >> [1] NA NA 2 1
> >>
> >> or (equally useful to me)
> >>
> >> [1] 1 2 2 1
> >>
> >> The result should have the property that if result[i] == j, then v[i]
> >> == v[j], at least for i != j.
> >>
> >> Does this already exist somewhere, or is it easy to write?
> >>
> >> Duncan Murdoch
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
>>>>> PIKAL Petr
>>>>> on Tue, 13 Nov 2018 08:42:22 +0000 writes:
> Hi
> similar result (with different numerical values) could
> be achieved by making v a factor.
> > v <- letters[c(2,2,1,2,1,1)]
> > vf<-factor(v)
> > as.numeric(vf)
> [1] 2 2 1 2 1 1
>
> Cheers
> Petr
Yes, as was already remarked by Michael Sumner.
But really the power is in match() : It is called at *twice* by factor().
Martin
> > -----Original Message-----
> > From: R-help < [hidden email]> On Behalf Of Bert Gunter
> > Sent: Tuesday, November 13, 2018 6:44 AM
> > To: Duncan Murdoch < [hidden email]>
> > Cc: R-help < [hidden email]>
> > Subject: Re: [R] which element is duplicated?
> >
> > It is not clear to what you want for the general case. Perhaps:
> >
> > > v <- letters[c(2,2,1,2,1,1)]
> > > wh <- tapply(seq_along(v),factor(v), '[',1) w <- wh[match(v,v[wh])] w
> > b b a b a a
> > 1 1 3 1 3 3
> > > ## and if you want NA's for the first occurences of unique values ##
> > > of course:
> > > w[wh] <- NA
> > > w
> > b b a b a a
> > NA 1 NA 1 3 3
> >
> > I'd like to see a cleverer solution that vectorizes and avoids the tapply(),
> > though.
> >
> > Cheers,
> > Bert
> >
> >
> >
> >
> > On Mon, Nov 12, 2018 at 8:33 PM Bert Gunter < [hidden email]>
> > wrote:
> >
> > > > match(v, unique(v))
> > > [1] 1 2 2 1
> > >
> > > Bert Gunter
> > >
> > > "The trouble with having an open mind is that people keep coming along
> > > and sticking things into it."
> > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> > >
> > >
> > > On Mon, Nov 12, 2018 at 5:08 PM Duncan Murdoch
> > > < [hidden email]>
> > > wrote:
> > >
> > >> The duplicated() function gives TRUE if an item in a vector (or row
> > >> in a matrix, etc.) is a duplicate of an earlier item. But what I
> > >> would like to know is which item does it duplicate?
> > >>
> > >> For example,
> > >>
> > >> v <- c("a", "b", "b", "a")
> > >> duplicated(v)
> > >>
> > >> returns
> > >>
> > >> [1] FALSE FALSE TRUE TRUE
> > >>
> > >> What I want is a fast way to calculate
> > >>
> > >> [1] NA NA 2 1
> > >>
> > >> or (equally useful to me)
> > >>
> > >> [1] 1 2 2 1
> > >>
> > >> The result should have the property that if result[i] == j, then v[i]
> > >> == v[j], at least for i != j.
> > >>
> > >> Does this already exist somewhere, or is it easy to write?
> > >>
> > >> Duncan Murdoch
> > >>
> > >> ______________________________________________
> > >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html> > >> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
On 13/11/2018 12:35 AM, Pages, Herve wrote:
> Hi,
>
> On 11/12/18 17:08, Duncan Murdoch wrote:
>> The duplicated() function gives TRUE if an item in a vector (or row in
>> a matrix, etc.) is a duplicate of an earlier item. But what I would
>> like to know is which item does it duplicate?
>>
>> For example,
>>
>> v <- c("a", "b", "b", "a")
>> duplicated(v)
>>
>> returns
>>
>> [1] FALSE FALSE TRUE TRUE
>>
>> What I want is a fast way to calculate
>>
>> [1] NA NA 2 1
>>
>> or (equally useful to me)
>>
>> [1] 1 2 2 1
>>
>> The result should have the property that if result[i] == j, then v[i]
>> == v[j], at least for i != j.
>>
>> Does this already exist somewhere, or is it easy to write?
>
> I generally use match() for that:
>
> > v <- c("a", "b", "b", "a")
>
> > match(v, v)
>
> [1] 1 2 2 1
Yes, this is perfect. Thanks to you (and the private answer I received
that suggested the same).
Duncan Murdoch
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
You also asked about doing this for the rows of a matrix. unique() give
the unique rows but match operates on a per element, not per row,
basis. You can use split, which operates on rows of a matrix, to help.
> m <- cbind( A=c(i=5,ii=5,iii=5,iv=4,v=4,vi=4), B=c(2,3,2,2,2,2) )
> unique(m)
A B
i 5 2
ii 5 3
iv 4 2
> match(m, unique(m)) # bad
[1] 1 1 1 3 3 3 4 5 4 4 4 4
> asRows <- function(x) split(x, seq_len(NROW(x))) # convert to list of rows
> match(asRows(m), unique(asRows(m)))
[1] 1 2 1 3 3 3
For data.frames unique works on rows but match works on columns, and
converting
to a list of rows does not quite work, because unique looks at the row
names. A
modification of asRoiws works around that:
> d <- data.frame(m)
> unique(d)
A B
i 5 2
ii 5 3
iv 4 2
> match(d, unique(d))
[1] NA NA
> asRows <- function(x) lapply(split(x, seq_len(NROW(x))), as.list)
> match(asRows(d), unique(asRows(d)))
[1] 1 2 1 3 3 3
Is this the sort of issue that Hadley's vectors package is addressing?
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Tue, Nov 13, 2018 at 2:15 AM, Duncan Murdoch < [hidden email]>
wrote:
> On 13/11/2018 12:35 AM, Pages, Herve wrote:
>
>> Hi,
>>
>> On 11/12/18 17:08, Duncan Murdoch wrote:
>>
>>> The duplicated() function gives TRUE if an item in a vector (or row in
>>> a matrix, etc.) is a duplicate of an earlier item. But what I would
>>> like to know is which item does it duplicate?
>>>
>>> For example,
>>>
>>> v <- c("a", "b", "b", "a")
>>> duplicated(v)
>>>
>>> returns
>>>
>>> [1] FALSE FALSE TRUE TRUE
>>>
>>> What I want is a fast way to calculate
>>>
>>> [1] NA NA 2 1
>>>
>>> or (equally useful to me)
>>>
>>> [1] 1 2 2 1
>>>
>>> The result should have the property that if result[i] == j, then v[i]
>>> == v[j], at least for i != j.
>>>
>>> Does this already exist somewhere, or is it easy to write?
>>>
>>
>> I generally use match() for that:
>>
>> > v <- c("a", "b", "b", "a")
>>
>> > match(v, v)
>>
>> [1] 1 2 2 1
>>
>
> Yes, this is perfect. Thanks to you (and the private answer I received
> that suggested the same).
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posti> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
On 13/11/2018 12:58 PM, William Dunlap wrote:
> You also asked about doing this for the rows of a matrix. unique() give
> the unique rows but match operates on a per element, not per row,
> basis. You can use split, which operates on rows of a matrix, to help.
>
> > m <- cbind( A=c(i=5,ii=5,iii=5,iv=4,v=4,vi=4), B=c(2,3,2,2,2,2) )
> > unique(m)
> A B
> i 5 2
> ii 5 3
> iv 4 2
> > match(m, unique(m)) # bad
> [1] 1 1 1 3 3 3 4 5 4 4 4 4
> > asRows <- function(x) split(x, seq_len(NROW(x))) # convert to
> list of rows
> > match(asRows(m), unique(asRows(m)))
> [1] 1 2 1 3 3 3
>
>
> For data.frames unique works on rows but match works on columns, and
> converting
> to a list of rows does not quite work, because unique looks at the row
> names. A
> modification of asRoiws works around that:
>
> > d <- data.frame(m)
> > unique(d)
> A B
> i 5 2
> ii 5 3
> iv 4 2
> > match(d, unique(d))
> [1] NA NA
> > asRows <- function(x) lapply(split(x, seq_len(NROW(x))), as.list)
> > match(asRows(d), unique(asRows(d)))
> [1] 1 2 1 3 3 3
>
Thanks! That's very nice.
>
> Is this the sort of issue that Hadley's vectors package is addressing?
I don't know; hopefully someone else will respond...
Duncan Murdoch
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com < http://tibco.com>
>
> On Tue, Nov 13, 2018 at 2:15 AM, Duncan Murdoch
> < [hidden email] <mailto: [hidden email]>> wrote:
>
> On 13/11/2018 12:35 AM, Pages, Herve wrote:
>
> Hi,
>
> On 11/12/18 17:08, Duncan Murdoch wrote:
>
> The duplicated() function gives TRUE if an item in a vector
> (or row in
> a matrix, etc.) is a duplicate of an earlier item. But what
> I would
> like to know is which item does it duplicate?
>
> For example,
>
> v <- c("a", "b", "b", "a")
> duplicated(v)
>
> returns
>
> [1] FALSE FALSE TRUE TRUE
>
> What I want is a fast way to calculate
>
> [1] NA NA 2 1
>
> or (equally useful to me)
>
> [1] 1 2 2 1
>
> The result should have the property that if result[i] == j,
> then v[i]
> == v[j], at least for i != j.
>
> Does this already exist somewhere, or is it easy to write?
>
>
> I generally use match() for that:
>
> > v <- c("a", "b", "b", "a")
>
> > match(v, v)
>
> [1] 1 2 2 1
>
>
> Yes, this is perfect. Thanks to you (and the private answer I
> received that suggested the same).
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] <mailto: [hidden email]> mailing list --
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> < https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> < http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|