looking for 'tied rows' in dataframe

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

looking for 'tied rows' in dataframe

Evan Cooch
Suppose I have the following sort of structure:

test <- matrix(c(2,1,1,2,2,2),3,2,byrow=T)

What I need to be able to do is (i) find the maximum value for each row,
(ii) find the column containing the max, but (iii) if the maximum value
is a tie (in this case, all numbers of the row are the same value), then
I want which.max (presumably, a tweaked version of what which.max does)
to reurn a T for the row where all values are the same.

Parts (i) and (ii) seem easy enough:

apply(test,1,max)  --- gives me the maximum values
apply(test,1,which.max) --- gives me the column

But, standard which.max doesn't handles ties/duplicates in a way that
serves my need. It defaults to returning the first column containing the
maximum value.

What I'd like to end up with is, ultimately, something where
apply(test,1,which.max) yields 1,2,T  (rather than 1,2,1).

So, a function which does what which.max currently does if the elements
of the row differ, but which returns a T (or some such) if in fact the
row values are all the same.

I've tried a bunch of things, to know avail. Closest I got was to use a
function to test for whether or not a vector

isUnique <- function(vector){
                  return(!any(duplicated(vector)))
             }

which returns TRUE if values of vector all unique. So

apply(test,1,isUnique)

returns

[1]  TRUE  TRUE FALSE

but I'm stuck beyond this.  Suggestions/pointers to the obvious welcome.

Thanks in advance.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: looking for 'tied rows' in dataframe

Evan Cooch
Got relatively close - below:

On 3/17/2019 7:39 PM, Evan Cooch wrote:

> Suppose I have the following sort of structure:
>
> test <- matrix(c(2,1,1,2,2,2),3,2,byrow=T)
>
> What I need to be able to do is (i) find the maximum value for each
> row, (ii) find the column containing the max, but (iii) if the maximum
> value is a tie (in this case, all numbers of the row are the same
> value), then I want which.max (presumably, a tweaked version of what
> which.max does) to reurn a T for the row where all values are the same.
>
> Parts (i) and (ii) seem easy enough:
>
> apply(test,1,max)  --- gives me the maximum values
> apply(test,1,which.max) --- gives me the column
>
> But, standard which.max doesn't handles ties/duplicates in a way that
> serves my need. It defaults to returning the first column containing
> the maximum value.
>
> What I'd like to end up with is, ultimately, something where
> apply(test,1,which.max) yields 1,2,T  (rather than 1,2,1).
>
> So, a function which does what which.max currently does if the
> elements of the row differ, but which returns a T (or some such) if in
> fact the row values are all the same.
>
> I've tried a bunch of things, to know avail. Closest I got was to use
> a function to test for whether or not a vector
>
> isUnique <- function(vector){
>                  return(!any(duplicated(vector)))
>             }
>
> which returns TRUE if values of vector all unique. So
>
> apply(test,1,isUnique)
>
> returns
>
> [1]  TRUE  TRUE FALSE
>
> but I'm stuck beyond this.

The following gets me pretty close,

test_new <- test
test_new[which(apply(test,1,isUnique)==FALSE),] <- 'T'

but is clunky.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: looking for 'tied rows' in dataframe

Evan Cooch
Solved --

hold=apply(test,1,which.max)
     hold[apply(test,1,isUnique)==FALSE] <- 'T'

Now, all I need to do is figure out how to get <- 'T' from turning
everything in the matrix to a string.


On 3/17/2019 8:00 PM, Evan Cooch wrote:

> Got relatively close - below:
>
> On 3/17/2019 7:39 PM, Evan Cooch wrote:
>> Suppose I have the following sort of structure:
>>
>> test <- matrix(c(2,1,1,2,2,2),3,2,byrow=T)
>>
>> What I need to be able to do is (i) find the maximum value for each
>> row, (ii) find the column containing the max, but (iii) if the
>> maximum value is a tie (in this case, all numbers of the row are the
>> same value), then I want which.max (presumably, a tweaked version of
>> what which.max does) to reurn a T for the row where all values are
>> the same.
>>
>> Parts (i) and (ii) seem easy enough:
>>
>> apply(test,1,max)  --- gives me the maximum values
>> apply(test,1,which.max) --- gives me the column
>>
>> But, standard which.max doesn't handles ties/duplicates in a way that
>> serves my need. It defaults to returning the first column containing
>> the maximum value.
>>
>> What I'd like to end up with is, ultimately, something where
>> apply(test,1,which.max) yields 1,2,T  (rather than 1,2,1).
>>
>> So, a function which does what which.max currently does if the
>> elements of the row differ, but which returns a T (or some such) if
>> in fact the row values are all the same.
>>
>> I've tried a bunch of things, to know avail. Closest I got was to use
>> a function to test for whether or not a vector
>>
>> isUnique <- function(vector){
>>                  return(!any(duplicated(vector)))
>>             }
>>
>> which returns TRUE if values of vector all unique. So
>>
>> apply(test,1,isUnique)
>>
>> returns
>>
>> [1]  TRUE  TRUE FALSE
>>
>> but I'm stuck beyond this.
>
> The following gets me pretty close,
>
> test_new <- test
> test_new[which(apply(test,1,isUnique)==FALSE),] <- 'T'
>
> but is clunky.
>
>
>
>
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: looking for 'tied rows' in dataframe

Ben Tupper-2
Hi,

Might you replaced 'T' with a numeric value that signals the TRUE case without rumpling your matrix?  0 might be a good choice as it is never an index for a 1-based indexing system.

hold=apply(test,1,which.max)
hold[apply(test,1,isUnique)==FALSE] <- 0
hold
[1] 1 2 0
 


> On Mar 17, 2019, at 8:17 PM, Evan Cooch <[hidden email]> wrote:
>
> Solved --
>
> hold=apply(test,1,which.max)
>     hold[apply(test,1,isUnique)==FALSE] <- 'T'
>
> Now, all I need to do is figure out how to get <- 'T' from turning everything in the matrix to a string.
>
>
> On 3/17/2019 8:00 PM, Evan Cooch wrote:
>> Got relatively close - below:
>>
>> On 3/17/2019 7:39 PM, Evan Cooch wrote:
>>> Suppose I have the following sort of structure:
>>>
>>> test <- matrix(c(2,1,1,2,2,2),3,2,byrow=T)
>>>
>>> What I need to be able to do is (i) find the maximum value for each row, (ii) find the column containing the max, but (iii) if the maximum value is a tie (in this case, all numbers of the row are the same value), then I want which.max (presumably, a tweaked version of what which.max does) to reurn a T for the row where all values are the same.
>>>
>>> Parts (i) and (ii) seem easy enough:
>>>
>>> apply(test,1,max)  --- gives me the maximum values
>>> apply(test,1,which.max) --- gives me the column
>>>
>>> But, standard which.max doesn't handles ties/duplicates in a way that serves my need. It defaults to returning the first column containing the maximum value.
>>>
>>> What I'd like to end up with is, ultimately, something where apply(test,1,which.max) yields 1,2,T  (rather than 1,2,1).
>>>
>>> So, a function which does what which.max currently does if the elements of the row differ, but which returns a T (or some such) if in fact the row values are all the same.
>>>
>>> I've tried a bunch of things, to know avail. Closest I got was to use a function to test for whether or not a vector
>>>
>>> isUnique <- function(vector){
>>>                  return(!any(duplicated(vector)))
>>>             }
>>>
>>> which returns TRUE if values of vector all unique. So
>>>
>>> apply(test,1,isUnique)
>>>
>>> returns
>>>
>>> [1]  TRUE  TRUE FALSE
>>>
>>> but I'm stuck beyond this.
>>
>> The following gets me pretty close,
>>
>> test_new <- test
>> test_new[which(apply(test,1,isUnique)==FALSE),] <- 'T'
>>
>> but is clunky.
>>
>>
>>
>>
>>
>>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

Ecological Forecasting: https://eco.bigelow.org/

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: looking for 'tied rows' in dataframe

Evan Cooch
Good suggestion, and for my purposes, will solve the problem. Thanks!

On 3/18/2019 12:37 PM, Ben Tupper wrote:

> Hi,
>
> Might you replaced 'T' with a numeric value that signals the TRUE case without rumpling your matrix?  0 might be a good choice as it is never an index for a 1-based indexing system.
>
> hold=apply(test,1,which.max)
> hold[apply(test,1,isUnique)==FALSE] <- 0
> hold
> [1] 1 2 0
>  
>
>
>> On Mar 17, 2019, at 8:17 PM, Evan Cooch <[hidden email]> wrote:
>>
>> Solved --
>>
>> hold=apply(test,1,which.max)
>>      hold[apply(test,1,isUnique)==FALSE] <- 'T'
>>
>> Now, all I need to do is figure out how to get <- 'T' from turning everything in the matrix to a string.
>>
>>
>> On 3/17/2019 8:00 PM, Evan Cooch wrote:
>>> Got relatively close - below:
>>>
>>> On 3/17/2019 7:39 PM, Evan Cooch wrote:
>>>> Suppose I have the following sort of structure:
>>>>
>>>> test <- matrix(c(2,1,1,2,2,2),3,2,byrow=T)
>>>>
>>>> What I need to be able to do is (i) find the maximum value for each row, (ii) find the column containing the max, but (iii) if the maximum value is a tie (in this case, all numbers of the row are the same value), then I want which.max (presumably, a tweaked version of what which.max does) to reurn a T for the row where all values are the same.
>>>>
>>>> Parts (i) and (ii) seem easy enough:
>>>>
>>>> apply(test,1,max)  --- gives me the maximum values
>>>> apply(test,1,which.max) --- gives me the column
>>>>
>>>> But, standard which.max doesn't handles ties/duplicates in a way that serves my need. It defaults to returning the first column containing the maximum value.
>>>>
>>>> What I'd like to end up with is, ultimately, something where apply(test,1,which.max) yields 1,2,T  (rather than 1,2,1).
>>>>
>>>> So, a function which does what which.max currently does if the elements of the row differ, but which returns a T (or some such) if in fact the row values are all the same.
>>>>
>>>> I've tried a bunch of things, to know avail. Closest I got was to use a function to test for whether or not a vector
>>>>
>>>> isUnique <- function(vector){
>>>>                   return(!any(duplicated(vector)))
>>>>              }
>>>>
>>>> which returns TRUE if values of vector all unique. So
>>>>
>>>> apply(test,1,isUnique)
>>>>
>>>> returns
>>>>
>>>> [1]  TRUE  TRUE FALSE
>>>>
>>>> but I'm stuck beyond this.
>>> The following gets me pretty close,
>>>
>>> test_new <- test
>>> test_new[which(apply(test,1,isUnique)==FALSE),] <- 'T'
>>>
>>> but is clunky.
>>>
>>>
>>>
>>>
>>>
>>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org
>
> Ecological Forecasting: https://eco.bigelow.org/
>
>
>
>
>
>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.