ROC optimal threshold

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

ROC optimal threshold

Anadon Herrera, Jose Daniel
hello,

I am using the ROC package to evaluate predictive models
I have successfully plot the ROC curve, however

¿is there anyway to obtain the value of operating point=optimal threshold
value (i.e. the nearest point of the curve to the top-left corner of the
axes)?

thank you very much,


jose daniel anadon
area de ecologia
universidad miguel hernandez

españa

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: ROC optimal threshold

Tim Howard
Jose -

I've struggled a bit with the same question, said another way: "how do you find the value in a ROC curve that minimizes false positives while maximizing true positives"?

Here's something I've come up with. I'd be curious to hear from the list whether anyone thinks this code might get stuck in local minima, or if it does find the global minimum each time. (I think it's ok).

>From your ROC object you need to grab the sensitivity (=true positive rate) and specificity (= 1- false positive rate) and the cutoff levels.  Then find the value that minimizes abs(sensitivity-specificity), or  sqrt((1-sens)^2)+(1-spec)^2)) as follows:

absMin <- extract[which.min(abs(extract$sens-extract$spec)),];
sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract$spec)^2)),];

In this example, 'extract' is a dataframe containing three columns: extract$sens = sensitivity values, extract$spec = specificity values, extract$votes = cutoff values. The command subsets the dataframe to a single row containing the desired cutoff and the sens and spec values that are associated with it.

Most of the time these two answers (abs or sqrt) are the same, sometimes they differ quite a bit.

I do not see this application of ROC curves very often. A question for those much more knowledgeable than I.... is there a problem with using ROC curves in this manner?

Tim Howard




Date: Fri, 31 Mar 2006 11:58:14 +0200
From: "Anadon Herrera, Jose Daniel" <[hidden email]>
Subject: [R] ROC optimal threshold
To: "'[hidden email]'" <[hidden email]>
Message-ID:
        <[hidden email]>
Content-Type: text/plain; charset=iso-8859-1

hello,

I am using the ROC package to evaluate predictive models
I have successfully plot the ROC curve, however

?is there anyway to obtain the value of operating point=optimal threshold
value (i.e. the nearest point of the curve to the top-left corner of the
axes)?

thank you very much,


jose daniel anadon
area de ecologia
universidad miguel hernandez

espa?a

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: ROC optimal threshold

Michael Kubovy
Hi Tim and José,

>> Date: Fri, 31 Mar 2006 11:58:14 +0200
>> From: "Anadon Herrera, Jose Daniel" <[hidden email]>
>> Subject: [R] ROC optimal threshold
>>
>> I am using the ROC package to evaluate predictive models
>> I have successfully plot the ROC curve, however
>>
>> ?is there anyway to obtain the value of operating point=optimal  
>> threshold
>> value (i.e. the nearest point of the curve to the top-left corner  
>> of the
>> axes)?

On Mar 31, 2006, at 8:01 AM, Tim Howard wrote:

> I've struggled a bit with the same question, said another way: "how  
> do you find the value in a ROC curve that minimizes false positives  
> while maximizing true positives"?
>
> Here's something I've come up with. I'd be curious to hear from the  
> list whether anyone thinks this code might get stuck in local  
> minima, or if it does find the global minimum each time. (I think  
> it's ok).
>
>> From your ROC object you need to grab the sensitivity (=true  
>> positive rate) and specificity (= 1- false positive rate) and the  
>> cutoff levels.  Then find the value that minimizes abs(sensitivity-
>> specificity), or  sqrt((1-sens)^2)+(1-spec)^2)) as follows:
>
> absMin <- extract[which.min(abs(extract$sens-extract$spec)),];
> sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract
> $spec)^2)),];
>
> In this example, 'extract' is a dataframe containing three columns:  
> extract$sens = sensitivity values, extract$spec = specificity  
> values, extract$votes = cutoff values. The command subsets the  
> dataframe to a single row containing the desired cutoff and the  
> sens and spec values that are associated with it.
>
> Most of the time these two answers (abs or sqrt) are the same,  
> sometimes they differ quite a bit.
>
> I do not see this application of ROC curves very often. A question  
> for those much more knowledgeable than I.... is there a problem  
> with using ROC curves in this manner?
>
> Tim Howard

@BOOK{MacmillanCreelman2005,
   title = {Detection theory: {A} user's guide},
   publisher = {Lawrence Erlbaum Associates},
   year = {2005},
   address = {Mahwah, NJ, USA},
   edition = {2nd},
   author = {Macmillan, Neil A and Creelman, C Douglas},
}
on p. 43 shows that the ideal value of the cutoff depends on the  
reward function R that specifies the payoff for each outcome:
\[
LR(x) = \beta = \frac{R(true negative) - R{false positive)}{R(true  
positive) - R(false negative)} \frac{p(noise)}{p(signal)}
\]

I believe that your attempt to minimize false positives while  
maximizing true positives amounts to maximizing the proportion of  
correct answers. For that you just set $\beta = 0$. Otherwise it  
might be best to explicitly state your costs and benefits by  
specifying the reward function R.
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
         McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766
WWW:    http://www.people.virginia.edu/~mk9y/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: ROC optimal threshold

Adaikalavan Ramasamy
In reply to this post by Tim Howard
If you define a cost function for a given threshold k as

   cost(k) = FP(k) + lambda * FN(k)

then choose k that minimises cost. FP and FN are false positives and
false negatives at threshold k.

You change lambda to a value greater than 1 if you want to penalise FN
more than FP. There are many situations where this is desirable. For
example when you have highly unbalanced class sizes. For example
consider a problem where you want to predict rare events and you will be
penalised much more heavily if you miss an event than a non-event.


I believe the ROC was designed to compare two methods over a range of
thresholds and not for choosing the threshold itself.

Regards, Adai



On Fri, 2006-03-31 at 08:01 -0500, Tim Howard wrote:

> Jose -
>
> I've struggled a bit with the same question, said another way: "how do you find the value in a ROC curve that minimizes false positives while maximizing true positives"?
>
> Here's something I've come up with. I'd be curious to hear from the list whether anyone thinks this code might get stuck in local minima, or if it does find the global minimum each time. (I think it's ok).
>
> >From your ROC object you need to grab the sensitivity (=true positive rate) and specificity (= 1- false positive rate) and the cutoff levels.  Then find the value that minimizes abs(sensitivity-specificity), or  sqrt((1-sens)^2)+(1-spec)^2)) as follows:
>
> absMin <- extract[which.min(abs(extract$sens-extract$spec)),];
> sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract$spec)^2)),];
>
> In this example, 'extract' is a dataframe containing three columns: extract$sens = sensitivity values, extract$spec = specificity values, extract$votes = cutoff values. The command subsets the dataframe to a single row containing the desired cutoff and the sens and spec values that are associated with it.
>
> Most of the time these two answers (abs or sqrt) are the same, sometimes they differ quite a bit.
>
> I do not see this application of ROC curves very often. A question for those much more knowledgeable than I.... is there a problem with using ROC curves in this manner?
>
> Tim Howard
>
>
>
>
> Date: Fri, 31 Mar 2006 11:58:14 +0200
> From: "Anadon Herrera, Jose Daniel" <[hidden email]>
> Subject: [R] ROC optimal threshold
> To: "'[hidden email]'" <[hidden email]>
> Message-ID:
> <[hidden email]>
> Content-Type: text/plain; charset=iso-8859-1
>
> hello,
>
> I am using the ROC package to evaluate predictive models
> I have successfully plot the ROC curve, however
>
> ?is there anyway to obtain the value of operating point=optimal threshold
> value (i.e. the nearest point of the curve to the top-left corner of the
> axes)?
>
> thank you very much,
>
>
> jose daniel anadon
> area de ecologia
> universidad miguel hernandez
>
> espa?a
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: ROC optimal threshold

Frank Harrell
In reply to this post by Michael Kubovy
Michael Kubovy wrote:

> Hi Tim and José,
>
>
>>>Date: Fri, 31 Mar 2006 11:58:14 +0200
>>>From: "Anadon Herrera, Jose Daniel" <[hidden email]>
>>>Subject: [R] ROC optimal threshold
>>>
>>>I am using the ROC package to evaluate predictive models
>>>I have successfully plot the ROC curve, however
>>>
>>>?is there anyway to obtain the value of operating point=optimal  
>>>threshold
>>>value (i.e. the nearest point of the curve to the top-left corner  
>>>of the
>>>axes)?
>
>
> On Mar 31, 2006, at 8:01 AM, Tim Howard wrote:
>
>
>>I've struggled a bit with the same question, said another way: "how  
>>do you find the value in a ROC curve that minimizes false positives  
>>while maximizing true positives"?
>>
>>Here's something I've come up with. I'd be curious to hear from the  
>>list whether anyone thinks this code might get stuck in local  
>>minima, or if it does find the global minimum each time. (I think  
>>it's ok).
>>
>>
>>>From your ROC object you need to grab the sensitivity (=true  
>>>positive rate) and specificity (= 1- false positive rate) and the  
>>>cutoff levels.  Then find the value that minimizes abs(sensitivity-
>>>specificity), or  sqrt((1-sens)^2)+(1-spec)^2)) as follows:
>>
>>absMin <- extract[which.min(abs(extract$sens-extract$spec)),];
>>sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract
>>$spec)^2)),];
>>
>>In this example, 'extract' is a dataframe containing three columns:  
>>extract$sens = sensitivity values, extract$spec = specificity  
>>values, extract$votes = cutoff values. The command subsets the  
>>dataframe to a single row containing the desired cutoff and the  
>>sens and spec values that are associated with it.
>>
>>Most of the time these two answers (abs or sqrt) are the same,  
>>sometimes they differ quite a bit.
>>
>>I do not see this application of ROC curves very often. A question  
>>for those much more knowledgeable than I.... is there a problem  
>>with using ROC curves in this manner?
>>
>>Tim Howard
>
>
> @BOOK{MacmillanCreelman2005,
>    title = {Detection theory: {A} user's guide},
>    publisher = {Lawrence Erlbaum Associates},
>    year = {2005},
>    address = {Mahwah, NJ, USA},
>    edition = {2nd},
>    author = {Macmillan, Neil A and Creelman, C Douglas},
> }
> on p. 43 shows that the ideal value of the cutoff depends on the  
> reward function R that specifies the payoff for each outcome:
> \[
> LR(x) = \beta = \frac{R(true negative) - R{false positive)}{R(true  
> positive) - R(false negative)} \frac{p(noise)}{p(signal)}
> \]
>
> I believe that your attempt to minimize false positives while  
> maximizing true positives amounts to maximizing the proportion of  
> correct answers. For that you just set $\beta = 0$. Otherwise it  
> might be best to explicitly state your costs and benefits by  
> specifying the reward function R.
> _____________________________
> Professor Michael Kubovy

Choosing cutoffs is frought with difficulties, arbitrariness,
inefficiency, and the necessity to use a complex adjustment for multiple
comparisons in later analysis steps unless the dataset used to generate
the cutoff was so large as could be considered infinite.

--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: ROC optimal threshold

Tim Howard
In reply to this post by Anadon Herrera, Jose Daniel
Dr. Harrell,
Thank you for your response. I had noted, and appreciate, your perspective on ROC in past listserv entries and am glad to have an opportunity to delve a little deeper.

I (and, I think, Jose Daniel Anadon, the original poster of this question) have a predictive model for the presence of, say, animal_X. This is a spatial model that can be represented on maps and is based on known locations where  animal_X is present and (usually) known locations where animal_X is absent. Output of the analysis (using any number of analytic routines, including logit, randomForest, maximum entropy, mahalanobis distance...) is a full map where every spot on the map has a probability that that particular location has the appropriate habitat for animal_x.

This output can be visualized by just using a color scale (perhaps blue for low probability to red for high probability), BUT, there are times when we want to apply a cutoff to this probability output and create a product where we can say either "yes, animal_X habitat is predicted here" or "no, animal_X habitat is not predicted here."

Note this is the final analytic step. There are no later anaylsis steps and so (possibly) adjustments for multiple comparisons do not come into play.

Indeed, it seems that using a standard process to find a threshold reduces the arbitrariness of the probabiliity color scale (at what probability do we set 'red'? at what probability do we set 'blue'?).

Are there alternative approaches that reduce the drawbacks you allude to?

How would you turn a surface of probabilities into a binary surface of yes-no?

Thank you for your time.
Sincerely,
Tim Howard

Ecologist
New York Natural Heritage Program

>>> Frank E Harrell Jr <[hidden email]> 03/31/06 11:20 AM >>>

Choosing cutoffs is frought with difficulties, arbitrariness,
inefficiency, and the necessity to use a complex adjustment for multiple
comparisons in later analysis steps unless the dataset used to generate
the cutoff was so large as could be considered infinite.

--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: ROC optimal threshold

Frank Harrell
In reply to this post by Anadon Herrera, Jose Daniel
Tim Howard wrote:

> Dr. Harrell,
> Thank you for your response. I had noted, and appreciate, your perspective on ROC in past listserv entries and am glad to have an opportunity to delve a little deeper.
>
> I (and, I think, Jose Daniel Anadon, the original poster of this question) have a predictive model for the presence of, say, animal_X. This is a spatial model that can be represented on maps and is based on known locations where  animal_X is present and (usually) known locations where animal_X is absent. Output of the analysis (using any number of analytic routines, including logit, randomForest, maximum entropy, mahalanobis distance...) is a full map where every spot on the map has a probability that that particular location has the appropriate habitat for animal_x.
>
> This output can be visualized by just using a color scale (perhaps blue for low probability to red for high probability), BUT, there are times when we want to apply a cutoff to this probability output and create a product where we can say either "yes, animal_X habitat is predicted here" or "no, animal_X habitat is not predicted here."
>
> Note this is the final analytic step. There are no later anaylsis steps and so (possibly) adjustments for multiple comparisons do not come into play.
>
> Indeed, it seems that using a standard process to find a threshold reduces the arbitrariness of the probabiliity color scale (at what probability do we set 'red'? at what probability do we set 'blue'?).
>
> Are there alternative approaches that reduce the drawbacks you allude to?
>
> How would you turn a surface of probabilities into a binary surface of yes-no?
>
> Thank you for your time.
> Sincerely,
> Tim Howard
>
> Ecologist
> New York Natural Heritage Program

Tim,

I think that 'animal_X habitat is predicted here' would hide a lot of
useful information, especially "gray zones" or uncertain areas.   I
think that a continuous mapping of probabilities to a gray scale or to
the heat spectrum would work best.  Bill Cleveland also has another idea
of using 5 saturation levels on each of 2 hues to get 10 levels with
easier human discrimination.  You might also consider thermometer plots
which give some of the most accurate human perception of a continuous
variable.  For the first 2 ideas you may have to round probabilities to
give just 10 intervals (or use deciles).

If you choose cutpoints from the data, there is uncertainty from the
cutpoint that may have to be taken into account.  See for example
http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/fehbib.html#roy06dic

Frank

>
>
>>>>Frank E Harrell Jr <[hidden email]> 03/31/06 11:20 AM >>>
>
>
> Choosing cutoffs is frought with difficulties, arbitrariness,
> inefficiency, and the necessity to use a complex adjustment for multiple
> comparisons in later analysis steps unless the dataset used to generate
> the cutoff was so large as could be considered infinite.
>


--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Frank Harrell
Department of Biostatistics, Vanderbilt University