# how exactly does 'identify' work?

## how exactly does 'identify' work?

 Hi all,

#########################################
test=data.frame(x=1:26,y=-23.5+0.45*(1:26)+rnorm(26))
rownames(test)=LETTERS[1:26]
attach(test)
#test
test.lm=lm(y~x)
plot(test.lm,2)
identify(test.lm\$res,,row.names(test)) # not working
plot(x,y)
identify(x,y,row.names(test)) # works fine
identify(y,,row.names(test)) # works fine
identify(x,,row.names(test)) # not working
identify(y,,y) # works
identify(x,,y) # not working
#####################################

My guess is that identify take the object 'x' ( the first argument ) is the thing that on the y axis. However, i have tried many many ways trying to get the LETTERS to be identified in the QQ-plot (plot(test.lm,2)) it never works..... I have even tried to extract the standardized residual using library(MASS), the 'stdres' function, and put it as the first argument in identify, still failed...

Is there any means to achieve this?

Thanks!

casper
## Re: how exactly does 'identify' work?

## Re: how exactly does 'identify' work?

 Hi,

I think the problem is

1 - when a linear model is fitted, ploting the qqnorm( test.lm\$ res )
we dont 'know' what values are actually being used on the y-axis; and
how do we refer to the 'Index' on the x-axis??
     therefore, i dont know how to refer to the x and y coordinates in the
identify function

2 - i have tried using the stdres function in the MASS library, to extract
the standardised
residuals and plot them manully, ( using the plot ) function.
     this way, the problem is we have to SORT the residuals first in
increasing order to reproduce the same qqnorm plot, in that case, 'identify'
function works, however, that CHANGES the order, i.e. it wont return the
original A:Z ( row.names ) label.
## Re: how exactly does 'identify' work?

 On 18/11/2010 1:50 PM, casperyc wrote:
> Hi,
>
> I think the problem is
>
> 1 - when a linear model is fitted, ploting the qqnorm( test.lm\$ res )
> we dont 'know' what values are actually being used on the y-axis; and
> how do we refer to the 'Index' on the x-axis??
>       therefore, i dont know how to refer to the x and y coordinates in the
> identify function

You could look at qqnorm.default to figure those things out, but it is
probably difficult to do.  You'd be better off using locator() to find
the coordinates of a mouse click, and plotting the label using text().

For a simple example,

x <- rnorm(100, mean=10, sd=2)
qqnorm(x)
repeat {
   pt <- locator(1)
   if (!length(pt\$x)) break
   text(pt, labels=which.min( abs(x - pt\$y) ) )
}

Duncan Murdoch

> 2 - i have tried using the stdres function in the MASS library, to extract
> the standardised
> residuals and plot them manully, ( using the plot ) function.
>       this way, the problem is we have to SORT the residuals first in
> increasing order to reproduce the same qqnorm plot, in that case, 'identify'
> function works, however, that CHANGES the order, i.e. it wont return the
> original A:Z ( row.names ) label.
## Re: how exactly does 'identify' work?

 yes, i tried to modify the "2L" part in plot.lm

###################################
if (show[2L]) {
        ylim <- range(rs, na.rm = TRUE)
        ylim[2L] <- ylim[2L] + diff(ylim) * 0.075
        qq <- qqnorm(rs, main = main, ylab = ylab23, ylim = ylim,
            ...)
        if (qqline)
            qqline(rs, lty = 3, col = "gray50")
        if (one.fig)
            title(sub = sub.caption, ...)
        mtext(getCaption(2), 3, 0.25, cex = cex.caption)
        if (id.n > 0)
            text.id(qq\$x[show.rs], qq\$y[show.rs], rs)
###################################

but didnt go very far, I could just use text to add the label, I just dont
understand why identify does not 'identify' the residuals in a linear model
in the qqnorm plot ...

Thanks.
## Re: how exactly does 'identify' work?

 Did you read the help page for qqnorm?  The return value has the x and y
coordinates used, you can just do something like:

> tmp <- qqnorm( resid(test.lm) )
> identify(tmp, , names(resid(test.lm)) )

Or the plot.lm function has an argument id.n that automatically labels
the n most extreme values:

> plot( test.lm, 2, id.n=10 )

Those both worked in my tests, if they are not working for you then send
a reproducible example (include data, see ?dput) and maybe we can help
further.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare

> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of casperyc
> Sent: Thursday, November 18, 2010 11:50 AM
> To: [hidden email]
> Subject: Re: [R] how exactly does 'identify' work?
>
>
> Hi,
>
> I think the problem is
>
> 1 - when a linear model is fitted, ploting the qqnorm( test.lm\$ res )
> we dont 'know' what values are actually being used on the y-axis; and
> how do we refer to the 'Index' on the x-axis??
>      therefore, i dont know how to refer to the x and y coordinates in
> the
> identify function
>
> 2 - i have tried using the stdres function in the MASS library, to
> extract
> the standardised
> residuals and plot them manully, ( using the plot ) function.
>      this way, the problem is we have to SORT the residuals first in
> increasing order to reproduce the same qqnorm plot, in that case,
> 'identify'
> function works, however, that CHANGES the order, i.e. it wont return
> the
> original A:Z ( row.names ) label.