Quantcast

How to delete rows using conditions on all columns

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

How to delete rows using conditions on all columns

Aher
n <- 10
P1 <- runif(n)
P2 <- runif(n)
P3 <- P1 + P2 + runif(n)/100
P4 <- P1 + P2 + P3 + runif(n)/100
mydata <- data.frame(cbind(P1,P2,P3,P4))
mydata[1,1] <- 8
mydata[3,1] <- -5
mydata[2,3] <- -6
mydata[7,3] <- 7

f=function(z){quantile(z, c(0.01, 0.99)) }

temp1 <- lapply(mydata, f)
temp1
$P1
       1%       99%
-4.542391  7.354209

$P2
        1%        99%
0.03452814 0.61029804

$P3
       1%       99%
-5.423229  6.498828

$P4
       1%       99%
0.7825967 2.8454615

I want to remove rows based on the conditions on the columns as stored in the vector temp1. Any row containing value less than 1% and value greater than 99% need to be removed for each of the variable.
How this can be achieved.

Thanks for the help in advance.
Regards,
-Aher
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to delete rows using conditions on all columns

markheckmann
If you don't mind using a loop, e.g. like this:

x <- mydata
qs <- sapply(x, function(x) quantile(x, c(0.01, 0.99)))
for (i in 1:ncol(x))
  x <- subset(x, x[ ,i] >= qs[1,i] & x[ ,i] <= qs[2,i])
Loading...