Quantcast

quantiles and dataframe

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

quantiles and dataframe

Anders Bjørgesæter
Hi

I have a dataframe, RQ, like this:

A    B1    B2    B3
1    NA    112    12
2    NA    123   123
3    NA    324    13
4    3     21    535
5    4     12    33
6    7     1     335
7    4     NA    3535
8    4     NA    NA
9    NA    NA    NA
10    5    NA    NA
12    4    NA    NA
15    2    NA    NA
17    3    NA    1
63    1    NA    1
75    NA   NA    NA
100   NA   NA    NA
123   NA   NA    NA
155   NA   NA    NA
166   NA   NA    NA
177   NA   NA    NA

I want to extract min, max, 5% and 95% from A based on the range of the Bs.

Using this:

s1<-A[min(which(!is.na(B1))):max(which(!is.na(B1)))]
q1<-quantile(s1,probs=c(0,5,95,100,NA)/100)

I manage to get this by changing the B1 manually for each B

B1    B2     B3
4.0    1.00     1.00    (min)
63.0   6.00   63.00   (max)
4.5    4.5      1.65    (5%)
40.0   6.00     63.00   (95%)

I tried to use apply like this: s1<-apply(RQ,2,function(x)
{A[min(which(!is.na(RQ[,2:4]))):max(which(!is.na(RQ[,2:4])))] })

to get the range of each B but that doesn't work.

Also as you see, s1 includes the A where the B's are NA, e.g. for B1 I
get the 9 at row 9 (4,5,6,7,8,9,10,12,15,17,63) and not
(4,5,6,7,8,10,12,15,17,63), which I would prefer.

BUT the main question is how can I extract min, max etc. from each B in
dataframe RQ without using a loop?

Any help is greatly appreciated!

Best Regards
Anders

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantiles and dataframe

Gabor Grothendieck
Try this:

sapply(RQ[-1], quantile, probs = c(0, .05, .95, 1), na.rm = TRUE)



On 9/14/07, Anders Bjørgesæter <[hidden email]> wrote:

> Hi
>
> I have a dataframe, RQ, like this:
>
> A    B1    B2    B3
> 1    NA    112    12
> 2    NA    123   123
> 3    NA    324    13
> 4    3     21    535
> 5    4     12    33
> 6    7     1     335
> 7    4     NA    3535
> 8    4     NA    NA
> 9    NA    NA    NA
> 10    5    NA    NA
> 12    4    NA    NA
> 15    2    NA    NA
> 17    3    NA    1
> 63    1    NA    1
> 75    NA   NA    NA
> 100   NA   NA    NA
> 123   NA   NA    NA
> 155   NA   NA    NA
> 166   NA   NA    NA
> 177   NA   NA    NA
>
> I want to extract min, max, 5% and 95% from A based on the range of the Bs.
>
> Using this:
>
> s1<-A[min(which(!is.na(B1))):max(which(!is.na(B1)))]
> q1<-quantile(s1,probs=c(0,5,95,100,NA)/100)
>
> I manage to get this by changing the B1 manually for each B
>
> B1    B2        B3
> 4.0    1.00     1.00    (min)
> 63.0   6.00     63.00   (max)
> 4.5    4.5      1.65    (5%)
> 40.0   6.00     63.00   (95%)
>
> I tried to use apply like this: s1<-apply(RQ,2,function(x)
> {A[min(which(!is.na(RQ[,2:4]))):max(which(!is.na(RQ[,2:4])))] })
>
> to get the range of each B but that doesn't work.
>
> Also as you see, s1 includes the A where the B's are NA, e.g. for B1 I
> get the 9 at row 9 (4,5,6,7,8,9,10,12,15,17,63) and not
> (4,5,6,7,8,10,12,15,17,63), which I would prefer.
>
> BUT the main question is how can I extract min, max etc. from each B in
> dataframe RQ without using a loop?
>
> Any help is greatly appreciated!
>
> Best Regards
> Anders
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantiles and dataframe

jholtman
In reply to this post by Anders Bjørgesæter
I think this does what you want:

> RQ
     A B1  B2   B3
1    1 NA 112   12
2    2 NA 123  123
3    3 NA 324   13
4    4  3  21  535
5    5  4  12   33
6    6  7   1  335
7    7  4  NA 3535
8    8  4  NA   NA
9    9 NA  NA   NA
10  10  5  NA   NA
11  12  4  NA   NA
12  15  2  NA   NA
13  17  3  NA    1
14  63  1  NA    1
15  75 NA  NA   NA
16 100 NA  NA   NA
17 123 NA  NA   NA
18 155 NA  NA   NA
19 166 NA  NA   NA
20 177 NA  NA   NA
> x <- lapply(RQ[-1], function(.col){
+     quantile(RQ[!is.na(.col), 1], probs=c(0, 0.05, 0.95, 1))
+ })
> do.call('cbind', x)
        B1   B2   B3
0%    4.00 1.00  1.0
5%    4.45 1.25  1.4
95%  42.30 5.75 44.6
100% 63.00 6.00 63.0


On 9/14/07, Anders Bjørgesæter <[hidden email]> wrote:

> Hi
>
> I have a dataframe, RQ, like this:
>
> A    B1    B2    B3
> 1    NA    112    12
> 2    NA    123   123
> 3    NA    324    13
> 4    3     21    535
> 5    4     12    33
> 6    7     1     335
> 7    4     NA    3535
> 8    4     NA    NA
> 9    NA    NA    NA
> 10    5    NA    NA
> 12    4    NA    NA
> 15    2    NA    NA
> 17    3    NA    1
> 63    1    NA    1
> 75    NA   NA    NA
> 100   NA   NA    NA
> 123   NA   NA    NA
> 155   NA   NA    NA
> 166   NA   NA    NA
> 177   NA   NA    NA
>
> I want to extract min, max, 5% and 95% from A based on the range of the Bs.
>
> Using this:
>
> s1<-A[min(which(!is.na(B1))):max(which(!is.na(B1)))]
> q1<-quantile(s1,probs=c(0,5,95,100,NA)/100)
>
> I manage to get this by changing the B1 manually for each B
>
> B1    B2        B3
> 4.0    1.00     1.00    (min)
> 63.0   6.00     63.00   (max)
> 4.5    4.5      1.65    (5%)
> 40.0   6.00     63.00   (95%)
>
> I tried to use apply like this: s1<-apply(RQ,2,function(x)
> {A[min(which(!is.na(RQ[,2:4]))):max(which(!is.na(RQ[,2:4])))] })
>
> to get the range of each B but that doesn't work.
>
> Also as you see, s1 includes the A where the B's are NA, e.g. for B1 I
> get the 9 at row 9 (4,5,6,7,8,9,10,12,15,17,63) and not
> (4,5,6,7,8,10,12,15,17,63), which I would prefer.
>
> BUT the main question is how can I extract min, max etc. from each B in
> dataframe RQ without using a loop?
>
> Any help is greatly appreciated!
>
> Best Regards
> Anders
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: quantiles and dataframe

Gabor Grothendieck
In reply to this post by Gabor Grothendieck
Sorry, try this instead.  It creates a data frame of 3 columns in which
each column equals RQ[,1] except it has NAs where the columns of RQ[-1]
has NAs.  Perform the quantile operation on that.

sapply(sign(RQ[-1]) * RQ[,1], quantile, probs = c(0, .05, .95, 1), na.rm = TRUE)


On 9/14/07, Gabor Grothendieck <[hidden email]> wrote:

> Try this:
>
> sapply(RQ[-1], quantile, probs = c(0, .05, .95, 1), na.rm = TRUE)
>
>
>
> On 9/14/07, Anders Bjørgesæter <[hidden email]> wrote:
> > Hi
> >
> > I have a dataframe, RQ, like this:
> >
> > A    B1    B2    B3
> > 1    NA    112    12
> > 2    NA    123   123
> > 3    NA    324    13
> > 4    3     21    535
> > 5    4     12    33
> > 6    7     1     335
> > 7    4     NA    3535
> > 8    4     NA    NA
> > 9    NA    NA    NA
> > 10    5    NA    NA
> > 12    4    NA    NA
> > 15    2    NA    NA
> > 17    3    NA    1
> > 63    1    NA    1
> > 75    NA   NA    NA
> > 100   NA   NA    NA
> > 123   NA   NA    NA
> > 155   NA   NA    NA
> > 166   NA   NA    NA
> > 177   NA   NA    NA
> >
> > I want to extract min, max, 5% and 95% from A based on the range of the Bs.
> >
> > Using this:
> >
> > s1<-A[min(which(!is.na(B1))):max(which(!is.na(B1)))]
> > q1<-quantile(s1,probs=c(0,5,95,100,NA)/100)
> >
> > I manage to get this by changing the B1 manually for each B
> >
> > B1    B2        B3
> > 4.0    1.00     1.00    (min)
> > 63.0   6.00     63.00   (max)
> > 4.5    4.5      1.65    (5%)
> > 40.0   6.00     63.00   (95%)
> >
> > I tried to use apply like this: s1<-apply(RQ,2,function(x)
> > {A[min(which(!is.na(RQ[,2:4]))):max(which(!is.na(RQ[,2:4])))] })
> >
> > to get the range of each B but that doesn't work.
> >
> > Also as you see, s1 includes the A where the B's are NA, e.g. for B1 I
> > get the 9 at row 9 (4,5,6,7,8,9,10,12,15,17,63) and not
> > (4,5,6,7,8,10,12,15,17,63), which I would prefer.
> >
> > BUT the main question is how can I extract min, max etc. from each B in
> > dataframe RQ without using a loop?
> >
> > Any help is greatly appreciated!
> >
> > Best Regards
> > Anders
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Loading...