how to calculate multiple meta p values

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

how to calculate multiple meta p values

anikaM
Hello,

I would like to use this package metap
to calculate multiple o values

I have my data frame with 3 p values
> head(tt)
          RS            G           E          B
1: rs2089177   0.9986   0.7153   0.604716
2: rs4360974   0.9738   0.7838   0.430228
3: rs6502526   0.9744   0.7839   0.429160
4: rs8069906   0.7184   0.4918   0.521452
5: rs9905280   0.7205   0.4861   0.465758
6: rs4313843   0.9804   0.8522   0.474313

and data frame with corresponding weights for each of the p values
from the tt data frame

> head(df)
       wg       we             wb                RS
1 40.6325 35.39774 580.6436 rs2089177
2 40.6325 35.39774 580.6436 rs4360974
3 40.6325 35.39774 580.6436 rs6502526
4 40.6325 35.39774 580.6436 rs8069906
5 40.6325 35.39774 580.6436 rs9905280
6 40.6325 35.39774 580.6436 rs4313843

RS column is the same in df and tt

How to use this sunz() function to create a new data frame which would
look the same as tt only it would have additional column, say named
"META" which has calculated meta p values for each row

This i s example of how much would be p value in the first row:

> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
p =  0.6940048

Thanks
Ana

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to calculate multiple meta p values

anikaM
this is the function I was referring to:
https://www.rdocumentation.org/packages/metap/versions/1.1/topics/sumz

On Fri, Oct 25, 2019 at 6:31 PM Ana Marija <[hidden email]> wrote:

>
> Hello,
>
> I would like to use this package metap
> to calculate multiple o values
>
> I have my data frame with 3 p values
> > head(tt)
>           RS            G           E          B
> 1: rs2089177   0.9986   0.7153   0.604716
> 2: rs4360974   0.9738   0.7838   0.430228
> 3: rs6502526   0.9744   0.7839   0.429160
> 4: rs8069906   0.7184   0.4918   0.521452
> 5: rs9905280   0.7205   0.4861   0.465758
> 6: rs4313843   0.9804   0.8522   0.474313
>
> and data frame with corresponding weights for each of the p values
> from the tt data frame
>
> > head(df)
>        wg       we             wb                RS
> 1 40.6325 35.39774 580.6436 rs2089177
> 2 40.6325 35.39774 580.6436 rs4360974
> 3 40.6325 35.39774 580.6436 rs6502526
> 4 40.6325 35.39774 580.6436 rs8069906
> 5 40.6325 35.39774 580.6436 rs9905280
> 6 40.6325 35.39774 580.6436 rs4313843
>
> RS column is the same in df and tt
>
> How to use this sunz() function to create a new data frame which would
> look the same as tt only it would have additional column, say named
> "META" which has calculated meta p values for each row
>
> This i s example of how much would be p value in the first row:
>
> > sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
> p =  0.6940048
>
> Thanks
> Ana

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to calculate multiple meta p values

Michael Dewey-3
In reply to this post by anikaM
Dear Ana

There must be several ways of doing this but see below for an idea with
comments in-line.

On 26/10/2019 00:31, Ana Marija wrote:

> Hello,
>
> I would like to use this package metap
> to calculate multiple o values
>
> I have my data frame with 3 p values
>> head(tt)
>            RS            G           E          B
> 1: rs2089177   0.9986   0.7153   0.604716
> 2: rs4360974   0.9738   0.7838   0.430228
> 3: rs6502526   0.9744   0.7839   0.429160
> 4: rs8069906   0.7184   0.4918   0.521452
> 5: rs9905280   0.7205   0.4861   0.465758
> 6: rs4313843   0.9804   0.8522   0.474313
>
> and data frame with corresponding weights for each of the p values
> from the tt data frame
>
>> head(df)
>         wg       we             wb                RS
> 1 40.6325 35.39774 580.6436 rs2089177
> 2 40.6325 35.39774 580.6436 rs4360974
> 3 40.6325 35.39774 580.6436 rs6502526
> 4 40.6325 35.39774 580.6436 rs8069906
> 5 40.6325 35.39774 580.6436 rs9905280
> 6 40.6325 35.39774 580.6436 rs4313843
>
> RS column is the same in df and tt
>

So you can create a new data-frame with merge()

newdata <- merge(tt, df)

which will use RS as the key to merge them on.

The write a function of one argument, a seven element vector, which
picks out the p-values and the weights and feeds them to sumz().
Something like

helper <- function(x) {
  p <- sumz(x[2:4], weights = x[5:7])$p
  p
}
Note you need to check that 2:4 and 5:7 are actually where they are in
the row of newdat.

Then use apply() to apply that to the rows of newdat.

I have not tested any of this but the general idea should be OK even if
the details are wrong.

Michael


> How to use this sunz() function to create a new data frame which would
> look the same as tt only it would have additional column, say named
> "META" which has calculated meta p values for each row
>
> This i s example of how much would be p value in the first row:
>
>> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
> p =  0.6940048
>
> Thanks
> Ana
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to calculate multiple meta p values

anikaM
Hi Michael,

I tried what you proposed with my data frame q:

> head(q)
           ID                P             G              E
 wb          wg           we
1:  rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774
2:  rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774
3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774
4:  rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774
5:  rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774
6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774

so the solution of the first row would be this:
> sumz(c(0.0979931,0.0054060,0.39160), weights = c(580.6436,40.6325,35.39774), na.action = na.fail)
sumz =  1.481833 p =  0.06919239

I tried applying the function you wrote:
helper <- function(x) {
  p <- sumz(x[2:4], weights = x[5:7])$p
  p
}

With:

q$META <- apply(q, MARGIN = 1, helper)

# I want to make a new column in q named META with results
but I got this error:
 Error in sumz(x[2:4], weights = x[5:7]) :
  Must have at least two valid p values

Please advise,
Ana

On Sun, Oct 27, 2019 at 9:49 AM Michael Dewey <[hidden email]> wrote:

>
> Dear Ana
>
> There must be several ways of doing this but see below for an idea with
> comments in-line.
>
> On 26/10/2019 00:31, Ana Marija wrote:
> > Hello,
> >
> > I would like to use this package metap
> > to calculate multiple o values
> >
> > I have my data frame with 3 p values
> >> head(tt)
> >            RS            G           E          B
> > 1: rs2089177   0.9986   0.7153   0.604716
> > 2: rs4360974   0.9738   0.7838   0.430228
> > 3: rs6502526   0.9744   0.7839   0.429160
> > 4: rs8069906   0.7184   0.4918   0.521452
> > 5: rs9905280   0.7205   0.4861   0.465758
> > 6: rs4313843   0.9804   0.8522   0.474313
> >
> > and data frame with corresponding weights for each of the p values
> > from the tt data frame
> >
> >> head(df)
> >         wg       we             wb                RS
> > 1 40.6325 35.39774 580.6436 rs2089177
> > 2 40.6325 35.39774 580.6436 rs4360974
> > 3 40.6325 35.39774 580.6436 rs6502526
> > 4 40.6325 35.39774 580.6436 rs8069906
> > 5 40.6325 35.39774 580.6436 rs9905280
> > 6 40.6325 35.39774 580.6436 rs4313843
> >
> > RS column is the same in df and tt
> >
>
> So you can create a new data-frame with merge()
>
> newdata <- merge(tt, df)
>
> which will use RS as the key to merge them on.
>
> The write a function of one argument, a seven element vector, which
> picks out the p-values and the weights and feeds them to sumz().
> Something like
>
> helper <- function(x) {
>   p <- sumz(x[2:4], weights = x[5:7])$p
>   p
> }
> Note you need to check that 2:4 and 5:7 are actually where they are in
> the row of newdat.
>
> Then use apply() to apply that to the rows of newdat.
>
> I have not tested any of this but the general idea should be OK even if
> the details are wrong.
>
> Michael
>
>
> > How to use this sunz() function to create a new data frame which would
> > look the same as tt only it would have additional column, say named
> > "META" which has calculated meta p values for each row
> >
> > This i s example of how much would be p value in the first row:
> >
> >> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
> > p =  0.6940048
> >
> > Thanks
> > Ana
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to calculate multiple meta p values

Michael Dewey-3
Dear Ana

Yes, when apply coerces q to a matrix it does so as a character matrix
because of the values in the first column. So you need to wrap the
references to x in helper in as.numeric() tat is to day like
as.numeric(x[2:4]) and similarly for the other one. Sorry about that, I
should have thought of it before.

When I next update metap I will try to get it to degrade more gracefully
when it finds an error.

Michael

On 28/10/2019 19:06, Ana Marija wrote:

> Hi Michael,
>
> I tried what you proposed with my data frame q:
>
>> head(q)
>             ID                P             G              E
>   wb          wg           we
> 1:  rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774
> 2:  rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774
> 3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774
> 4:  rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774
> 5:  rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774
> 6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774
>
> so the solution of the first row would be this:
>> sumz(c(0.0979931,0.0054060,0.39160), weights = c(580.6436,40.6325,35.39774), na.action = na.fail)
> sumz =  1.481833 p =  0.06919239
>
> I tried applying the function you wrote:
> helper <- function(x) {
>    p <- sumz(x[2:4], weights = x[5:7])$p
>    p
> }
>
> With:
>
> q$META <- apply(q, MARGIN = 1, helper)
>
> # I want to make a new column in q named META with results
> but I got this error:
>   Error in sumz(x[2:4], weights = x[5:7]) :
>    Must have at least two valid p values
>
> Please advise,
> Ana
>
> On Sun, Oct 27, 2019 at 9:49 AM Michael Dewey <[hidden email]> wrote:
>>
>> Dear Ana
>>
>> There must be several ways of doing this but see below for an idea with
>> comments in-line.
>>
>> On 26/10/2019 00:31, Ana Marija wrote:
>>> Hello,
>>>
>>> I would like to use this package metap
>>> to calculate multiple o values
>>>
>>> I have my data frame with 3 p values
>>>> head(tt)
>>>             RS            G           E          B
>>> 1: rs2089177   0.9986   0.7153   0.604716
>>> 2: rs4360974   0.9738   0.7838   0.430228
>>> 3: rs6502526   0.9744   0.7839   0.429160
>>> 4: rs8069906   0.7184   0.4918   0.521452
>>> 5: rs9905280   0.7205   0.4861   0.465758
>>> 6: rs4313843   0.9804   0.8522   0.474313
>>>
>>> and data frame with corresponding weights for each of the p values
>>> from the tt data frame
>>>
>>>> head(df)
>>>          wg       we             wb                RS
>>> 1 40.6325 35.39774 580.6436 rs2089177
>>> 2 40.6325 35.39774 580.6436 rs4360974
>>> 3 40.6325 35.39774 580.6436 rs6502526
>>> 4 40.6325 35.39774 580.6436 rs8069906
>>> 5 40.6325 35.39774 580.6436 rs9905280
>>> 6 40.6325 35.39774 580.6436 rs4313843
>>>
>>> RS column is the same in df and tt
>>>
>>
>> So you can create a new data-frame with merge()
>>
>> newdata <- merge(tt, df)
>>
>> which will use RS as the key to merge them on.
>>
>> The write a function of one argument, a seven element vector, which
>> picks out the p-values and the weights and feeds them to sumz().
>> Something like
>>
>> helper <- function(x) {
>>    p <- sumz(x[2:4], weights = x[5:7])$p
>>    p
>> }
>> Note you need to check that 2:4 and 5:7 are actually where they are in
>> the row of newdat.
>>
>> Then use apply() to apply that to the rows of newdat.
>>
>> I have not tested any of this but the general idea should be OK even if
>> the details are wrong.
>>
>> Michael
>>
>>
>>> How to use this sunz() function to create a new data frame which would
>>> look the same as tt only it would have additional column, say named
>>> "META" which has calculated meta p values for each row
>>>
>>> This i s example of how much would be p value in the first row:
>>>
>>>> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
>>> p =  0.6940048
>>>
>>> Thanks
>>> Ana
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Michael
>> http://www.dewey.myzen.co.uk/home.html
>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to calculate multiple meta p values

anikaM
Hi Michael,

this still doesn't work, by data frame has a few less columns now, but
the principle is still the same:

> head(d)
    chr    pos         gene_id                     LCL
Retina       wl           wr
1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838
2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838
3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838
4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838
5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838
6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838

so solution for the first row should be:
> sumz(c(0.3559520,9.72251e-05), weights = c(31.62278,21.2838), na.action = na.fail)
sumz =  2.386896 p =  0.008495647

when I run what you proposed in the last email:

helper <- function(x) {
  p <- sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7]))$p
  p
}

d$META <- apply(d, MARGIN = 1, helper)

I am getting:

Error in sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7])) :
  Must have at least two valid p values

Please advise,
Ana

On Wed, Oct 30, 2019 at 5:02 AM Michael Dewey <[hidden email]> wrote:

>
> Dear Ana
>
> Yes, when apply coerces q to a matrix it does so as a character matrix
> because of the values in the first column. So you need to wrap the
> references to x in helper in as.numeric() tat is to day like
> as.numeric(x[2:4]) and similarly for the other one. Sorry about that, I
> should have thought of it before.
>
> When I next update metap I will try to get it to degrade more gracefully
> when it finds an error.
>
> Michael
>
> On 28/10/2019 19:06, Ana Marija wrote:
> > Hi Michael,
> >
> > I tried what you proposed with my data frame q:
> >
> >> head(q)
> >             ID                P             G              E
> >   wb          wg           we
> > 1:  rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774
> > 2:  rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774
> > 3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774
> > 4:  rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774
> > 5:  rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774
> > 6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774
> >
> > so the solution of the first row would be this:
> >> sumz(c(0.0979931,0.0054060,0.39160), weights = c(580.6436,40.6325,35.39774), na.action = na.fail)
> > sumz =  1.481833 p =  0.06919239
> >
> > I tried applying the function you wrote:
> > helper <- function(x) {
> >    p <- sumz(x[2:4], weights = x[5:7])$p
> >    p
> > }
> >
> > With:
> >
> > q$META <- apply(q, MARGIN = 1, helper)
> >
> > # I want to make a new column in q named META with results
> > but I got this error:
> >   Error in sumz(x[2:4], weights = x[5:7]) :
> >    Must have at least two valid p values
> >
> > Please advise,
> > Ana
> >
> > On Sun, Oct 27, 2019 at 9:49 AM Michael Dewey <[hidden email]> wrote:
> >>
> >> Dear Ana
> >>
> >> There must be several ways of doing this but see below for an idea with
> >> comments in-line.
> >>
> >> On 26/10/2019 00:31, Ana Marija wrote:
> >>> Hello,
> >>>
> >>> I would like to use this package metap
> >>> to calculate multiple o values
> >>>
> >>> I have my data frame with 3 p values
> >>>> head(tt)
> >>>             RS            G           E          B
> >>> 1: rs2089177   0.9986   0.7153   0.604716
> >>> 2: rs4360974   0.9738   0.7838   0.430228
> >>> 3: rs6502526   0.9744   0.7839   0.429160
> >>> 4: rs8069906   0.7184   0.4918   0.521452
> >>> 5: rs9905280   0.7205   0.4861   0.465758
> >>> 6: rs4313843   0.9804   0.8522   0.474313
> >>>
> >>> and data frame with corresponding weights for each of the p values
> >>> from the tt data frame
> >>>
> >>>> head(df)
> >>>          wg       we             wb                RS
> >>> 1 40.6325 35.39774 580.6436 rs2089177
> >>> 2 40.6325 35.39774 580.6436 rs4360974
> >>> 3 40.6325 35.39774 580.6436 rs6502526
> >>> 4 40.6325 35.39774 580.6436 rs8069906
> >>> 5 40.6325 35.39774 580.6436 rs9905280
> >>> 6 40.6325 35.39774 580.6436 rs4313843
> >>>
> >>> RS column is the same in df and tt
> >>>
> >>
> >> So you can create a new data-frame with merge()
> >>
> >> newdata <- merge(tt, df)
> >>
> >> which will use RS as the key to merge them on.
> >>
> >> The write a function of one argument, a seven element vector, which
> >> picks out the p-values and the weights and feeds them to sumz().
> >> Something like
> >>
> >> helper <- function(x) {
> >>    p <- sumz(x[2:4], weights = x[5:7])$p
> >>    p
> >> }
> >> Note you need to check that 2:4 and 5:7 are actually where they are in
> >> the row of newdat.
> >>
> >> Then use apply() to apply that to the rows of newdat.
> >>
> >> I have not tested any of this but the general idea should be OK even if
> >> the details are wrong.
> >>
> >> Michael
> >>
> >>
> >>> How to use this sunz() function to create a new data frame which would
> >>> look the same as tt only it would have additional column, say named
> >>> "META" which has calculated meta p values for each row
> >>>
> >>> This i s example of how much would be p value in the first row:
> >>>
> >>>> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
> >>> p =  0.6940048
> >>>
> >>> Thanks
> >>> Ana
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >> --
> >> Michael
> >> http://www.dewey.myzen.co.uk/home.html
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to calculate multiple meta p values

anikaM
I also tried to do it this way:

d$META <- sapply(seq_len(nrow(d)), function(rn) {
  unlist(sumz(as.matrix(d[,.(LCL,Retina)])[rn,], weights =
as.vector(d[,.(wl,wr)])[rn,],
              na.action=na.fail)["p"])
})

but again I am getting error:
Error in sumz(as.matrix(d[, .(LCL, Retina)])[rn, ], weights = as.vector(d[,  :
  Must have at least two valid p values

for this reference these are details about my data frame:
> head(d)
    chr    pos         gene_id                     LCL          Retina
           wl           wr
1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838
2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838
3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838
4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838
5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838
6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838
> sapply(d,class)
        chr         pos     gene_id         LCL      Retina          wl
"character" "character" "character"   "numeric"   "numeric"   "numeric"
         wr
  "numeric"
> sum(is.na(d$LCL))
[1] 0
> sum(is.na(d$Retina))
[1] 0
> sum(is.na(d$wl))
[1] 0
> sum(is.na(d$wr))
[1] 0
> dim(d)
[1] 1668837       7

On Wed, Oct 30, 2019 at 4:52 PM Ana Marija <[hidden email]> wrote:

>
> Hi Michael,
>
> this still doesn't work, by data frame has a few less columns now, but
> the principle is still the same:
>
> > head(d)
>     chr    pos         gene_id                     LCL
> Retina       wl           wr
> 1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838
> 2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838
> 3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838
> 4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838
> 5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838
> 6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838
>
> so solution for the first row should be:
> > sumz(c(0.3559520,9.72251e-05), weights = c(31.62278,21.2838), na.action = na.fail)
> sumz =  2.386896 p =  0.008495647
>
> when I run what you proposed in the last email:
>
> helper <- function(x) {
>   p <- sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7]))$p
>   p
> }
>
> d$META <- apply(d, MARGIN = 1, helper)
>
> I am getting:
>
> Error in sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7])) :
>   Must have at least two valid p values
>
> Please advise,
> Ana
>
> On Wed, Oct 30, 2019 at 5:02 AM Michael Dewey <[hidden email]> wrote:
> >
> > Dear Ana
> >
> > Yes, when apply coerces q to a matrix it does so as a character matrix
> > because of the values in the first column. So you need to wrap the
> > references to x in helper in as.numeric() tat is to day like
> > as.numeric(x[2:4]) and similarly for the other one. Sorry about that, I
> > should have thought of it before.
> >
> > When I next update metap I will try to get it to degrade more gracefully
> > when it finds an error.
> >
> > Michael
> >
> > On 28/10/2019 19:06, Ana Marija wrote:
> > > Hi Michael,
> > >
> > > I tried what you proposed with my data frame q:
> > >
> > >> head(q)
> > >             ID                P             G              E
> > >   wb          wg           we
> > > 1:  rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774
> > > 2:  rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774
> > > 3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774
> > > 4:  rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774
> > > 5:  rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774
> > > 6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774
> > >
> > > so the solution of the first row would be this:
> > >> sumz(c(0.0979931,0.0054060,0.39160), weights = c(580.6436,40.6325,35.39774), na.action = na.fail)
> > > sumz =  1.481833 p =  0.06919239
> > >
> > > I tried applying the function you wrote:
> > > helper <- function(x) {
> > >    p <- sumz(x[2:4], weights = x[5:7])$p
> > >    p
> > > }
> > >
> > > With:
> > >
> > > q$META <- apply(q, MARGIN = 1, helper)
> > >
> > > # I want to make a new column in q named META with results
> > > but I got this error:
> > >   Error in sumz(x[2:4], weights = x[5:7]) :
> > >    Must have at least two valid p values
> > >
> > > Please advise,
> > > Ana
> > >
> > > On Sun, Oct 27, 2019 at 9:49 AM Michael Dewey <[hidden email]> wrote:
> > >>
> > >> Dear Ana
> > >>
> > >> There must be several ways of doing this but see below for an idea with
> > >> comments in-line.
> > >>
> > >> On 26/10/2019 00:31, Ana Marija wrote:
> > >>> Hello,
> > >>>
> > >>> I would like to use this package metap
> > >>> to calculate multiple o values
> > >>>
> > >>> I have my data frame with 3 p values
> > >>>> head(tt)
> > >>>             RS            G           E          B
> > >>> 1: rs2089177   0.9986   0.7153   0.604716
> > >>> 2: rs4360974   0.9738   0.7838   0.430228
> > >>> 3: rs6502526   0.9744   0.7839   0.429160
> > >>> 4: rs8069906   0.7184   0.4918   0.521452
> > >>> 5: rs9905280   0.7205   0.4861   0.465758
> > >>> 6: rs4313843   0.9804   0.8522   0.474313
> > >>>
> > >>> and data frame with corresponding weights for each of the p values
> > >>> from the tt data frame
> > >>>
> > >>>> head(df)
> > >>>          wg       we             wb                RS
> > >>> 1 40.6325 35.39774 580.6436 rs2089177
> > >>> 2 40.6325 35.39774 580.6436 rs4360974
> > >>> 3 40.6325 35.39774 580.6436 rs6502526
> > >>> 4 40.6325 35.39774 580.6436 rs8069906
> > >>> 5 40.6325 35.39774 580.6436 rs9905280
> > >>> 6 40.6325 35.39774 580.6436 rs4313843
> > >>>
> > >>> RS column is the same in df and tt
> > >>>
> > >>
> > >> So you can create a new data-frame with merge()
> > >>
> > >> newdata <- merge(tt, df)
> > >>
> > >> which will use RS as the key to merge them on.
> > >>
> > >> The write a function of one argument, a seven element vector, which
> > >> picks out the p-values and the weights and feeds them to sumz().
> > >> Something like
> > >>
> > >> helper <- function(x) {
> > >>    p <- sumz(x[2:4], weights = x[5:7])$p
> > >>    p
> > >> }
> > >> Note you need to check that 2:4 and 5:7 are actually where they are in
> > >> the row of newdat.
> > >>
> > >> Then use apply() to apply that to the rows of newdat.
> > >>
> > >> I have not tested any of this but the general idea should be OK even if
> > >> the details are wrong.
> > >>
> > >> Michael
> > >>
> > >>
> > >>> How to use this sunz() function to create a new data frame which would
> > >>> look the same as tt only it would have additional column, say named
> > >>> "META" which has calculated meta p values for each row
> > >>>
> > >>> This i s example of how much would be p value in the first row:
> > >>>
> > >>>> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
> > >>> p =  0.6940048
> > >>>
> > >>> Thanks
> > >>> Ana
> > >>>
> > >>> ______________________________________________
> > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>
> > >> --
> > >> Michael
> > >> http://www.dewey.myzen.co.uk/home.html
> > >
> >
> > --
> > Michael
> > http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to calculate multiple meta p values

anikaM
Can you please get back to me about this, I need this meta p values
for manuscript I have to submit next week

On Wed, Oct 30, 2019 at 5:35 PM Ana Marija <[hidden email]> wrote:

>
> I also tried to do it this way:
>
> d$META <- sapply(seq_len(nrow(d)), function(rn) {
>   unlist(sumz(as.matrix(d[,.(LCL,Retina)])[rn,], weights =
> as.vector(d[,.(wl,wr)])[rn,],
>               na.action=na.fail)["p"])
> })
>
> but again I am getting error:
> Error in sumz(as.matrix(d[, .(LCL, Retina)])[rn, ], weights = as.vector(d[,  :
>   Must have at least two valid p values
>
> for this reference these are details about my data frame:
> > head(d)
>     chr    pos         gene_id                     LCL          Retina
>            wl           wr
> 1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838
> 2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838
> 3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838
> 4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838
> 5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838
> 6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838
> > sapply(d,class)
>         chr         pos     gene_id         LCL      Retina          wl
> "character" "character" "character"   "numeric"   "numeric"   "numeric"
>          wr
>   "numeric"
> > sum(is.na(d$LCL))
> [1] 0
> > sum(is.na(d$Retina))
> [1] 0
> > sum(is.na(d$wl))
> [1] 0
> > sum(is.na(d$wr))
> [1] 0
> > dim(d)
> [1] 1668837       7
>
> On Wed, Oct 30, 2019 at 4:52 PM Ana Marija <[hidden email]> wrote:
> >
> > Hi Michael,
> >
> > this still doesn't work, by data frame has a few less columns now, but
> > the principle is still the same:
> >
> > > head(d)
> >     chr    pos         gene_id                     LCL
> > Retina       wl           wr
> > 1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838
> > 2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838
> > 3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838
> > 4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838
> > 5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838
> > 6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838
> >
> > so solution for the first row should be:
> > > sumz(c(0.3559520,9.72251e-05), weights = c(31.62278,21.2838), na.action = na.fail)
> > sumz =  2.386896 p =  0.008495647
> >
> > when I run what you proposed in the last email:
> >
> > helper <- function(x) {
> >   p <- sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7]))$p
> >   p
> > }
> >
> > d$META <- apply(d, MARGIN = 1, helper)
> >
> > I am getting:
> >
> > Error in sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7])) :
> >   Must have at least two valid p values
> >
> > Please advise,
> > Ana
> >
> > On Wed, Oct 30, 2019 at 5:02 AM Michael Dewey <[hidden email]> wrote:
> > >
> > > Dear Ana
> > >
> > > Yes, when apply coerces q to a matrix it does so as a character matrix
> > > because of the values in the first column. So you need to wrap the
> > > references to x in helper in as.numeric() tat is to day like
> > > as.numeric(x[2:4]) and similarly for the other one. Sorry about that, I
> > > should have thought of it before.
> > >
> > > When I next update metap I will try to get it to degrade more gracefully
> > > when it finds an error.
> > >
> > > Michael
> > >
> > > On 28/10/2019 19:06, Ana Marija wrote:
> > > > Hi Michael,
> > > >
> > > > I tried what you proposed with my data frame q:
> > > >
> > > >> head(q)
> > > >             ID                P             G              E
> > > >   wb          wg           we
> > > > 1:  rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774
> > > > 2:  rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774
> > > > 3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774
> > > > 4:  rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774
> > > > 5:  rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774
> > > > 6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774
> > > >
> > > > so the solution of the first row would be this:
> > > >> sumz(c(0.0979931,0.0054060,0.39160), weights = c(580.6436,40.6325,35.39774), na.action = na.fail)
> > > > sumz =  1.481833 p =  0.06919239
> > > >
> > > > I tried applying the function you wrote:
> > > > helper <- function(x) {
> > > >    p <- sumz(x[2:4], weights = x[5:7])$p
> > > >    p
> > > > }
> > > >
> > > > With:
> > > >
> > > > q$META <- apply(q, MARGIN = 1, helper)
> > > >
> > > > # I want to make a new column in q named META with results
> > > > but I got this error:
> > > >   Error in sumz(x[2:4], weights = x[5:7]) :
> > > >    Must have at least two valid p values
> > > >
> > > > Please advise,
> > > > Ana
> > > >
> > > > On Sun, Oct 27, 2019 at 9:49 AM Michael Dewey <[hidden email]> wrote:
> > > >>
> > > >> Dear Ana
> > > >>
> > > >> There must be several ways of doing this but see below for an idea with
> > > >> comments in-line.
> > > >>
> > > >> On 26/10/2019 00:31, Ana Marija wrote:
> > > >>> Hello,
> > > >>>
> > > >>> I would like to use this package metap
> > > >>> to calculate multiple o values
> > > >>>
> > > >>> I have my data frame with 3 p values
> > > >>>> head(tt)
> > > >>>             RS            G           E          B
> > > >>> 1: rs2089177   0.9986   0.7153   0.604716
> > > >>> 2: rs4360974   0.9738   0.7838   0.430228
> > > >>> 3: rs6502526   0.9744   0.7839   0.429160
> > > >>> 4: rs8069906   0.7184   0.4918   0.521452
> > > >>> 5: rs9905280   0.7205   0.4861   0.465758
> > > >>> 6: rs4313843   0.9804   0.8522   0.474313
> > > >>>
> > > >>> and data frame with corresponding weights for each of the p values
> > > >>> from the tt data frame
> > > >>>
> > > >>>> head(df)
> > > >>>          wg       we             wb                RS
> > > >>> 1 40.6325 35.39774 580.6436 rs2089177
> > > >>> 2 40.6325 35.39774 580.6436 rs4360974
> > > >>> 3 40.6325 35.39774 580.6436 rs6502526
> > > >>> 4 40.6325 35.39774 580.6436 rs8069906
> > > >>> 5 40.6325 35.39774 580.6436 rs9905280
> > > >>> 6 40.6325 35.39774 580.6436 rs4313843
> > > >>>
> > > >>> RS column is the same in df and tt
> > > >>>
> > > >>
> > > >> So you can create a new data-frame with merge()
> > > >>
> > > >> newdata <- merge(tt, df)
> > > >>
> > > >> which will use RS as the key to merge them on.
> > > >>
> > > >> The write a function of one argument, a seven element vector, which
> > > >> picks out the p-values and the weights and feeds them to sumz().
> > > >> Something like
> > > >>
> > > >> helper <- function(x) {
> > > >>    p <- sumz(x[2:4], weights = x[5:7])$p
> > > >>    p
> > > >> }
> > > >> Note you need to check that 2:4 and 5:7 are actually where they are in
> > > >> the row of newdat.
> > > >>
> > > >> Then use apply() to apply that to the rows of newdat.
> > > >>
> > > >> I have not tested any of this but the general idea should be OK even if
> > > >> the details are wrong.
> > > >>
> > > >> Michael
> > > >>
> > > >>
> > > >>> How to use this sunz() function to create a new data frame which would
> > > >>> look the same as tt only it would have additional column, say named
> > > >>> "META" which has calculated meta p values for each row
> > > >>>
> > > >>> This i s example of how much would be p value in the first row:
> > > >>>
> > > >>>> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
> > > >>> p =  0.6940048
> > > >>>
> > > >>> Thanks
> > > >>> Ana
> > > >>>
> > > >>> ______________________________________________
> > > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > >>> and provide commented, minimal, self-contained, reproducible code.
> > > >>>
> > > >>
> > > >> --
> > > >> Michael
> > > >> http://www.dewey.myzen.co.uk/home.html
> > > >
> > >
> > > --
> > > Michael
> > > http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.