issue with numeric

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

issue with numeric

anikaM
Hello,

I was running this code, located at:
https://github.com/swvanderlaan/QTLToolKit/blob/master/SCRIPTS/runFDR_cis.R

Rscript runFDR_cis.R Retina_new_perms_full.txt 0.05 permutations_all

and I got this error:

Processing QTLtools output.
  * Input  = [ Retina_new_perms_full.txt ]
  * FDR    =  0.05
  * Output = [ permutations_all ]

Read Input data. Note: we expect a file with 19 columns i.e. it is
best to use the results from a permutation test.
  * Gene level correction detected
  * Number of molecular phenotypes = 17375
  * Number of NA lines = 15
Error in cor(D[, 18 + exon_offset], D[, 19 + exon_offset]) :
  'x' must be numeric
Calls: cat -> cor
Execution halted

> a=read.table("Retina_new_perms_full.txt", header=T)
> head(a)
               V1   V2     V3     V4 V5   V6      V7
1 ENSG00000227232 chr1  29571  29570  -  983 -828479
2 ENSG00000237613 chr1  36082  36081  - 1006  -38709
3 ENSG00000239945 chr1  91106  91105  - 1169 -782443
4 ENSG00000238009 chr1 133724 133723  - 1340   69986
5 ENSG00000241860 chr1 173863 173862  - 1441 -831895
6 ENSG00000279457 chr1 200323 200322  - 1620 -980529
                                   V8   V9     V10     V11   Effect_allele
1              rs200956863:793429:T:C chr1  858049  858049               T
2                rs13328700:74790:C:G chr1   74790   74790               G
3               rs11240780:808928:C:T chr1  873548  873548               C
4            rs201888535:63735:CCTA:C chr1   63735   63738               C
5 rs61703480:941137:AGCCCCCGCAGCAGT:A chr1 1005757 1005771 AGCCCCCGCAGCAGT
6              rs13374146:1116231:T:C chr1 1180851 1180851               C
  Baseline_allele V12     V13     V14     V15         V16       V17      V18
1               C 404 348.489 1.04262 139.753 0.000741814  0.269459 0.196180
2               C 404 346.832 1.03086 138.165 0.002822220 -0.687290 0.530547
3               T 404 347.109 1.02189 152.726 0.000626379 -0.284821 0.203080
4            CCTA 404 338.804 1.04423 154.301 0.000797573 -0.398402 0.264974
5               A 404 341.822 1.04355 171.178 0.002893770  0.340855 0.638936
6               T 404 338.232 1.05240 180.879 0.001846080 -0.458547 0.528947
       V19
1 0.198142
2 0.529105
3 0.199394
4 0.261441
5 0.633917
6 0.524186


Please advise,
Ana

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: issue with numeric

Brian Kreeger
*snip*
Error in cor(D[, 18 + exon_offset], D[, 19 + exon_offset]) :
  'x' must be numeric
*snip*

You are applying the correlation function to non-numeric variables.

Brian

On Wed, Dec 18, 2019 at 12:23 PM Ana Marija <[hidden email]>
wrote:

> Hello,
>
> I was running this code, located at:
> https://github.com/swvanderlaan/QTLToolKit/blob/master/SCRIPTS/runFDR_cis.R
>
> Rscript runFDR_cis.R Retina_new_perms_full.txt 0.05 permutations_all
>
> and I got this error:
>
> Processing QTLtools output.
>   * Input  = [ Retina_new_perms_full.txt ]
>   * FDR    =  0.05
>   * Output = [ permutations_all ]
>
> Read Input data. Note: we expect a file with 19 columns i.e. it is
> best to use the results from a permutation test.
>   * Gene level correction detected
>   * Number of molecular phenotypes = 17375
>   * Number of NA lines = 15
> Error in cor(D[, 18 + exon_offset], D[, 19 + exon_offset]) :
>   'x' must be numeric
> Calls: cat -> cor
> Execution halted
>
> > a=read.table("Retina_new_perms_full.txt", header=T)
> > head(a)
>                V1   V2     V3     V4 V5   V6      V7
> 1 ENSG00000227232 chr1  29571  29570  -  983 -828479
> 2 ENSG00000237613 chr1  36082  36081  - 1006  -38709
> 3 ENSG00000239945 chr1  91106  91105  - 1169 -782443
> 4 ENSG00000238009 chr1 133724 133723  - 1340   69986
> 5 ENSG00000241860 chr1 173863 173862  - 1441 -831895
> 6 ENSG00000279457 chr1 200323 200322  - 1620 -980529
>                                    V8   V9     V10     V11   Effect_allele
> 1              rs200956863:793429:T:C chr1  858049  858049               T
> 2                rs13328700:74790:C:G chr1   74790   74790               G
> 3               rs11240780:808928:C:T chr1  873548  873548               C
> 4            rs201888535:63735:CCTA:C chr1   63735   63738               C
> 5 rs61703480:941137:AGCCCCCGCAGCAGT:A chr1 1005757 1005771 AGCCCCCGCAGCAGT
> 6              rs13374146:1116231:T:C chr1 1180851 1180851               C
>   Baseline_allele V12     V13     V14     V15         V16       V17
> V18
> 1               C 404 348.489 1.04262 139.753 0.000741814  0.269459
> 0.196180
> 2               C 404 346.832 1.03086 138.165 0.002822220 -0.687290
> 0.530547
> 3               T 404 347.109 1.02189 152.726 0.000626379 -0.284821
> 0.203080
> 4            CCTA 404 338.804 1.04423 154.301 0.000797573 -0.398402
> 0.264974
> 5               A 404 341.822 1.04355 171.178 0.002893770  0.340855
> 0.638936
> 6               T 404 338.232 1.05240 180.879 0.001846080 -0.458547
> 0.528947
>        V19
> 1 0.198142
> 2 0.529105
> 3 0.199394
> 4 0.261441
> 5 0.633917
> 6 0.524186
>
>
> Please advise,
> Ana
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: issue with numeric

Rui Barradas
In reply to this post by anikaM
Hello,

It's hard to tell without data but:


1) The data is read in in code line 19. Check if it has 19 columns and
if columns 18 and 19 are numeric.
If they are of class factor run

D[18:19] <- lapply(D[18:19], function(x) as.numeric(as.character(x)))


2) code line nr 20 is
exon_offset = ifelse(ncol(D) == 19, 0, 2)

So if the data has 19 columns, as expected, exon_offset = 0.

3) The error comes from computing the correlation between D[, 18 +
exon_offset] and D[, 19 + exon_offset] in code line 27:


cat("  * Correlation between Beta approx. and Empirical p-values =",
round(cor(D[, 18+exon_offset], D[, 19+exon_offset]), 4), "\n")

This includes cor(D[, 18+exon_offset], D[, 19+exon_offset])


So check what are the values of ncol(D) and of exon_offset. Their sum
cannot be bigger than ncol(D).


If none of the above, say something.


Hope this helps,

Rui Barradas




Às 18:25 de 18/12/19, Ana Marija escreveu:

> Hello,
>
> I was running this code, located at:
> https://github.com/swvanderlaan/QTLToolKit/blob/master/SCRIPTS/runFDR_cis.R
>
> Rscript runFDR_cis.R Retina_new_perms_full.txt 0.05 permutations_all
>
> and I got this error:
>
> Processing QTLtools output.
>    * Input  = [ Retina_new_perms_full.txt ]
>    * FDR    =  0.05
>    * Output = [ permutations_all ]
>
> Read Input data. Note: we expect a file with 19 columns i.e. it is
> best to use the results from a permutation test.
>    * Gene level correction detected
>    * Number of molecular phenotypes = 17375
>    * Number of NA lines = 15
> Error in cor(D[, 18 + exon_offset], D[, 19 + exon_offset]) :
>    'x' must be numeric
> Calls: cat -> cor
> Execution halted
>
>> a=read.table("Retina_new_perms_full.txt", header=T)
>> head(a)
>                 V1   V2     V3     V4 V5   V6      V7
> 1 ENSG00000227232 chr1  29571  29570  -  983 -828479
> 2 ENSG00000237613 chr1  36082  36081  - 1006  -38709
> 3 ENSG00000239945 chr1  91106  91105  - 1169 -782443
> 4 ENSG00000238009 chr1 133724 133723  - 1340   69986
> 5 ENSG00000241860 chr1 173863 173862  - 1441 -831895
> 6 ENSG00000279457 chr1 200323 200322  - 1620 -980529
>                                     V8   V9     V10     V11   Effect_allele
> 1              rs200956863:793429:T:C chr1  858049  858049               T
> 2                rs13328700:74790:C:G chr1   74790   74790               G
> 3               rs11240780:808928:C:T chr1  873548  873548               C
> 4            rs201888535:63735:CCTA:C chr1   63735   63738               C
> 5 rs61703480:941137:AGCCCCCGCAGCAGT:A chr1 1005757 1005771 AGCCCCCGCAGCAGT
> 6              rs13374146:1116231:T:C chr1 1180851 1180851               C
>    Baseline_allele V12     V13     V14     V15         V16       V17      V18
> 1               C 404 348.489 1.04262 139.753 0.000741814  0.269459 0.196180
> 2               C 404 346.832 1.03086 138.165 0.002822220 -0.687290 0.530547
> 3               T 404 347.109 1.02189 152.726 0.000626379 -0.284821 0.203080
> 4            CCTA 404 338.804 1.04423 154.301 0.000797573 -0.398402 0.264974
> 5               A 404 341.822 1.04355 171.178 0.002893770  0.340855 0.638936
> 6               T 404 338.232 1.05240 180.879 0.001846080 -0.458547 0.528947
>         V19
> 1 0.198142
> 2 0.529105
> 3 0.199394
> 4 0.261441
> 5 0.633917
> 6 0.524186
>
>
> Please advise,
> Ana
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: issue with numeric

Ivan Krylov
In reply to this post by anikaM
On Wed, 18 Dec 2019 12:25:24 -0600
Ana Marija <[hidden email]> wrote:

> Error in cor(D[, 18 + exon_offset], D[, 19 + exon_offset]) :
>   'x' must be numeric

Try str(a) to find out the types of the columns. A stray typo could
make a representation of a number impossible to parse and make the
whole column textual. Use
which(is.na(as.numeric(as.character(a[,column_number])))) to find out
the row number where it happened (using extra as.character() here in
case the column is a factor).

--
Best regards,
Ivan

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: issue with numeric

anikaM
Hi Ivan,

here it is:

> str(a)
'data.frame':    17389 obs. of  21 variables:
 $ V1             : Factor w/ 17389 levels "ENSG00000000419",..: 14093
14622 14705 14651 14784 17138 14773 14163 14569 15156 ...
 $ V2             : Factor w/ 22 levels "chr1","chr10",..: 1 1 1 1 1 1
1 1 1 1 ...
 $ V3             : int  29571 36082 91106 133724 173863 200323 259025
297503 348367 493242 ...
 $ V4             : int  29570 36081 91105 133723 173862 200322 259024
297502 348366 493241 ...
 $ V5             : Factor w/ 2 levels "-","+": 1 1 1 1 1 1 1 1 1 1 ...
 $ V6             : int  983 1006 1169 1340 1441 1620 1897 2032 2175 2697 ...
 $ V7             : int  -828479 -38709 -782443 69986 -831895 -980529
-647609 -946918 -631093 -886444 ...
 $ V8             : Factor w/ 17104 levels "1:10095977:G:GT",..: 7339
4761 2344 7480 12580 4781 14856 3061 9397 6938 ...
 $ V9             : Factor w/ 22 levels "chr1","chr10",..: 1 1 1 1 1 1
1 1 1 1 ...
 $ V10            : int  858049 74790 873548 63735 1005757 1180851
906633 1244420 979459 1379685 ...
 $ V11            : int  858049 74790 873548 63738 1005771 1180851
906633 1244420 979459 1379685 ...
 $ Effect_allele  : Factor w/ 358 levels "A","AAAAACAAAAC",..: 267 190
92 92 54 92 190 1 267 267 ...
 $ Baseline_allele: Factor w/ 435 levels "A","AAAAAAAAAATAAAAAT",..:
112 112 325 175 1 325 325 237 112 237 ...
 $ V12            : int  404 404 404 404 404 404 404 404 404 404 ...
 $ V13            : num  348 347 347 339 342 ...
 $ V14            : num  1.04 1.03 1.02 1.04 1.04 ...
 $ V15            : num  140 138 153 154 171 ...
 $ V16            : num  0.000742 0.002822 0.000626 0.000798 0.002894 ...
 $ V17            : num  0.269 -0.687 -0.285 -0.398 0.341 ...
 $ V18            : num  0.196 0.531 0.203 0.265 0.639 ...
 $ V19            : num  0.198 0.529 0.199 0.261 0.634 ...

and this:

> which(is.na(as.numeric(as.character(a[,18]))))
 [1] 10757 11062 11063 11064 11065 11066 11067 11068 11069 11070 11071 11072
[13] 11073 11074 11075
> which(is.na(as.numeric(as.character(a[,19]))))
 [1] 10757 11062 11063 11064 11065 11066 11067 11068 11069 11070 11071 11072
[13] 11073 11074 11075

columns 18 and 19 seems to be numeric, what is could be the issue?

On Wed, Dec 18, 2019 at 1:49 PM Ivan Krylov <[hidden email]> wrote:

>
> On Wed, 18 Dec 2019 12:25:24 -0600
> Ana Marija <[hidden email]> wrote:
>
> > Error in cor(D[, 18 + exon_offset], D[, 19 + exon_offset]) :
> >   'x' must be numeric
>
> Try str(a) to find out the types of the columns. A stray typo could
> make a representation of a number impossible to parse and make the
> whole column textual. Use
> which(is.na(as.numeric(as.character(a[,column_number])))) to find out
> the row number where it happened (using extra as.character() here in
> case the column is a factor).
>
> --
> Best regards,
> Ivan

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: issue with numeric

anikaM
Hello,

the error was in the code:
D = read.table(opt_input, head = FALSE, stringsAsFactors = FALSE)

I should have there header=TRUE

Sorry for bothering with this,

Ana

On Wed, Dec 18, 2019 at 2:44 PM Ana Marija <[hidden email]> wrote:

>
> Hi Ivan,
>
> here it is:
>
> > str(a)
> 'data.frame':    17389 obs. of  21 variables:
>  $ V1             : Factor w/ 17389 levels "ENSG00000000419",..: 14093
> 14622 14705 14651 14784 17138 14773 14163 14569 15156 ...
>  $ V2             : Factor w/ 22 levels "chr1","chr10",..: 1 1 1 1 1 1
> 1 1 1 1 ...
>  $ V3             : int  29571 36082 91106 133724 173863 200323 259025
> 297503 348367 493242 ...
>  $ V4             : int  29570 36081 91105 133723 173862 200322 259024
> 297502 348366 493241 ...
>  $ V5             : Factor w/ 2 levels "-","+": 1 1 1 1 1 1 1 1 1 1 ...
>  $ V6             : int  983 1006 1169 1340 1441 1620 1897 2032 2175 2697 ...
>  $ V7             : int  -828479 -38709 -782443 69986 -831895 -980529
> -647609 -946918 -631093 -886444 ...
>  $ V8             : Factor w/ 17104 levels "1:10095977:G:GT",..: 7339
> 4761 2344 7480 12580 4781 14856 3061 9397 6938 ...
>  $ V9             : Factor w/ 22 levels "chr1","chr10",..: 1 1 1 1 1 1
> 1 1 1 1 ...
>  $ V10            : int  858049 74790 873548 63735 1005757 1180851
> 906633 1244420 979459 1379685 ...
>  $ V11            : int  858049 74790 873548 63738 1005771 1180851
> 906633 1244420 979459 1379685 ...
>  $ Effect_allele  : Factor w/ 358 levels "A","AAAAACAAAAC",..: 267 190
> 92 92 54 92 190 1 267 267 ...
>  $ Baseline_allele: Factor w/ 435 levels "A","AAAAAAAAAATAAAAAT",..:
> 112 112 325 175 1 325 325 237 112 237 ...
>  $ V12            : int  404 404 404 404 404 404 404 404 404 404 ...
>  $ V13            : num  348 347 347 339 342 ...
>  $ V14            : num  1.04 1.03 1.02 1.04 1.04 ...
>  $ V15            : num  140 138 153 154 171 ...
>  $ V16            : num  0.000742 0.002822 0.000626 0.000798 0.002894 ...
>  $ V17            : num  0.269 -0.687 -0.285 -0.398 0.341 ...
>  $ V18            : num  0.196 0.531 0.203 0.265 0.639 ...
>  $ V19            : num  0.198 0.529 0.199 0.261 0.634 ...
>
> and this:
>
> > which(is.na(as.numeric(as.character(a[,18]))))
>  [1] 10757 11062 11063 11064 11065 11066 11067 11068 11069 11070 11071 11072
> [13] 11073 11074 11075
> > which(is.na(as.numeric(as.character(a[,19]))))
>  [1] 10757 11062 11063 11064 11065 11066 11067 11068 11069 11070 11071 11072
> [13] 11073 11074 11075
>
> columns 18 and 19 seems to be numeric, what is could be the issue?
>
> On Wed, Dec 18, 2019 at 1:49 PM Ivan Krylov <[hidden email]> wrote:
> >
> > On Wed, 18 Dec 2019 12:25:24 -0600
> > Ana Marija <[hidden email]> wrote:
> >
> > > Error in cor(D[, 18 + exon_offset], D[, 19 + exon_offset]) :
> > >   'x' must be numeric
> >
> > Try str(a) to find out the types of the columns. A stray typo could
> > make a representation of a number impossible to parse and make the
> > whole column textual. Use
> > which(is.na(as.numeric(as.character(a[,column_number])))) to find out
> > the row number where it happened (using extra as.character() here in
> > case the column is a factor).
> >
> > --
> > Best regards,
> > Ivan

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.