Warning message: NAs introduced by coercion

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Warning message: NAs introduced by coercion

MeriamNF
Dear all,

I have a .csv file called df4. (15752 obs. of 264 variables).
I apply this code but couldn't continue further other analyses, a warning
message keeps coming up. Then, I want to determine max and min
similarity values,
heat map plot, cluster...etc

> require(SNPRelate)
> library(gdsfmt)
> myd <- read.csv(file = "df4.csv", header = TRUE)
> names(myd)[-1]
myd[,1]
> myd[1:10, 1:10]
 # the data must be 0,1,2 with 3 as missing so you have r
> sample.id <- names(myd)[-1]
> snp.id <- myd[,1]
> snp.position <- 1:length(snp.id) # not needed for ibs
> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
# genotype data must have - in 3
> genod <- myd[,-1]
> genod[is.na(genod)] <- 3
> genod[genod=="0"] <- 0
> genod[genod=="1"] <- 2
> genod[1:10,1:10]
> genod <- as.matrix(genod)
> class(genod) <- "numeric"


*Warning message:In class(genod) <- "numeric" : NAs introduced by coercion*

Maybe I could illustrate more with details so I can be more specific?
Please, let me know.

I would appreciate your help.
Thanks,
Meriam

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

PIKAL Petr
Hi

see in line

> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of N Meriam
> Sent: Tuesday, January 8, 2019 3:08 PM
> To: [hidden email]
> Subject: [R] Warning message: NAs introduced by coercion
>
> Dear all,
>
> I have a .csv file called df4. (15752 obs. of 264 variables).
> I apply this code but couldn't continue further other analyses, a warning
> message keeps coming up. Then, I want to determine max and min
> similarity values,
> heat map plot, cluster...etc
>
> > require(SNPRelate)
> > library(gdsfmt)
> > myd <- read.csv(file = "df4.csv", header = TRUE)
> > names(myd)[-1]
> myd[,1]
> > myd[1:10, 1:10]
>  # the data must be 0,1,2 with 3 as missing so you have r
> > sample.id <- names(myd)[-1]
> > snp.id <- myd[,1]
> > snp.position <- 1:length(snp.id) # not needed for ibs
> > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
> # genotype data must have - in 3
> > genod <- myd[,-1]
> > genod[is.na(genod)] <- 3
> > genod[genod=="0"] <- 0
> > genod[genod=="1"] <- 2
> > genod[1:10,1:10]
> > genod <- as.matrix(genod)

matrix can have only one type of data so you probaly changed it to character by such construction.

> > class(genod) <- "numeric"

This tries to change all "numeric" values to numbers but if it cannot it sets it to NA.

something like

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
> ir <-head(iris)
> irm <- as.matrix(ir)
> head(irm)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
> class(irm) <- "numeric"
Warning message:
In class(irm) <- "numeric" : NAs introduced by coercion
> head(irm)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2      NA
2          4.9         3.0          1.4         0.2      NA
3          4.7         3.2          1.3         0.2      NA
4          4.6         3.1          1.5         0.2      NA
5          5.0         3.6          1.4         0.2      NA
6          5.4         3.9          1.7         0.4      NA
>

Cheers
Petr


>
>
> *Warning message:In class(genod) <- "numeric" : NAs introduced by coercion*
>
> Maybe I could illustrate more with details so I can be more specific?
> Please, let me know.
>
> I would appreciate your help.
> Thanks,
> Meriam
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

MeriamNF
I see...
Here's a portion of what my data looks like (csv file attached).
I run again and here are the results:

df4 <- read.csv(file = "mydata.csv", header = TRUE)

> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> names(myd)[-1][1] "marker" "X88"    "X9"     "X17"    "X25"

> myd[,1][1]  3  4  5  6  8 10


> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2

> genod2 <- as.matrix(genod)> head(genod2)     marker                        X88 X9  X17 X25
[1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"
[2,] "1043336|F|0-7:A>G-7:A>G"     "2" "0" "3" "0"
[3,] "1212218|F|0-49:A>G-49:A>G"   "0" "0" "0" "0"
[4,] "1019554|F|0-14:T>C-14:T>C"   "0" "0" "3" "0"
[5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"
[6,] "1106702|F|0-8:C>A-8:C>A"     "0" "0" "0" "0"

> class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs introduced by coercion> head(genod2)

 marker X88 X9 X17 X25
[1,]     NA   0  3   3   3
[2,]     NA   2  0   3   0
[3,]     NA   0  0   0   0
[4,]     NA   0  0   3   0
[5,]     NA   3  3   3   3
[6,]     NA   0  0   0   0

> class(genod2) <- "numeric"> class(genod2)[1] "matrix"

> # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = genod,+                  sample.id = sample.id, snp.id = snp.id,+                  snp.chromosome = snp.chromosome,+                  snp.position = snp.position,+                  snp.allele = snp.allele, snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,  :
  is.matrix(genmat) is not TRUE

Thanks,
Meriam

On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <[hidden email]> wrote:

> Hi
>
> see in line
>
> > -----Original Message-----
> > From: R-help <[hidden email]> On Behalf Of N Meriam
> > Sent: Tuesday, January 8, 2019 3:08 PM
> > To: [hidden email]
> > Subject: [R] Warning message: NAs introduced by coercion
> >
> > Dear all,
> >
> > I have a .csv file called df4. (15752 obs. of 264 variables).
> > I apply this code but couldn't continue further other analyses, a warning
> > message keeps coming up. Then, I want to determine max and min
> > similarity values,
> > heat map plot, cluster...etc
> >
> > > require(SNPRelate)
> > > library(gdsfmt)
> > > myd <- read.csv(file = "df4.csv", header = TRUE)
> > > names(myd)[-1]
> > myd[,1]
> > > myd[1:10, 1:10]
> >  # the data must be 0,1,2 with 3 as missing so you have r
> > > sample.id <- names(myd)[-1]
> > > snp.id <- myd[,1]
> > > snp.position <- 1:length(snp.id) # not needed for ibs
> > > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> > > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
> > # genotype data must have - in 3
> > > genod <- myd[,-1]
> > > genod[is.na(genod)] <- 3
> > > genod[genod=="0"] <- 0
> > > genod[genod=="1"] <- 2
> > > genod[1:10,1:10]
> > > genod <- as.matrix(genod)
>
> matrix can have only one type of data so you probaly changed it to
> character by such construction.
>
> > > class(genod) <- "numeric"
>
> This tries to change all "numeric" values to numbers but if it cannot it
> sets it to NA.
>
> something like
>
> > head(iris)
>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> 1          5.1         3.5          1.4         0.2  setosa
> 2          4.9         3.0          1.4         0.2  setosa
> 3          4.7         3.2          1.3         0.2  setosa
> 4          4.6         3.1          1.5         0.2  setosa
> 5          5.0         3.6          1.4         0.2  setosa
> 6          5.4         3.9          1.7         0.4  setosa
> > ir <-head(iris)
> > irm <- as.matrix(ir)
> > head(irm)
>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> 1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
> 2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
> 3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
> 4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
> 5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
> 6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
> > class(irm) <- "numeric"
> Warning message:
> In class(irm) <- "numeric" : NAs introduced by coercion
> > head(irm)
>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> 1          5.1         3.5          1.4         0.2      NA
> 2          4.9         3.0          1.4         0.2      NA
> 3          4.7         3.2          1.3         0.2      NA
> 4          4.6         3.1          1.5         0.2      NA
> 5          5.0         3.6          1.4         0.2      NA
> 6          5.4         3.9          1.7         0.4      NA
> >
>
> Cheers
> Petr
>
>
> >
> >
> > *Warning message:In class(genod) <- "numeric" : NAs introduced by
> coercion*
> >
> > Maybe I could illustrate more with details so I can be more specific?
> > Please, let me know.
> >
> > I would appreciate your help.
> > Thanks,
> > Meriam
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
> partnerů PRECHEZA a.s. jsou zveřejněny na:
> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information
> about processing and protection of business partner’s personal data are
> available on website:
> https://www.precheza.cz/en/personal-data-protection-principles/
> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou
> důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení
> odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any
> documents attached to it may be confidential and are subject to the legally
> binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
>
>

--
*Meriam Nefzaoui*
*MSc. in Plant Breeding and Genetics*
*Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil*
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

Michael Dewey-3
Dear Meriam

Your csv file did not come through as attachments are stripped unless of
certain types and you post is very hard to read since you are posting in
HTML. Try renaming the file to ????.txt and set your mailer to send
plain text then people may be able to help you better.

Michael

On 08/01/2019 15:35, N Meriam wrote:

> I see...
> Here's a portion of what my data looks like (csv file attached).
> I run again and here are the results:
>
> df4 <- read.csv(file = "mydata.csv", header = TRUE)
>
>> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> names(myd)[-1][1] "marker" "X88"    "X9"     "X17"    "X25"
>
>> myd[,1][1]  3  4  5  6  8 10
>
>
>> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2
>
>> genod2 <- as.matrix(genod)> head(genod2)     marker                        X88 X9  X17 X25
> [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"
> [2,] "1043336|F|0-7:A>G-7:A>G"     "2" "0" "3" "0"
> [3,] "1212218|F|0-49:A>G-49:A>G"   "0" "0" "0" "0"
> [4,] "1019554|F|0-14:T>C-14:T>C"   "0" "0" "3" "0"
> [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"
> [6,] "1106702|F|0-8:C>A-8:C>A"     "0" "0" "0" "0"
>
>> class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs introduced by coercion> head(genod2)
>
>   marker X88 X9 X17 X25
> [1,]     NA   0  3   3   3
> [2,]     NA   2  0   3   0
> [3,]     NA   0  0   0   0
> [4,]     NA   0  0   3   0
> [5,]     NA   3  3   3   3
> [6,]     NA   0  0   0   0
>
>> class(genod2) <- "numeric"> class(genod2)[1] "matrix"
>
>> # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = genod,+                  sample.id = sample.id, snp.id = snp.id,+                  snp.chromosome = snp.chromosome,+                  snp.position = snp.position,+                  snp.allele = snp.allele, snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,  :
>    is.matrix(genmat) is not TRUE
>
> Thanks,
> Meriam
>
> On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <[hidden email]> wrote:
>
>> Hi
>>
>> see in line
>>
>>> -----Original Message-----
>>> From: R-help <[hidden email]> On Behalf Of N Meriam
>>> Sent: Tuesday, January 8, 2019 3:08 PM
>>> To: [hidden email]
>>> Subject: [R] Warning message: NAs introduced by coercion
>>>
>>> Dear all,
>>>
>>> I have a .csv file called df4. (15752 obs. of 264 variables).
>>> I apply this code but couldn't continue further other analyses, a warning
>>> message keeps coming up. Then, I want to determine max and min
>>> similarity values,
>>> heat map plot, cluster...etc
>>>
>>>> require(SNPRelate)
>>>> library(gdsfmt)
>>>> myd <- read.csv(file = "df4.csv", header = TRUE)
>>>> names(myd)[-1]
>>> myd[,1]
>>>> myd[1:10, 1:10]
>>>   # the data must be 0,1,2 with 3 as missing so you have r
>>>> sample.id <- names(myd)[-1]
>>>> snp.id <- myd[,1]
>>>> snp.position <- 1:length(snp.id) # not needed for ibs
>>>> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
>>>> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
>>> # genotype data must have - in 3
>>>> genod <- myd[,-1]
>>>> genod[is.na(genod)] <- 3
>>>> genod[genod=="0"] <- 0
>>>> genod[genod=="1"] <- 2
>>>> genod[1:10,1:10]
>>>> genod <- as.matrix(genod)
>>
>> matrix can have only one type of data so you probaly changed it to
>> character by such construction.
>>
>>>> class(genod) <- "numeric"
>>
>> This tries to change all "numeric" values to numbers but if it cannot it
>> sets it to NA.
>>
>> something like
>>
>>> head(iris)
>>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
>> 1          5.1         3.5          1.4         0.2  setosa
>> 2          4.9         3.0          1.4         0.2  setosa
>> 3          4.7         3.2          1.3         0.2  setosa
>> 4          4.6         3.1          1.5         0.2  setosa
>> 5          5.0         3.6          1.4         0.2  setosa
>> 6          5.4         3.9          1.7         0.4  setosa
>>> ir <-head(iris)
>>> irm <- as.matrix(ir)
>>> head(irm)
>>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
>> 1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
>> 2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
>> 3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
>> 4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
>> 5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
>> 6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
>>> class(irm) <- "numeric"
>> Warning message:
>> In class(irm) <- "numeric" : NAs introduced by coercion
>>> head(irm)
>>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
>> 1          5.1         3.5          1.4         0.2      NA
>> 2          4.9         3.0          1.4         0.2      NA
>> 3          4.7         3.2          1.3         0.2      NA
>> 4          4.6         3.1          1.5         0.2      NA
>> 5          5.0         3.6          1.4         0.2      NA
>> 6          5.4         3.9          1.7         0.4      NA
>>>
>>
>> Cheers
>> Petr
>>
>>
>>>
>>>
>>> *Warning message:In class(genod) <- "numeric" : NAs introduced by
>> coercion*
>>>
>>> Maybe I could illustrate more with details so I can be more specific?
>>> Please, let me know.
>>>
>>> I would appreciate your help.
>>> Thanks,
>>> Meriam
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
>> partnerů PRECHEZA a.s. jsou zveřejněny na:
>> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information
>> about processing and protection of business partner’s personal data are
>> available on website:
>> https://www.precheza.cz/en/personal-data-protection-principles/
>> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou
>> důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení
>> odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any
>> documents attached to it may be confidential and are subject to the legally
>> binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
>>
>>
>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

MeriamNF
Here's a portion of what my data looks like (text file format attached).
When running in R, it gives me this:

> df4 <- read.csv(file = "mydata.csv", header = TRUE)
> require(SNPRelate)
> library(gdsfmt)
> myd <- df4
> myd <- df4
> names(myd)[-1]
[1] "marker" "X88"    "X9"     "X17"    "X25"
> myd[,1]
[1]  3  4  5  6  8 10
# the data must be 0,1,2 with 3 as missing so you have r
> sample.id <- names(myd)[-1]
> snp.id <- myd[,1]
> snp.position <- 1:length(snp.id) # not needed for ibs
> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
# genotype data must have - in 3
> genod <- myd[,-1]
> genod[is.na(genod)] <- 3
> genod[genod=="0"] <- 0
> genod[genod=="1"] <- 2
> genod2 <- as.matrix(genod)
> head(genod2)
         marker                                             X88   X9
 X17   X25
[1,]  "100023173|F|0-47:G>A-47:G>A"     "0"    "3"    "3"     "3"
[2,]  "1043336|F|0-7:A>G-7:A>G"             "2"    "0"    "3"     "0"
[3,]  "1212218|F|0-49:A>G-49:A>G"         "0"    "0"    "0"     "0"
[4,]  "1019554|F|0-14:T>C-14:T>C"           "0"   "0"    "3"     "0"
[5,]  "100024550|F|0-16:G>A-16:G>A"     "3"    "3"    "3"     "3"
[6,]  "1106702|F|0-8:C>A-8:C>A"              "0"   "0"     "0"     "0"
> class(genod2) <- "numeric"
Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion
> head(genod2)
        marker   X88  X9   X17  X25
[1,]     NA         0      3     3       3
[2,]     NA         2      0     3       0
[3,]     NA         0      0     0       0
[4,]     NA         0      0     3       0
[5,]     NA         3      3     3       3
[6,]     NA         0      0     0       0
> class(genod2) <- "numeric"
> class(genod2)
[1] "matrix"
# read data
> filn <-"simTunesian.gds"
> snpgdsCreateGeno(filn, genmat = genod,
+                  sample.id = sample.id, snp.id = snp.id,
+                  snp.chromosome = snp.chromosome,
+                  snp.position = snp.position,
+                  snp.allele = snp.allele, snpfirstdim=TRUE)
Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,
 :   is.matrix(genmat) is not TRUE

Can't find a solution to my problem...my guess is that the problem
comes from converting the column 'marker' factor to numerical.

Best,
Meriam

On Tue, Jan 8, 2019 at 11:28 AM Michael Dewey <[hidden email]> wrote:

>
> Dear Meriam
>
> Your csv file did not come through as attachments are stripped unless of
> certain types and you post is very hard to read since you are posting in
> HTML. Try renaming the file to ????.txt and set your mailer to send
> plain text then people may be able to help you better.
>
> Michael
>
> On 08/01/2019 15:35, N Meriam wrote:
> > I see...
> > Here's a portion of what my data looks like (csv file attached).
> > I run again and here are the results:
> >
> > df4 <- read.csv(file = "mydata.csv", header = TRUE)
> >
> >> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> names(myd)[-1][1] "marker" "X88"    "X9"     "X17"    "X25"
> >
> >> myd[,1][1]  3  4  5  6  8 10
> >
> >
> >> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2
> >
> >> genod2 <- as.matrix(genod)> head(genod2)     marker                        X88 X9  X17 X25
> > [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"
> > [2,] "1043336|F|0-7:A>G-7:A>G"     "2" "0" "3" "0"
> > [3,] "1212218|F|0-49:A>G-49:A>G"   "0" "0" "0" "0"
> > [4,] "1019554|F|0-14:T>C-14:T>C"   "0" "0" "3" "0"
> > [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"
> > [6,] "1106702|F|0-8:C>A-8:C>A"     "0" "0" "0" "0"
> >
> >> class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs introduced by coercion> head(genod2)
> >
> >   marker X88 X9 X17 X25
> > [1,]     NA   0  3   3   3
> > [2,]     NA   2  0   3   0
> > [3,]     NA   0  0   0   0
> > [4,]     NA   0  0   3   0
> > [5,]     NA   3  3   3   3
> > [6,]     NA   0  0   0   0
> >
> >> class(genod2) <- "numeric"> class(genod2)[1] "matrix"
> >
> >> # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = genod,+                  sample.id = sample.id, snp.id = snp.id,+                  snp.chromosome = snp.chromosome,+                  snp.position = snp.position,+                  snp.allele = snp.allele, snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,  :
> >    is.matrix(genmat) is not TRUE
> >
> > Thanks,
> > Meriam
> >
> > On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <[hidden email]> wrote:
> >
> >> Hi
> >>
> >> see in line
> >>
> >>> -----Original Message-----
> >>> From: R-help <[hidden email]> On Behalf Of N Meriam
> >>> Sent: Tuesday, January 8, 2019 3:08 PM
> >>> To: [hidden email]
> >>> Subject: [R] Warning message: NAs introduced by coercion
> >>>
> >>> Dear all,
> >>>
> >>> I have a .csv file called df4. (15752 obs. of 264 variables).
> >>> I apply this code but couldn't continue further other analyses, a warning
> >>> message keeps coming up. Then, I want to determine max and min
> >>> similarity values,
> >>> heat map plot, cluster...etc
> >>>
> >>>> require(SNPRelate)
> >>>> library(gdsfmt)
> >>>> myd <- read.csv(file = "df4.csv", header = TRUE)
> >>>> names(myd)[-1]
> >>> myd[,1]
> >>>> myd[1:10, 1:10]
> >>>   # the data must be 0,1,2 with 3 as missing so you have r
> >>>> sample.id <- names(myd)[-1]
> >>>> snp.id <- myd[,1]
> >>>> snp.position <- 1:length(snp.id) # not needed for ibs
> >>>> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> >>>> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
> >>> # genotype data must have - in 3
> >>>> genod <- myd[,-1]
> >>>> genod[is.na(genod)] <- 3
> >>>> genod[genod=="0"] <- 0
> >>>> genod[genod=="1"] <- 2
> >>>> genod[1:10,1:10]
> >>>> genod <- as.matrix(genod)
> >>
> >> matrix can have only one type of data so you probaly changed it to
> >> character by such construction.
> >>
> >>>> class(genod) <- "numeric"
> >>
> >> This tries to change all "numeric" values to numbers but if it cannot it
> >> sets it to NA.
> >>
> >> something like
> >>
> >>> head(iris)
> >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> >> 1          5.1         3.5          1.4         0.2  setosa
> >> 2          4.9         3.0          1.4         0.2  setosa
> >> 3          4.7         3.2          1.3         0.2  setosa
> >> 4          4.6         3.1          1.5         0.2  setosa
> >> 5          5.0         3.6          1.4         0.2  setosa
> >> 6          5.4         3.9          1.7         0.4  setosa
> >>> ir <-head(iris)
> >>> irm <- as.matrix(ir)
> >>> head(irm)
> >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> >> 1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
> >> 2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
> >> 3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
> >> 4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
> >> 5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
> >> 6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
> >>> class(irm) <- "numeric"
> >> Warning message:
> >> In class(irm) <- "numeric" : NAs introduced by coercion
> >>> head(irm)
> >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> >> 1          5.1         3.5          1.4         0.2      NA
> >> 2          4.9         3.0          1.4         0.2      NA
> >> 3          4.7         3.2          1.3         0.2      NA
> >> 4          4.6         3.1          1.5         0.2      NA
> >> 5          5.0         3.6          1.4         0.2      NA
> >> 6          5.4         3.9          1.7         0.4      NA
> >>>
> >>
> >> Cheers
> >> Petr
> >>
> >>
> >>>
> >>>
> >>> *Warning message:In class(genod) <- "numeric" : NAs introduced by
> >> coercion*
> >>>
> >>> Maybe I could illustrate more with details so I can be more specific?
> >>> Please, let me know.
> >>>
> >>> I would appreciate your help.
> >>> Thanks,
> >>> Meriam
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
> >> partnerů PRECHEZA a.s. jsou zveřejněny na:
> >> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information
> >> about processing and protection of business partner’s personal data are
> >> available on website:
> >> https://www.precheza.cz/en/personal-data-protection-principles/
> >> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou
> >> důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení
> >> odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any
> >> documents attached to it may be confidential and are subject to the legally
> >> binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
> >>
> >>
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html


--
Meriam Nefzaoui
MSc. in Plant Breeding and Genetics
Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

mydata.txt (394 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

David Carlson
Your attached file is not a .csv file since the field are not separated by commas (just rename the mydata.csv to mydata.txt).

The command "genod2 <- as.matrix(genod)" created a character matrix from the data frame genod.  When you try to force genod2 to numeric, the marker column becomes NAs which is probably not what you want.

The error message is because you passed genod (a data frame) to the snpgdsCreateGeno() function not genod2 (the matrix you created from genod).

------------------------------------
David L. Carlson
Department of Anthropology
Texas A&M University

-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of N Meriam
Sent: Tuesday, January 8, 2019 1:38 PM
To: Michael Dewey <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] Warning message: NAs introduced by coercion

Here's a portion of what my data looks like (text file format attached).
When running in R, it gives me this:

> df4 <- read.csv(file = "mydata.csv", header = TRUE)
> require(SNPRelate)
> library(gdsfmt)
> myd <- df4
> myd <- df4
> names(myd)[-1]
[1] "marker" "X88"    "X9"     "X17"    "X25"
> myd[,1]
[1]  3  4  5  6  8 10
# the data must be 0,1,2 with 3 as missing so you have r
> sample.id <- names(myd)[-1]
> snp.id <- myd[,1]
> snp.position <- 1:length(snp.id) # not needed for ibs
> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
# genotype data must have - in 3
> genod <- myd[,-1]
> genod[is.na(genod)] <- 3
> genod[genod=="0"] <- 0
> genod[genod=="1"] <- 2
> genod2 <- as.matrix(genod)
> head(genod2)
         marker                                             X88   X9
 X17   X25
[1,]  "100023173|F|0-47:G>A-47:G>A"     "0"    "3"    "3"     "3"
[2,]  "1043336|F|0-7:A>G-7:A>G"             "2"    "0"    "3"     "0"
[3,]  "1212218|F|0-49:A>G-49:A>G"         "0"    "0"    "0"     "0"
[4,]  "1019554|F|0-14:T>C-14:T>C"           "0"   "0"    "3"     "0"
[5,]  "100024550|F|0-16:G>A-16:G>A"     "3"    "3"    "3"     "3"
[6,]  "1106702|F|0-8:C>A-8:C>A"              "0"   "0"     "0"     "0"
> class(genod2) <- "numeric"
Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion
> head(genod2)
        marker   X88  X9   X17  X25
[1,]     NA         0      3     3       3
[2,]     NA         2      0     3       0
[3,]     NA         0      0     0       0
[4,]     NA         0      0     3       0
[5,]     NA         3      3     3       3
[6,]     NA         0      0     0       0
> class(genod2) <- "numeric"
> class(genod2)
[1] "matrix"
# read data
> filn <-"simTunesian.gds"
> snpgdsCreateGeno(filn, genmat = genod,
+                  sample.id = sample.id, snp.id = snp.id,
+                  snp.chromosome = snp.chromosome,
+                  snp.position = snp.position,
+                  snp.allele = snp.allele, snpfirstdim=TRUE)
Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,
 :   is.matrix(genmat) is not TRUE

Can't find a solution to my problem...my guess is that the problem
comes from converting the column 'marker' factor to numerical.

Best,
Meriam

On Tue, Jan 8, 2019 at 11:28 AM Michael Dewey <[hidden email]> wrote:

>
> Dear Meriam
>
> Your csv file did not come through as attachments are stripped unless of
> certain types and you post is very hard to read since you are posting in
> HTML. Try renaming the file to ????.txt and set your mailer to send
> plain text then people may be able to help you better.
>
> Michael
>
> On 08/01/2019 15:35, N Meriam wrote:
> > I see...
> > Here's a portion of what my data looks like (csv file attached).
> > I run again and here are the results:
> >
> > df4 <- read.csv(file = "mydata.csv", header = TRUE)
> >
> >> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> names(myd)[-1][1] "marker" "X88"    "X9"     "X17"    "X25"
> >
> >> myd[,1][1]  3  4  5  6  8 10
> >
> >
> >> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2
> >
> >> genod2 <- as.matrix(genod)> head(genod2)     marker                        X88 X9  X17 X25
> > [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"
> > [2,] "1043336|F|0-7:A>G-7:A>G"     "2" "0" "3" "0"
> > [3,] "1212218|F|0-49:A>G-49:A>G"   "0" "0" "0" "0"
> > [4,] "1019554|F|0-14:T>C-14:T>C"   "0" "0" "3" "0"
> > [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"
> > [6,] "1106702|F|0-8:C>A-8:C>A"     "0" "0" "0" "0"
> >
> >> class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs introduced by coercion> head(genod2)
> >
> >   marker X88 X9 X17 X25
> > [1,]     NA   0  3   3   3
> > [2,]     NA   2  0   3   0
> > [3,]     NA   0  0   0   0
> > [4,]     NA   0  0   3   0
> > [5,]     NA   3  3   3   3
> > [6,]     NA   0  0   0   0
> >
> >> class(genod2) <- "numeric"> class(genod2)[1] "matrix"
> >
> >> # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = genod,+                  sample.id = sample.id, snp.id = snp.id,+                  snp.chromosome = snp.chromosome,+                  snp.position = snp.position,+                  snp.allele = snp.allele, snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,  :
> >    is.matrix(genmat) is not TRUE
> >
> > Thanks,
> > Meriam
> >
> > On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <[hidden email]> wrote:
> >
> >> Hi
> >>
> >> see in line
> >>
> >>> -----Original Message-----
> >>> From: R-help <[hidden email]> On Behalf Of N Meriam
> >>> Sent: Tuesday, January 8, 2019 3:08 PM
> >>> To: [hidden email]
> >>> Subject: [R] Warning message: NAs introduced by coercion
> >>>
> >>> Dear all,
> >>>
> >>> I have a .csv file called df4. (15752 obs. of 264 variables).
> >>> I apply this code but couldn't continue further other analyses, a warning
> >>> message keeps coming up. Then, I want to determine max and min
> >>> similarity values,
> >>> heat map plot, cluster...etc
> >>>
> >>>> require(SNPRelate)
> >>>> library(gdsfmt)
> >>>> myd <- read.csv(file = "df4.csv", header = TRUE)
> >>>> names(myd)[-1]
> >>> myd[,1]
> >>>> myd[1:10, 1:10]
> >>>   # the data must be 0,1,2 with 3 as missing so you have r
> >>>> sample.id <- names(myd)[-1]
> >>>> snp.id <- myd[,1]
> >>>> snp.position <- 1:length(snp.id) # not needed for ibs
> >>>> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> >>>> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
> >>> # genotype data must have - in 3
> >>>> genod <- myd[,-1]
> >>>> genod[is.na(genod)] <- 3
> >>>> genod[genod=="0"] <- 0
> >>>> genod[genod=="1"] <- 2
> >>>> genod[1:10,1:10]
> >>>> genod <- as.matrix(genod)
> >>
> >> matrix can have only one type of data so you probaly changed it to
> >> character by such construction.
> >>
> >>>> class(genod) <- "numeric"
> >>
> >> This tries to change all "numeric" values to numbers but if it cannot it
> >> sets it to NA.
> >>
> >> something like
> >>
> >>> head(iris)
> >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> >> 1          5.1         3.5          1.4         0.2  setosa
> >> 2          4.9         3.0          1.4         0.2  setosa
> >> 3          4.7         3.2          1.3         0.2  setosa
> >> 4          4.6         3.1          1.5         0.2  setosa
> >> 5          5.0         3.6          1.4         0.2  setosa
> >> 6          5.4         3.9          1.7         0.4  setosa
> >>> ir <-head(iris)
> >>> irm <- as.matrix(ir)
> >>> head(irm)
> >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> >> 1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
> >> 2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
> >> 3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
> >> 4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
> >> 5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
> >> 6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
> >>> class(irm) <- "numeric"
> >> Warning message:
> >> In class(irm) <- "numeric" : NAs introduced by coercion
> >>> head(irm)
> >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> >> 1          5.1         3.5          1.4         0.2      NA
> >> 2          4.9         3.0          1.4         0.2      NA
> >> 3          4.7         3.2          1.3         0.2      NA
> >> 4          4.6         3.1          1.5         0.2      NA
> >> 5          5.0         3.6          1.4         0.2      NA
> >> 6          5.4         3.9          1.7         0.4      NA
> >>>
> >>
> >> Cheers
> >> Petr
> >>
> >>
> >>>
> >>>
> >>> *Warning message:In class(genod) <- "numeric" : NAs introduced by
> >> coercion*
> >>>
> >>> Maybe I could illustrate more with details so I can be more specific?
> >>> Please, let me know.
> >>>
> >>> I would appreciate your help.
> >>> Thanks,
> >>> Meriam
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
> >> partnerů PRECHEZA a.s. jsou zveřejněny na:
> >> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information
> >> about processing and protection of business partner’s personal data are
> >> available on website:
> >> https://www.precheza.cz/en/personal-data-protection-principles/
> >> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou
> >> důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení
> >> odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any
> >> documents attached to it may be confidential and are subject to the legally
> >> binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
> >>
> >>
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html



--
Meriam Nefzaoui
MSc. in Plant Breeding and Genetics
Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

MeriamNF
Yes, sorry. I attached the file once again.
Well, still getting the same warning.

> class(genod) <- "numeric"
Warning message:
In class(genod) <- "numeric" : NAs introduced by coercion
> class(genod)
[1] "matrix"

Then, I run the following code and it gives this:

> filn <-"simTunesian.gds"
> snpgdsCreateGeno(filn, genmat = genod,
+                  sample.id = sample.id, snp.id = snp.id,
+                  snp.chromosome = snp.chromosome,
+                  snp.position = snp.position,
+                  snp.allele = snp.allele, snpfirstdim=TRUE)
> # calculate similarity matrix
> # Open the GDS file
> (genofile <- snpgdsOpen(filn))
File: C:\Users\DELL\Documents\TEST\simTunesian.gds (1.4M)
+    [  ] *
|--+ sample.id   { Str8 363 ZIP_ra(42.5%), 755B }
|--+ snp.id   { Int32 15752 ZIP_ra(35.1%), 21.6K }
|--+ snp.position   { Int32 15752 ZIP_ra(34.7%), 21.3K }
|--+ snp.chromosome   { Float64 15752 ZIP_ra(0.18%), 230B }
|--+ snp.allele   { Str8 15752 ZIP_ra(0.16%), 108B }
\--+ genotype   { Bit2 15752x363, 1.4M } *
> ibs <- snpgdsIBS(genofile, remove.monosnp = FALSE, num.thread=1)
Identity-By-State (IBS) analysis on genotypes:
Excluding 0 SNP on non-autosomes
Working space: 363 samples, 15,752 SNPs
    using 1 (CPU) core
IBS:    the sum of all selected genotypes (0,1,2) = 3658952
Tue Jan 08 15:38:00 2019    (internal increment: 42880)
[==================================================] 100%, completed in 0s
Tue Jan 08 15:38:00 2019    Done.
> # maximum similarity value
> max(ibs$ibs)
[1] NaN
> # minimum similarity value
> min(ibs$ibs)
[1] NaN

As you can see, I can't continue my analysis (heat map plot,
clustering with hclust) because values are NaN.


On Tue, Jan 8, 2019 at 2:01 PM David L Carlson <[hidden email]> wrote:

>
> Your attached file is not a .csv file since the field are not separated by commas (just rename the mydata.csv to mydata.txt).
>
> The command "genod2 <- as.matrix(genod)" created a character matrix from the data frame genod.  When you try to force genod2 to numeric, the marker column becomes NAs which is probably not what you want.
>
> The error message is because you passed genod (a data frame) to the snpgdsCreateGeno() function not genod2 (the matrix you created from genod).
>
> ------------------------------------
> David L. Carlson
> Department of Anthropology
> Texas A&M University
>
> -----Original Message-----
> From: R-help [mailto:[hidden email]] On Behalf Of N Meriam
> Sent: Tuesday, January 8, 2019 1:38 PM
> To: Michael Dewey <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [R] Warning message: NAs introduced by coercion
>
> Here's a portion of what my data looks like (text file format attached).
> When running in R, it gives me this:
>
> > df4 <- read.csv(file = "mydata.csv", header = TRUE)
> > require(SNPRelate)
> > library(gdsfmt)
> > myd <- df4
> > myd <- df4
> > names(myd)[-1]
> [1] "marker" "X88"    "X9"     "X17"    "X25"
> > myd[,1]
> [1]  3  4  5  6  8 10
> # the data must be 0,1,2 with 3 as missing so you have r
> > sample.id <- names(myd)[-1]
> > snp.id <- myd[,1]
> > snp.position <- 1:length(snp.id) # not needed for ibs
> > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
> # genotype data must have - in 3
> > genod <- myd[,-1]
> > genod[is.na(genod)] <- 3
> > genod[genod=="0"] <- 0
> > genod[genod=="1"] <- 2
> > genod2 <- as.matrix(genod)
> > head(genod2)
>          marker                                             X88   X9
>  X17   X25
> [1,]  "100023173|F|0-47:G>A-47:G>A"     "0"    "3"    "3"     "3"
> [2,]  "1043336|F|0-7:A>G-7:A>G"             "2"    "0"    "3"     "0"
> [3,]  "1212218|F|0-49:A>G-49:A>G"         "0"    "0"    "0"     "0"
> [4,]  "1019554|F|0-14:T>C-14:T>C"           "0"   "0"    "3"     "0"
> [5,]  "100024550|F|0-16:G>A-16:G>A"     "3"    "3"    "3"     "3"
> [6,]  "1106702|F|0-8:C>A-8:C>A"              "0"   "0"     "0"     "0"
> > class(genod2) <- "numeric"
> Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion
> > head(genod2)
>         marker   X88  X9   X17  X25
> [1,]     NA         0      3     3       3
> [2,]     NA         2      0     3       0
> [3,]     NA         0      0     0       0
> [4,]     NA         0      0     3       0
> [5,]     NA         3      3     3       3
> [6,]     NA         0      0     0       0
> > class(genod2) <- "numeric"
> > class(genod2)
> [1] "matrix"
> # read data
> > filn <-"simTunesian.gds"
> > snpgdsCreateGeno(filn, genmat = genod,
> +                  sample.id = sample.id, snp.id = snp.id,
> +                  snp.chromosome = snp.chromosome,
> +                  snp.position = snp.position,
> +                  snp.allele = snp.allele, snpfirstdim=TRUE)
> Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,
>  :   is.matrix(genmat) is not TRUE
>
> Can't find a solution to my problem...my guess is that the problem
> comes from converting the column 'marker' factor to numerical.
>
> Best,
> Meriam
>
> On Tue, Jan 8, 2019 at 11:28 AM Michael Dewey <[hidden email]> wrote:
> >
> > Dear Meriam
> >
> > Your csv file did not come through as attachments are stripped unless of
> > certain types and you post is very hard to read since you are posting in
> > HTML. Try renaming the file to ????.txt and set your mailer to send
> > plain text then people may be able to help you better.
> >
> > Michael
> >
> > On 08/01/2019 15:35, N Meriam wrote:
> > > I see...
> > > Here's a portion of what my data looks like (csv file attached).
> > > I run again and here are the results:
> > >
> > > df4 <- read.csv(file = "mydata.csv", header = TRUE)
> > >
> > >> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> names(myd)[-1][1] "marker" "X88"    "X9"     "X17"    "X25"
> > >
> > >> myd[,1][1]  3  4  5  6  8 10
> > >
> > >
> > >> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2
> > >
> > >> genod2 <- as.matrix(genod)> head(genod2)     marker                        X88 X9  X17 X25
> > > [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"
> > > [2,] "1043336|F|0-7:A>G-7:A>G"     "2" "0" "3" "0"
> > > [3,] "1212218|F|0-49:A>G-49:A>G"   "0" "0" "0" "0"
> > > [4,] "1019554|F|0-14:T>C-14:T>C"   "0" "0" "3" "0"
> > > [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"
> > > [6,] "1106702|F|0-8:C>A-8:C>A"     "0" "0" "0" "0"
> > >
> > >> class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs introduced by coercion> head(genod2)
> > >
> > >   marker X88 X9 X17 X25
> > > [1,]     NA   0  3   3   3
> > > [2,]     NA   2  0   3   0
> > > [3,]     NA   0  0   0   0
> > > [4,]     NA   0  0   3   0
> > > [5,]     NA   3  3   3   3
> > > [6,]     NA   0  0   0   0
> > >
> > >> class(genod2) <- "numeric"> class(genod2)[1] "matrix"
> > >
> > >> # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = genod,+                  sample.id = sample.id, snp.id = snp.id,+                  snp.chromosome = snp.chromosome,+                  snp.position = snp.position,+                  snp.allele = snp.allele, snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,  :
> > >    is.matrix(genmat) is not TRUE
> > >
> > > Thanks,
> > > Meriam
> > >
> > > On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <[hidden email]> wrote:
> > >
> > >> Hi
> > >>
> > >> see in line
> > >>
> > >>> -----Original Message-----
> > >>> From: R-help <[hidden email]> On Behalf Of N Meriam
> > >>> Sent: Tuesday, January 8, 2019 3:08 PM
> > >>> To: [hidden email]
> > >>> Subject: [R] Warning message: NAs introduced by coercion
> > >>>
> > >>> Dear all,
> > >>>
> > >>> I have a .csv file called df4. (15752 obs. of 264 variables).
> > >>> I apply this code but couldn't continue further other analyses, a warning
> > >>> message keeps coming up. Then, I want to determine max and min
> > >>> similarity values,
> > >>> heat map plot, cluster...etc
> > >>>
> > >>>> require(SNPRelate)
> > >>>> library(gdsfmt)
> > >>>> myd <- read.csv(file = "df4.csv", header = TRUE)
> > >>>> names(myd)[-1]
> > >>> myd[,1]
> > >>>> myd[1:10, 1:10]
> > >>>   # the data must be 0,1,2 with 3 as missing so you have r
> > >>>> sample.id <- names(myd)[-1]
> > >>>> snp.id <- myd[,1]
> > >>>> snp.position <- 1:length(snp.id) # not needed for ibs
> > >>>> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> > >>>> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
> > >>> # genotype data must have - in 3
> > >>>> genod <- myd[,-1]
> > >>>> genod[is.na(genod)] <- 3
> > >>>> genod[genod=="0"] <- 0
> > >>>> genod[genod=="1"] <- 2
> > >>>> genod[1:10,1:10]
> > >>>> genod <- as.matrix(genod)
> > >>
> > >> matrix can have only one type of data so you probaly changed it to
> > >> character by such construction.
> > >>
> > >>>> class(genod) <- "numeric"
> > >>
> > >> This tries to change all "numeric" values to numbers but if it cannot it
> > >> sets it to NA.
> > >>
> > >> something like
> > >>
> > >>> head(iris)
> > >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> > >> 1          5.1         3.5          1.4         0.2  setosa
> > >> 2          4.9         3.0          1.4         0.2  setosa
> > >> 3          4.7         3.2          1.3         0.2  setosa
> > >> 4          4.6         3.1          1.5         0.2  setosa
> > >> 5          5.0         3.6          1.4         0.2  setosa
> > >> 6          5.4         3.9          1.7         0.4  setosa
> > >>> ir <-head(iris)
> > >>> irm <- as.matrix(ir)
> > >>> head(irm)
> > >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> > >> 1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
> > >> 2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
> > >> 3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
> > >> 4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
> > >> 5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
> > >> 6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
> > >>> class(irm) <- "numeric"
> > >> Warning message:
> > >> In class(irm) <- "numeric" : NAs introduced by coercion
> > >>> head(irm)
> > >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> > >> 1          5.1         3.5          1.4         0.2      NA
> > >> 2          4.9         3.0          1.4         0.2      NA
> > >> 3          4.7         3.2          1.3         0.2      NA
> > >> 4          4.6         3.1          1.5         0.2      NA
> > >> 5          5.0         3.6          1.4         0.2      NA
> > >> 6          5.4         3.9          1.7         0.4      NA
> > >>>
> > >>
> > >> Cheers
> > >> Petr
> > >>
> > >>
> > >>>
> > >>>
> > >>> *Warning message:In class(genod) <- "numeric" : NAs introduced by
> > >> coercion*
> > >>>
> > >>> Maybe I could illustrate more with details so I can be more specific?
> > >>> Please, let me know.
> > >>>
> > >>> I would appreciate your help.
> > >>> Thanks,
> > >>> Meriam
> > >>>
> > >>> [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
> > >> partnerů PRECHEZA a.s. jsou zveřejněny na:
> > >> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information
> > >> about processing and protection of business partner’s personal data are
> > >> available on website:
> > >> https://www.precheza.cz/en/personal-data-protection-principles/
> > >> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou
> > >> důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení
> > >> odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any
> > >> documents attached to it may be confidential and are subject to the legally
> > >> binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
> > >>
> > >>
> > >
> >
> > --
> > Michael
> > http://www.dewey.myzen.co.uk/home.html
>
>
>
> --
> Meriam Nefzaoui
> MSc. in Plant Breeding and Genetics
> Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil


--
Meriam Nefzaoui
MSc. in Plant Breeding and Genetics
Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

mydata.txt (352 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

PIKAL Petr
In reply to this post by MeriamNF
Hm,

you should use dput for sharing data but my suggestion was correct.

You converted genod to genod2 by as.matrix what changed it to „character“ matrix as matrix is able to hold only one type of data. By trying to change it to numeric, all numbers are changed to numeric and what cannot be changed is simply converted to NA (with polite warning).

You should read documentation to
snpgdsCreateGeno
as it requires matrix as an input and maybe also pay attention to the basic documents like R-intro which would teach you difference between matrix and data frame (chapter 3).

Cheers
Petr


From: N Meriam <[hidden email]>
Sent: Tuesday, January 8, 2019 4:36 PM
To: PIKAL Petr <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] Warning message: NAs introduced by coercion

I see...
Here's a portion of what my data looks like (csv file attached).
I run again and here are the results:


df4 <- read.csv(file = "mydata.csv", header = TRUE)

> require(SNPRelate)

> library(gdsfmt)

> myd <- df4

> myd <- df4

> names(myd)[-1]

[1] "marker" "X88"    "X9"     "X17"    "X25"

> myd[,1]

[1]  3  4  5  6  8 10



> # the data must be 0,1,2 with 3 as missing so you have r

> sample.id<http://sample.id> <- names(myd)[-1]

> snp.id<http://snp.id> <- myd[,1]

> snp.position <- 1:length(snp.id<http://snp.id>) # not needed for ibs

> snp.chromosome <- rep(1, each=length(snp.id<http://snp.id>)) # not needed for ibs

> snp.allele <- rep("A/G", length(snp.id<http://snp.id>)) # not needed for ibs

> # genotype data must have - in 3

> genod <- myd[,-1]

> genod[is.na<http://is.na>(genod)] <- 3

> genod[genod=="0"] <- 0

> genod[genod=="1"] <- 2

> genod2 <- as.matrix(genod)

> head(genod2)

     marker                        X88 X9  X17 X25

[1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"

[2,] "1043336|F|0-7:A>G-7:A>G"     "2" "0" "3" "0"

[3,] "1212218|F|0-49:A>G-49:A>G"   "0" "0" "0" "0"

[4,] "1019554|F|0-14:T>C-14:T>C"   "0" "0" "3" "0"

[5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"

[6,] "1106702|F|0-8:C>A-8:C>A"     "0" "0" "0" "0"



> class(genod2) <- "numeric"

Warning message:

In class(genod2) <- "numeric" : NAs introduced by coercion

> head(genod2)

 marker X88 X9 X17 X25

[1,]     NA   0  3   3   3

[2,]     NA   2  0   3   0

[3,]     NA   0  0   0   0

[4,]     NA   0  0   3   0

[5,]     NA   3  3   3   3

[6,]     NA   0  0   0   0

> class(genod2) <- "numeric"

> class(genod2)

[1] "matrix"

> # read data

> filn <-"simTunesian.gds"

> snpgdsCreateGeno(filn, genmat = genod,

+                  sample.id<http://sample.id> = sample.id<http://sample.id>, snp.id<http://snp.id> = snp.id<http://snp.id>,

+                  snp.chromosome = snp.chromosome,

+                  snp.position = snp.position,

+                  snp.allele = snp.allele, snpfirstdim=TRUE)

Error in snpgdsCreateGeno(filn, genmat = genod, sample.id<http://sample.id> = sample.id<http://sample.id>,  :

  is.matrix(genmat) is not TRUE
Thanks,
Meriam

On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <[hidden email]<mailto:[hidden email]>> wrote:
Hi

see in line

> -----Original Message-----
> From: R-help <[hidden email]<mailto:[hidden email]>> On Behalf Of N Meriam
> Sent: Tuesday, January 8, 2019 3:08 PM
> To: [hidden email]<mailto:[hidden email]>
> Subject: [R] Warning message: NAs introduced by coercion
>
> Dear all,
>
> I have a .csv file called df4. (15752 obs. of 264 variables).
> I apply this code but couldn't continue further other analyses, a warning
> message keeps coming up. Then, I want to determine max and min
> similarity values,
> heat map plot, cluster...etc
>
> > require(SNPRelate)
> > library(gdsfmt)
> > myd <- read.csv(file = "df4.csv", header = TRUE)
> > names(myd)[-1]
> myd[,1]
> > myd[1:10, 1:10]
>  # the data must be 0,1,2 with 3 as missing so you have r
> > sample.id<http://sample.id> <- names(myd)[-1]
> > snp.id<http://snp.id> <- myd[,1]
> > snp.position <- 1:length(snp.id<http://snp.id>) # not needed for ibs
> > snp.chromosome <- rep(1, each=length(snp.id<http://snp.id>)) # not needed for ibs
> > snp.allele <- rep("A/G", length(snp.id<http://snp.id>)) # not needed for ibs
> # genotype data must have - in 3
> > genod <- myd[,-1]
> > genod[is.na<http://is.na>(genod)] <- 3
> > genod[genod=="0"] <- 0
> > genod[genod=="1"] <- 2
> > genod[1:10,1:10]
> > genod <- as.matrix(genod)

matrix can have only one type of data so you probaly changed it to character by such construction.

> > class(genod) <- "numeric"

This tries to change all "numeric" values to numbers but if it cannot it sets it to NA.

something like

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
> ir <-head(iris)
> irm <- as.matrix(ir)
> head(irm)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
> class(irm) <- "numeric"
Warning message:
In class(irm) <- "numeric" : NAs introduced by coercion
> head(irm)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2      NA
2          4.9         3.0          1.4         0.2      NA
3          4.7         3.2          1.3         0.2      NA
4          4.6         3.1          1.5         0.2      NA
5          5.0         3.6          1.4         0.2      NA
6          5.4         3.9          1.7         0.4      NA
>

Cheers
Petr


>
>
> *Warning message:In class(genod) <- "numeric" : NAs introduced by coercion*
>
> Maybe I could illustrate more with details so I can be more specific?
> Please, let me know.
>
> I would appreciate your help.
> Thanks,
> Meriam
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/


--
Meriam Nefzaoui
MSc. in Plant Breeding and Genetics
Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

PIKAL Petr
In reply to this post by MeriamNF
And as you use bioconductor related package you probably could get better answers in specialised biconductor help

https://www.bioconductor.org/help/

Cheers
Petr

From: N Meriam <[hidden email]>
Sent: Tuesday, January 8, 2019 4:36 PM
To: PIKAL Petr <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] Warning message: NAs introduced by coercion

I see...
Here's a portion of what my data looks like (csv file attached).
I run again and here are the results:


df4 <- read.csv(file = "mydata.csv", header = TRUE)

> require(SNPRelate)

> library(gdsfmt)

> myd <- df4

> myd <- df4

> names(myd)[-1]

[1] "marker" "X88"    "X9"     "X17"    "X25"

> myd[,1]

[1]  3  4  5  6  8 10



> # the data must be 0,1,2 with 3 as missing so you have r

> sample.id<http://sample.id> <- names(myd)[-1]

> snp.id<http://snp.id> <- myd[,1]

> snp.position <- 1:length(snp.id<http://snp.id>) # not needed for ibs

> snp.chromosome <- rep(1, each=length(snp.id<http://snp.id>)) # not needed for ibs

> snp.allele <- rep("A/G", length(snp.id<http://snp.id>)) # not needed for ibs

> # genotype data must have - in 3

> genod <- myd[,-1]

> genod[is.na<http://is.na>(genod)] <- 3

> genod[genod=="0"] <- 0

> genod[genod=="1"] <- 2

> genod2 <- as.matrix(genod)

> head(genod2)

     marker                        X88 X9  X17 X25

[1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"

[2,] "1043336|F|0-7:A>G-7:A>G"     "2" "0" "3" "0"

[3,] "1212218|F|0-49:A>G-49:A>G"   "0" "0" "0" "0"

[4,] "1019554|F|0-14:T>C-14:T>C"   "0" "0" "3" "0"

[5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"

[6,] "1106702|F|0-8:C>A-8:C>A"     "0" "0" "0" "0"



> class(genod2) <- "numeric"

Warning message:

In class(genod2) <- "numeric" : NAs introduced by coercion

> head(genod2)

 marker X88 X9 X17 X25

[1,]     NA   0  3   3   3

[2,]     NA   2  0   3   0

[3,]     NA   0  0   0   0

[4,]     NA   0  0   3   0

[5,]     NA   3  3   3   3

[6,]     NA   0  0   0   0

> class(genod2) <- "numeric"

> class(genod2)

[1] "matrix"

> # read data

> filn <-"simTunesian.gds"

> snpgdsCreateGeno(filn, genmat = genod,

+                  sample.id<http://sample.id> = sample.id<http://sample.id>, snp.id<http://snp.id> = snp.id<http://snp.id>,

+                  snp.chromosome = snp.chromosome,

+                  snp.position = snp.position,

+                  snp.allele = snp.allele, snpfirstdim=TRUE)

Error in snpgdsCreateGeno(filn, genmat = genod, sample.id<http://sample.id> = sample.id<http://sample.id>,  :

  is.matrix(genmat) is not TRUE
Thanks,
Meriam

On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <[hidden email]<mailto:[hidden email]>> wrote:
Hi

see in line

> -----Original Message-----
> From: R-help <[hidden email]<mailto:[hidden email]>> On Behalf Of N Meriam
> Sent: Tuesday, January 8, 2019 3:08 PM
> To: [hidden email]<mailto:[hidden email]>
> Subject: [R] Warning message: NAs introduced by coercion
>
> Dear all,
>
> I have a .csv file called df4. (15752 obs. of 264 variables).
> I apply this code but couldn't continue further other analyses, a warning
> message keeps coming up. Then, I want to determine max and min
> similarity values,
> heat map plot, cluster...etc
>
> > require(SNPRelate)
> > library(gdsfmt)
> > myd <- read.csv(file = "df4.csv", header = TRUE)
> > names(myd)[-1]
> myd[,1]
> > myd[1:10, 1:10]
>  # the data must be 0,1,2 with 3 as missing so you have r
> > sample.id<http://sample.id> <- names(myd)[-1]
> > snp.id<http://snp.id> <- myd[,1]
> > snp.position <- 1:length(snp.id<http://snp.id>) # not needed for ibs
> > snp.chromosome <- rep(1, each=length(snp.id<http://snp.id>)) # not needed for ibs
> > snp.allele <- rep("A/G", length(snp.id<http://snp.id>)) # not needed for ibs
> # genotype data must have - in 3
> > genod <- myd[,-1]
> > genod[is.na<http://is.na>(genod)] <- 3
> > genod[genod=="0"] <- 0
> > genod[genod=="1"] <- 2
> > genod[1:10,1:10]
> > genod <- as.matrix(genod)

matrix can have only one type of data so you probaly changed it to character by such construction.

> > class(genod) <- "numeric"

This tries to change all "numeric" values to numbers but if it cannot it sets it to NA.

something like

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
> ir <-head(iris)
> irm <- as.matrix(ir)
> head(irm)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
> class(irm) <- "numeric"
Warning message:
In class(irm) <- "numeric" : NAs introduced by coercion
> head(irm)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2      NA
2          4.9         3.0          1.4         0.2      NA
3          4.7         3.2          1.3         0.2      NA
4          4.6         3.1          1.5         0.2      NA
5          5.0         3.6          1.4         0.2      NA
6          5.4         3.9          1.7         0.4      NA
>

Cheers
Petr


>
>
> *Warning message:In class(genod) <- "numeric" : NAs introduced by coercion*
>
> Maybe I could illustrate more with details so I can be more specific?
> Please, let me know.
>
> I would appreciate your help.
> Thanks,
> Meriam
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/


--
Meriam Nefzaoui
MSc. in Plant Breeding and Genetics
Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Warning message: NAs introduced by coercion

David Carlson
In reply to this post by MeriamNF
Now you have pushed a numeric matrix to the function with a column of missing values. No wonder you do not get any results.

David C

-----Original Message-----
From: N Meriam [mailto:[hidden email]]
Sent: Tuesday, January 8, 2019 3:44 PM
To: David L Carlson <[hidden email]>
Cc: Michael Dewey <[hidden email]>; [hidden email]
Subject: Re: [R] Warning message: NAs introduced by coercion

Yes, sorry. I attached the file once again.
Well, still getting the same warning.

> class(genod) <- "numeric"
Warning message:
In class(genod) <- "numeric" : NAs introduced by coercion
> class(genod)
[1] "matrix"

Then, I run the following code and it gives this:

> filn <-"simTunesian.gds"
> snpgdsCreateGeno(filn, genmat = genod,
+                  sample.id = sample.id, snp.id = snp.id,
+                  snp.chromosome = snp.chromosome,
+                  snp.position = snp.position,
+                  snp.allele = snp.allele, snpfirstdim=TRUE)
> # calculate similarity matrix
> # Open the GDS file
> (genofile <- snpgdsOpen(filn))
File: C:\Users\DELL\Documents\TEST\simTunesian.gds (1.4M)
+    [  ] *
|--+ sample.id   { Str8 363 ZIP_ra(42.5%), 755B }
|--+ snp.id   { Int32 15752 ZIP_ra(35.1%), 21.6K }
|--+ snp.position   { Int32 15752 ZIP_ra(34.7%), 21.3K }
|--+ snp.chromosome   { Float64 15752 ZIP_ra(0.18%), 230B }
|--+ snp.allele   { Str8 15752 ZIP_ra(0.16%), 108B }
\--+ genotype   { Bit2 15752x363, 1.4M } *
> ibs <- snpgdsIBS(genofile, remove.monosnp = FALSE, num.thread=1)
Identity-By-State (IBS) analysis on genotypes:
Excluding 0 SNP on non-autosomes
Working space: 363 samples, 15,752 SNPs
    using 1 (CPU) core
IBS:    the sum of all selected genotypes (0,1,2) = 3658952
Tue Jan 08 15:38:00 2019    (internal increment: 42880)
[==================================================] 100%, completed in 0s
Tue Jan 08 15:38:00 2019    Done.
> # maximum similarity value
> max(ibs$ibs)
[1] NaN
> # minimum similarity value
> min(ibs$ibs)
[1] NaN

As you can see, I can't continue my analysis (heat map plot,
clustering with hclust) because values are NaN.


On Tue, Jan 8, 2019 at 2:01 PM David L Carlson <[hidden email]> wrote:

>
> Your attached file is not a .csv file since the field are not separated by commas (just rename the mydata.csv to mydata.txt).
>
> The command "genod2 <- as.matrix(genod)" created a character matrix from the data frame genod.  When you try to force genod2 to numeric, the marker column becomes NAs which is probably not what you want.
>
> The error message is because you passed genod (a data frame) to the snpgdsCreateGeno() function not genod2 (the matrix you created from genod).
>
> ------------------------------------
> David L. Carlson
> Department of Anthropology
> Texas A&M University
>
> -----Original Message-----
> From: R-help [mailto:[hidden email]] On Behalf Of N Meriam
> Sent: Tuesday, January 8, 2019 1:38 PM
> To: Michael Dewey <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [R] Warning message: NAs introduced by coercion
>
> Here's a portion of what my data looks like (text file format attached).
> When running in R, it gives me this:
>
> > df4 <- read.csv(file = "mydata.csv", header = TRUE)
> > require(SNPRelate)
> > library(gdsfmt)
> > myd <- df4
> > myd <- df4
> > names(myd)[-1]
> [1] "marker" "X88"    "X9"     "X17"    "X25"
> > myd[,1]
> [1]  3  4  5  6  8 10
> # the data must be 0,1,2 with 3 as missing so you have r
> > sample.id <- names(myd)[-1]
> > snp.id <- myd[,1]
> > snp.position <- 1:length(snp.id) # not needed for ibs
> > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
> # genotype data must have - in 3
> > genod <- myd[,-1]
> > genod[is.na(genod)] <- 3
> > genod[genod=="0"] <- 0
> > genod[genod=="1"] <- 2
> > genod2 <- as.matrix(genod)
> > head(genod2)
>          marker                                             X88   X9
>  X17   X25
> [1,]  "100023173|F|0-47:G>A-47:G>A"     "0"    "3"    "3"     "3"
> [2,]  "1043336|F|0-7:A>G-7:A>G"             "2"    "0"    "3"     "0"
> [3,]  "1212218|F|0-49:A>G-49:A>G"         "0"    "0"    "0"     "0"
> [4,]  "1019554|F|0-14:T>C-14:T>C"           "0"   "0"    "3"     "0"
> [5,]  "100024550|F|0-16:G>A-16:G>A"     "3"    "3"    "3"     "3"
> [6,]  "1106702|F|0-8:C>A-8:C>A"              "0"   "0"     "0"     "0"
> > class(genod2) <- "numeric"
> Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion
> > head(genod2)
>         marker   X88  X9   X17  X25
> [1,]     NA         0      3     3       3
> [2,]     NA         2      0     3       0
> [3,]     NA         0      0     0       0
> [4,]     NA         0      0     3       0
> [5,]     NA         3      3     3       3
> [6,]     NA         0      0     0       0
> > class(genod2) <- "numeric"
> > class(genod2)
> [1] "matrix"
> # read data
> > filn <-"simTunesian.gds"
> > snpgdsCreateGeno(filn, genmat = genod,
> +                  sample.id = sample.id, snp.id = snp.id,
> +                  snp.chromosome = snp.chromosome,
> +                  snp.position = snp.position,
> +                  snp.allele = snp.allele, snpfirstdim=TRUE)
> Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,
>  :   is.matrix(genmat) is not TRUE
>
> Can't find a solution to my problem...my guess is that the problem
> comes from converting the column 'marker' factor to numerical.
>
> Best,
> Meriam
>
> On Tue, Jan 8, 2019 at 11:28 AM Michael Dewey <[hidden email]> wrote:
> >
> > Dear Meriam
> >
> > Your csv file did not come through as attachments are stripped unless of
> > certain types and you post is very hard to read since you are posting in
> > HTML. Try renaming the file to ????.txt and set your mailer to send
> > plain text then people may be able to help you better.
> >
> > Michael
> >
> > On 08/01/2019 15:35, N Meriam wrote:
> > > I see...
> > > Here's a portion of what my data looks like (csv file attached).
> > > I run again and here are the results:
> > >
> > > df4 <- read.csv(file = "mydata.csv", header = TRUE)
> > >
> > >> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> names(myd)[-1][1] "marker" "X88"    "X9"     "X17"    "X25"
> > >
> > >> myd[,1][1]  3  4  5  6  8 10
> > >
> > >
> > >> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2
> > >
> > >> genod2 <- as.matrix(genod)> head(genod2)     marker                        X88 X9  X17 X25
> > > [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3"
> > > [2,] "1043336|F|0-7:A>G-7:A>G"     "2" "0" "3" "0"
> > > [3,] "1212218|F|0-49:A>G-49:A>G"   "0" "0" "0" "0"
> > > [4,] "1019554|F|0-14:T>C-14:T>C"   "0" "0" "3" "0"
> > > [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3"
> > > [6,] "1106702|F|0-8:C>A-8:C>A"     "0" "0" "0" "0"
> > >
> > >> class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs introduced by coercion> head(genod2)
> > >
> > >   marker X88 X9 X17 X25
> > > [1,]     NA   0  3   3   3
> > > [2,]     NA   2  0   3   0
> > > [3,]     NA   0  0   0   0
> > > [4,]     NA   0  0   3   0
> > > [5,]     NA   3  3   3   3
> > > [6,]     NA   0  0   0   0
> > >
> > >> class(genod2) <- "numeric"> class(genod2)[1] "matrix"
> > >
> > >> # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = genod,+                  sample.id = sample.id, snp.id = snp.id,+                  snp.chromosome = snp.chromosome,+                  snp.position = snp.position,+                  snp.allele = snp.allele, snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id,  :
> > >    is.matrix(genmat) is not TRUE
> > >
> > > Thanks,
> > > Meriam
> > >
> > > On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <[hidden email]> wrote:
> > >
> > >> Hi
> > >>
> > >> see in line
> > >>
> > >>> -----Original Message-----
> > >>> From: R-help <[hidden email]> On Behalf Of N Meriam
> > >>> Sent: Tuesday, January 8, 2019 3:08 PM
> > >>> To: [hidden email]
> > >>> Subject: [R] Warning message: NAs introduced by coercion
> > >>>
> > >>> Dear all,
> > >>>
> > >>> I have a .csv file called df4. (15752 obs. of 264 variables).
> > >>> I apply this code but couldn't continue further other analyses, a warning
> > >>> message keeps coming up. Then, I want to determine max and min
> > >>> similarity values,
> > >>> heat map plot, cluster...etc
> > >>>
> > >>>> require(SNPRelate)
> > >>>> library(gdsfmt)
> > >>>> myd <- read.csv(file = "df4.csv", header = TRUE)
> > >>>> names(myd)[-1]
> > >>> myd[,1]
> > >>>> myd[1:10, 1:10]
> > >>>   # the data must be 0,1,2 with 3 as missing so you have r
> > >>>> sample.id <- names(myd)[-1]
> > >>>> snp.id <- myd[,1]
> > >>>> snp.position <- 1:length(snp.id) # not needed for ibs
> > >>>> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs
> > >>>> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs
> > >>> # genotype data must have - in 3
> > >>>> genod <- myd[,-1]
> > >>>> genod[is.na(genod)] <- 3
> > >>>> genod[genod=="0"] <- 0
> > >>>> genod[genod=="1"] <- 2
> > >>>> genod[1:10,1:10]
> > >>>> genod <- as.matrix(genod)
> > >>
> > >> matrix can have only one type of data so you probaly changed it to
> > >> character by such construction.
> > >>
> > >>>> class(genod) <- "numeric"
> > >>
> > >> This tries to change all "numeric" values to numbers but if it cannot it
> > >> sets it to NA.
> > >>
> > >> something like
> > >>
> > >>> head(iris)
> > >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> > >> 1          5.1         3.5          1.4         0.2  setosa
> > >> 2          4.9         3.0          1.4         0.2  setosa
> > >> 3          4.7         3.2          1.3         0.2  setosa
> > >> 4          4.6         3.1          1.5         0.2  setosa
> > >> 5          5.0         3.6          1.4         0.2  setosa
> > >> 6          5.4         3.9          1.7         0.4  setosa
> > >>> ir <-head(iris)
> > >>> irm <- as.matrix(ir)
> > >>> head(irm)
> > >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> > >> 1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
> > >> 2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
> > >> 3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
> > >> 4 "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
> > >> 5 "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
> > >> 6 "5.4"        "3.9"       "1.7"        "0.4"       "setosa"
> > >>> class(irm) <- "numeric"
> > >> Warning message:
> > >> In class(irm) <- "numeric" : NAs introduced by coercion
> > >>> head(irm)
> > >>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> > >> 1          5.1         3.5          1.4         0.2      NA
> > >> 2          4.9         3.0          1.4         0.2      NA
> > >> 3          4.7         3.2          1.3         0.2      NA
> > >> 4          4.6         3.1          1.5         0.2      NA
> > >> 5          5.0         3.6          1.4         0.2      NA
> > >> 6          5.4         3.9          1.7         0.4      NA
> > >>>
> > >>
> > >> Cheers
> > >> Petr
> > >>
> > >>
> > >>>
> > >>>
> > >>> *Warning message:In class(genod) <- "numeric" : NAs introduced by
> > >> coercion*
> > >>>
> > >>> Maybe I could illustrate more with details so I can be more specific?
> > >>> Please, let me know.
> > >>>
> > >>> I would appreciate your help.
> > >>> Thanks,
> > >>> Meriam
> > >>>
> > >>> [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
> > >> partnerů PRECHEZA a.s. jsou zveřejněny na:
> > >> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information
> > >> about processing and protection of business partner’s personal data are
> > >> available on website:
> > >> https://www.precheza.cz/en/personal-data-protection-principles/
> > >> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou
> > >> důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení
> > >> odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any
> > >> documents attached to it may be confidential and are subject to the legally
> > >> binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
> > >>
> > >>
> > >
> >
> > --
> > Michael
> > http://www.dewey.myzen.co.uk/home.html
>
>
>
> --
> Meriam Nefzaoui
> MSc. in Plant Breeding and Genetics
> Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil



--
Meriam Nefzaoui
MSc. in Plant Breeding and Genetics
Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.