Quantcast

R as.numeric()

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

R as.numeric()

Lutz Fischer
Hi,

I have a bit of a problem with as.numeric or as.double.

I read in an excel-file (either xlsx::read.xlsx2 or gdata::read.xls).
Select a subset and then try to make it numeric:


# read in the excel-file
alldata<-read.xlsx2("input.xls",1)
# select the subset
s<-subset(alldata, select=c("cI","cII","cIII","cIV","cV"))
# unluckily we have "n/a" for missing values in the file - so we turn it
into "proper" missing values
s[s == "n/a"]<-NA

n<-data.matrix(s);




The problem I have is that it does not convert the date the way I would
expect.

just as an example:
 > s[1,2]
[1] 30.94346629
3136 Levels: 0.026307482 0.028239812 0.02849896 0.029054564 0.029540352
0.030248034 0.030841352 0.032966308 ... n/a

turned into:
 > n[1,2]
[1] 3020

And I would like to have there 30.94346629 as well. I assume that has to
do with the "Levels" attribute - but not sure what to make of these in
the first place.

I also tried to convert each value on its own:

#make some space that holds the actual numeric data
n <- array(dim=c(length(s[,1]),length(s)))
# now turn everything into doubles
for (c in 1:length(s)) {
   for (r in 1:length(s[,1])) {
     n[r,c]<-as.double(s[r,c])
   }
}

but that gave the same result - just a lot slower.



Thanks
Lutz


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: R as.numeric()

David Scott-6
On 25/05/2011 9:20 a.m., Lutz Fischer wrote:

> Hi,
>
> I have a bit of a problem with as.numeric or as.double.
>
> I read in an excel-file (either xlsx::read.xlsx2 or gdata::read.xls).
> Select a subset and then try to make it numeric:
>
>
> # read in the excel-file
> alldata<-read.xlsx2("input.xls",1)
> # select the subset
> s<-subset(alldata, select=c("cI","cII","cIII","cIV","cV"))
> # unluckily we have "n/a" for missing values in the file - so we turn it
> into "proper" missing values
> s[s == "n/a"]<-NA
>
> n<-data.matrix(s);
>
>
>
>
> The problem I have is that it does not convert the date the way I would
> expect.
>
> just as an example:
>  > s[1,2]
> [1] 30.94346629
> 3136 Levels: 0.026307482 0.028239812 0.02849896 0.029054564 0.029540352
> 0.030248034 0.030841352 0.032966308 ... n/a
>
> turned into:
>  > n[1,2]
> [1] 3020
>
> And I would like to have there 30.94346629 as well. I assume that has to
> do with the "Levels" attribute - but not sure what to make of these in
> the first place.
>
> I also tried to convert each value on its own:
>
> #make some space that holds the actual numeric data
> n <- array(dim=c(length(s[,1]),length(s)))
> # now turn everything into doubles
> for (c in 1:length(s)) {
> for (r in 1:length(s[,1])) {
> n[r,c]<-as.double(s[r,c])
> }
> }
>
> but that gave the same result - just a lot slower.
>
>
>
> Thanks
> Lutz
>
>
Your problem is the conversion to factors when the data is read. Use

options(stringsAsFactors = FALSE)

before you read the data, then the mixed columns of numeric and missing
will be read as character data and the conversion to numeric will go as
you expect. (But I haven't tested this.)

David Scott
--
_________________________________________________________________
David Scott Department of Statistics
                The University of Auckland, PB 92019
                Auckland 1142,    NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email: [hidden email],  Fax: +64 9 373 7018

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: R as.numeric()

Timothy Bates
In reply to this post by Lutz Fischer

On 24 May 2011, at 10:20 PM, Lutz Fischer wrote:
> n<-data.matrix(s);
> > s[1,2]
> [1] 30.94346629 # 3136 Levels: 0.026307482
> turned into:
> > n[1,2]
> [1] 3020

Dear Lutz, “3020” is the factor level associated with 30.94346629, in turn generated by importing with strings in the column

Try passing "na.strings = ’n/a’ “ into the read.xls function to avoid the problem

If not, then as.numeric(as.character(datafulloffactors) will work

as.numeric sees the factor levels: as.character forces as.numeric to see the “labels”

t

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: R as.numeric()

Ista Zahn-2
In reply to this post by David Scott-6
This is a FAQ:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f

Please try there before posting a question to the list.

Best,
Ista
On Tue, May 24, 2011 at 5:33 PM, David Scott <[hidden email]> wrote:

> On 25/05/2011 9:20 a.m., Lutz Fischer wrote:
>>
>> Hi,
>>
>> I have a bit of a problem with as.numeric or as.double.
>>
>> I read in an excel-file (either xlsx::read.xlsx2 or gdata::read.xls).
>> Select a subset and then try to make it numeric:
>>
>>
>> # read in the excel-file
>> alldata<-read.xlsx2("input.xls",1)
>> # select the subset
>> s<-subset(alldata, select=c("cI","cII","cIII","cIV","cV"))
>> # unluckily we have "n/a" for missing values in the file - so we turn it
>> into "proper" missing values
>> s[s == "n/a"]<-NA
>>
>> n<-data.matrix(s);
>>
>>
>>
>>
>> The problem I have is that it does not convert the date the way I would
>> expect.
>>
>> just as an example:
>>  > s[1,2]
>> [1] 30.94346629
>> 3136 Levels: 0.026307482 0.028239812 0.02849896 0.029054564 0.029540352
>> 0.030248034 0.030841352 0.032966308 ... n/a
>>
>> turned into:
>>  > n[1,2]
>> [1] 3020
>>
>> And I would like to have there 30.94346629 as well. I assume that has to
>> do with the "Levels" attribute - but not sure what to make of these in
>> the first place.
>>
>> I also tried to convert each value on its own:
>>
>> #make some space that holds the actual numeric data
>> n <- array(dim=c(length(s[,1]),length(s)))
>> # now turn everything into doubles
>> for (c in 1:length(s)) {
>> for (r in 1:length(s[,1])) {
>> n[r,c]<-as.double(s[r,c])
>> }
>> }
>>
>> but that gave the same result - just a lot slower.
>>
>>
>>
>> Thanks
>> Lutz
>>
>>
> Your problem is the conversion to factors when the data is read. Use
>
> options(stringsAsFactors = FALSE)
>
> before you read the data, then the mixed columns of numeric and missing will
> be read as character data and the conversion to numeric will go as you
> expect. (But I haven't tested this.)
>
> David Scott
> --
> _________________________________________________________________
> David Scott     Department of Statistics
>                The University of Auckland, PB 92019
>                Auckland 1142,    NEW ZEALAND
> Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
> Email:  [hidden email],  Fax: +64 9 373 7018
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: R as.numeric()

Lutz Fischer


Thanks a lot for both replies.

If I setup the option as proposed everything works as I wanted it to.

I guess as.character would work as well. Only then I guess I would need
to loop through the data frame.

Lutz


On 24/05/11 22:42, Ista Zahn wrote:

> This is a FAQ:
>
> http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f
>
> Please try there before posting a question to the list.
>
> Best,
> Ista
> On Tue, May 24, 2011 at 5:33 PM, David Scott <[hidden email]> wrote:
>> On 25/05/2011 9:20 a.m., Lutz Fischer wrote:
>>> Hi,
>>>
>>> I have a bit of a problem with as.numeric or as.double.
>>>
>>> I read in an excel-file (either xlsx::read.xlsx2 or gdata::read.xls).
>>> Select a subset and then try to make it numeric:
>>>
>>>
>>> # read in the excel-file
>>> alldata<-read.xlsx2("input.xls",1)
>>> # select the subset
>>> s<-subset(alldata, select=c("cI","cII","cIII","cIV","cV"))
>>> # unluckily we have "n/a" for missing values in the file - so we turn it
>>> into "proper" missing values
>>> s[s == "n/a"]<-NA
>>>
>>> n<-data.matrix(s);
>>>
>>>
>>>
>>>
>>> The problem I have is that it does not convert the date the way I would
>>> expect.
>>>
>>> just as an example:
>>>  > s[1,2]
>>> [1] 30.94346629
>>> 3136 Levels: 0.026307482 0.028239812 0.02849896 0.029054564 0.029540352
>>> 0.030248034 0.030841352 0.032966308 ... n/a
>>>
>>> turned into:
>>>  > n[1,2]
>>> [1] 3020
>>>
>>> And I would like to have there 30.94346629 as well. I assume that has to
>>> do with the "Levels" attribute - but not sure what to make of these in
>>> the first place.
>>>
>>> I also tried to convert each value on its own:
>>>
>>> #make some space that holds the actual numeric data
>>> n <- array(dim=c(length(s[,1]),length(s)))
>>> # now turn everything into doubles
>>> for (c in 1:length(s)) {
>>> for (r in 1:length(s[,1])) {
>>> n[r,c]<-as.double(s[r,c])
>>> }
>>> }
>>>
>>> but that gave the same result - just a lot slower.
>>>
>>>
>>>
>>> Thanks
>>> Lutz
>>>
>>>
>> Your problem is the conversion to factors when the data is read. Use
>>
>> options(stringsAsFactors = FALSE)
>>
>> before you read the data, then the mixed columns of numeric and missing will
>> be read as character data and the conversion to numeric will go as you
>> expect. (But I haven't tested this.)
>>
>> David Scott
>> --
>> _________________________________________________________________
>> David Scott     Department of Statistics
>>                The University of Auckland, PB 92019
>>                Auckland 1142,    NEW ZEALAND
>> Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
>> Email:  [hidden email],  Fax: +64 9 373 7018
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: R as.numeric()

David Winsemius

On May 25, 2011, at 7:25 AM, Lutz Fischer wrote:

>
>
> Thanks a lot for both replies.
>
> If I setup the option as proposed everything works as I wanted it to.
>
> I guess as.character would work as well. Only then I guess I would  
> need
> to loop through the data frame.

as.character is vectorized. You should not need loops.

--
David.

>
> Lutz
>
>
> On 24/05/11 22:42, Ista Zahn wrote:
>> This is a FAQ:
>>
>> http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f
>>
>> Please try there before posting a question to the list.
>>
>> Best,
>> Ista
>> On Tue, May 24, 2011 at 5:33 PM, David Scott  
>> <[hidden email]> wrote:
>>> On 25/05/2011 9:20 a.m., Lutz Fischer wrote:
>>>> Hi,
>>>>
>>>> I have a bit of a problem with as.numeric or as.double.
>>>>
>>>> I read in an excel-file (either xlsx::read.xlsx2 or  
>>>> gdata::read.xls).
>>>> Select a subset and then try to make it numeric:
>>>>
>>>>
>>>> # read in the excel-file
>>>> alldata<-read.xlsx2("input.xls",1)
>>>> # select the subset
>>>> s<-subset(alldata, select=c("cI","cII","cIII","cIV","cV"))
>>>> # unluckily we have "n/a" for missing values in the file - so we  
>>>> turn it
>>>> into "proper" missing values
>>>> s[s == "n/a"]<-NA
>>>>
>>>> n<-data.matrix(s);
>>>>
>>>>
>>>>
>>>>
>>>> The problem I have is that it does not convert the date the way I  
>>>> would
>>>> expect.
>>>>
>>>> just as an example:
>>>>> s[1,2]
>>>> [1] 30.94346629
>>>> 3136 Levels: 0.026307482 0.028239812 0.02849896 0.029054564  
>>>> 0.029540352
>>>> 0.030248034 0.030841352 0.032966308 ... n/a
>>>>
>>>> turned into:
>>>>> n[1,2]
>>>> [1] 3020
>>>>
>>>> And I would like to have there 30.94346629 as well. I assume that  
>>>> has to
>>>> do with the "Levels" attribute - but not sure what to make of  
>>>> these in
>>>> the first place.
>>>>
>>>> I also tried to convert each value on its own:
>>>>
>>>> #make some space that holds the actual numeric data
>>>> n <- array(dim=c(length(s[,1]),length(s)))
>>>> # now turn everything into doubles
>>>> for (c in 1:length(s)) {
>>>> for (r in 1:length(s[,1])) {
>>>> n[r,c]<-as.double(s[r,c])
>>>> }
>>>> }
>>>>
>>>> but that gave the same result - just a lot slower.
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Lutz
>>>>
>>>>
>>> Your problem is the conversion to factors when the data is read. Use
>>>
>>> options(stringsAsFactors = FALSE)
>>>
>>> before you read the data, then the mixed columns of numeric and  
>>> missing will
>>> be read as character data and the conversion to numeric will go as  
>>> you
>>> expect. (But I haven't tested this.)
>>>
>>> David Scott
>>> --
>>> _________________________________________________________________
>>> David Scott     Department of Statistics
>>>               The University of Auckland, PB 92019
>>>               Auckland 1142,    NEW ZEALAND
>>> Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
>>> Email:  [hidden email],  Fax: +64 9 373 7018
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...