Quantcast

How to create a numeric data.frame

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

How to create a numeric data.frame

Aparna Sampath
Hi All

I am new to R and  I am not sure of how this should be done. I have a matrix of
985x100 values and the class is data.frame.

A sample of my dataset looks like this (Since its a huge dataset and it would
make the screen look more complex, I am pasting only the first few rows and
columns.

 V2           V3           V4           V5           V6
2   0.009953966  -0.01586103 -0.016227028  0.016774711 -0.021342598
3  -0.230181145  0.203303786 -0.685321843  0.147050709 -0.122269004
4  -0.552905273 -0.034039644 -0.511356309 -0.330524909 -0.239088566
5  -0.089739322 -0.082768643 -0.411209134 -0.301011664  1.560185991
6  -1.986059137 -0.252217616 -0.369044526 -0.585619405  0.545903757
7  -1.635875161  2.741310455 -0.058411313 -1.458825827  0.078480977
8   0.525846706 -1.134643662 -0.067014844 -1.431990219 -0.557057121
9  -0.913511821  0.688374777  0.376412044 -0.861746434  2.065507172
10 -1.538179621  0.814330376  1.639939042  -1.41478931  1.802738289
11  0.817957993 -0.426560507  2.773380242 -0.123291817  1.316883748


When I try to use this command to convert it to numeric,

as.numeric(leu_cluster1): I get an error Error: (list) object cannot be coerced
to type 'double'. I tried several functions and looked into other forums too,
but could not find a solution. i am trying to change it to numeric data.frame
and not to a matrix.


thanks in advance. :)

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Sarah Goslee
What are you trying to do? It looks numeric, although a visual
assessment isn't reliable.

The output of str() would be helpful.

But I'm not sure what your objective is. What do you think your data
frame is now, and what do you think it should be?

Sarah

On Mon, Jun 13, 2011 at 6:06 AM, Aparna <[hidden email]> wrote:

> Hi All
>
> I am new to R and  I am not sure of how this should be done. I have a matrix of
> 985x100 values and the class is data.frame.
>
> A sample of my dataset looks like this (Since its a huge dataset and it would
> make the screen look more complex, I am pasting only the first few rows and
> columns.
>
>  V2           V3           V4           V5           V6
> 2   0.009953966  -0.01586103 -0.016227028  0.016774711 -0.021342598
> 3  -0.230181145  0.203303786 -0.685321843  0.147050709 -0.122269004
> 4  -0.552905273 -0.034039644 -0.511356309 -0.330524909 -0.239088566
> 5  -0.089739322 -0.082768643 -0.411209134 -0.301011664  1.560185991
> 6  -1.986059137 -0.252217616 -0.369044526 -0.585619405  0.545903757
> 7  -1.635875161  2.741310455 -0.058411313 -1.458825827  0.078480977
> 8   0.525846706 -1.134643662 -0.067014844 -1.431990219 -0.557057121
> 9  -0.913511821  0.688374777  0.376412044 -0.861746434  2.065507172
> 10 -1.538179621  0.814330376  1.639939042  -1.41478931  1.802738289
> 11  0.817957993 -0.426560507  2.773380242 -0.123291817  1.316883748
>
>
> When I try to use this command to convert it to numeric,
>
> as.numeric(leu_cluster1): I get an error Error: (list) object cannot be coerced
> to type 'double'. I tried several functions and looked into other forums too,
> but could not find a solution. i am trying to change it to numeric data.frame
> and not to a matrix.
>
>
> thanks in advance. :)
>


--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Joshua Wiley-2
In reply to this post by Aparna Sampath
Hi,

If your matrix is already numeric, then:

as.data.frame(your_matrix_name)

will do the trick.  However, if you have a matrix that is not numeric
(say it is character), then you could use:

as.data.frame(as.numeric(your_matrix_name))

Matrices can only hold one class of data (for example, all numeric, or
all character, or all factor), so if *any* of your data is character
(say one column contains people's names), then the entire matrix will
be character, and calling as.numeric() on it is probably not what you
want (the character data will get screwed up).  In which case, you
might convert the matrix to a data frame first:

as.data.frame(your_matrix_name)

because data frames can contain different classes of data in their
different columns.  Once it is a data frame, you could convert the
columns that should be numeric to numeric (say, columns 2 through 6
only) by:

your_data_name[, 2:6] <- lapply(your_data_name[, 2:6], as.numeric)

For relevant documentation, see

?as.numeric
?as.data.frame
## under the "Details" section, it shows the hierarchy of data types
## that is how I could know that if there is character data, the numeric
## data will be converted up to the character class
?matrix


Hope this helps,

Josh

On Mon, Jun 13, 2011 at 3:06 AM, Aparna <[hidden email]> wrote:

> Hi All
>
> I am new to R and  I am not sure of how this should be done. I have a matrix of
> 985x100 values and the class is data.frame.
>
> A sample of my dataset looks like this (Since its a huge dataset and it would
> make the screen look more complex, I am pasting only the first few rows and
> columns.
>
>  V2           V3           V4           V5           V6
> 2   0.009953966  -0.01586103 -0.016227028  0.016774711 -0.021342598
> 3  -0.230181145  0.203303786 -0.685321843  0.147050709 -0.122269004
> 4  -0.552905273 -0.034039644 -0.511356309 -0.330524909 -0.239088566
> 5  -0.089739322 -0.082768643 -0.411209134 -0.301011664  1.560185991
> 6  -1.986059137 -0.252217616 -0.369044526 -0.585619405  0.545903757
> 7  -1.635875161  2.741310455 -0.058411313 -1.458825827  0.078480977
> 8   0.525846706 -1.134643662 -0.067014844 -1.431990219 -0.557057121
> 9  -0.913511821  0.688374777  0.376412044 -0.861746434  2.065507172
> 10 -1.538179621  0.814330376  1.639939042  -1.41478931  1.802738289
> 11  0.817957993 -0.426560507  2.773380242 -0.123291817  1.316883748
>
>
> When I try to use this command to convert it to numeric,
>
> as.numeric(leu_cluster1): I get an error Error: (list) object cannot be coerced
> to type 'double'. I tried several functions and looked into other forums too,
> but could not find a solution. i am trying to change it to numeric data.frame
> and not to a matrix.
>
>
> thanks in advance. :)
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Barry Rowlingson
In reply to this post by Aparna Sampath
On Mon, Jun 13, 2011 at 11:06 AM, Aparna <[hidden email]> wrote:
> Hi All
>
> I am new to R and  I am not sure of how this should be done. I have a matrix of
> 985x100 values and the class is data.frame.

 You don't have a 'matrix' in the R sense of the word. You seem to
have a table of numbers which are stored in an object of class
'data.frame'.

>  V2           V3           V4           V5           V6
> 2   0.009953966  -0.01586103 -0.016227028  0.016774711 -0.021342598
> 3  -0.230181145  0.203303786 -0.685321843  0.147050709 -0.122269004
> 4  -0.552905273 -0.034039644 -0.511356309 -0.330524909 -0.239088566
> 5  -0.089739322 -0.082768643 -0.411209134 -0.301011664  1.560185991
> 6  -1.986059137 -0.252217616 -0.369044526 -0.585619405  0.545903757
> 7  -1.635875161  2.741310455 -0.058411313 -1.458825827  0.078480977
> 8   0.525846706 -1.134643662 -0.067014844 -1.431990219 -0.557057121
> 9  -0.913511821  0.688374777  0.376412044 -0.861746434  2.065507172
> 10 -1.538179621  0.814330376  1.639939042  -1.41478931  1.802738289
> 11  0.817957993 -0.426560507  2.773380242 -0.123291817  1.316883748
>
>
> When I try to use this command to convert it to numeric,

 A data frame doesn't have an overall sense of itself being numeric or
character. Only the columns have that, and they can be independent.

> as.numeric(leu_cluster1): I get an error Error: (list) object cannot be coerced
> to type 'double'.

 Because a data frame is implemented as a list where each element is
the same length. Each element is a vector of numbers or characters.
You are doing the equivalent of:

 as.numeric(list(foo=c(1,2,3))

 now you may think it reasonable to do an 'as.numeric' on that, but what about:

 as.numeric(list(foo=list(bar=c(1,2,3),baz=c(34,5)),bar=c("Hello","World"))

  how would you 'as.numeric' that?

> I tried several functions and looked into other forums too,
> but could not find a solution. i am trying to change it to numeric data.frame
> and not to a matrix.

 There is no numeric data frame. There is only numeric matrix, or a
dataframe with all numeric columns. Do summary(mydataframe) to see
what class your columns all are.

Barry

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Joshua Wiley-2
On Mon, Jun 13, 2011 at 7:45 AM, Barry Rowlingson
<[hidden email]> wrote:
[snip]
>  now you may think it reasonable to do an 'as.numeric' on that, but what about:
>
>  as.numeric(list(foo=list(bar=c(1,2,3),baz=c(34,5)),bar=c("Hello","World"))
>
>  how would you 'as.numeric' that?

Well, "Hello" is one of the first words spoken when meeting someone
and in programming, and I think "World" represents not just the earth,
but everything that exists.  Everything seems best represented by a
continuous circle, so logically I would 'as.numeric' that

$foo
$foo$bar
[1] 1 2 3

$foo$baz
[1] 34  5

$bar
[1] 1 0

However, R does not agree with my logic ;)

Josh

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Patrizio Frederic
In reply to this post by Barry Rowlingson
On Mon, Jun 13, 2011 at 4:45 PM, Barry Rowlingson
<[hidden email]> wrote:

> On Mon, Jun 13, 2011 at 11:06 AM, Aparna <[hidden email]> wrote:
>> Hi All
>>
>> I am new to R and  I am not sure of how this should be done. I have a matrix of
>> 985x100 values and the class is data.frame.
>
>  You don't have a 'matrix' in the R sense of the word. You seem to
> have a table of numbers which are stored in an object of class
> 'data.frame'.
>

but you could have one:

a <- data.frame(matrix(rnorm(100),10) # get some data
class(a) # check for its class
as.numeric(a) # whoops, won't work

class(as.matrix(a)) # change class, and
as.numeric(as.matrix(a)) # bingo, it works

PF


--
+-----------------------------------------------------------------------
| Patrizio Frederic,
| http://www.economia.unimore.it/frederic_patrizio/
+-----------------------------------------------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

PIKAL Petr
Hi
[hidden email] napsal dne 13.06.2011 17:19:39:

> Patrizio Frederic <[hidden email]>
>
> On Mon, Jun 13, 2011 at 4:45 PM, Barry Rowlingson
> <[hidden email]> wrote:
> > On Mon, Jun 13, 2011 at 11:06 AM, Aparna <[hidden email]>
wrote:
> >> Hi All
> >>
> >> I am new to R and  I am not sure of how this should be done. I have a
matrix of

> >> 985x100 values and the class is data.frame.
> >
> >  You don't have a 'matrix' in the R sense of the word. You seem to
> > have a table of numbers which are stored in an object of class
> > 'data.frame'.
> >
>
> but you could have one:
>
> a <- data.frame(matrix(rnorm(100),10) # get some data
> class(a) # check for its class
> as.numeric(a) # whoops, won't work
>
> class(as.matrix(a)) # change class, and
> as.numeric(as.matrix(a)) # bingo, it works

Which results in vector of numbers

str(as.numeric(as.matrix(a)))
 num [1:100] 0.82 -1.339 1.397 0.673 -0.461 ...

data frame is convenient list structure which can contain vectors of
various nature (numeric, character, factor, logical, ...)
and looks quite similar to Excel table.

matrix is a vector with (2) dimensions but as it is a vector it can not
consist from objects of different nature (class). Therefore you can have
numeric or character matrix but not numeric and character columns in your
matrix.

and vector is vector (numeric, character, logical,  ...) but again you can
not mix items of different class in one vector.

Regards
Petr

>
> PF
>
>
> --
> +-----------------------------------------------------------------------
> | Patrizio Frederic,
> | http://www.economia.unimore.it/frederic_patrizio/
> +-----------------------------------------------------------------------
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Aparna Sampath
In reply to this post by Sarah Goslee
Hi Sarah

Thanks for your advice. My dataset contains all the normalized values. I have to
give this dataset as input to ClusterCons package in R. In order to run the
package, it requires the data to be converted to numeric data.frame.

When i check my data using class(mydataset), it is in the form of data.frame.
But when I try to convert it to numeric using as.numeric(mydataset), it gives me
an error saying

Error: (list) object cannot be coerced to type 'double'.
Could you tell me why this error occurs.

Thanks
Aparna

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Aparna Sampath
In reply to this post by Joshua Wiley-2
Hi Joshua

While looking at the data, all the values seem to be in numeric. As i mentioned,
the dataset is already in data.frame.

As suggested, I used str(mydata) and got the following result:


str(leu_cluster1)
'data.frame':   984 obs. of  100 variables:
 $ V2  : Factor w/ 986 levels "-0.00257361",..: 543 116 252 54 520 ...
 $ V3  : Factor w/ 986 levels "-0.000790437",..: 7 666 14 32 105 ...
 $ V4  : Factor w/ 986 levels "-0.0023231","-0.004207663",..: 6 353 267 208
187...
 $ V5  : Factor w/ 986 levels "-0.006466083",..: 585 627 146 131 263 ...
 $ V6  : Factor w/ 986 levels "-0.002119173",..: 11 56 111 898 780...
 
 
The columns are not numeric, which I understood from this.


There is a function called data_check as part of Clustercons package
I am using for this project. This helps me check whether the input data is
numeric or not. Using this I could tell that my data is not numeric
and that is why I was trying to convert it to numeric data.
 
This forum is of great help since I am able to learn more and
thanks for making this forum so helpful to people like us who are new to R.

Aparna

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Patrizio Frederic
In reply to this post by PIKAL Petr
>
> Which results in vector of numbers
>
> str(as.numeric(as.matrix(a)))
>  num [1:100] 0.82 -1.339 1.397 0.673 -0.461 ...
>
> data frame is convenient list structure which can contain vectors of
> various nature (numeric, character, factor, logical, ...)
> and looks quite similar to Excel table.
>
> matrix is a vector with (2) dimensions but as it is a vector it can not
> consist from objects of different nature (class). Therefore you can have
> numeric or character matrix but not numeric and character columns in your
> matrix.
>
> and vector is vector (numeric, character, logical,  ...) but again you can
> not mix items of different class in one vector.
>

of course it is. I forgot to say that the way I proposed works only if
the data-frame contains numeric objects only.

R is a great tool because you can get to the very same results in many
different ways.
Depending on the problem you're dealing with, you have to choose the
most efficient one.
Often, in my research work, the most efficient is the one that use as
less as possible lines of code:

Suppose a is a data.frame which contains numeric objects only

a <- data.frame(matrix(rnorm(100),10)) # some data

## 1 not very nice
b <- 0
for (j in 1:length(a)) b<-c(b,as.numeric(a[i]))
b<-b[-1]

## 2 long time ago I was a fortran guy
b<-numeric(length(a))
for (j in 1:dim(a)[2]){
  for (i in 1:dim(a)[1]){
     b[10*(j-1)+i] <- as.numeric(a[i,j])
  }
}

## 3 better: sapply function
as.numeric(sapply(a,function(x)as.numeric(x)))

## 4 shorter
as.numeric(as.matrix(a))

## which type of data a has
a <- data.frame(a,fact=sample(c('F1','F2'),dim(a)[1],replace=T))
class_a <- sapply(a,function(x)class(x))
class_a
a_numeric <- a[,class_a=='numeric']
as.numeric(as.matrix(a_numeric))

Regards,

PF

--
+-----------------------------------------------------------------------
| Patrizio Frederic,
| http://www.economia.unimore.it/frederic_patrizio/
+-----------------------------------------------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Patrizio Frederic
In reply to this post by Aparna Sampath
On Mon, Jun 13, 2011 at 6:47 PM, Aparna <[hidden email]> wrote:

> Hi Joshua
>
> While looking at the data, all the values seem to be in numeric. As i mentioned,
> the dataset is already in data.frame.
>
> As suggested, I used str(mydata) and got the following result:
>
>
> str(leu_cluster1)
> 'data.frame':   984 obs. of  100 variables:
>  $ V2  : Factor w/ 986 levels "-0.00257361",..: 543 116 252 54 520 ...

your data columns are not numeric but factors indeed.
you may try this one

a <- as.character(rnorm(100)) # some numeric data
adf <- data.frame(matrix(a,10)) # which are misinterpreted as factors
adf
adf[,1]
class(adf[,1]) # check for the class of the first column
sapply(adf,function(x)class(x)) # check classes for all columns

b <- sapply(adf,function(x)as.numeric(as.character(x))) #
as.character: use levels literally, as.numeric: transforms in numbers
b # look at b

class(b) # which is now a numeric matrix

best regards

PF

--
+-----------------------------------------------------------------------
| Patrizio Frederic,
| http://www.economia.unimore.it/frederic_patrizio/
+-----------------------------------------------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Joshua Wiley-2
On Mon, Jun 13, 2011 at 2:11 PM, Patrizio Frederic
<[hidden email]> wrote:

> On Mon, Jun 13, 2011 at 6:47 PM, Aparna <[hidden email]> wrote:
>> Hi Joshua
>>
>> While looking at the data, all the values seem to be in numeric. As i mentioned,
>> the dataset is already in data.frame.
>>
>> As suggested, I used str(mydata) and got the following result:
>>
>>
>> str(leu_cluster1)
>> 'data.frame':   984 obs. of  100 variables:
>>  $ V2  : Factor w/ 986 levels "-0.00257361",..: 543 116 252 54 520 ...
>
> your data columns are not numeric but factors indeed.
> you may try this one
>
> a <- as.character(rnorm(100))           # some numeric data
> adf <- data.frame(matrix(a,10))         # which are misinterpreted as factors
> adf
> adf[,1]
> class(adf[,1]) # check for the class of the first column
> sapply(adf,function(x)class(x)) # check classes for all columns
>
> b <- sapply(adf,function(x)as.numeric(as.character(x))) #

But coercing to a character class first is not the recommended method.
 Also, I am leery about using sapply() with data frames, because it
converts them to matrices, which can cause havoc, if you have
different classes of data.  You mentioned that as a first step, you
had removed the names column from the data frame before trying to
convert it to numeric.  I would simply leave the names in, and then
(supposing they are in column 101)

leu_cluster1[, 1:100] <- lapply(leu_cluster1[, 1:100], function(x)
as.numeric(levels(x))[x])

apply the conversion to numeric on only the necessary columns.  This
simplifies life because you are not making interim data sets.  Using
lapply() allows you to work with (potentially) different classes of
data (although I realize in this particular case you are only dealing
with one class).  So long as you are assigning the results back into a
data frame (as above), the methods for lapply will automatically
conver the list back to a data frame.  If you are concerned about
this, just wrap the call in as.data.frame()

leu_cluster1[, 1:100] <- as.data.frame(lapply(
  leu_cluster1[, 1:100], function(x) as.numeric(levels(x))[x]))

Cheers,

Josh

> as.character: use levels literally, as.numeric: transforms in numbers
> b # look at b
>
> class(b) # which is now a numeric matrix
>
> best regards
>
> PF
>
> --
> +-----------------------------------------------------------------------
> | Patrizio Frederic,
> | http://www.economia.unimore.it/frederic_patrizio/
> +-----------------------------------------------------------------------
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Richard M. Heiberger
Aparna,

Something is very strange here.  I think it is time to go back to the
original data source and find
out why the file you thought contained numbers was read as factors.  Is
there an unanticipated
comma in a field? Did you forget to say header=TRUE?
Rich
On Mon, Jun 13, 2011 at 5:30 PM, Joshua Wiley <[hidden email]>wrote:

> On Mon, Jun 13, 2011 at 2:11 PM, Patrizio Frederic
> <[hidden email]> wrote:
> > On Mon, Jun 13, 2011 at 6:47 PM, Aparna <[hidden email]>
> wrote:
> >> Hi Joshua
> >>
> >> While looking at the data, all the values seem to be in numeric. As i
> mentioned,
> >> the dataset is already in data.frame.
> >>
> >> As suggested, I used str(mydata) and got the following result:
> >>
> >>
> >> str(leu_cluster1)
> >> 'data.frame':   984 obs. of  100 variables:
> >>  $ V2  : Factor w/ 986 levels "-0.00257361",..: 543 116 252 54 520 ...
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Aparna Sampath
This post has NOT been accepted by the mailing list yet.
Hi Rich

yeah I did check it again. I made the header to be T. I am not sure why it is showing as factors. But i jus cleared all the variables and read the matrix again and this is the output i got


> str(temp)
'data.frame':   985 obs. of  101 variables:
 $ Name               : Factor w/ 985 levels "1009_at","1044_s_at",..: 121 122 123 124 125 126 127 128 129 130 ...
 $ TEL.AML1.C41       : num  0.00995 -0.23018 -0.55291 -0.08974 -1.98606 ...
 $ Hyperdip.50.C23    : num  -0.0159 0.2033 -0.034 -0.0828 -0.2522 ...
 $ BCR.AB.LC1         : num  -0.0162 -0.6853 -0.5114 -0.4112 -0.369 ...
 $ Hyperdip.50.C13    : num  0.0168 0.1471 -0.3305 -0.301 -0.5856 ...
 $ T.ALL.C5           : num  -0.0213 -0.1223 -0.2391 1.5602 0.5459 ...
 $ TEL.AML1.9         : num  -0.0224 -0.8246 -0.7583 -1.6787 -0.231 ...
 $ TEL.AML1.8         : num  -0.0266 -0.3889 -0.5522 -0.1678 -0.937 ...

It does now show the values are numeric, but also the first line is factor because the first row contains the gene names. It is in data.frame.

But when i check the data using data_check(temp), (a function as a part of the package I am using to check whether the input data is in numeric format), it says (x) is not numeric. It is still asking me to convert it to numeric data.frame.

Regards
Aparna
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Aparna Sampath
This post has NOT been accepted by the mailing list yet.
In reply to this post by Joshua Wiley-2
Hi joshua

I tried to not exclude the gene names from the excel file and have it as it is and read it using read.table and making header=T. this is the following output I got.

> str(temp)
'data.frame':   985 obs. of  101 variables:
 $ Name               : Factor w/ 985 levels "1009_at","1044_s_at",..: 121 122 123 124 125 126 127 128 129 130 ...
 $ TEL.AML1.C41       : num  0.00995 -0.23018 -0.55291 -0.08974 -1.98606 ...
 $ Hyperdip.50.C23    : num  -0.0159 0.2033 -0.034 -0.0828 -0.2522 ...
 $ BCR.AB.LC1         : num  -0.0162 -0.6853 -0.5114 -0.4112 -0.369 ...
 $ Hyperdip.50.C13    : num  0.0168 0.1471 -0.3305 -0.301 -0.5856 ...
 $ T.ALL.C5           : num  -0.0213 -0.1223 -0.2391 1.5602 0.5459 ...
 $ TEL.AML1.9         : num  -0.0224 -0.8246 -0.7583 -1.6787 -0.231 ...
 $ TEL.AML1.8         : num  -0.0266 -0.3889 -0.5522 -0.1678 -0.937 ...
 $ Hyperdip.50.C7     : num  -0.0277 -0.2553 -0.1602 0.2029 -0.2295 ...
 $ TEL.AML1.C37       : num  -0.0376 0.1465 -1.0516 -0.675 0.6024 ...
 $ MLL.6              : num  -0.039 -0.201 4.043 0.83 1.711 ...
 

this is similar to the example you had shown me, the values are numeric and the factor is showing the gene names.

But when i use data_check(temp) to see if it is in numeric data.frame, it is showing (x) is not numeric. I am thinking if it is because of the names in the file and it expects only numbers in the file and when I try to read only the numbers, it is showing it as factors.

Any idea why.. :)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: How to create a numeric data.frame

Aparna Sampath
This post has NOT been accepted by the mailing list yet.
In reply to this post by Joshua Wiley-2
Josh,

I tried the lapply() it does convert it to numeric dataframe and when I checked whether the format was accepted by the package, it returned true.

I kept the gene names and sample names without deleting it (sample names are in column 1), and chose header =T when i read the data using read.table, I had to convert from column 2 to end as numeric, this is what I did:
temp[,2:101]=lapply(temp[,2:101],as.numeric)

This creates temp as dataframe and it is numeric but the sample names are ignored. The result looks like
this:



     TEL.AML1.C41 Hyperdip.50.C23  BCR.AB.LC1 Hyperdip.50.C13    T.ALL.C5
1   -0.01586103     -0.01622703  0.01677471     -0.02134260 -0.02237441
2    0.20330379     -0.68532184  0.14705071     -0.12226900 -0.82459784
3   -0.03403964     -0.51135631 -0.33052491     -0.23908857 -0.75833475
4   -0.08276864     -0.41120913 -0.30101166      1.56018599 -1.67872101
5   -0.25221762     -0.36904453 -0.58561940      0.54590376 -0.23098727
6    2.74131045     -0.05841131 -1.45882583      0.07848098 -0.02832526
7   -1.13464366     -0.06701484 -1.43199022     -0.55705712  2.04748271


The sample names which are supposed to be the first column are replaced by numbers.

Kindly advice....

Aparna
Loading...