Calculating the mean in one column with empty cells

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Calculating the mean in one column with empty cells

fxen3k
Hi all,

I recently tried to calculate the mean and the median just for one column. In this column I have numbers with some empty cells due to missing data.
So how can I calculate the mean just for the filled cells?
I tried:
mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)], na.rm=TRUE)
But the output was different to the calculation I died in Microsoft Excel.

Thanks in advance,
Felix
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

PIKAL Petr
Hi

see inline

> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of fxen3k
> Sent: Friday, October 05, 2012 11:28 AM
> To: [hidden email]
> Subject: [R] Calculating the mean in one column with empty cells
>
> Hi all,
>
> I recently tried to calculate the mean and the median just for one
> column.
> In this column I have numbers with some empty cells due to missing
> data.
> So how can I calculate the mean just for the filled cells?
> I tried:
> mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)],
> na.rm=TRUE)
>

mean(dataSet2$ac_60d_4d_after_ann, na.rm=TRUE)

shall suffice.

 But the output was different to the calculation I died in Microsoft
> Excel.

Hm. I am also sometimes dying from Excel performance.

There could be 2 options:
Excel is wrong
You did not have transferred values from Excel to R correctly, they are screwed somehow.

Which one is true is difficult to decide based on information you revealed.

Regards
Petr


>
> Thanks in advance,
> Felix
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-
> empty-cells-tp4645135.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

Berend Hasselman
In reply to this post by fxen3k

On 05-10-2012, at 11:27, fxen3k wrote:

> Hi all,
>
> I recently tried to calculate the mean and the median just for one column.
> In this column I have numbers with some empty cells due to missing data.
> So how can I calculate the mean just for the filled cells?
> I tried:
> mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)],
> na.rm=TRUE)
> But the output was different to the calculation I died in Microsoft Excel.
>

No data ==> no can answer question.

What did you expect?
What did Excel give you?

But you are trying to calculate the mean of  dataSet2$ac_60d_4d_after_ann and indexing with the indices of  non-NA numbers of  master$ac_60d_4d_after_ann.

mean(dataSet2$ac_60d_4d_after_ann, na.rm=TRUE)

should do what you want.

Berend

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

Ken Takagi
In reply to this post by fxen3k
fxen3k <f.sehardt <at> gmail.com> writes:

>
> Hi all,
>
> I recently tried to calculate the mean and the median just for one column.
> In this column I have numbers with some empty cells due to missing data.
> So how can I calculate the mean just for the filled cells?
> I tried:
> mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)],
> na.rm=TRUE)
> But the output was different to the calculation I died in Microsoft Excel.
>
> Thanks in advance,
> Felix
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135.html
> Sent from the R help mailing list archive at Nabble.com.
>
>

Hi Felix,

Assuming that you have the data frame formatted properly, mean(yourdata$column,
na.rm = T) should work.  When coverting an excel table to R, I usually fill in
blank cells with NA before importing.  If you try to import an data frame with
empty cells, you usually get an error using read.table().  But since you seem to
have already got you data into R, that may not be the problem.

HTH,
Ken

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

fxen3k
In reply to this post by fxen3k
I'm sorry!

Now I tried it again with just 10 numbers (just random numbers) and Excel gives a different output than R.

Here are the numbers I used:

0,2006160108532920
0,1321167173880490
0,0563941428921262
0,0264198664609803
0,0200581303857603
-0,2971754213679500
-0,2353086361784190
0,0667195538296534
0,1755852636926560

And this is the command in R:

> nums <- as.numeric(as.character(dataSet2$ac_bhar_60d_4d_after_ann[2:10]))
> m <- mean(nums, na.rm = T)
> m

The output of R is:
> print(m, digits= 12)
[1] 0.0166666666667

The output in Excel is:
0,0161584031062386

The numbers are imported correctly. Or does R reduce the imported numbers to any decimal place? (i don't think so ;-) )

Best Regards,
Felix
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

fxen3k
I imported the whole dataset with read.csv2() and it works fine. (2 for German is correct ;) )

I already checked the numbers and I also tried to calculate the mean of a range of numbers where there is no NA given. (as mentioned in my last post above).
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

arun kirshna
In reply to this post by fxen3k
HI,
Not sure how your dataset looks like:

set.seed(1)
dat1<-data.frame(col1=c(sample(1:50,5,replace=TRUE),NA,sample(1:25,3,replace=TRUE),NA,sample(1:25,2,replace=TRUE)),col2=c(NA,NA,rnorm(8,15),NA,NA))
mean(dat1$col1[!is.na(dat1$col1)])
#[1] 20.1
 mean(dat1$col1,na.rm=TRUE)
#[1] 20.1

But, there is one problem that is obvious "dataset2" and "master".  Looks like you have two datasets.
A.K.






----- Original Message -----
From: fxen3k <[hidden email]>
To: [hidden email]
Cc:
Sent: Friday, October 5, 2012 5:27 AM
Subject: [R] Calculating the mean in one column with empty cells

Hi all,

I recently tried to calculate the mean and the median just for one column.
In this column I have numbers with some empty cells due to missing data.
So how can I calculate the mean just for the filled cells?
I tried:
mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)],
na.rm=TRUE)
But the output was different to the calculation I died in Microsoft Excel.

Thanks in advance,
Felix



--
View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

Sarah Goslee
In reply to this post by fxen3k
If the numbers were imported correctly you wouldn't need to do
as.numeric(as.character(yourdata)).

Please use dput() to provide your data, as in:
dput(dataSet2$ac_bhar_60d_4d_after_ann[2:10])

Otherwise it's impossible for us to diagnose your problem or reproduce
your error.

testdata <- c(0.2006160108532920,
0.1321167173880490,
0.0563941428921262,
0.0264198664609803,
0.0200581303857603,
-0.2971754213679500,
-0.2353086361784190,
0.0667195538296534,
0.1755852636926560)


> mean(testdata)
[1] 0.0161584


Sarah


On Fri, Oct 5, 2012 at 9:14 AM, fxen3k <[hidden email]> wrote:

> I'm sorry!
>
> Now I tried it again with just 10 numbers (just random numbers) and Excel
> gives a different output than R.
>
> Here are the numbers I used:
>
> 0,2006160108532920
> 0,1321167173880490
> 0,0563941428921262
> 0,0264198664609803
> 0,0200581303857603
> -0,2971754213679500
> -0,2353086361784190
> 0,0667195538296534
> 0,1755852636926560
>
> And this is the command in R:
>
>> nums <- as.numeric(as.character(dataSet2$ac_bhar_60d_4d_after_ann[2:10]))
>> m <- mean(nums, na.rm = T)
>> m
>
> The output of R is:
>> print(m, digits= 12)
> [1] 0.0166666666667
>
> The output in Excel is:
> 0,0161584031062386
>
> The numbers are imported correctly. Or does R reduce the imported numbers to
> any decimal place? (i don't think so ;-) )
>
> Best Regards,
> Felix
>
>
>
--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

William Dunlap
In reply to this post by fxen3k
You need to show us the verbatim output of the following R command
  dput(dataSet2$ac_bhar_60d_4d_after_ann[2:10])
to make any further progress.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf
> Of fxen3k
> Sent: Friday, October 05, 2012 6:15 AM
> To: [hidden email]
> Subject: Re: [R] Calculating the mean in one column with empty cells
>
> I'm sorry!
>
> Now I tried it again with just 10 numbers (just random numbers) and Excel
> gives a different output than R.
>
> Here are the numbers I used:
>
> 0,2006160108532920
> 0,1321167173880490
> 0,0563941428921262
> 0,0264198664609803
> 0,0200581303857603
> -0,2971754213679500
> -0,2353086361784190
> 0,0667195538296534
> 0,1755852636926560
>
> And this is the command in R:
>
> > nums <- as.numeric(as.character(dataSet2$ac_bhar_60d_4d_after_ann[2:10]))
> > m <- mean(nums, na.rm = T)
> > m
>
> The output of R is:
> > print(m, digits= 12)
> [1] 0.0166666666667
>
> The output in Excel is:
> 0,0161584031062386
>
> The numbers are imported correctly. Or does R reduce the imported numbers to
> any decimal place? (i don't think so ;-) )
>
> Best Regards,
> Felix
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-
> one-column-with-empty-cells-tp4645135p4645165.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

PIKAL Petr
In reply to this post by fxen3k
Hi

> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of fxen3k
> Sent: Friday, October 05, 2012 3:18 PM
> To: [hidden email]
> Subject: Re: [R] Calculating the mean in one column with empty cells
>
> I imported the whole dataset with read.csv2() and it works fine. (2 for
> German is correct ;) )
>
> I already checked the numbers and I also tried to calculate the mean of
> a range of numbers where there is no NA given. (as mentioned in my last
> post above).

But as Sarah pointed out the result in R from your values (when correctly read) are the same as in Excel. Therefore the problem seems to be in ***your*** data

output from
str(dataSet2$ac_bhar_60d_4d_after_ann[2:10])

can be helpful in diagnosing what may be going on.

Regards
Petr

>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-
> empty-cells-tp4645135p4645166.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

fxen3k
Hi,

the first command was bringing the numbers into R directly:
> testdata <- c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262, 0.0264198664609803, 0.0200581303857603, -0.2971754213679500, -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
> mean(testdata)
[1] 0.0161584


Here I tried to calculate the mean with the same numbers as given above, but taken from my dataset.

> str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
 num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
> mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
[1] 0.01666667


It seems that in the second case he calculates the mean with rounded numbers (0.2 and not 0.20061601085...)
Could it be that R imports only the rounded numbers?
How can I build a CSV-file with numbers showing all decimal places? Because I think my current CSV-file only has numbers with 2 decimal places.


Kind Regards,
Felix
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

Michael Weylandt
On Sat, Oct 6, 2012 at 9:11 AM, fxen3k <[hidden email]> wrote:

> Hi,
>
> the first command was bringing the numbers into R directly:
> *> testdata <- c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262,
> 0.0264198664609803, 0.0200581303857603, -0.2971754213679500,
> -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
>> mean(testdata)
> [1] 0.0161584*
>
> Here I tried to calculate the mean with the same numbers as given above, but
> taken from my dataset.
> *
>> str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
>  num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
>> mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
> [1] 0.01666667
> *
>
> It seems that in the second case he calculates the mean with rounded numbers
> (0.2 and not 0.20061601085...)
> Could it be that R imports only the rounded numbers?
> How can I build a CSV-file with numbers showing all decimal places? Because
> I think my current CSV-file only has numbers with 2 decimal places.

That's something you need to figure out with whatever software is
writing the csv.

Cheers,
Michael

>
>
> Kind Regards,
> Felix
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645252.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

John Kane
In reply to this post by fxen3k
Where is the csv data coming from?  If it is an export from a spreadsheet, Excel (and others?) has a nasty habit of exporting "as displayed" rather than the actual number as it's default.  

John Kane
Kingston ON Canada


> -----Original Message-----
> From: [hidden email]
> Sent: Sat, 6 Oct 2012 01:11:11 -0700 (PDT)
> To: [hidden email]
> Subject: Re: [R] Calculating the mean in one column with empty cells
>
> Hi,
>
> the first command was bringing the numbers into R directly:
> *> testdata <- c(0.2006160108532920, 0.1321167173880490,
> 0.0563941428921262,
> 0.0264198664609803, 0.0200581303857603, -0.2971754213679500,
> -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
>> mean(testdata)
> [1] 0.0161584*
>
> Here I tried to calculate the mean with the same numbers as given above,
> but
> taken from my dataset.
> *
>> str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
>  num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
>> mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
> [1] 0.01666667
> *
>
> It seems that in the second case he calculates the mean with rounded
> numbers
> (0.2 and not 0.20061601085...)
> Could it be that R imports only the rounded numbers?
> How can I build a CSV-file with numbers showing all decimal places?
> Because
> I think my current CSV-file only has numbers with 2 decimal places.
>
>
> Kind Regards,
> Felix
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645252.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

____________________________________________________________
FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

fxen3k
I created a Microsoft Excel spreadsheet. As you said, I only have "as displayed" numbers. I just solved the problem by showing 25 decimal places in Excel and then exported the data into a CSV-file.

Is there a better way to solve this?

Regards,
Felix
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

David Winsemius
In reply to this post by fxen3k

On Oct 6, 2012, at 1:11 AM, fxen3k wrote:

> Hi,
>
> the first command was bringing the numbers into R directly:
> *> testdata <- c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262,
> 0.0264198664609803, 0.0200581303857603, -0.2971754213679500,
> -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
>> mean(testdata)
> [1] 0.0161584*
>
> Here I tried to calculate the mean with the same numbers as given above, but
> taken from my dataset.
> *
>> str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
> num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
>> mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
> [1] 0.01666667
> *

This is something that has happened in data processing:

> dat <- read.csv2(text="0,2006160108532920
+ 0,1321167173880490
+ 0,0563941428921262
+ 0,0264198664609803
+ 0,0200581303857603
+ -0,2971754213679500
+ -0,2353086361784190
+ 0,0667195538296534
+ 0,1755852636926560
+ ", header=FALSE)
> mean(dat[[1]])
[1] 0.0161584

>

> It seems that in the second case he calculates the mean with rounded numbers
> (0.2 and not 0.20061601085...)
> Could it be that R imports only the rounded numbers?
> How can I build a CSV-file with numbers showing all decimal places? Because
> I think my current CSV-file only has numbers with 2 decimal places.
>

That is more likely the fault of Excel than it is something R is responsible for.

--

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

William Dunlap
For nine numbers, R-helpers should recommend that people
show their data with dput(obj) instead of str(obj).
dput() shows everything in the object to full precision.  str() shows
a summary of the object and rounds numbers to 2 digits -- it
is good for an overview of the data, but when the question is "why
did I get a mean of .066666 instead of .06547494 from my 9 numbers"
str() is not useful.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf
> Of David Winsemius
> Sent: Saturday, October 06, 2012 9:08 AM
> To: fxen3k
> Cc: [hidden email]
> Subject: Re: [R] Calculating the mean in one column with empty cells
>
>
> On Oct 6, 2012, at 1:11 AM, fxen3k wrote:
>
> > Hi,
> >
> > the first command was bringing the numbers into R directly:
> > *> testdata <- c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262,
> > 0.0264198664609803, 0.0200581303857603, -0.2971754213679500,
> > -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
> >> mean(testdata)
> > [1] 0.0161584*
> >
> > Here I tried to calculate the mean with the same numbers as given above, but
> > taken from my dataset.
> > *
> >> str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
> > num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
> >> mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
> > [1] 0.01666667
> > *
>
> This is something that has happened in data processing:
>
> > dat <- read.csv2(text="0,2006160108532920
> + 0,1321167173880490
> + 0,0563941428921262
> + 0,0264198664609803
> + 0,0200581303857603
> + -0,2971754213679500
> + -0,2353086361784190
> + 0,0667195538296534
> + 0,1755852636926560
> + ", header=FALSE)
> > mean(dat[[1]])
> [1] 0.0161584
>
> >
>
> > It seems that in the second case he calculates the mean with rounded numbers
> > (0.2 and not 0.20061601085...)
> > Could it be that R imports only the rounded numbers?
> > How can I build a CSV-file with numbers showing all decimal places? Because
> > I think my current CSV-file only has numbers with 2 decimal places.
> >
>
> That is more likely the fault of Excel than it is something R is responsible for.
>
> --
>
> David Winsemius, MD
> Alameda, CA, USA
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

Michael Weylandt
In reply to this post by fxen3k


On Oct 6, 2012, at 4:54 PM, fxen3k <[hidden email]> wrote:

> I created a Microsoft Excel spreadsheet. As you said, I only have "as
> displayed" numbers. I just solved the problem by showing 25 decimal places
> in Excel and then exported the data into a CSV-file.
>
> Is there a better way to solve this?
>

Don't use Excel. (or at least find a way to get reasonable defaults) This isn't sarcastic: just acknowledging that instances like this show Excel really isn't a suitable tool for real data analysis (cf Pat Burns' 'Spreadsheet Addiction' paper)

Michael

> Regards,
> Felix
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645278.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating the mean in one column with empty cells

PIKAL Petr
In reply to this post by William Dunlap
Hi

> -----Original Message-----
> From: [hidden email] [mailto:r-help-bounces@r-
> project.org] On Behalf Of William Dunlap
> Sent: Saturday, October 06, 2012 6:17 PM
> To: David Winsemius; fxen3k
> Cc: [hidden email]
> Subject: Re: [R] Calculating the mean in one column with empty cells
>
> For nine numbers, R-helpers should recommend that people show their
> data with dput(obj) instead of str(obj).
> dput() shows everything in the object to full precision.  str() shows a
> summary of the object and rounds numbers to 2 digits -- it is good for

actually 4 digits

 str(testdata)
 num [1:9] 0.2006 0.1321 0.0564 0.0264 0.0201 ...

but you are right that dput does not hide anything.

Anyway exporting through csv from Excel like programs through csv can be rather problematic due to rounding habit of those programs.

The best way to solve this problem will probably be to ask Microsoft help for assistance.

Regards
Petr

> an overview of the data, but when the question is "why did I get a mean
> of .066666 instead of .06547494 from my 9 numbers"
> str() is not useful.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of David Winsemius
> > Sent: Saturday, October 06, 2012 9:08 AM
> > To: fxen3k
> > Cc: [hidden email]
> > Subject: Re: [R] Calculating the mean in one column with empty cells
> >
> >
> > On Oct 6, 2012, at 1:11 AM, fxen3k wrote:
> >
> > > Hi,
> > >
> > > the first command was bringing the numbers into R directly:
> > > *> testdata <- c(0.2006160108532920, 0.1321167173880490,
> > > 0.0563941428921262, 0.0264198664609803, 0.0200581303857603,
> > > -0.2971754213679500, -0.2353086361784190, 0.0667195538296534,
> > > 0.1755852636926560)
> > >> mean(testdata)
> > > [1] 0.0161584*
> > >
> > > Here I tried to calculate the mean with the same numbers as given
> > > above, but taken from my dataset.
> > > *
> > >> str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
> > > num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
> > >> mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
> > > [1] 0.01666667
> > > *
> >
> > This is something that has happened in data processing:
> >
> > > dat <- read.csv2(text="0,2006160108532920
> > + 0,1321167173880490
> > + 0,0563941428921262
> > + 0,0264198664609803
> > + 0,0200581303857603
> > + -0,2971754213679500
> > + -0,2353086361784190
> > + 0,0667195538296534
> > + 0,1755852636926560
> > + ", header=FALSE)
> > > mean(dat[[1]])
> > [1] 0.0161584
> >
> > >
> >
> > > It seems that in the second case he calculates the mean with
> rounded
> > > numbers
> > > (0.2 and not 0.20061601085...)
> > > Could it be that R imports only the rounded numbers?
> > > How can I build a CSV-file with numbers showing all decimal places?
> > > Because I think my current CSV-file only has numbers with 2 decimal
> places.
> > >
> >
> > That is more likely the fault of Excel than it is something R is
> responsible for.
> >
> > --
> >
> > David Winsemius, MD
> > Alameda, CA, USA
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.