mean calculation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

mean calculation

juvin
This post has NOT been accepted by the mailing list yet.
I tried to calculate a mean from a csv table by forming a data frame, but it says dim(x)must have a positive length. The table has 1206 column and 31 rows. I want to calculate mean, median, and maximum from the the table. The table has some NA values which i dont want to include. The table looks as follows:
1 2 3 4 5 6 7 8 9 10 11
NA 0 0 0 0 12 0 0 0 0 0
NA 0 0 0 0 0 0 0 0 0 0
NA 0 0 0 0 14 0 0 0 0 5
NA 0 0 0 0 0 0 0 0 0 0
NA 0 0 27 0 0 0 0 20 0 165
NA 0 88 38 0 0 0 0 0 0 26
NA 12 12 0 0 0 0 0 0 0 2
NA 2 0 0 0 0 0 0 0 0 0
NA 2 0 0 0 0 0 0 0 0 0
NA 0 24 1 0 0 0 0 3 0 62
NA 26 0 0 0 0 0 0 0 0 33

I used following code to calculate mean:
Any help would be appreciated.
rainfall=read.table('bmark.csv',header=T,sep=',')
precip=data.frame(rainfall[1:1206])
monthlyMean=apply(precip, MARGIN=2,FUN=mean,na.rm=TRUE)

Juvin


Reply | Threaded
Open this post in threaded view
|

Re: mean calculation

arun kirshna
Hi Juvin,

 The error "dim(X) must have a positive length" usually shows when you are passing a vector to "apply", ie.

    apply(1:5,2,mean)
    #Error in apply(1:5, 2, mean) : dim(X) must have a positive length



   Also, if your dataset originally has "1206" columns, it is not clear why you needed the below code.  ("rainfall" is already a "data.frame")


      precip=data.frame(rainfall[1:1206])



Based on the data provided,

    rainfall <-  read.table(text="1    2    3    4    5    6    7    8    9    10    11
NA    0    0    0    0    12    0    0    0    0    0
NA    0    0    0    0    0    0    0    0    0    0
NA    0    0    0    0    14    0    0    0    0    5
NA    0    0    0    0    0    0    0    0    0    0
NA    0    0    27    0    0    0    0    20    0    165
NA    0    88    38    0    0    0    0    0    0    26
NA    12    12    0    0    0    0    0    0    0    2
NA    2    0    0    0    0    0    0    0    0    0
NA    2    0    0    0    0    0    0    0    0    0
NA    0    24    1    0    0    0    0    3    0    62
NA    26    0    0    0    0    0    0    0    0    33",sep="", header=TRUE, check.names=FALSE)



    apply(rainfall, 2, function(x) c(mean=mean(x, na.rm=TRUE),

                   median=median(x, na.rm=TRUE), max=max(x, na.rm=TRUE)))

#1         2        3  4 5         6 7 8         9 10        11
#mean    NaN  3.818182 11.27273  6 0  2.363636 0 0  2.090909  0  26.63636
#median   NA  0.000000  0.00000  0 0  0.000000 0 0  0.000000  0   2.00000
#max    -Inf 26.000000 88.00000 38 0 14.000000 0 0 20.000000  0 165.00000



Or using `colMaxs`, `colMedians` from `matrixStats`

    library(matrixStats)
    rbind(mean=colMeans(rainfall, na.rm=TRUE), median= colMedians(as.matrix(rainfall),
          na.rm=TRUE), max=colMaxs(rainfall, na.rm=TRUE))

Another option would be to use `summarise_each` from `dplyr`

    library(dplyr)
    rainfall %>%
             summarise_each(funs(mean=mean(., na.rm=TRUE), median=median(., na.rm=TRUE),

                               max=max(., na.rm=TRUE)))

A.K.


I tried to calculate a mean from a csv table by forming a data frame,
but it says dim(x)must have a positive length. The table has 1206 column and 31 rows. I want to calculate mean, median, and maximum from the the table. The table has some NA values which i dont want to include. The
table looks as follows:
1    2    3    4    5    6    7    8    9    10    11
NA    0    0    0    0    12    0    0    0    0    0
NA    0    0    0    0    0    0    0    0    0    0
NA    0    0    0    0    14    0    0    0    0    5
NA    0    0    0    0    0    0    0    0    0    0
NA    0    0    27    0    0    0    0    20    0    165
NA    0    88    38    0    0    0    0    0    0    26
NA    12    12    0    0    0    0    0    0    0    2
NA    2    0    0    0    0    0    0    0    0    0
NA    2    0    0    0    0    0    0    0    0    0
NA    0    24    1    0    0    0    0    3    0    62
NA    26    0    0    0    0    0    0    0    0    33

I used following code to calculate mean:
Any help would be appreciated.
rainfall=read.table('bmark.csv',header=T,sep=',')
precip=data.frame(rainfall[1:1206])
monthlyMean=apply(precip, MARGIN=2,FUN=mean,na.rm=TRUE)

Juvin

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.