Count non-zero values in excluding NA Values

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Count non-zero values in excluding NA Values

Temel İspanyolca
Dear R Staff

You can see my data.csv file in the annex.

I try to count non-zero values in dataset but I need to exclude NA in this
calculation

My code is very long (following),
How can I write this code more efficiently and shortly?

## [NA_Count] - Find NA values

data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
(c)))))


## [Zero] - Find zero values

data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))


## [Non-Zero] - Find non-zero values

data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)


Sincerely
Engin YILMAZ

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Count non-zero values in excluding NA Values

Rui Barradas
Hello,

Your attachment didn't came through, R-Help strips off most types of
files, including CSV.
Anyway, the following will do what I understand of your question. Tested
with a fake dataset.


set.seed(3026)    # make the results reproducible
data <- matrix(1:100, ncol = 10)
data[sample(100, 15)] <- 0
data[sample(100, 10)] <- NA
data <- as.data.frame(data)

zero <- sapply(data, function(x) sum(x == 0, na.rm = TRUE))
na <- sapply(data, function(x) sum(is.na(x)))
totals <- nrow(data) - zero - na  # totals non zero per column
grand_total <- sum(totals)        # total non zero

totals
# V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
#  6   8   8   8   8   7   7   8   6  10

grand_total
#[1] 76

# another way
prod(dim(data)) - sum(zero + na)
#[1] 76


Hope this helps,

Rui Barradas

Em 29-10-2017 10:25, Engin YILMAZ escreveu:

> Dear R Staff
>
> You can see my data.csv file in the annex.
>
> I try to count non-zero values in dataset but I need to exclude NA in this
> calculation
>
> My code is very long (following),
> How can I write this code more efficiently and shortly?
>
> ## [NA_Count] - Find NA values
>
> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
> (c)))))
>
>
> ## [Zero] - Find zero values
>
> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>
>
> ## [Non-Zero] - Find non-zero values
>
> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>
>
> Sincerely
> Engin YILMAZ
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Count non-zero values in excluding NA Values

Eric Berger
If one does not need all the intermediate results then after defining data
just one line:

grand_total <- nrow(data)*ncol(data) - sum( sapply(data, function(x) sum(
is.na(x) | x == 0 ) ) )
# 76




On Sun, Oct 29, 2017 at 2:38 PM, Rui Barradas <[hidden email]> wrote:

> Hello,
>
> Your attachment didn't came through, R-Help strips off most types of
> files, including CSV.
> Anyway, the following will do what I understand of your question. Tested
> with a fake dataset.
>
>
> set.seed(3026)    # make the results reproducible
> data <- matrix(1:100, ncol = 10)
> data[sample(100, 15)] <- 0
> data[sample(100, 10)] <- NA
> data <- as.data.frame(data)
>
> zero <- sapply(data, function(x) sum(x == 0, na.rm = TRUE))
> na <- sapply(data, function(x) sum(is.na(x)))
> totals <- nrow(data) - zero - na  # totals non zero per column
> grand_total <- sum(totals)        # total non zero
>
> totals
> # V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
> #  6   8   8   8   8   7   7   8   6  10
>
> grand_total
> #[1] 76
>
> # another way
> prod(dim(data)) - sum(zero + na)
> #[1] 76
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Em 29-10-2017 10:25, Engin YILMAZ escreveu:
>
>> Dear R Staff
>>
>> You can see my data.csv file in the annex.
>>
>> I try to count non-zero values in dataset but I need to exclude NA in this
>> calculation
>>
>> My code is very long (following),
>> How can I write this code more efficiently and shortly?
>>
>> ## [NA_Count] - Find NA values
>>
>> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
>> (c)))))
>>
>>
>> ## [Zero] - Find zero values
>>
>> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>>
>>
>> ## [Non-Zero] - Find non-zero values
>>
>> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>>
>>
>> Sincerely
>> Engin YILMAZ
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source
>> =link&utm_campaign=sig-email&utm_content=webmail>
>> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source
>> =link&utm_campaign=sig-email&utm_content=webmail>
>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Count non-zero values in excluding NA Values

Ek Esawi
In reply to this post by Temel İspanyolca
Since i could not see your data, the easiest thing comes to mind is court
values excluding NAs, is something like this
sum(!is.na(x))

Best of luck--EK

On Sun, Oct 29, 2017 at 6:25 AM, Engin YILMAZ <[hidden email]> wrote:

> Dear R Staff
>
> You can see my data.csv file in the annex.
>
> I try to count non-zero values in dataset but I need to exclude NA in this
> calculation
>
> My code is very long (following),
> How can I write this code more efficiently and shortly?
>
> ## [NA_Count] - Find NA values
>
> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
> (c)))))
>
>
> ## [Zero] - Find zero values
>
> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>
>
> ## [Non-Zero] - Find non-zero values
>
> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>
>
> Sincerely
> Engin YILMAZ
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_
> source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_
> source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Count non-zero values in excluding NA Values

Temel İspanyolca
Dear R Staff

This is my file (www.fiscalforecasting.com/data.csv)

if you don't download this file, my dataset same as following

Year

Month

A

B

C

D

E

2005

July

0

*4*

NA

NA

*1*

2005

July

0

NA

NA

0

*9*

2005

July

NA

*4*

0

*1*

0

2005

July

*4*

0

*2*

*9*

NA

I try to count non-zero values which are not NA values for every *column*

*Sincerely*
*Engin YILMAZ*






2017-10-29 15:01 GMT+03:00 Ek Esawi <[hidden email]>:

> Since i could not see your data, the easiest thing comes to mind is court
> values excluding NAs, is something like this
> sum(!is.na(x))
>
> Best of luck--EK
>
> On Sun, Oct 29, 2017 at 6:25 AM, Engin YILMAZ <[hidden email]>
> wrote:
>
>> Dear R Staff
>>
>> You can see my data.csv file in the annex.
>>
>> I try to count non-zero values in dataset but I need to exclude NA in this
>> calculation
>>
>> My code is very long (following),
>> How can I write this code more efficiently and shortly?
>>
>> ## [NA_Count] - Find NA values
>>
>> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
>> (c)))))
>>
>>
>> ## [Zero] - Find zero values
>>
>> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>>
>>
>> ## [Non-Zero] - Find non-zero values
>>
>> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>>
>>
>> Sincerely
>> Engin YILMAZ
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source
>> =link&utm_campaign=sig-email&utm_content=webmail>
>> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source
>> =link&utm_campaign=sig-email&utm_content=webmail>
>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


--
*Saygılarımla*
Engin YILMAZ

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Count non-zero values in excluding NA Values

Ek Esawi
In reply to this post by Temel İspanyolca
What was suggested by Eric and Rui works well, but here is a short and may
be simpler answer provided your data is similar what Eric posted. It should
work for your l data too.

aa <- is.na(data)|data==0
nrow(data)-colSums(aa)

EK

On Sun, Oct 29, 2017 at 6:25 AM, Engin YILMAZ <[hidden email]> wrote:

> Dear R Staff
>
> You can see my data.csv file in the annex.
>
> I try to count non-zero values in dataset but I need to exclude NA in this
> calculation
>
> My code is very long (following),
> How can I write this code more efficiently and shortly?
>
> ## [NA_Count] - Find NA values
>
> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
> (c)))))
>
>
> ## [Zero] - Find zero values
>
> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>
>
> ## [Non-Zero] - Find non-zero values
>
> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>
>
> Sincerely
> Engin YILMAZ
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_
> source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_
> source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Count non-zero values in excluding NA Values

Temel İspanyolca
Thanks Esawi,Barradas and Berger

Sincerely
Engin YILMAZ

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

2017-10-29 23:14 GMT+03:00 Ek Esawi <[hidden email]>:

> What was suggested by Eric and Rui works well, but here is a short and
> may be simpler answer provided your data is similar what Eric posted. It
> should work for your l data too.
>
> aa <- is.na(data)|data==0
> nrow(data)-colSums(aa)
>
> EK
>
> On Sun, Oct 29, 2017 at 6:25 AM, Engin YILMAZ <[hidden email]>
> wrote:
>
>> Dear R Staff
>>
>> You can see my data.csv file in the annex.
>>
>> I try to count non-zero values in dataset but I need to exclude NA in this
>> calculation
>>
>> My code is very long (following),
>> How can I write this code more efficiently and shortly?
>>
>> ## [NA_Count] - Find NA values
>>
>> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
>> (c)))))
>>
>>
>> ## [Zero] - Find zero values
>>
>> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>>
>>
>> ## [Non-Zero] - Find non-zero values
>>
>> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>>
>>
>> Sincerely
>> Engin YILMAZ
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source
>> =link&utm_campaign=sig-email&utm_content=webmail>
>> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source
>> =link&utm_campaign=sig-email&utm_content=webmail>
>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


--
*Saygılarımla*
Engin YILMAZ

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Count non-zero values in excluding NA Values

Daniel Nordlund-3
In reply to this post by Temel İspanyolca
On 10/29/2017 3:25 AM, Engin YILMAZ wrote:

> Dear R Staff
>
> You can see my data.csv file in the annex.
>
> I try to count non-zero values in dataset but I need to exclude NA in this
> calculation
>
> My code is very long (following),
> How can I write this code more efficiently and shortly?
>
> ## [NA_Count] - Find NA values
>
> data.na =sapply(data[,3:ncol(data)], function(c) sum(length(which(is.na
> (c)))))
>
>
> ## [Zero] - Find zero values
>
> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>
>
> ## [Non-Zero] - Find non-zero values
>
> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>
>
> Sincerely
> Engin YILMAZ
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

this looks like a good place for apply()

    apply(data,2,function(x) sum(x != 0, na.rm=TRUE))


Hope this is helpful,

Dan

--
Daniel Nordlund
Port Townsend, WA  USA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Count non-zero values in excluding NA Values

David Carlson
In reply to this post by Temel İspanyolca
You need to send plain text emails so that this does not happen to your data.

Assuming you want to ignore the first 2 columns:

> colSums(dta[, 3:7]>0, na.rm=TRUE)
      Raki     Whisky       Wine       Beer Cigarettes
       153        153        153        153         32

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352


-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of Engin YILMAZ
Sent: Sunday, October 29, 2017 1:49 PM
To: Ek Esawi <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] Count non-zero values in excluding NA Values

Dear R Staff

This is my file (www.fiscalforecasting.com/data.csv)

if you don't download this file, my dataset same as following

Year

Month

A

B

C

D

E

2005

July

0

*4*

NA

NA

*1*

2005

July

0

NA

NA

0

*9*

2005

July

NA

*4*

0

*1*

0

2005

July

*4*

0

*2*

*9*

NA

I try to count non-zero values which are not NA values for every *column*

*Sincerely*
*Engin YILMAZ*






2017-10-29 15:01 GMT+03:00 Ek Esawi <[hidden email]>:

> Since i could not see your data, the easiest thing comes to mind is
> court values excluding NAs, is something like this
> sum(!is.na(x))
>
> Best of luck--EK
>
> On Sun, Oct 29, 2017 at 6:25 AM, Engin YILMAZ <[hidden email]>
> wrote:
>
>> Dear R Staff
>>
>> You can see my data.csv file in the annex.
>>
>> I try to count non-zero values in dataset but I need to exclude NA in
>> this calculation
>>
>> My code is very long (following),
>> How can I write this code more efficiently and shortly?
>>
>> ## [NA_Count] - Find NA values
>>
>> data.na =sapply(data[,3:ncol(data)], function(c)
>> sum(length(which(is.na
>> (c)))))
>>
>>
>> ## [Zero] - Find zero values
>>
>> data.z=apply(data[,3:ncol(data)], 2, function(c) sum(c==0))
>>
>>
>> ## [Non-Zero] - Find non-zero values
>>
>> data.nz=nrow(data[,3:ncol(data)])- (data.na+data.z)
>>
>>
>> Sincerely
>> Engin YILMAZ
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source
>> =link&utm_campaign=sig-email&utm_content=webmail>
>> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source
>> =link&utm_campaign=sig-email&utm_content=webmail>
>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti 
>> ng-guide.html and provide commented, minimal, self-contained,
>> reproducible code.
>>
>
>


--
*Saygılarımla*
Engin YILMAZ

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.