speed up R_IsNA, R_IsNaN for vector input

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

speed up R_IsNA, R_IsNaN for vector input

Jan Gorecki
Dear R developers,

I spotted that R_isNA and R_IsNaN could be improved when applied on a
vector where we could take out small part of their logic, run it once,
and then reuse inside the loop.
I setup tiny plain-C experiment. Taking R_IsNA, R_IsNaN from R's
arithmetic.c, and building R_vIsNA and R_vIsNaN accordingly.
For double input of size 1e9 (having some NA and NaN) I observed
following timings:

R_IsNA    6.729s
R_vIsNA   4.386s

R_IsNaN   6.874s
R_vIsNaN  4.479s

ISNAN     4.392s

It looks like R_vIsN(A|aN) are close to ISNAN (which just wraps to
math.h::isnan).
Should I follow up with a patch?

The experiment is a single nan.c file of 127 lines (includes R C
funs). Large enough to not paste in the email. Here is the link:
https://gist.github.com/jangorecki/c140fed3a3672620c1e2af90a768d785

Run it as:

gcc nan.c -lm
./a.out R_vIsNA 8
./a.out R_IsNA 8
./a.out R_vIsNaN 8
./a.out R_IsNaN 8
./a.out ISNAN 8

Best regards,
Jan Gorecki

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: speed up R_IsNA, R_IsNaN for vector input

Tomas Kalibera
On 9/29/19 1:09 PM, Jan Gorecki wrote:
> Dear R developers,
>
> I spotted that R_isNA and R_IsNaN could be improved when applied on a
> vector where we could take out small part of their logic, run it once,
> and then reuse inside the loop.

Dear Jan,

Looking at your examples, I just see you have hand-inlined
R_IsNA/R_IsNaN, or is there anything more? In principle we could put
R_IsNA, R_IsNAN into Rinlinedfuns to allow inlining across compilation
modules, but we can't put all functions there - so there would have to
be a clear case for a performance problem in some specific function in a
different module.

If you were curious there are optimized checks for non-finite values in
vectors in array.c, which are used for matrix multiplication before
calling to BLAS. These have to be fast and the optimization is biased
towards the case that such values are rare and that it is ok to
sometimes say "there may be non-finite values" even when in fact they
are not.

Best
Tomas

> I setup tiny plain-C experiment. Taking R_IsNA, R_IsNaN from R's
> arithmetic.c, and building R_vIsNA and R_vIsNaN accordingly.
> For double input of size 1e9 (having some NA and NaN) I observed
> following timings:
>
> R_IsNA    6.729s
> R_vIsNA   4.386s
>
> R_IsNaN   6.874s
> R_vIsNaN  4.479s
>
> ISNAN     4.392s
>
> It looks like R_vIsN(A|aN) are close to ISNAN (which just wraps to
> math.h::isnan).
> Should I follow up with a patch?
>
> The experiment is a single nan.c file of 127 lines (includes R C
> funs). Large enough to not paste in the email. Here is the link:
> https://gist.github.com/jangorecki/c140fed3a3672620c1e2af90a768d785
>
> Run it as:
>
> gcc nan.c -lm
> ./a.out R_vIsNA 8
> ./a.out R_IsNA 8
> ./a.out R_vIsNaN 8
> ./a.out R_IsNaN 8
> ./a.out ISNAN 8
>
> Best regards,
> Jan Gorecki
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: speed up R_IsNA, R_IsNaN for vector input

Jan Gorecki
Dear Tomas,

I was thinking it is because of taking

    ieee_double y;

out from the loop, and re-using across iterations.
Now I checked that was not the reason of speed up.
So as you wrote, it was only due to inlining.
I am surprised the difference is so significant.
Thank you,
Jan

On Mon, Sep 30, 2019 at 10:31 AM Tomas Kalibera
<[hidden email]> wrote:

>
> On 9/29/19 1:09 PM, Jan Gorecki wrote:
> > Dear R developers,
> >
> > I spotted that R_isNA and R_IsNaN could be improved when applied on a
> > vector where we could take out small part of their logic, run it once,
> > and then reuse inside the loop.
>
> Dear Jan,
>
> Looking at your examples, I just see you have hand-inlined
> R_IsNA/R_IsNaN, or is there anything more? In principle we could put
> R_IsNA, R_IsNAN into Rinlinedfuns to allow inlining across compilation
> modules, but we can't put all functions there - so there would have to
> be a clear case for a performance problem in some specific function in a
> different module.
>
> If you were curious there are optimized checks for non-finite values in
> vectors in array.c, which are used for matrix multiplication before
> calling to BLAS. These have to be fast and the optimization is biased
> towards the case that such values are rare and that it is ok to
> sometimes say "there may be non-finite values" even when in fact they
> are not.
>
> Best
> Tomas
>
> > I setup tiny plain-C experiment. Taking R_IsNA, R_IsNaN from R's
> > arithmetic.c, and building R_vIsNA and R_vIsNaN accordingly.
> > For double input of size 1e9 (having some NA and NaN) I observed
> > following timings:
> >
> > R_IsNA    6.729s
> > R_vIsNA   4.386s
> >
> > R_IsNaN   6.874s
> > R_vIsNaN  4.479s
> >
> > ISNAN     4.392s
> >
> > It looks like R_vIsN(A|aN) are close to ISNAN (which just wraps to
> > math.h::isnan).
> > Should I follow up with a patch?
> >
> > The experiment is a single nan.c file of 127 lines (includes R C
> > funs). Large enough to not paste in the email. Here is the link:
> > https://gist.github.com/jangorecki/c140fed3a3672620c1e2af90a768d785
> >
> > Run it as:
> >
> > gcc nan.c -lm
> > ./a.out R_vIsNA 8
> > ./a.out R_IsNA 8
> > ./a.out R_vIsNaN 8
> > ./a.out R_IsNaN 8
> > ./a.out ISNAN 8
> >
> > Best regards,
> > Jan Gorecki
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel