[ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?

Wang Jiefei
Hi,



I would like to figure out the meaning of the return value of these two
functions. Here are the default definitions I find from R source code:



static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS; }

static int altreal_No_NA_default(SEXP x) { return 0; }

I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA
*simply means
unknown sorted/NA status of the vector, so R will loop over the vector and
find the answer. However, what should we return in these functions to
indicate whether the vector has been sorted/ contains NA? My initial guess
is 0/1 but since *NA_NA *uses 0 as its default value so it will be
ambiguous. Are there any macros to define yes/no return values for these
functions? I would appreciate any thought here.



Best,

Jiefei

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?

Gabriel Becker-2
Hi Jiefei,

The meanings of the return values for sortedness can be found in
RInternals.h, and are as follows:

/* ALTREP sorting support */
enum {SORTED_DECR_NA_1ST = -2,
      SORTED_DECR = -1,
      UNKNOWN_SORTEDNESS = INT_MIN, /*INT_MIN is NA_INTEGER! */
      SORTED_INCR = 1,
      SORTED_INCR_NA_1ST = 2,
      KNOWN_UNSORTED = 0};

The default value there is NA_INTEGER (ie INT_MIN), indicating that there
is no sortedness information.

Currently, *_NO_NA  effectively return a boolean, (even though the actual
return value is int). This can be seen in the method we provide for compact
sequences in altclasses.c:


static int compact_intseq_No_NA(SEXP x)
{
#ifdef COMPACT_INTSEQ_MUTABLE
    /* If the vector has been expanded it may have been modified. */
    if (COMPACT_SEQ_EXPANDED(x) != R_NilValue)
return FALSE;
#endif
    return TRUE;
}

(FALSE is a macro for 0, TRUE is a macro for 1).

Think of the meaning of the return value to No_NA methods as the object's
answer to the following question

"Are you sure there are zero NAs in your data?"

When it is sure of that, it  says "yes" (returning 1, ie TRUE). When it
either is sure there are NAs *OR* doesn't have any information about
whether there are NAs, it says "no" (returning 0, ie FALSE).

Also please note, it is possible there may be another API point in the
future which asks the object *how many NAs it has.∫ˆ* If that materializes,
No_NA would just  consume the answer to thatto get the binarized version,
but again there is nothing like that in there now.

Hope that helps.

Best,
~G

On Wed, Sep 11, 2019 at 12:04 AM Wang Jiefei <[hidden email]> wrote:

> Hi,
>
>
>
> I would like to figure out the meaning of the return value of these two
> functions. Here are the default definitions I find from R source code:
>
>
>
> static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS; }
>
> static int altreal_No_NA_default(SEXP x) { return 0; }
>
> I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA
> *simply means
> unknown sorted/NA status of the vector, so R will loop over the vector and
> find the answer. However, what should we return in these functions to
> indicate whether the vector has been sorted/ contains NA? My initial guess
> is 0/1 but since *NA_NA *uses 0 as its default value so it will be
> ambiguous. Are there any macros to define yes/no return values for these
> functions? I would appreciate any thought here.
>
>
>
> Best,
>
> Jiefei
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?

Wang Jiefei
Hi Gabriel,

Thanks for your answer and future update plan. Somehow this email has been
delayed for a week, so there might be a wired reply from me saying that I
have found the answer from the R source code, it was sent from me last
week. Hopefully, this reply will not cost another week to post:)

As a side note, I like the idea that defining a macro for sortedness, and I
can see why we can only have a binary answer for NO_NA (since the return
value is actually bool). For making the code more readable, and for
possibly working with the future R release, is it possible to define a
macro for NO_NA function in RInternal.h? So if there is any change in NO_NA
function, there is no need to modify the code. Also, the code can be more
readable by doing that.

Best,
Jiefei

On Wed, Sep 11, 2019 at 1:58 PM Gabriel Becker <[hidden email]>
wrote:

> Hi Jiefei,
>
> The meanings of the return values for sortedness can be found in
> RInternals.h, and are as follows:
>
> /* ALTREP sorting support */
> enum {SORTED_DECR_NA_1ST = -2,
>       SORTED_DECR = -1,
>       UNKNOWN_SORTEDNESS = INT_MIN, /*INT_MIN is NA_INTEGER! */
>       SORTED_INCR = 1,
>       SORTED_INCR_NA_1ST = 2,
>       KNOWN_UNSORTED = 0};
>
> The default value there is NA_INTEGER (ie INT_MIN), indicating that there
> is no sortedness information.
>
> Currently, *_NO_NA  effectively return a boolean, (even though the actual
> return value is int). This can be seen in the method we provide for compact
> sequences in altclasses.c:
>
>
> static int compact_intseq_No_NA(SEXP x)
> {
> #ifdef COMPACT_INTSEQ_MUTABLE
>     /* If the vector has been expanded it may have been modified. */
>     if (COMPACT_SEQ_EXPANDED(x) != R_NilValue)
> return FALSE;
> #endif
>     return TRUE;
> }
>
> (FALSE is a macro for 0, TRUE is a macro for 1).
>
> Think of the meaning of the return value to No_NA methods as the object's
> answer to the following question
>
> "Are you sure there are zero NAs in your data?"
>
> When it is sure of that, it  says "yes" (returning 1, ie TRUE). When it
> either is sure there are NAs *OR* doesn't have any information about
> whether there are NAs, it says "no" (returning 0, ie FALSE).
>
> Also please note, it is possible there may be another API point in the
> future which asks the object *how many NAs it has.∫ˆ* If that
> materializes, No_NA would just  consume the answer to thatto get the
> binarized version, but again there is nothing like that in there now.
>
> Hope that helps.
>
> Best,
> ~G
>
> On Wed, Sep 11, 2019 at 12:04 AM Wang Jiefei <[hidden email]> wrote:
>
>> Hi,
>>
>>
>>
>> I would like to figure out the meaning of the return value of these two
>> functions. Here are the default definitions I find from R source code:
>>
>>
>>
>> static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS;
>> }
>>
>> static int altreal_No_NA_default(SEXP x) { return 0; }
>>
>> I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA
>> *simply means
>> unknown sorted/NA status of the vector, so R will loop over the vector and
>> find the answer. However, what should we return in these functions to
>> indicate whether the vector has been sorted/ contains NA? My initial guess
>> is 0/1 but since *NA_NA *uses 0 as its default value so it will be
>> ambiguous. Are there any macros to define yes/no return values for these
>> functions? I would appreciate any thought here.
>>
>>
>>
>> Best,
>>
>> Jiefei
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

NiceBayes filtered "[ALTREP] ... return value ..."

Martin Maechler
>>>>> Wang Jiefei
>>>>>     on Wed, 11 Sep 2019 14:49:13 -0400 writes:

    > Hi Gabriel,
    > Thanks for your answer and future update plan. Somehow this email has been
    > delayed for a week, so there might be a wired reply from me saying that I
    > have found the answer from the R source code, it was sent from me last
    > week. Hopefully, this reply will not cost another week to post:)

All our e-mail is heavily spam filtered fortunately, through
quite a few filters which sum up to a final spam score and when
that is too high, the message is "diverted" to the spam
collection.
In your case, the "NiceBayes" spamfilter somehow decided to give
the message quite a high score and that got a relatively large
weight
(maybe you should stop using all capitals such as ALTREP in your
 subject !?)

We, the volunteer mailing list moderators, get (daily or weekly, in
this case daily) e-mails from the spam software giving us a full
list of the filtered messages... However, we usually lack the
time to carefully go through that list, notably with R-help or
R-devel where that list is quite long...
so I had detected your "ham" message among the many dozens of
spam ones only a day ago, and released it..

Martin Maechler
ETH Zurich

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel