Parsing code with newlines

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Parsing code with newlines

Peter Jaeger
Dear List,

When I try to parse code containing newline characters with R_ParseVector, I
get a compilation error. How can I compile code that includes comments and
newlines?

I am using the following:

void* my_compile(char *code)
{
    SEXP cmdSexp, cmdExpr = R_NilValue;
    ParseStatus status;

    PROTECT (cmdSexp = allocVector (STRSXP, 1));
    SET_STRING_ELT (cmdSexp, 0, mkChar (code));
    PROTECT (cmdExpr = R_ParseVector (cmdSexp,-1,&status,
        R_NilValue));
    UNPROTECT_PTR (cmdSexp);

    if (status != PARSE_OK) {
        return (void*)0;
    } else {
        return (void*)cmdExpr;
    }
}

Regards,
Peter

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Parsing code with newlines

Duncan Murdoch
On 04/08/2008 8:50 AM, Peter Jaeger wrote:

> Dear List,
>
> When I try to parse code containing newline characters with R_ParseVector, I
> get a compilation error. How can I compile code that includes comments and
> newlines?
>
> I am using the following:
>
> void* my_compile(char *code)
> {
>     SEXP cmdSexp, cmdExpr = R_NilValue;
>     ParseStatus status;
>
>     PROTECT (cmdSexp = allocVector (STRSXP, 1));
>     SET_STRING_ELT (cmdSexp, 0, mkChar (code));
>     PROTECT (cmdExpr = R_ParseVector (cmdSexp,-1,&status,
>         R_NilValue));
>     UNPROTECT_PTR (cmdSexp);
>
>     if (status != PARSE_OK) {
>         return (void*)0;
>     } else {
>         return (void*)cmdExpr;
>     }
> }

You need to put together a reproducible example if you want help.
parse() uses R_ParseVector, and it handles newlines fine.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Parsing code with newlines

Prof Brian Ripley
In reply to this post by Peter Jaeger
This adds nothing to your previous post: please don't be annoying and post
almost identical messages.

I strongly suspect you mean 'parse error' and 'how can I parse R code',
but we don't know what the example was and what the error message was.
Nor do we know what you are doing with this fragement of C code (and
returnng an unprotected SEXP via a (void *) cast is a recipe for tears).

parse(text=) uses R_ParseVector, and that works for 'code that includes
comments and newlines', so please check out the internal code used (in
src/main/gram.c).

If you study the posting guide you should be able to formulate a posting
that people can actually help you with.

On Mon, 4 Aug 2008, Peter Jaeger wrote:

> Dear List,
>
> When I try to parse code containing newline characters with R_ParseVector, I
> get a compilation error. How can I compile code that includes comments and
> newlines?
>
> I am using the following:
>
> void* my_compile(char *code)
> {
>    SEXP cmdSexp, cmdExpr = R_NilValue;
>    ParseStatus status;
>
>    PROTECT (cmdSexp = allocVector (STRSXP, 1));
>    SET_STRING_ELT (cmdSexp, 0, mkChar (code));
>    PROTECT (cmdExpr = R_ParseVector (cmdSexp,-1,&status,
>        R_NilValue));
>    UNPROTECT_PTR (cmdSexp);
>
>    if (status != PARSE_OK) {
>        return (void*)0;
>    } else {
>        return (void*)cmdExpr;
>    }
> }
>
> Regards,
> Peter
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Parsing code with newlines

Jeffrey Horner
In reply to this post by Duncan Murdoch
Duncan Murdoch wrote on 08/04/2008 08:11 AM:

> On 04/08/2008 8:50 AM, Peter Jaeger wrote:
>> Dear List,
>>
>> When I try to parse code containing newline characters with
>> R_ParseVector, I
>> get a compilation error. How can I compile code that includes comments
>> and
>> newlines?
>>
>> I am using the following:
>>
>> void* my_compile(char *code)
>> {
>>     SEXP cmdSexp, cmdExpr = R_NilValue;
>>     ParseStatus status;
>>
>>     PROTECT (cmdSexp = allocVector (STRSXP, 1));
>>     SET_STRING_ELT (cmdSexp, 0, mkChar (code));
>>     PROTECT (cmdExpr = R_ParseVector (cmdSexp,-1,&status,
>>         R_NilValue));
>>     UNPROTECT_PTR (cmdSexp);
>>
>>     if (status != PARSE_OK) {
>>         return (void*)0;
>>     } else {
>>         return (void*)cmdExpr;
>>     }
>> }
>
> You need to put together a reproducible example if you want help.
> parse() uses R_ParseVector, and it handles newlines fine.

As a follow up, it'd be good to know the exact value of your status
variable. You've only tested for PARSE_OK, but there's also
PARSE_INCOMPLETE, PARSE_NULL, PARSE_ERROR, and PARSE_EOF.

Here's a function that I use in rapache that not only parses but
executes the code as well. While it doesn't really help you with your
parsing problem, I suspect that you'll want to do something with the
returned expressions after you've parsed the code, and the point is that
R_ParseVector() can return more than one expression. Thus you'll need to
loop through each expression and eval() it separately. The function
returns 1 when the code was parsed and executed, and 0 on failure.

(it's been awhile since I've had to touch this, and although I do keep
up with R development, my skills at remembering which macros and
functions to use are lacking. Anyone spot something I shouldn't be
doing? like mkChar() or some such? )

static int ExecRCode(const char *code, SEXP env, int *error){
        ParseStatus status;
        SEXP cmd, expr, fun;
        int i, errorOccurred=1, retval = 1;

        PROTECT(cmd = allocVector(STRSXP, 1));
        SET_STRING_ELT(cmd, 0, mkChar(code));

        /* fprintf(stderr,"ExecRCode(%s)\n",code); */
        PROTECT(expr = R_ParseVector(cmd, -1, &status,R_NilValue));

        switch (status){
                case PARSE_OK:
                        for(i = 0; i < length(expr); i++){
                                R_tryEval(VECTOR_ELT(expr, i),env,&errorOccurred);
                                if (error) *error = errorOccurred;
                                if (errorOccurred){
                                        retval=0;
                                        break;
                                }
                        }
                break;
                case PARSE_INCOMPLETE:
                case PARSE_NULL:
                case PARSE_ERROR:
                case PARSE_EOF:
                default:
                        retval=0;
                break;
        }
        UNPROTECT(2);

        return retval;
}



>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


--
http://biostat.mc.vanderbilt.edu/JeffreyHorner

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Parsing code with newlines

Mikhail Titov-2
In reply to this post by Prof Brian Ripley
Hello!

This is my first post here. I came across the very same problem.
It can be reproduced within modified tests/Embedding/RParseEval.c

Actually this example has another issue, namely it doesn't wrap
everything in R_ToplevelExec . This is a major show stopper for
newcomers as that function is barely mentioned anywhere and longjmp into
terminated setuploop function followed by R_suicide look like a mystery.

Error: bad value
Fatal error: unable to initialize the JIT


That aside, here is the code with newlines that fails to parse. I hope
it will paste alright here.


#include "embeddedRCall.h"
#include <R_ext/Parse.h>

int
main(int argc, char *argv[])
{
    SEXP e, tmp;
    int hadError;
    ParseStatus status;

    init_R(argc, argv);

    PROTECT(tmp = mkString("\n\r ls()"));
    PROTECT(e = R_ParseVector(tmp, 1, &status, R_NilValue));
    if (status != PARSE_OK)
    {
        printf("boo boo\n");
    }
    else
    {
        PrintValue(e);
        R_tryEval(VECTOR_ELT(e,0), R_GlobalEnv, &hadError);
    }
    UNPROTECT(2);

    end_R();
    return(0);
}


--
Mikhail

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Parsing code with newlines

Tomas Kalibera
On 4/5/19 8:14 AM, Mikhail Titov wrote:
> Hello!
>
> This is my first post here. I came across the very same problem.
> It can be reproduced within modified tests/Embedding/RParseEval.c

Please check https://www.r-project.org/posting-guide.html and update
your post if you still need to get help here - from your current post I
am not sure what you did, what was the error you got and from which
tool, why you think the error was a result of something not working
correctly/as documented, etc. The original post with the same subject
you are probably referring to had the same problem.

Please also note that "tests" (tests/Embedding/RParseEval.c) are not
examples - if they do not catch R errors in some cases that is perfectly
ok, they also may use internal API that is indeed not documented e.g. in
Writing R Extensions. Note Writing R Extensions has a section on
embedding R and on cleanup handlers.

Best
Tomas

>
> Actually this example has another issue, namely it doesn't wrap
> everything in R_ToplevelExec . This is a major show stopper for
> newcomers as that function is barely mentioned anywhere and longjmp into
> terminated setuploop function followed by R_suicide look like a mystery.
>
> Error: bad value
> Fatal error: unable to initialize the JIT
>
>
> That aside, here is the code with newlines that fails to parse. I hope
> it will paste alright here.
>
>
> #include "embeddedRCall.h"
> #include <R_ext/Parse.h>
>
> int
> main(int argc, char *argv[])
> {
>      SEXP e, tmp;
>      int hadError;
>      ParseStatus status;
>
>      init_R(argc, argv);
>
>      PROTECT(tmp = mkString("\n\r ls()"));
>      PROTECT(e = R_ParseVector(tmp, 1, &status, R_NilValue));
>      if (status != PARSE_OK)
>      {
>          printf("boo boo\n");
>      }
>      else
>      {
>          PrintValue(e);
>          R_tryEval(VECTOR_ELT(e,0), R_GlobalEnv, &hadError);
>      }
>      UNPROTECT(2);
>
>      end_R();
>      return(0);
> }
>
>
> --
> Mikhail
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Parsing code with newlines

Mikhail Titov-2
On Wed, Apr 10, 2019 at  5:06 AM, Tomas Kalibera <[hidden email]> wrote:
>> This is my first post here. I came across the very same problem.
>> It can be reproduced within modified tests/Embedding/RParseEval.c
>
> Please check https://www.r-project.org/posting-guide.html and update
> your post if you still need to get help here - from your current post
> I am not sure what you did, what was the error you got and from which
> tool, why you think the error was a result of something not working
> correctly/as documented, etc. The original post with the same subject
> you are probably referring to had the same problem.

The original post is linked via e-mail headers however it goes back a
decade. It shows up linked as a thread alright in Gnus. Hence I thought
it would be alright to jump straight to the matter.

Here is the link to original discussion
https://stat.ethz.ch/pipermail/r-devel/2008-August/050332.html

At this point, I would like to report two bugs in "Writing R Extensions"
documentation. From that document it is not clear why line feeds (0x0A)
have to be removed from the input string to be parsed. Also nowhere in
that document it mentions R_TopLevelExec if parsing needs to be done in
the outer context. That is not when our C function is called from R, but
when we are trying to parse R code in C directly outside of main loop.
These are big show stoppers for newcomers.

The barely modified test code I had in my previous post, does not parse
what would seem a legit sample string "\r\n ls()". However, it does
parse alright "\n ls()". Nowhere in the docs the intolerance to line
feeds is mentioned. It is reproducible from R console as well.

,----[ R console session ]
| > parse(text="\r\n ls()")
| Error in parse(text = "\r\n ls()") : <text>:1:1: unexpected input
| 1:
|     ^
| >
`----

Another problem with the aforementioned documentation is parsing
erroneous expressions like "deadbeef<-function(,bad){}" in top level
context. Instead of returning an error from parsing, it crashes
(with R_suicide) unless the call is wrapped in R_TopLevelExec.

> Please also note that "tests" (tests/Embedding/RParseEval.c) are not
> examples - if they do not catch R errors in some cases that is
> perfectly ok, they also may use internal API that is indeed not
> documented e.g. in Writing R Extensions.

Where would be a good example on top level context parsing then? I have
no problems skipping error checks and/or with the use of undocumented
functions. However I would rather prefer to avoid major unexpected
crashes. That example does NOT use any of the undocumented API and therefore is
misleading. I believe it SHOULD include R_TopLevelExec and that function
SHOULD be in the docs.

> Note Writing R Extensions has a section on embedding R and on cleanup
> handlers.

I have no problems with the rest of the document on embedding and clean
up in general.

>> Actually this example has another issue, namely it doesn't wrap
>> everything in R_ToplevelExec . This is a major show stopper for
>> newcomers as that function is barely mentioned anywhere and longjmp into
>> terminated setuploop function followed by R_suicide look like a mystery.
>>
>> Error: bad value
>> Fatal error: unable to initialize the JIT
>>
>>
>> That aside, here is the code with newlines that fails to parse. I hope
>> it will paste alright here.
>>
>>
>> #include "embeddedRCall.h"
>> #include <R_ext/Parse.h>
>>
>> int
>> main(int argc, char *argv[])
>> {
>>      SEXP e, tmp;
>>      int hadError;
>>      ParseStatus status;
>>
>>      init_R(argc, argv);
>>
>>      PROTECT(tmp = mkString("\n\r ls()"));
>>      PROTECT(e = R_ParseVector(tmp, 1, &status, R_NilValue));
>>      if (status != PARSE_OK)
>>      {
>>          printf("boo boo\n");
>>      }
>>      else
>>      {
>>          PrintValue(e);
>>          R_tryEval(VECTOR_ELT(e,0), R_GlobalEnv, &hadError);
>>      }
>>      UNPROTECT(2);
>>
>>      end_R();
>>      return(0);
>> }

--
Mikhail

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Parsing code with newlines

Peter Dalgaard-2
”\r" is CR not LF. On systems that use CRLF as newline, the combination should be "\n" at the C (or R) level.

However, I suppose there is no particular reason not to treat CR as whitespace, as does happen with FF and HT.

-pd

> On 11 Apr 2019, at 01:59 , Mikhail Titov <[hidden email]> wrote:
>
> The barely modified test code I had in my previous post, does not parse
> what would seem a legit sample string "\r\n ls()". However, it does
> parse alright "\n ls()". Nowhere in the docs the intolerance to line
> feeds is mentioned. It is reproducible from R console as well.
>
> ,----[ R console session ]
> | > parse(text="\r\n ls()")
> | Error in parse(text = "\r\n ls()") : <text>:1:1: unexpected input
> | 1:
> |     ^
> | >
> `----
>

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel