"found 4 marked UTF-8 strings" during check of package... but where !

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

"found 4 marked UTF-8 strings" during check of package... but where !

R help mailing list-2
Dear members,

I want submit to CRAN a new version of a package that I maintain. When I
check locally "as-cran" no note or error are reported but the link after
submission reports several notes and one warning:

For example:

using R Under development (unstable) (2017-03-05 r72309)
using platform: x86_64-apple-darwin16.4.0 (64-bit)
using session charset: UTF-8
...
checking extension type ... Package
this is package ‘embryogrowth’ version ‘6.4’
package encoding: UTF-8
...
checking data for non-ASCII characters ... NOTE
   Note: found 4 marked UTF-8 strings

I have the same with
using R version 3.3.0 (2016-05-03)
using platform: x86_64-apple-darwin13.4.0 (64-bit)

but not with some others such as r-devel-linux-x86_64-debian-gcc

Based on the message, "Note: found 4 marked UTF-8 strings", it seems
that "4 marked UTF-8 strings" are present in the package and it is a
problem...

Is there any solution to know in which file?

Thanks
Marc

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "found 4 marked UTF-8 strings" during check of package... but where !

Michael Friendly
Try:

tools:::showNonASCIIfile(file)

On 3/10/2017 5:52 AM, Marc Girondot via R-help wrote:
> Based on the message, "Note: found 4 marked UTF-8 strings", it seems
> that "4 marked UTF-8 strings" are present in the package and it is a
> problem...
>
> Is there any solution to know in which file?

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "found 4 marked UTF-8 strings" during check of package... but where !

Duncan Murdoch-2
In reply to this post by R help mailing list-2
On 10/03/2017 2:52 AM, Marc Girondot via R-help wrote:

> Dear members,
>
> I want submit to CRAN a new version of a package that I maintain. When I
> check locally "as-cran" no note or error are reported but the link after
> submission reports several notes and one warning:
>
> For example:
>
> using R Under development (unstable) (2017-03-05 r72309)
> using platform: x86_64-apple-darwin16.4.0 (64-bit)
> using session charset: UTF-8
> ...
> checking extension type ... Package
> this is package ‘embryogrowth’ version ‘6.4’
> package encoding: UTF-8
> ...
> checking data for non-ASCII characters ... NOTE
>    Note: found 4 marked UTF-8 strings
>
> I have the same with
> using R version 3.3.0 (2016-05-03)
> using platform: x86_64-apple-darwin13.4.0 (64-bit)
>
> but not with some others such as r-devel-linux-x86_64-debian-gcc
>
> Based on the message, "Note: found 4 marked UTF-8 strings", it seems
> that "4 marked UTF-8 strings" are present in the package and it is a
> problem...
>
> Is there any solution to know in which file?

It's one containing an object coming from your data directory.

R won't give more detail than that, but if you still can't guess, you
could get some idea by debugging the check code:

debug(tools:::.check_package_datasets)
tools:::.check_package_datasets(pkg)

where pkg contains the path to the package source code.  That function
does the checking one variable at a time.

Duncan Murdoch

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "found 4 marked UTF-8 strings" during check of package... but where !

R help mailing list-2
Thanks Duncan and Michael,

Indeed I have data file with utf-8 characters inside. In the
DESCRIPTION, I have the line Encoding: UTF-8
but it seems to not be sufficient.
In each R page for these data, I have also :
#' @docType data
#' @encoding UTF-8

But I still have the notes during check when I try to submit the package
in CRAN (not in local --as-cran check).

How I could "say" that these data have utf-8 characters inside?

Thanks
Marc


Le 10/03/2017 à 15:24, Duncan Murdoch a écrit :

> On 10/03/2017 2:52 AM, Marc Girondot via R-help wrote:
>> Dear members,
>>
>> I want submit to CRAN a new version of a package that I maintain. When I
>> check locally "as-cran" no note or error are reported but the link after
>> submission reports several notes and one warning:
>>
>> For example:
>>
>> using R Under development (unstable) (2017-03-05 r72309)
>> using platform: x86_64-apple-darwin16.4.0 (64-bit)
>> using session charset: UTF-8
>> ...
>> checking extension type ... Package
>> this is package ‘embryogrowth’ version ‘6.4’
>> package encoding: UTF-8
>> ...
>> checking data for non-ASCII characters ... NOTE
>>    Note: found 4 marked UTF-8 strings
>>
>> I have the same with
>> using R version 3.3.0 (2016-05-03)
>> using platform: x86_64-apple-darwin13.4.0 (64-bit)
>>
>> but not with some others such as r-devel-linux-x86_64-debian-gcc
>>
>> Based on the message, "Note: found 4 marked UTF-8 strings", it seems
>> that "4 marked UTF-8 strings" are present in the package and it is a
>> problem...
>>
>> Is there any solution to know in which file?
>
> It's one containing an object coming from your data directory.
>
> R won't give more detail than that, but if you still can't guess, you
> could get some idea by debugging the check code:
>
> debug(tools:::.check_package_datasets)
> tools:::.check_package_datasets(pkg)
>
> where pkg contains the path to the package source code.  That function
> does the checking one variable at a time.
>
> Duncan Murdoch
>
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: "found 4 marked UTF-8 strings" during check of package... but where !

Duncan Murdoch-2
On 10/03/2017 10:44 AM, Marc Girondot via R-help wrote:
> Thanks Duncan and Michael,
>
> Indeed I have data file with utf-8 characters inside. In the
> DESCRIPTION, I have the line Encoding: UTF-8
> but it seems to not be sufficient.

That line describes how text files in your package are to be interpreted.

> In each R page for these data, I have also :
> #' @docType data
> #' @encoding UTF-8

R ignores those lines, but presumably you're running Roxygen2, which
will use them when it produces the .Rd files for the help topics.  They
have nothing to do with the data itself.

>
> But I still have the notes during check when I try to submit the package
> in CRAN (not in local --as-cran check).
>
> How I could "say" that these data have utf-8 characters inside?

If those are intentional, just say so when you submit to CRAN in the
submission comments, but do read the comments about portability in
section 1.6.3 of the Writing R Extensions manual.

Duncan Murdoch

P.S. This question doesn't belong in R-help, it belongs in
R-package-devel.  If you have any followup questions, please post them
there.


>
> Thanks
> Marc
>
>
> Le 10/03/2017 à 15:24, Duncan Murdoch a écrit :
>> On 10/03/2017 2:52 AM, Marc Girondot via R-help wrote:
>>> Dear members,
>>>
>>> I want submit to CRAN a new version of a package that I maintain. When I
>>> check locally "as-cran" no note or error are reported but the link after
>>> submission reports several notes and one warning:
>>>
>>> For example:
>>>
>>> using R Under development (unstable) (2017-03-05 r72309)
>>> using platform: x86_64-apple-darwin16.4.0 (64-bit)
>>> using session charset: UTF-8
>>> ...
>>> checking extension type ... Package
>>> this is package ‘embryogrowth’ version ‘6.4’
>>> package encoding: UTF-8
>>> ...
>>> checking data for non-ASCII characters ... NOTE
>>>    Note: found 4 marked UTF-8 strings
>>>
>>> I have the same with
>>> using R version 3.3.0 (2016-05-03)
>>> using platform: x86_64-apple-darwin13.4.0 (64-bit)
>>>
>>> but not with some others such as r-devel-linux-x86_64-debian-gcc
>>>
>>> Based on the message, "Note: found 4 marked UTF-8 strings", it seems
>>> that "4 marked UTF-8 strings" are present in the package and it is a
>>> problem...
>>>
>>> Is there any solution to know in which file?
>>
>> It's one containing an object coming from your data directory.
>>
>> R won't give more detail than that, but if you still can't guess, you
>> could get some idea by debugging the check code:
>>
>> debug(tools:::.check_package_datasets)
>> tools:::.check_package_datasets(pkg)
>>
>> where pkg contains the path to the package source code.  That function
>> does the checking one variable at a time.
>>
>> Duncan Murdoch
>>
>>
>>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...