Possible bug in formatC

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Possible bug in formatC

Randy Cragun
I do not know if this is a bug or a case of improper documentation. The
documentation for formatC() implies that the difference between the options
format="f" and format="g" is that with "g", scientific format is sometimes
used. There is another difference between them that is not mentioned in the
documentation. drop0trailing=FALSE is ignored when format is set to "g"
unless flag contains "#" (this is the documented behavior for format="fg").
For instance, the first line below return " 2.5", whereas the second returns
the expected  "2.50".

formatC(2.50, format="g", digits=3, drop0trailing=F)
formatC(2.50, format="g", digits=3, drop0trailing=F, flag="#")


----------------------
sessionInfo():

R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252  
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C

[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

loaded via a namespace (and not attached):
[1] compiler_3.5.3 tools_3.5.3

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Possible bug in formatC

Martin Maechler
>>>>> Randy Cragun
>>>>>     on Thu, 30 May 2019 00:26:38 -0400 writes:

    > I do not know if this is a bug or a case of improper
    > documentation. The documentation for formatC() implies
    > that the difference between the options format="f" and
    > format="g" is that with "g", scientific format is
    > sometimes used. There is another difference between them
    > that is not mentioned in the
    > documentation. drop0trailing=FALSE is ignored when format
    > is set to "g" unless flag contains "#" (this is the
    > documented behavior for format="fg").  For instance, the
    > first line below return " 2.5", whereas the second returns
    > the expected "2.50".

    > formatC(2.50, format="g", digits=3, drop0trailing=F)
    > formatC(2.50, format="g", digits=3, drop0trailing=F, flag="#")

Well, you have a point that this behavior is not documented in
details (and I assume the text reference "Kernighan and Richie"
is less available for the typical R users than in 1995...)

However, formatC() has been unchanged like that for close to 20
years, so we will most probably not change the function's behavior.

Notice that   drop0trailing=FALSE  is really the default
(and format="g" is also the default for non-character / non-integer numbers).

The design of formatC(*) for numbers has entailed to default to
format="g" which drops trailing zeros most of the time
[whereas the format = "f" does not, unless drop0trailing=TRUE is set.]

Lastly, note that 2.50 and 2.5 are exactly identical as R
numbers; so, your two examples above are identical to the much shorter

   formatC(2.5, digits=3)
   formatC(2.5, digits=3, flag="#")

If you want "extraneous" trailing zeros, the "f" format is your
friend most of the time anyway:

> t(sapply(1:8, function(D) formatC(c(2.5,pi), format="f", digits= D)))
     [,1]         [,2]        
[1,] "2.5"        "3.1"      
[2,] "2.50"       "3.14"      
[3,] "2.500"      "3.142"    
[4,] "2.5000"     "3.1416"    
[5,] "2.50000"    "3.14159"  
[6,] "2.500000"   "3.141593"  
[7,] "2.5000000"  "3.1415927"
[8,] "2.50000000" "3.14159265"
>

I will add more information to the formatC()  help
page, notably not only mentioning but explaining most of the
'flag's that are available typically(*).

Thank you for raising the issue.

Martin Maechler
ETH Zurich and R Core Team

--
*) as formatC() interfaces to the OS C library, some of the
   availability and meaning of 'flags' is platform dependent.



    > ----------------------
    > sessionInfo():

    > R version 3.5.3 (2019-03-11) Platform:
    > x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >=
    > 8 x64 (build 9200)

    > Matrix products: default

    > locale: [1] LC_COLLATE=English_United States.1252
    > LC_CTYPE=English_United States.1252 [3]
    > LC_MONETARY=English_United States.1252 LC_NUMERIC=C

    > [5] LC_TIME=English_United States.1252

    > attached base packages: [1] stats graphics grDevices utils
    > datasets methods base

    > loaded via a namespace (and not attached): [1]
    > compiler_3.5.3 tools_3.5.3

    > ______________________________________________
    > [hidden email] mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel