Dear R Developers,

I would like to pick up back again the issue of 64 bits integers with R:

http://r.789695.n4.nabble.com/Re-R-support-for-64-bit-integers-td2320024.html*** CURRENT SITUATION ***

At the moment, as regards integers, all the following are the same type:

* length of an R vector

* R integer type

* C int type (Fixed at 32 bits: In practice)

* Fortran INTEGER type (Fixed at 32 bits: By Standard)

*** OBJECTIVE ***

Introducing 64-bit integers natively into "base R", notably if it was

also allowed using them for indices.

And, ideally, we would like:

* length of an R vector.

* R integer type.

To become 64bit.

This would allow to free ourselves from the increasingly relevant

maximum-atomic-object-length = 2^31 problem.

*** DIFFICULTIES ***

a) If both the R length type and the R integer type become the same

64bit type and replace the current integer type -> Then every compiled

package would have to change to declare the arguments as int64 (or

long, on most 64bit systems) and INTEGER*8.

b) If the R length type changes to something /different/ from the

integer type then any compiled code has to be checked to see if C int

arguments are lengths or integers, which is more work and more

error-prone.

c) On the other hand, changing the integer type to 64bit -> Will

presumably make integer code run noticeably more slowly on 32bit

systems.

In any case, the changes could be postponed by having an option to

.C/.Call forcing lengths and integers to be passed as 32-bit -> This

would mean that: The code couldn't use large integers or large

vectors, but it would keep working indefinitely.

*** 2010 SOLUTION***

There were 2 possibilities at the time:

a) Using 64-bit integers.

b) Using "double precision integers": Solution Finally Chosen at 2010.

Reason: In order that not all R packages using compiled code had to be

patched extensively.

*** BIT64 PACKAGE***

Nowdays, we have 'bit64' Package, which provides serializable S3

atomic 64bit (signed) integers (+-2^63).

But this are not a replacement for 32bit integers, as integer64 are:

* Not supported for subscripting.

* Have different semantics when combined with double, e.g. integer64 +

double => integer64.

https://cran.r-project.org/web/packages/bit64/index.html*** PROPOSAL ***

Instead of seeing 64 integers as a substitution to 32 bit integers,

these could be included into base R as a new / additional data type,

which co-exists with:

a) Using 64-bit integers.

b) Using "double precision integers".

This new data type could:

* Be based (ported) from "bit64" package:

https://github.com/cran/bit64 * Allow to use int64 Data Type for Subscripting.

* Have Coercion Rules such as:

as.integer64()

is.integer64()

integer + integer64 => integer

double + integer64 => double

* Be included with a double "L". e.g.: 34783274893274892334279LL (This

would be integer64, not double).

By doing so, existing packages would not need to be recompiled, and

could keep on working as already do. So we would not introduce

backward incompatible change.

*** FINAL KEY IDEA ***

Take already developed "bit64" Package (

https://github.com/cran/bit64)

as base for building a new Integer64 Type System which co-exists

natively in R with Integer32 (Just as "parallel" package was included

in the past into base R for example), and build on top of it

improvements.

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel