longint

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

longint

Benjamin Tyner
Hi

In my R package, imagine I have a C function defined:

    void myfunc(int *x) {
       // some code
    }

but when I call it, I pass it a pointer to a longint instead of a
pointer to an int. Could this practice potentially result in a segfault?

Regards
Ben

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

Duncan Murdoch-2
On 15/08/2018 7:08 AM, Benjamin Tyner wrote:

> Hi
>
> In my R package, imagine I have a C function defined:
>
>      void myfunc(int *x) {
>         // some code
>      }
>
> but when I call it, I pass it a pointer to a longint instead of a
> pointer to an int. Could this practice potentially result in a segfault?

I don't think the passing would cause a segfault, but "some code" might
be expecting a positive number, and due to the type error you could pass
in a positive longint and have it interpreted as a negative int.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

Brian Ripley


> On 15 Aug 2018, at 12:48, Duncan Murdoch <[hidden email]> wrote:
>
>> On 15/08/2018 7:08 AM, Benjamin Tyner wrote:
>> Hi
>> In my R package, imagine I have a C function defined:
>>     void myfunc(int *x) {
>>        // some code
>>     }
>> but when I call it, I pass it a pointer to a longint instead of a
>> pointer to an int. Could this practice potentially result in a segfault?
>
> I don't think the passing would cause a segfault, but "some code" might be expecting a positive number, and due to the type error you could pass in a positive longint and have it interpreted as a negative int.

Are you thinking only of a little-endian system?  A 32-bit lookup of a pointer to a 64-bit area could read the wrong half and get a completely different value.

>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

Hervé Pagès-2
No segfault but a BIG warning from the compiler. That's because
dereferencing the pointer inside your myfunc() function will
produce an int that is not predictable i.e. it is system-dependent.
Its value will depend on sizeof(long int) (which is not
guaranteed to be 8) and on the endianness of the system.

Also if the pointer you pass in the call to the function is
an array of long ints, then pointer arithmetic inside your myfunc()
won't necessarily take you to the array element that you'd expect.

Note that there are very specific situations where you can actually
do this kind of things e.g. in the context of writing a callback
function to pass to qsort(). See 'man 3 qsort' if you are on a Unix
system. In that case pointers to void and explicit casts should
be used. If done properly, this is portable code and the compiler won't
issue warnings.

H.


On 08/15/2018 07:05 AM, Brian Ripley wrote:

>
>
>> On 15 Aug 2018, at 12:48, Duncan Murdoch <[hidden email]> wrote:
>>
>>> On 15/08/2018 7:08 AM, Benjamin Tyner wrote:
>>> Hi
>>> In my R package, imagine I have a C function defined:
>>>      void myfunc(int *x) {
>>>         // some code
>>>      }
>>> but when I call it, I pass it a pointer to a longint instead of a
>>> pointer to an int. Could this practice potentially result in a segfault?
>>
>> I don't think the passing would cause a segfault, but "some code" might be expecting a positive number, and due to the type error you could pass in a positive longint and have it interpreted as a negative int.
>
> Are you thinking only of a little-endian system?  A 32-bit lookup of a pointer to a 64-bit area could read the wrong half and get a completely different value.
>
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Ocv60csJFJClZotWkJIMwUdjIc&e=
>
> ______________________________________________
> [hidden email] mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Ocv60csJFJClZotWkJIMwUdjIc&e=
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

Benjamin Tyner
Thanks for the replies and for confirming my suspicion.

Interestingly, src/include/S.h uses a trick:

    #define longint int

and so does the nlme package (within src/init.c).

On 08/15/2018 02:47 PM, Hervé Pagès wrote:

> No segfault but a BIG warning from the compiler. That's because
> dereferencing the pointer inside your myfunc() function will
> produce an int that is not predictable i.e. it is system-dependent.
> Its value will depend on sizeof(long int) (which is not
> guaranteed to be 8) and on the endianness of the system.
>
> Also if the pointer you pass in the call to the function is
> an array of long ints, then pointer arithmetic inside your myfunc()
> won't necessarily take you to the array element that you'd expect.
>
> Note that there are very specific situations where you can actually
> do this kind of things e.g. in the context of writing a callback
> function to pass to qsort(). See 'man 3 qsort' if you are on a Unix
> system. In that case pointers to void and explicit casts should
> be used. If done properly, this is portable code and the compiler won't
> issue warnings.
>
> H.
>
>
> On 08/15/2018 07:05 AM, Brian Ripley wrote:
>>
>>
>>> On 15 Aug 2018, at 12:48, Duncan Murdoch <[hidden email]>
>>> wrote:
>>>
>>>> On 15/08/2018 7:08 AM, Benjamin Tyner wrote:
>>>> Hi
>>>> In my R package, imagine I have a C function defined:
>>>>      void myfunc(int *x) {
>>>>         // some code
>>>>      }
>>>> but when I call it, I pass it a pointer to a longint instead of a
>>>> pointer to an int. Could this practice potentially result in a
>>>> segfault?
>>>
>>> I don't think the passing would cause a segfault, but "some code"
>>> might be expecting a positive number, and due to the type error you
>>> could pass in a positive longint and have it interpreted as a
>>> negative int.
>>
>> Are you thinking only of a little-endian system?  A 32-bit lookup of
>> a pointer to a 64-bit area could read the wrong half and get a
>> completely different value.
>>
>>>
>>> Duncan Murdoch
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Ocv60csJFJClZotWkJIMwUdjIc&e= 
>>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Ocv60csJFJClZotWkJIMwUdjIc&e= 
>>
>>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

R devel mailing list
Note that include/S.h contains
  /*
     This is a legacy header and no longer documented.
     Code using it should be converted to use R.h
  */
  ...
  /* is this a good idea? - conflicts with many versions of f2c.h */
  # define longint int

S.h was meant to be used while converting to R C code written for S or S+.
S/S+ "integers" are represented as C "long ints", whose size depends on
the architecture, while R "integers" are represented as 32-bit C "ints".
"longint" was invented to hide this difference.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Aug 15, 2018 at 5:32 PM, Benjamin Tyner <[hidden email]> wrote:

> Thanks for the replies and for confirming my suspicion.
>
> Interestingly, src/include/S.h uses a trick:
>
>    #define longint int
>
> and so does the nlme package (within src/init.c).
>
> On 08/15/2018 02:47 PM, Hervé Pagès wrote:
>
>> No segfault but a BIG warning from the compiler. That's because
>> dereferencing the pointer inside your myfunc() function will
>> produce an int that is not predictable i.e. it is system-dependent.
>> Its value will depend on sizeof(long int) (which is not
>> guaranteed to be 8) and on the endianness of the system.
>>
>> Also if the pointer you pass in the call to the function is
>> an array of long ints, then pointer arithmetic inside your myfunc()
>> won't necessarily take you to the array element that you'd expect.
>>
>> Note that there are very specific situations where you can actually
>> do this kind of things e.g. in the context of writing a callback
>> function to pass to qsort(). See 'man 3 qsort' if you are on a Unix
>> system. In that case pointers to void and explicit casts should
>> be used. If done properly, this is portable code and the compiler won't
>> issue warnings.
>>
>> H.
>>
>>
>> On 08/15/2018 07:05 AM, Brian Ripley wrote:
>>
>>>
>>>
>>> On 15 Aug 2018, at 12:48, Duncan Murdoch <[hidden email]>
>>>> wrote:
>>>>
>>>> On 15/08/2018 7:08 AM, Benjamin Tyner wrote:
>>>>> Hi
>>>>> In my R package, imagine I have a C function defined:
>>>>>      void myfunc(int *x) {
>>>>>         // some code
>>>>>      }
>>>>> but when I call it, I pass it a pointer to a longint instead of a
>>>>> pointer to an int. Could this practice potentially result in a
>>>>> segfault?
>>>>>
>>>>
>>>> I don't think the passing would cause a segfault, but "some code" might
>>>> be expecting a positive number, and due to the type error you could pass in
>>>> a positive longint and have it interpreted as a negative int.
>>>>
>>>
>>> Are you thinking only of a little-endian system?  A 32-bit lookup of a
>>> pointer to a 64-bit area could read the wrong half and get a completely
>>> different value.
>>>
>>>
>>>> Duncan Murdoch
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>>>> hz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84V
>>>> tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0
>>>> y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Oc
>>>> v60csJFJClZotWkJIMwUdjIc&e=
>>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>>> hz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84V
>>> tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0
>>> y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Oc
>>> v60csJFJClZotWkJIMwUdjIc&e=
>>>
>>>
>>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

Dirk Eddelbuettel
In reply to this post by Benjamin Tyner

On 15 August 2018 at 20:32, Benjamin Tyner wrote:
| Thanks for the replies and for confirming my suspicion.
|
| Interestingly, src/include/S.h uses a trick:
|
|     #define longint int
|
| and so does the nlme package (within src/init.c).

As Bill Dunlap already told you, this is a) ancient and b) was concerned with
the int as 16 bit to 32 bit transition period. Ie a long time ago. Old C
programmers remember.

You should preferably not even use 'long int' on the other side but rely on
the fact that all compiler nowadays allow you to specify exactly what size is
used via int64_t (long), int32_t (int), ... and the unsigned cousins (which R
does not have).  So please receive the value as a int64_t and then cast it to
an int32_t -- which corresponds to R's notion of an integer on every platform.

And please note that that conversion is lossy.  If you must keep 64 bits then
the bit64 package by Jens Oehlschlaegel is good and eg fully supported inside
data.table. We use it for 64-bit integers as nanosecond timestamps in our
nanotime package (which has some converters).

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

Hervé Pagès-2
On 08/16/2018 05:12 AM, Dirk Eddelbuettel wrote:

>
> On 15 August 2018 at 20:32, Benjamin Tyner wrote:
> | Thanks for the replies and for confirming my suspicion.
> |
> | Interestingly, src/include/S.h uses a trick:
> |
> |     #define longint int
> |
> | and so does the nlme package (within src/init.c).
>
> As Bill Dunlap already told you, this is a) ancient and b) was concerned with
> the int as 16 bit to 32 bit transition period. Ie a long time ago. Old C
> programmers remember.
>
> You should preferably not even use 'long int' on the other side but rely on
> the fact that all compiler nowadays allow you to specify exactly what size is
> used via int64_t (long), int32_t (int), ... and the unsigned cousins (which R
> does not have).  So please receive the value as a int64_t and then cast it to
> an int32_t -- which corresponds to R's notion of an integer on every platform.

Only on Intel platforms int is 32 bits. Strictly speaking int is only
required to be >= 16 bits. Who knows what the size of an int is on
the Sunway TaihuLight for example ;-)

H.

>
> And please note that that conversion is lossy.  If you must keep 64 bits then
> the bit64 package by Jens Oehlschlaegel is good and eg fully supported inside
> data.table. We use it for 64-bit integers as nanosecond timestamps in our
> nanotime package (which has some converters).
>
> Dirk
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

Brian Ripley
On 16/08/2018 18:33, Hervé Pagès wrote:

> On 08/16/2018 05:12 AM, Dirk Eddelbuettel wrote:
>>
>> On 15 August 2018 at 20:32, Benjamin Tyner wrote:
>> | Thanks for the replies and for confirming my suspicion.
>> |
>> | Interestingly, src/include/S.h uses a trick:
>> |
>> |     #define longint int
>> |
>> | and so does the nlme package (within src/init.c).
>>
>> As Bill Dunlap already told you, this is a) ancient and b) was
>> concerned with
>> the int as 16 bit to 32 bit transition period. Ie a long time ago. Old C
>> programmers remember.
>>
>> You should preferably not even use 'long int' on the other side but
>> rely on
>> the fact that all compiler nowadays allow you to specify exactly what
>> size is
>> used via int64_t (long), int32_t (int), ... and the unsigned cousins

Well, not all compilers.  Those types were introduced in C99, but are
optional in that standard and in C11 and C++11.  I have not checked
C++1[47], but expect they are also optional there.  int_fast64_t is not
optional in C99, so R uses that if int64_t is not supported.

[It is easy to overlook that they are optional in C99 and at one time R
assumed them.]

>> (which R
>> does not have).  So please receive the value as a int64_t and then
>> cast it to
>> an int32_t -- which corresponds to R's notion of an integer on every
>> platform.
>
> Only on Intel platforms int is 32 bits. Strictly speaking int is only
> required to be >= 16 bits. Who knows what the size of an int is on
> the Sunway TaihuLight for example ;-)

R's configure checks that int is 32 bit and will not compile without it
(src/main/arithmetic.c) ... so int and int32_t are the same on all
platforms where the latter is defined.

--
Brian D. Ripley,                  [hidden email]
Emeritus Professor of Applied Statistics, University of Oxford

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: longint

Hervé Pagès-2
On 08/16/2018 11:30 AM, Prof Brian Ripley wrote:
> On 16/08/2018 18:33, Hervé Pagès wrote:
...
>>
>> Only on Intel platforms int is 32 bits. Strictly speaking int is only
>> required to be >= 16 bits. Who knows what the size of an int is on
>> the Sunway TaihuLight for example ;-)
>
> R's configure checks that int is 32 bit and will not compile without it
> (src/main/arithmetic.c) ... so int and int32_t are the same on all
> platforms where the latter is defined.

Good to know. Thanks for the clarification!

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel