using 2D array of SEXP for creating dataframe

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

using 2D array of SEXP for creating dataframe

Sandip Nandi
Hi ,

For our production package i need to create a dataframein C . So I wrote
the following code

SEXP dfm ,head,df , dfint , dfStr,lsnm;

*SEXP  valueVector[2];*

char *ab[3] = {"aa","vv","gy"};
int sn[3] ={99,89,12};
char *listnames[2] = {"int","string"};
int i,j;

//=============================

PROTECT(df = allocVector(VECSXP,2));

*PROTECT(valueVector[0] = allocVector(REALSXP,3));*
*PROTECT(valueVector[1] = allocVector(VECSXP,3));*


PROTECT(lsnm = allocVector(STRSXP,2));

SET_STRING_ELT(lsnm,0,mkChar("int"));
SET_STRING_ELT(lsnm,1,mkChar("string"));
SEXP rawvec,headr;
unsigned char str[24]="abcdef";

for ( i = 0 ; i < 3; i++ ) {

*SET_STRING_ELT(valueVector[1],i,mkChar(ab[i]));*
*REAL(valueVector[0])[i] = sn[i];*
}


It works , data frame is being created and executed properly .
Just curious , if I am doing anything wrong or is there another way around
for creation of data-frame .  I am concerned about the SEXP 2D array .

Thanks,
Sandip

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: using 2D array of SEXP for creating dataframe

Hervé Pagès
Hi,

On 06/26/2014 02:32 PM, Sandip Nandi wrote:

> Hi ,
>
> For our production package i need to create a dataframein C . So I wrote
> the following code
>
> SEXP dfm ,head,df , dfint , dfStr,lsnm;
>
> *SEXP  valueVector[2];*
>
> char *ab[3] = {"aa","vv","gy"};
> int sn[3] ={99,89,12};
> char *listnames[2] = {"int","string"};
> int i,j;
>
> //=============================
>
> PROTECT(df = allocVector(VECSXP,2));
>
> *PROTECT(valueVector[0] = allocVector(REALSXP,3));*
> *PROTECT(valueVector[1] = allocVector(VECSXP,3));*
>
>
> PROTECT(lsnm = allocVector(STRSXP,2));
>
> SET_STRING_ELT(lsnm,0,mkChar("int"));
> SET_STRING_ELT(lsnm,1,mkChar("string"));
> SEXP rawvec,headr;
> unsigned char str[24]="abcdef";
>
> for ( i = 0 ; i < 3; i++ ) {
>
> *SET_STRING_ELT(valueVector[1],i,mkChar(ab[i]));*
> *REAL(valueVector[0])[i] = sn[i];*
> }
>
>
> It works , data frame is being created and executed properly .

Really? You mean, you can compile this code right? Otherwise it's
incomplete: you allocate but do nothing with 'df'. Same with 'lsnm'.
And you don't UNPROTECT. With no further treatment, 'df' will be an
unnamed list containing junk data, but not the data.frame you expect.
So there are a few gaps that would need to be filled before this code
actually works as intended.

Maybe try and come back again with specific questions?

Cheers,
H.
  > Just curious , if I am doing anything wrong or is there another way
around

> for creation of data-frame .  I am concerned about the SEXP 2D array .
>
> Thanks,
> Sandip
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: using 2D array of SEXP for creating dataframe

Sandip Nandi
Hi ,

I have put incomplete code here . The complete code works , My doubt is ,
what I am doing logical/safe ? Any memory leak going to happen ? is there
any way to create dataframe ?



SEXP formDF() {

SEXP dfm ,head,df , dfint , dfStr,lsnm;
SEXP  valueVector[2];
char *ab[3] = {"aa","vv","gy"};
int sn[3] ={99,89,12};
char *listnames[2] = {"int","string"};
int i,j;

PROTECT(df = allocVector(VECSXP,2));

PROTECT(valueVector[0] = allocVector(REALSXP,3));
PROTECT(valueVector[1] = allocVector(VECSXP,3));
PROTECT(lsnm = allocVector(STRSXP,2));

SET_STRING_ELT(lsnm,0,mkChar("int"));
SET_STRING_ELT(lsnm,1,mkChar("string"));
SEXP rawvec,headr;

for ( i = 0 ; i < 3; i++ ) {
SET_STRING_ELT(valueVector[1],0,mkChar(listNames[i]));
REAL(valueVector[0])[i] = sn[i];
}

SET_VECTOR_ELT(df,1,valueVector[0]);
SET_VECTOR_ELT(df,0,valueVector[1]);
setAttrib(df,R_RowNamesSymbol,lsnm);

PROTECT(dfm=lang3(install("data.frame"),df,ScalarLogical(FALSE)));
SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
SEXP res = PROTECT(eval(dfm,R_GlobalEnv));

UNPROTECT(7);
return res;

}



On Thu, Jun 26, 2014 at 3:49 PM, Hervé Pagès <[hidden email]> wrote:

> Hi,
>
>
> On 06/26/2014 02:32 PM, Sandip Nandi wrote:
>
>> Hi ,
>>
>> For our production package i need to create a dataframein C . So I wrote
>> the following code
>>
>> SEXP dfm ,head,df , dfint , dfStr,lsnm;
>>
>> *SEXP  valueVector[2];*
>>
>>
>> char *ab[3] = {"aa","vv","gy"};
>> int sn[3] ={99,89,12};
>> char *listnames[2] = {"int","string"};
>> int i,j;
>>
>> //=============================
>>
>> PROTECT(df = allocVector(VECSXP,2));
>>
>> *PROTECT(valueVector[0] = allocVector(REALSXP,3));*
>> *PROTECT(valueVector[1] = allocVector(VECSXP,3));*
>>
>>
>>
>> PROTECT(lsnm = allocVector(STRSXP,2));
>>
>> SET_STRING_ELT(lsnm,0,mkChar("int"));
>> SET_STRING_ELT(lsnm,1,mkChar("string"));
>> SEXP rawvec,headr;
>> unsigned char str[24]="abcdef";
>>
>> for ( i = 0 ; i < 3; i++ ) {
>>
>> *SET_STRING_ELT(valueVector[1],i,mkChar(ab[i]));*
>> *REAL(valueVector[0])[i] = sn[i];*
>>
>> }
>>
>>
>> It works , data frame is being created and executed properly .
>>
>
> Really? You mean, you can compile this code right? Otherwise it's
> incomplete: you allocate but do nothing with 'df'. Same with 'lsnm'.
> And you don't UNPROTECT. With no further treatment, 'df' will be an
> unnamed list containing junk data, but not the data.frame you expect.
> So there are a few gaps that would need to be filled before this code
> actually works as intended.
>
> Maybe try and come back again with specific questions?
>
> Cheers,
> H.
>
>  > Just curious , if I am doing anything wrong or is there another way
> around
>
>> for creation of data-frame .  I am concerned about the SEXP 2D array .
>>
>> Thanks,
>> Sandip
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: [hidden email]
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: using 2D array of SEXP for creating dataframe

Hervé Pagès
Hi Sandip,

On 06/26/2014 04:21 PM, Sandip Nandi wrote:
> Hi ,
>
> I have put incomplete code here . The complete code works , My doubt is
> , what I am doing logical/safe ? Any memory leak going to happen ? is
> there any way to create dataframe ?

I still don't believe it "works". It doesn't even compile. More below...

>
>
>
> SEXP formDF() {
>
> SEXP dfm ,head,df , dfint , dfStr,lsnm;
> SEXP  valueVector[2];
> char *ab[3] = {"aa","vv","gy"};
> int sn[3] ={99,89,12};
> char *listnames[2] = {"int","string"};
> int i,j;
>
> PROTECT(df = allocVector(VECSXP,2));
>
> PROTECT(valueVector[0] = allocVector(REALSXP,3));
> PROTECT(valueVector[1] = allocVector(VECSXP,3));
> PROTECT(lsnm = allocVector(STRSXP,2));
>
> SET_STRING_ELT(lsnm,0,mkChar("int"));
> SET_STRING_ELT(lsnm,1,mkChar("string"));
> SEXP rawvec,headr;
>
> for ( i = 0 ; i < 3; i++ ) {
> SET_STRING_ELT(valueVector[1],0,mkChar(listNames[i]));

'listNames' is undeclared (C is case-sensitive).

Let's assume you managed to compile this with an (imaginary)
case-insensitive C compiler, 'listnames' is an array of length
2 and this for loop tries to read the 3 first elements
from it. So you're just lucky that you didn't get a segfault.
In any case, I don't see how this code could produce
the data.frame you're trying to make.

If you want to discuss how to improve code that *works* (i.e.
compiles and produces the expected result), that's fine, but you
should be able to show that code. Otherwise it sounds like you're
asking people to fix your code. Or to write it for you. Maybe
that's fine too but people will be more sympathetic and willing
to help if you're honest about it.

Cheers,
H.

> REAL(valueVector[0])[i] = sn[i];
> }
>
> SET_VECTOR_ELT(df,1,valueVector[0]);
> SET_VECTOR_ELT(df,0,valueVector[1]);
> setAttrib(df,R_RowNamesSymbol,lsnm);
>
> PROTECT(dfm=lang3(install("data.frame"),df,ScalarLogical(FALSE)));
> SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
> SEXP res = PROTECT(eval(dfm,R_GlobalEnv));
>
> UNPROTECT(7);
> return res;
>
> }
>
>
>
> On Thu, Jun 26, 2014 at 3:49 PM, Hervé Pagès <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi,
>
>
>     On 06/26/2014 02:32 PM, Sandip Nandi wrote:
>
>         Hi ,
>
>         For our production package i need to create a dataframein C . So
>         I wrote
>         the following code
>
>         SEXP dfm ,head,df , dfint , dfStr,lsnm;
>
>         *SEXP  valueVector[2];*
>
>
>         char *ab[3] = {"aa","vv","gy"};
>         int sn[3] ={99,89,12};
>         char *listnames[2] = {"int","string"};
>         int i,j;
>
>         //============================__=
>
>         PROTECT(df = allocVector(VECSXP,2));
>
>         *PROTECT(valueVector[0] = allocVector(REALSXP,3));*
>         *PROTECT(valueVector[1] = allocVector(VECSXP,3));*
>
>
>
>         PROTECT(lsnm = allocVector(STRSXP,2));
>
>         SET_STRING_ELT(lsnm,0,mkChar("__int"));
>         SET_STRING_ELT(lsnm,1,mkChar("__string"));
>         SEXP rawvec,headr;
>         unsigned char str[24]="abcdef";
>
>         for ( i = 0 ; i < 3; i++ ) {
>
>         *SET_STRING_ELT(valueVector[1]__,i,mkChar(ab[i]));*
>         *REAL(valueVector[0])[i] = sn[i];*
>
>         }
>
>
>         It works , data frame is being created and executed properly .
>
>
>     Really? You mean, you can compile this code right? Otherwise it's
>     incomplete: you allocate but do nothing with 'df'. Same with 'lsnm'.
>     And you don't UNPROTECT. With no further treatment, 'df' will be an
>     unnamed list containing junk data, but not the data.frame you expect.
>     So there are a few gaps that would need to be filled before this code
>     actually works as intended.
>
>     Maybe try and come back again with specific questions?
>
>     Cheers,
>     H.
>
>       > Just curious , if I am doing anything wrong or is there another
>     way around
>
>         for creation of data-frame .  I am concerned about the SEXP 2D
>         array .
>
>         Thanks,
>         Sandip
>
>                  [[alternative HTML version deleted]]
>
>         ________________________________________________
>         [hidden email] <mailto:[hidden email]> mailing list
>         https://stat.ethz.ch/mailman/__listinfo/r-devel
>         <https://stat.ethz.ch/mailman/listinfo/r-devel>
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: [hidden email] <mailto:[hidden email]>
>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: using 2D array of SEXP for creating dataframe

Sandip Nandi
Hi ,

I have asked a question , whether the data structure I am using to create a
dataframe is fine or there is anyother way i can use. My aim is to read  a
database and write it to dataframe and do operation on it . The dataframe
creation ,output everything works .  The code I put is wrong , trying to
adding pieces and do it ,sorry for that.    I feel my way of doing ,
creating a 2D array may not be the best, so if someone can point out any
drawback of my method will be great . My code in production can read 100k
rows and write in 15 seconds . But one case , when I try to assign NA_REAL
to a real vector it causes floating point exception. So I doubt something
is not wrong . People may be doing faster,efficient way.

This is a sample code
*/**

*dfm is a dataframe which i assume as list of list . So I created a SEXP
array valueVector[2]  where each one can hold different datatype .  Now
values are assigned and dataframe is generated at end*

**/*

SEXP formDF() {

SEXP dfm ,head,df , dfint , dfStr,lsnm;
SEXP  valueVector[2];
char *ab[3] = {"aa","vv","gy"};
int sn[3] ={99,89,12};
char *listnames[2] = {"int","string"};
int i,j;


PROTECT(valueVector[0] = allocVector(REALSXP,3));
PROTECT(valueVector[1] = allocVector(STRSXP,3));
PROTECT(lsnm = allocVector(STRSXP,2));

SET_STRING_ELT(lsnm,0,mkChar("int"));
SET_STRING_ELT(lsnm,1,mkChar("string"));

for ( i = 0 ; i < 3; i++ ) {
SET_STRING_ELT(valueVector[1],i,mkChar(ab[i]));
REAL(valueVector[0])[i] = sn[i];
}


SET_VECTOR_ELT(df,1,valueVector[0]);
SET_VECTOR_ELT(df,0,valueVector[1]);
setAttrib(df,R_RowNamesSymbol,lsnm);

PROTECT(dfm=lang3(install("data.frame"),df,ScalarLogical(FALSE)));
SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
SEXP res = PROTECT(eval(dfm,R_GlobalEnv));

UNPROTECT(7);
return res;

}


On Thu, Jun 26, 2014 at 4:52 PM, Hervé Pagès <[hidden email]> wrote:

> Hi Sandip,
>
>
> On 06/26/2014 04:21 PM, Sandip Nandi wrote:
>
>> Hi ,
>>
>> I have put incomplete code here . The complete code works , My doubt is
>> , what I am doing logical/safe ? Any memory leak going to happen ? is
>> there any way to create dataframe ?
>>
>
> I still don't believe it "works". It doesn't even compile. More below...
>
>
>
>>
>>
>> SEXP formDF() {
>>
>> SEXP dfm ,head,df , dfint , dfStr,lsnm;
>> SEXP  valueVector[2];
>> char *ab[3] = {"aa","vv","gy"};
>> int sn[3] ={99,89,12};
>> char *listnames[2] = {"int","string"};
>> int i,j;
>>
>> PROTECT(df = allocVector(VECSXP,2));
>>
>> PROTECT(valueVector[0] = allocVector(REALSXP,3));
>> PROTECT(valueVector[1] = allocVector(VECSXP,3));
>> PROTECT(lsnm = allocVector(STRSXP,2));
>>
>> SET_STRING_ELT(lsnm,0,mkChar("int"));
>> SET_STRING_ELT(lsnm,1,mkChar("string"));
>> SEXP rawvec,headr;
>>
>> for ( i = 0 ; i < 3; i++ ) {
>> SET_STRING_ELT(valueVector[1],0,mkChar(listNames[i]));
>>
>
> 'listNames' is undeclared (C is case-sensitive).
>
> Let's assume you managed to compile this with an (imaginary)
> case-insensitive C compiler, 'listnames' is an array of length
> 2 and this for loop tries to read the 3 first elements
> from it. So you're just lucky that you didn't get a segfault.
> In any case, I don't see how this code could produce
> the data.frame you're trying to make.
>
> If you want to discuss how to improve code that *works* (i.e.
> compiles and produces the expected result), that's fine, but you
> should be able to show that code. Otherwise it sounds like you're
> asking people to fix your code. Or to write it for you. Maybe
> that's fine too but people will be more sympathetic and willing
> to help if you're honest about it.
>
> Cheers,
> H.
>
>  REAL(valueVector[0])[i] = sn[i];
>> }
>>
>> SET_VECTOR_ELT(df,1,valueVector[0]);
>> SET_VECTOR_ELT(df,0,valueVector[1]);
>> setAttrib(df,R_RowNamesSymbol,lsnm);
>>
>> PROTECT(dfm=lang3(install("data.frame"),df,ScalarLogical(FALSE)));
>> SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
>> SEXP res = PROTECT(eval(dfm,R_GlobalEnv));
>>
>> UNPROTECT(7);
>> return res;
>>
>> }
>>
>>
>>
>> On Thu, Jun 26, 2014 at 3:49 PM, Hervé Pagès <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     Hi,
>>
>>
>>     On 06/26/2014 02:32 PM, Sandip Nandi wrote:
>>
>>         Hi ,
>>
>>         For our production package i need to create a dataframein C . So
>>         I wrote
>>         the following code
>>
>>         SEXP dfm ,head,df , dfint , dfStr,lsnm;
>>
>>         *SEXP  valueVector[2];*
>>
>>
>>         char *ab[3] = {"aa","vv","gy"};
>>         int sn[3] ={99,89,12};
>>         char *listnames[2] = {"int","string"};
>>         int i,j;
>>
>>         //============================__=
>>
>>
>>         PROTECT(df = allocVector(VECSXP,2));
>>
>>         *PROTECT(valueVector[0] = allocVector(REALSXP,3));*
>>         *PROTECT(valueVector[1] = allocVector(VECSXP,3));*
>>
>>
>>
>>         PROTECT(lsnm = allocVector(STRSXP,2));
>>
>>         SET_STRING_ELT(lsnm,0,mkChar("__int"));
>>         SET_STRING_ELT(lsnm,1,mkChar("__string"));
>>
>>         SEXP rawvec,headr;
>>         unsigned char str[24]="abcdef";
>>
>>         for ( i = 0 ; i < 3; i++ ) {
>>
>>         *SET_STRING_ELT(valueVector[1]__,i,mkChar(ab[i]));*
>>
>>         *REAL(valueVector[0])[i] = sn[i];*
>>
>>         }
>>
>>
>>         It works , data frame is being created and executed properly .
>>
>>
>>     Really? You mean, you can compile this code right? Otherwise it's
>>     incomplete: you allocate but do nothing with 'df'. Same with 'lsnm'.
>>     And you don't UNPROTECT. With no further treatment, 'df' will be an
>>     unnamed list containing junk data, but not the data.frame you expect.
>>     So there are a few gaps that would need to be filled before this code
>>     actually works as intended.
>>
>>     Maybe try and come back again with specific questions?
>>
>>     Cheers,
>>     H.
>>
>>       > Just curious , if I am doing anything wrong or is there another
>>     way around
>>
>>         for creation of data-frame .  I am concerned about the SEXP 2D
>>         array .
>>
>>         Thanks,
>>         Sandip
>>
>>                  [[alternative HTML version deleted]]
>>
>>         ________________________________________________
>>         [hidden email] <mailto:[hidden email]> mailing list
>>         https://stat.ethz.ch/mailman/__listinfo/r-devel
>>
>>         <https://stat.ethz.ch/mailman/listinfo/r-devel>
>>
>>
>>     --
>>     Hervé Pagès
>>
>>     Program in Computational Biology
>>     Division of Public Health Sciences
>>     Fred Hutchinson Cancer Research Center
>>     1100 Fairview Ave. N, M1-B514
>>     P.O. Box 19024
>>     Seattle, WA 98109-1024
>>
>>     E-mail: [hidden email] <mailto:[hidden email]>
>>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: [hidden email]
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: using 2D array of SEXP for creating dataframe

Hervé Pagès
On 06/26/2014 05:18 PM, Sandip Nandi wrote:

> Hi ,
>
> I have asked a question , whether the data structure I am using to
> create a dataframe is fine or there is anyother way i can use. My aim is
> to read  a database and write it to dataframe and do operation on it .
> The dataframe creation ,output everything works .  The code I put is
> wrong , trying to adding pieces and do it ,sorry for that.    I feel my
> way of doing , creating a 2D array may not be the best, so if someone
> can point out any drawback of my method will be great . My code in
> production can read 100k rows and write in 15 seconds . But one case ,
> when I try to assign NA_REAL to a real vector it causes floating point
> exception. So I doubt something is not wrong . People may be doing
> faster,efficient way.
>

Please understand that the code you send is useful for the discussion
only if we can understand it. And for this it needs to make sense.
The code below still makes little sense. Did you try it? For example
you're calling SET_VECTOR_ELT() and setAttrib() on an SEXP ('df') that
you didn't even allocate. Sounds maybe like a detail but because of
that the code will segfault and, more importantly, it's not clear what
kind of SEXP you want 'df' to be.

Also the following line makes no sense:

   setAttrib(df,R_RowNamesSymbol,lsnm);

given that 'lsnm' is c("int", "string") so it looks more like the col
names than the row names (and also because you're apparently trying to
make a 3x2 data.frame, not a 2x2).

Anyway, once you realize that a data.frame is just a list with 3
attributes:

   > attributes(data.frame(int=c(99,89,12), string=c("aa", "vv", "gy")))
   $names
   [1] "int"    "string"

   $row.names
   [1] 1 2 3

   $class
   [1] "data.frame"

everything becomes simple at the C level i.e. just make that list
and stick these 3 attributes on it. You don't need to call R code
from C (which BTW will protect you from random changes in the behavior
of the data.frame() constructor). You don't need the intermediate
'valueVector' data structure (what you seem to be referring to as the
"2D array of SEXP", don't know why, doesn't look like a 2D array to me,
but you never explained).

Cheers,
H.


> This is a sample code
> */**
> *
> *
> *dfm is a dataframe which i assume as list of list . So I created a SEXP
> array valueVector[2]  where each one can hold different datatype .  Now
> values are assigned and dataframe is generated at end*
> *
> *
> **/*
>
> SEXP formDF() {
>
> SEXP dfm ,head,df , dfint , dfStr,lsnm;
> SEXP  valueVector[2];
> char *ab[3] = {"aa","vv","gy"};
> int sn[3] ={99,89,12};
> char *listnames[2] = {"int","string"};
> int i,j;
>
>
> PROTECT(valueVector[0] = allocVector(REALSXP,3));
> PROTECT(valueVector[1] = allocVector(STRSXP,3));
> PROTECT(lsnm = allocVector(STRSXP,2));
>
> SET_STRING_ELT(lsnm,0,mkChar("int"));
> SET_STRING_ELT(lsnm,1,mkChar("string"));
>
> for ( i = 0 ; i < 3; i++ ) {
> SET_STRING_ELT(valueVector[1],i,mkChar(ab[i]));
> REAL(valueVector[0])[i] = sn[i];
> }
>
>
> SET_VECTOR_ELT(df,1,valueVector[0]);
> SET_VECTOR_ELT(df,0,valueVector[1]);
> setAttrib(df,R_RowNamesSymbol,lsnm);
>
> PROTECT(dfm=lang3(install("data.frame"),df,ScalarLogical(FALSE)));
> SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
> SEXP res = PROTECT(eval(dfm,R_GlobalEnv));
>
> UNPROTECT(7);
> return res;
>
> }
>
>
> On Thu, Jun 26, 2014 at 4:52 PM, Hervé Pagès <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi Sandip,
>
>
>     On 06/26/2014 04:21 PM, Sandip Nandi wrote:
>
>         Hi ,
>
>         I have put incomplete code here . The complete code works , My
>         doubt is
>         , what I am doing logical/safe ? Any memory leak going to happen
>         ? is
>         there any way to create dataframe ?
>
>
>     I still don't believe it "works". It doesn't even compile. More below...
>
>
>
>
>
>         SEXP formDF() {
>
>         SEXP dfm ,head,df , dfint , dfStr,lsnm;
>         SEXP  valueVector[2];
>         char *ab[3] = {"aa","vv","gy"};
>         int sn[3] ={99,89,12};
>         char *listnames[2] = {"int","string"};
>         int i,j;
>
>         PROTECT(df = allocVector(VECSXP,2));
>
>         PROTECT(valueVector[0] = allocVector(REALSXP,3));
>         PROTECT(valueVector[1] = allocVector(VECSXP,3));
>         PROTECT(lsnm = allocVector(STRSXP,2));
>
>         SET_STRING_ELT(lsnm,0,mkChar("__int"));
>         SET_STRING_ELT(lsnm,1,mkChar("__string"));
>         SEXP rawvec,headr;
>
>         for ( i = 0 ; i < 3; i++ ) {
>         SET_STRING_ELT(valueVector[1],__0,mkChar(listNames[i]));
>
>
>     'listNames' is undeclared (C is case-sensitive).
>
>     Let's assume you managed to compile this with an (imaginary)
>     case-insensitive C compiler, 'listnames' is an array of length
>     2 and this for loop tries to read the 3 first elements
>     from it. So you're just lucky that you didn't get a segfault.
>     In any case, I don't see how this code could produce
>     the data.frame you're trying to make.
>
>     If you want to discuss how to improve code that *works* (i.e.
>     compiles and produces the expected result), that's fine, but you
>     should be able to show that code. Otherwise it sounds like you're
>     asking people to fix your code. Or to write it for you. Maybe
>     that's fine too but people will be more sympathetic and willing
>     to help if you're honest about it.
>
>     Cheers,
>     H.
>
>         REAL(valueVector[0])[i] = sn[i];
>         }
>
>         SET_VECTOR_ELT(df,1,__valueVector[0]);
>         SET_VECTOR_ELT(df,0,__valueVector[1]);
>         setAttrib(df,R_RowNamesSymbol,__lsnm);
>
>         PROTECT(dfm=lang3(install("__data.frame"),df,ScalarLogical(__FALSE)));
>         SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
>         SEXP res = PROTECT(eval(dfm,R_GlobalEnv))__;
>
>         UNPROTECT(7);
>         return res;
>
>         }
>
>
>
>         On Thu, Jun 26, 2014 at 3:49 PM, Hervé Pagès <[hidden email]
>         <mailto:[hidden email]>
>         <mailto:[hidden email] <mailto:[hidden email]>>> wrote:
>
>              Hi,
>
>
>              On 06/26/2014 02:32 PM, Sandip Nandi wrote:
>
>                  Hi ,
>
>                  For our production package i need to create a
>         dataframein C . So
>                  I wrote
>                  the following code
>
>                  SEXP dfm ,head,df , dfint , dfStr,lsnm;
>
>                  *SEXP  valueVector[2];*
>
>
>                  char *ab[3] = {"aa","vv","gy"};
>                  int sn[3] ={99,89,12};
>                  char *listnames[2] = {"int","string"};
>                  int i,j;
>
>                  //============================____=
>
>
>                  PROTECT(df = allocVector(VECSXP,2));
>
>                  *PROTECT(valueVector[0] = allocVector(REALSXP,3));*
>                  *PROTECT(valueVector[1] = allocVector(VECSXP,3));*
>
>
>
>                  PROTECT(lsnm = allocVector(STRSXP,2));
>
>                  SET_STRING_ELT(lsnm,0,mkChar("____int"));
>                  SET_STRING_ELT(lsnm,1,mkChar("____string"));
>
>                  SEXP rawvec,headr;
>                  unsigned char str[24]="abcdef";
>
>                  for ( i = 0 ; i < 3; i++ ) {
>
>                  *SET_STRING_ELT(valueVector[1]____,i,mkChar(ab[i]));*
>
>                  *REAL(valueVector[0])[i] = sn[i];*
>
>                  }
>
>
>                  It works , data frame is being created and executed
>         properly .
>
>
>              Really? You mean, you can compile this code right?
>         Otherwise it's
>              incomplete: you allocate but do nothing with 'df'. Same
>         with 'lsnm'.
>              And you don't UNPROTECT. With no further treatment, 'df'
>         will be an
>              unnamed list containing junk data, but not the data.frame
>         you expect.
>              So there are a few gaps that would need to be filled before
>         this code
>              actually works as intended.
>
>              Maybe try and come back again with specific questions?
>
>              Cheers,
>              H.
>
>                > Just curious , if I am doing anything wrong or is there
>         another
>              way around
>
>                  for creation of data-frame .  I am concerned about the
>         SEXP 2D
>                  array .
>
>                  Thanks,
>                  Sandip
>
>                           [[alternative HTML version deleted]]
>
>                  __________________________________________________
>         [hidden email] <mailto:[hidden email]>
>         <mailto:[hidden email] <mailto:[hidden email]>>
>         mailing list
>         https://stat.ethz.ch/mailman/____listinfo/r-devel
>         <https://stat.ethz.ch/mailman/__listinfo/r-devel>
>
>                  <https://stat.ethz.ch/mailman/__listinfo/r-devel
>         <https://stat.ethz.ch/mailman/listinfo/r-devel>>
>
>
>              --
>              Hervé Pagès
>
>              Program in Computational Biology
>              Division of Public Health Sciences
>              Fred Hutchinson Cancer Research Center
>              1100 Fairview Ave. N, M1-B514
>              P.O. Box 19024
>              Seattle, WA 98109-1024
>
>              E-mail: [hidden email] <mailto:[hidden email]>
>         <mailto:[hidden email] <mailto:[hidden email]>>
>              Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>         <tel:%28206%29%20667-5791>
>              Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>         <tel:%28206%29%20667-1319>
>
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: [hidden email] <mailto:[hidden email]>
>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: using 2D array of SEXP for creating dataframe

Sandip Nandi
Thanks a lot .  Appreciate your time . I am sorry for missing snippets of
the code , trying to copy back and forth . My bad .sorry for that .

The dataframe df is to be of VECSXP .
PROTECT(df = allocVector(VECSXP,2));

What I am trying to do ?

Lets say you read a huge table , with each column is different datatype
some integer , some real,some string , some raw . And you have to return  a
dataframe after reading it . So what I did lets create a array of SEXP (
SEXP dataframevalues[noOfCols] ) . Now from table each column is read and
also the datatype is obtained. I allocate each array with number of values
to be inserted .

Now there is one outer vector of type VECSXP , which will hold the
dataframe[noOfCols] . So VECSXP is of  length noOfCols.
. After that a R internal call is performed to convert it to a data frame
format .


And yes , i want to set for column Names, not row names.
Thanks,



On Thu, Jun 26, 2014 at 6:35 PM, Hervé Pagès <[hidden email]> wrote:

> On 06/26/2014 05:18 PM, Sandip Nandi wrote:
>
>> Hi ,
>>
>> I have asked a question , whether the data structure I am using to
>> create a dataframe is fine or there is anyother way i can use. My aim is
>> to read  a database and write it to dataframe and do operation on it .
>> The dataframe creation ,output everything works .  The code I put is
>> wrong , trying to adding pieces and do it ,sorry for that.    I feel my
>> way of doing , creating a 2D array may not be the best, so if someone
>> can point out any drawback of my method will be great . My code in
>> production can read 100k rows and write in 15 seconds . But one case ,
>> when I try to assign NA_REAL to a real vector it causes floating point
>> exception. So I doubt something is not wrong . People may be doing
>> faster,efficient way.
>>
>>
> Please understand that the code you send is useful for the discussion
> only if we can understand it. And for this it needs to make sense.
> The code below still makes little sense. Did you try it? For example
> you're calling SET_VECTOR_ELT() and setAttrib() on an SEXP ('df') that
> you didn't even allocate. Sounds maybe like a detail but because of
> that the code will segfault and, more importantly, it's not clear what
> kind of SEXP you want 'df' to be.
>
> Also the following line makes no sense:
>
>   setAttrib(df,R_RowNamesSymbol,lsnm);
>
> given that 'lsnm' is c("int", "string") so it looks more like the col
> names than the row names (and also because you're apparently trying to
> make a 3x2 data.frame, not a 2x2).
>
> Anyway, once you realize that a data.frame is just a list with 3
> attributes:
>
>   > attributes(data.frame(int=c(99,89,12), string=c("aa", "vv", "gy")))
>   $names
>   [1] "int"    "string"
>
>   $row.names
>   [1] 1 2 3
>
>   $class
>   [1] "data.frame"
>
> everything becomes simple at the C level i.e. just make that list
> and stick these 3 attributes on it. You don't need to call R code
> from C (which BTW will protect you from random changes in the behavior
> of the data.frame() constructor). You don't need the intermediate
> 'valueVector' data structure (what you seem to be referring to as the
> "2D array of SEXP", don't know why, doesn't look like a 2D array to me,
> but you never explained).
>
> Cheers,
> H.
>
>
>  This is a sample code
>> */**
>> *
>> *
>> *dfm is a dataframe which i assume as list of list . So I created a SEXP
>>
>> array valueVector[2]  where each one can hold different datatype .  Now
>> values are assigned and dataframe is generated at end*
>> *
>> *
>> **/*
>>
>>
>> SEXP formDF() {
>>
>> SEXP dfm ,head,df , dfint , dfStr,lsnm;
>> SEXP  valueVector[2];
>> char *ab[3] = {"aa","vv","gy"};
>> int sn[3] ={99,89,12};
>> char *listnames[2] = {"int","string"};
>> int i,j;
>>
>>
>> PROTECT(valueVector[0] = allocVector(REALSXP,3));
>> PROTECT(valueVector[1] = allocVector(STRSXP,3));
>> PROTECT(lsnm = allocVector(STRSXP,2));
>>
>> SET_STRING_ELT(lsnm,0,mkChar("int"));
>> SET_STRING_ELT(lsnm,1,mkChar("string"));
>>
>> for ( i = 0 ; i < 3; i++ ) {
>> SET_STRING_ELT(valueVector[1],i,mkChar(ab[i]));
>> REAL(valueVector[0])[i] = sn[i];
>> }
>>
>>
>> SET_VECTOR_ELT(df,1,valueVector[0]);
>> SET_VECTOR_ELT(df,0,valueVector[1]);
>> setAttrib(df,R_RowNamesSymbol,lsnm);
>>
>> PROTECT(dfm=lang3(install("data.frame"),df,ScalarLogical(FALSE)));
>> SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
>> SEXP res = PROTECT(eval(dfm,R_GlobalEnv));
>>
>> UNPROTECT(7);
>> return res;
>>
>> }
>>
>>
>> On Thu, Jun 26, 2014 at 4:52 PM, Hervé Pagès <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     Hi Sandip,
>>
>>
>>     On 06/26/2014 04:21 PM, Sandip Nandi wrote:
>>
>>         Hi ,
>>
>>         I have put incomplete code here . The complete code works , My
>>         doubt is
>>         , what I am doing logical/safe ? Any memory leak going to happen
>>         ? is
>>         there any way to create dataframe ?
>>
>>
>>     I still don't believe it "works". It doesn't even compile. More
>> below...
>>
>>
>>
>>
>>
>>         SEXP formDF() {
>>
>>         SEXP dfm ,head,df , dfint , dfStr,lsnm;
>>         SEXP  valueVector[2];
>>         char *ab[3] = {"aa","vv","gy"};
>>         int sn[3] ={99,89,12};
>>         char *listnames[2] = {"int","string"};
>>         int i,j;
>>
>>         PROTECT(df = allocVector(VECSXP,2));
>>
>>         PROTECT(valueVector[0] = allocVector(REALSXP,3));
>>         PROTECT(valueVector[1] = allocVector(VECSXP,3));
>>         PROTECT(lsnm = allocVector(STRSXP,2));
>>
>>         SET_STRING_ELT(lsnm,0,mkChar("__int"));
>>         SET_STRING_ELT(lsnm,1,mkChar("__string"));
>>         SEXP rawvec,headr;
>>
>>         for ( i = 0 ; i < 3; i++ ) {
>>         SET_STRING_ELT(valueVector[1],__0,mkChar(listNames[i]));
>>
>>
>>
>>     'listNames' is undeclared (C is case-sensitive).
>>
>>     Let's assume you managed to compile this with an (imaginary)
>>     case-insensitive C compiler, 'listnames' is an array of length
>>     2 and this for loop tries to read the 3 first elements
>>     from it. So you're just lucky that you didn't get a segfault.
>>     In any case, I don't see how this code could produce
>>     the data.frame you're trying to make.
>>
>>     If you want to discuss how to improve code that *works* (i.e.
>>     compiles and produces the expected result), that's fine, but you
>>     should be able to show that code. Otherwise it sounds like you're
>>     asking people to fix your code. Or to write it for you. Maybe
>>     that's fine too but people will be more sympathetic and willing
>>     to help if you're honest about it.
>>
>>     Cheers,
>>     H.
>>
>>         REAL(valueVector[0])[i] = sn[i];
>>         }
>>
>>         SET_VECTOR_ELT(df,1,__valueVector[0]);
>>         SET_VECTOR_ELT(df,0,__valueVector[1]);
>>         setAttrib(df,R_RowNamesSymbol,__lsnm);
>>
>>         PROTECT(dfm=lang3(install("__data.frame"),df,ScalarLogical(
>> __FALSE)));
>>         SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
>>         SEXP res = PROTECT(eval(dfm,R_GlobalEnv))__;
>>
>>
>>         UNPROTECT(7);
>>         return res;
>>
>>         }
>>
>>
>>
>>         On Thu, Jun 26, 2014 at 3:49 PM, Hervé Pagès <[hidden email]
>>         <mailto:[hidden email]>
>>         <mailto:[hidden email] <mailto:[hidden email]>>> wrote:
>>
>>              Hi,
>>
>>
>>              On 06/26/2014 02:32 PM, Sandip Nandi wrote:
>>
>>                  Hi ,
>>
>>                  For our production package i need to create a
>>         dataframein C . So
>>                  I wrote
>>                  the following code
>>
>>                  SEXP dfm ,head,df , dfint , dfStr,lsnm;
>>
>>                  *SEXP  valueVector[2];*
>>
>>
>>                  char *ab[3] = {"aa","vv","gy"};
>>                  int sn[3] ={99,89,12};
>>                  char *listnames[2] = {"int","string"};
>>                  int i,j;
>>
>>                  //============================____=
>>
>>
>>
>>                  PROTECT(df = allocVector(VECSXP,2));
>>
>>                  *PROTECT(valueVector[0] = allocVector(REALSXP,3));*
>>                  *PROTECT(valueVector[1] = allocVector(VECSXP,3));*
>>
>>
>>
>>                  PROTECT(lsnm = allocVector(STRSXP,2));
>>
>>                  SET_STRING_ELT(lsnm,0,mkChar("____int"));
>>                  SET_STRING_ELT(lsnm,1,mkChar("____string"));
>>
>>
>>                  SEXP rawvec,headr;
>>                  unsigned char str[24]="abcdef";
>>
>>                  for ( i = 0 ; i < 3; i++ ) {
>>
>>                  *SET_STRING_ELT(valueVector[1]____,i,mkChar(ab[i]));*
>>
>>
>>                  *REAL(valueVector[0])[i] = sn[i];*
>>
>>                  }
>>
>>
>>                  It works , data frame is being created and executed
>>         properly .
>>
>>
>>              Really? You mean, you can compile this code right?
>>         Otherwise it's
>>              incomplete: you allocate but do nothing with 'df'. Same
>>         with 'lsnm'.
>>              And you don't UNPROTECT. With no further treatment, 'df'
>>         will be an
>>              unnamed list containing junk data, but not the data.frame
>>         you expect.
>>              So there are a few gaps that would need to be filled before
>>         this code
>>              actually works as intended.
>>
>>              Maybe try and come back again with specific questions?
>>
>>              Cheers,
>>              H.
>>
>>                > Just curious , if I am doing anything wrong or is there
>>         another
>>              way around
>>
>>                  for creation of data-frame .  I am concerned about the
>>         SEXP 2D
>>                  array .
>>
>>                  Thanks,
>>                  Sandip
>>
>>                           [[alternative HTML version deleted]]
>>
>>                  __________________________________________________
>>         [hidden email] <mailto:[hidden email]>
>>         <mailto:[hidden email] <mailto:[hidden email]>>
>>         mailing list
>>         https://stat.ethz.ch/mailman/____listinfo/r-devel
>>         <https://stat.ethz.ch/mailman/__listinfo/r-devel>
>>
>>
>>                  <https://stat.ethz.ch/mailman/__listinfo/r-devel
>>         <https://stat.ethz.ch/mailman/listinfo/r-devel>>
>>
>>
>>              --
>>              Hervé Pagès
>>
>>              Program in Computational Biology
>>              Division of Public Health Sciences
>>              Fred Hutchinson Cancer Research Center
>>              1100 Fairview Ave. N, M1-B514
>>              P.O. Box 19024
>>              Seattle, WA 98109-1024
>>
>>              E-mail: [hidden email] <mailto:[hidden email]>
>>         <mailto:[hidden email] <mailto:[hidden email]>>
>>
>>              Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>         <tel:%28206%29%20667-5791>
>>              Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>         <tel:%28206%29%20667-1319>
>>
>>
>>
>>     --
>>     Hervé Pagès
>>
>>     Program in Computational Biology
>>     Division of Public Health Sciences
>>     Fred Hutchinson Cancer Research Center
>>     1100 Fairview Ave. N, M1-B514
>>     P.O. Box 19024
>>     Seattle, WA 98109-1024
>>
>>     E-mail: [hidden email] <mailto:[hidden email]>
>>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: [hidden email]
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel