Reading xpt files into R

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Reading xpt files into R

R help mailing list-2
Hello R folk

I have an xpt file which I have been trying to open into R in R studio

On the net I found guidance which says that I need packages Hmisc and SASxport which I have successfully loaded.

I had also found some code which says that this would allow me to read the xpt file into R:

library(SASxport)
data(Alfalfa)
lookup.xport("test.xpt")
Alfalfa<-read.xport("test.xpt")

I have set the directory correctly as far as I am aware, but when I tried to run this code I got the following error messages:

> lookup.xport("test.xpt")
Error in lookup.xport.inner(file) : file not in SAS transfer format
> Alfalfa<-read.xport("test.xpt")
Error in read.xport("test.xpt") :
The specified file does not start with a SAS xport file header!

I neither know what the file being not in SAS transfer format means, nor what not starting with an SAS xport file header means either...
If anyone can explain how I can read this xpt file into R I'd be v grateful

Thanks Nick Wray

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reading xpt files into R

David Winsemius

> On Apr 13, 2018, at 10:01 AM, WRAY NICHOLAS via R-help <[hidden email]> wrote:
>
> Hello R folk
>
> I have an xpt file which I have been trying to open into R in R studio
>
> On the net I found guidance which says that I need packages Hmisc and SASxport which I have successfully loaded.
>
> I had also found some code which says that this would allow me to read the xpt file into R:
>
> library(SASxport)
> data(Alfalfa)
> lookup.xport("test.xpt")
> Alfalfa<-read.xport("test.xpt")
>
> I have set the directory correctly as far as I am aware, but when I tried to run this code I got the following error messages:
>
>> lookup.xport("test.xpt")
> Error in lookup.xport.inner(file) : file not in SAS transfer format
>> Alfalfa<-read.xport("test.xpt")
> Error in read.xport("test.xpt") :
> The specified file does not start with a SAS xport file header!
>
> I neither know what the file being not in SAS transfer format means, nor what not starting with an SAS xport file header means either...

The "export" or "transfer format from SA is supposed to make reading data less difficult and standardized. This is what a header from the version used by the NHANES releases (that's all one line):

HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000  SAS     SAS     SASLIB  9.2     XP_PRO                        16SEP09:09:39:2516SEP09:09:39:25                                                                HEADER RECORD*******MEMBER  HEADER RECORD!!!!!!!000000000000000001600000000140  HEADER RECORD*******DSCRPTR HEADER RECORD!!!!!!!000000000000000000000000000000  SAS     DEMO    SASDATA 9.2     XP_PRO                        16SEP09:09:39:2516SEP09:09:39:25                                                                HEADER RECORD*******NAMESTR HEADER RECORD!!!!!!!000000014400000000000000000000  SEQN    Respondent sequence number                              

You can look at the file with a text editor.

There is a read.xport function in the foreign package and I think most people would have chosen that one as a first attemp. It's part of the standard R distribution. It refers you to https://support.sas.com/techsup/technote/ts140.pdf for details on the format.

--
David.

> If anyone can explain how I can read this xpt file into R I'd be v grateful
>
> Thanks Nick Wray
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reading xpt files into R

Peter Dalgaard-2
That's what he tried, but the bottom line is that just because something is called foo.xpt there is no guarantee that it actually is a SAS XPORT file. Firefox plugins use the same extension but it could really be anything - naming conventions are just that: conventions.

So dig deeper and find out what the file really is (or was supposed to be).

-pd

> On 14 Apr 2018, at 00:18 , David Winsemius <[hidden email]> wrote:
>
> There is a read.xport function in the foreign package and I think most people would have chosen that one as a first attemp. It's part of the standard R distribution. It refers you to https://support.sas.com/techsup/technote/ts140.pdf for details on the format.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Fwd: Re: Reading xpt files into R

R help mailing list-2

-------- Original Message ----------
From: WRAY NICHOLAS <[hidden email]>
To: peter dalgaard <[hidden email]>
Date: 14 April 2018 at 20:18
Subject: Re: [R] Reading xpt files into R


Well yesterday I'd downloaded the "foreign" package and tried to open the xpt file using that:

library(foreign)
read.xport("test.xpt")

I got the following error and warning messages:

> read.xport("test.xpt")
Error in read.xport("test.xpt") :
The specified file does not start with a SAS xport file header!
In addition: Warning message:
In readBin(file, what = character(0), n = 1, size = nchar(xport.file.header,  :
null terminator not found: breaking string at 10000 bytes

I can open the xpt using wordpad and there is a header but it seems to be just text.  I really don't know what constitutes an "
SAS xport file header"

Nick



On 14 April 2018 at 10:32 peter dalgaard <[hidden email]> wrote:

That's what he tried, but the bottom line is that just because something is called foo.xpt there is no guarantee that it actually is a SAS XPORT file. Firefox plugins use the same extension but it could really be anything - naming conventions are just that: conventions.

So dig deeper and find out what the file really is (or was supposed to be).

-pd

>
>             On 14 Apr 2018, at 00:18 , David Winsemius <[hidden email]> wrote:
>
>             There is a read.xport function in the foreign package and I think most people would have chosen that one as a first attemp. It's part of the standard R distribution. It refers you to https://support.sas.com/techsup/technote/ts140.pdf for details on the format.
>
>        
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email] Priv: [hidden email]




 

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Re: Reading xpt files into R

R help mailing list-2
Does read.xport read both version 5 and version 8 xpt files?  This link to
the Library of Congress can get you started on how to interpret the
header.  (It states that Version 8 was introduced in 2012 but was not in
wide use as of early 2017.)

https://www.loc.gov/preservation/digital/formats/fdd/fdd000464.shtml

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Apr 14, 2018 at 12:18 PM, WRAY NICHOLAS via R-help <
[hidden email]> wrote:

>
> -------- Original Message ----------
> From: WRAY NICHOLAS <[hidden email]>
> To: peter dalgaard <[hidden email]>
> Date: 14 April 2018 at 20:18
> Subject: Re: [R] Reading xpt files into R
>
>
> Well yesterday I'd downloaded the "foreign" package and tried to open the
> xpt file using that:
>
> library(foreign)
> read.xport("test.xpt")
>
> I got the following error and warning messages:
>
> > read.xport("test.xpt")
> Error in read.xport("test.xpt") :
> The specified file does not start with a SAS xport file header!
> In addition: Warning message:
> In readBin(file, what = character(0), n = 1, size =
> nchar(xport.file.header,  :
> null terminator not found: breaking string at 10000 bytes
>
> I can open the xpt using wordpad and there is a header but it seems to be
> just text.  I really don't know what constitutes an "
> SAS xport file header"
>
> Nick
>
>
>
> On 14 April 2018 at 10:32 peter dalgaard <[hidden email]> wrote:
>
> That's what he tried, but the bottom line is that just because something
> is called foo.xpt there is no guarantee that it actually is a SAS XPORT
> file. Firefox plugins use the same extension but it could really be
> anything - naming conventions are just that: conventions.
>
> So dig deeper and find out what the file really is (or was supposed to be).
>
> -pd
>
> >
> >             On 14 Apr 2018, at 00:18 , David Winsemius <
> [hidden email]> wrote:
> >
> >             There is a read.xport function in the foreign package and I
> think most people would have chosen that one as a first attemp. It's part
> of the standard R distribution. It refers you to https://support.sas.com/
> techsup/technote/ts140.pdf for details on the format.
> >
> >
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: [hidden email] Priv: [hidden email]
>
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Re: Reading xpt files into R

David Winsemius
In reply to this post by R help mailing list-2

> On Apr 14, 2018, at 12:18 PM, WRAY NICHOLAS via R-help <[hidden email]> wrote:
>
>
> -------- Original Message ----------
> From: WRAY NICHOLAS <[hidden email]>
> To: peter dalgaard <[hidden email]>
> Date: 14 April 2018 at 20:18
> Subject: Re: [R] Reading xpt files into R
>
>
> Well yesterday I'd downloaded the "foreign" package and tried to open the xpt file using that:
>
> library(foreign)
> read.xport("test.xpt")
>
> I got the following error and warning messages:
>
>> read.xport("test.xpt")
> Error in read.xport("test.xpt") :
> The specified file does not start with a SAS xport file header!
> In addition: Warning message:
> In readBin(file, what = character(0), n = 1, size = nchar(xport.file.header,  :
> null terminator not found: breaking string at 10000 bytes
>
> I can open the xpt using wordpad and there is a header but it seems to be just text.  I really don't know what constitutes an "
> SAS xport file header"

I'm not sure why Peter deleted my copy of a sample of a SAS xport header that I took from an NHANES data distribution. He seemed to think I was confused about the function you had been using. The reason I mentioned that `read.xport` was from the 'foreign' package is that one generally loads that package to make the function available, while it appears you were using a different package, SASxport, and I didn't know whether that package had a function which had the same name as the one from pkg-foreign, and if it did whether it might depend on the read.xport function in foreign. You should not need to download the 'foreign' package, since it ships with every distribution of R. These are the arguments accepted by that function:

SASxport::read.xport
function (file, force.integer = TRUE, formats = NULL, name.chars = NULL,
    names.tolower = FALSE, keep = NULL, drop = NULL, as.is = 0.95,
    verbose = FALSE, as.list = FALSE, include.formats = FALSE)


 When I look at the SASxport::read.xport function code, it is in fact, _not_ the same function. But it does have the R statement about what it thinks qualifies as a SAS xprot file:

xport.file.header <- "HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000  "

It checks to see whether the file starts with that string.

This is what appeared in my first message:

>
> The "export" or "transfer format from SA is supposed to make reading data less difficult and standardized. This is what a header from the version used by the NHANES releases (that's all one line):
>
> HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000  SAS     SAS     SASLIB  9.2     XP_PRO                        16SEP09:09:39:2516SEP09:09:39:25                                                                HEADER RECORD*******MEMBER  HEADER RECORD!!!!!!!000000000000000001600000000140  HEADER RECORD*******DSCRPTR HEADER RECORD!!!!!!!000000000000000000000000000000  SAS     DEMO    SASDATA 9.2     XP_PRO                        16SEP09:09:39:2516SEP09:09:39:25                                                                HEADER RECORD*******NAMESTR HEADER RECORD!!!!!!!000000014400000000000000000000  SEQN    Respondent sequence number    

So the header is text, but it is text with a particular structure. If your file doesn't have that structure, then it's not a SAS xport file. The .xpt extension is also used for Mozilla Firefox plugins.


>
> Nick
>
>
>
> On 14 April 2018 at 10:32 peter dalgaard <[hidden email]> wrote:
>
> That's what he tried,

Actually not, Peter. Wray was using a function of the same name, but not from pkg-foreign. Perhaps he was following the tutorial at:

http://www.phusewiki.org/wiki/index.php?title=Open_XPT_File_with_R


> but the bottom line is that just because something is called foo.xpt there is no guarantee that it actually is a SAS XPORT file. Firefox plugins use the same extension but it could really be anything - naming conventions are just that: conventions.
>
> So dig deeper and find out what the file really is (or was supposed to be).

Peter and I agree agree on that advice.

>
> -pd
>
>>
>>            On 14 Apr 2018, at 00:18 , David Winsemius <[hidden email]> wrote:
>>
>>            There is a read.xport function in the foreign package and I think most people would have chosen that one as a first attemp. It's part of the standard R distribution. It refers you to https://support.sas.com/techsup/technote/ts140.pdf for details on the format.
>
--

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Re: Reading xpt files into R

R help mailing list-2
 > When I look at the SASxport::read.xport function code, it is in fact,
_not_ the
> same function. But it does have the R statement about what it thinks
> qualifies as a SAS xprot file:
>
> xport.file.header <- "HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000
"
>
> It checks to see whether the file starts with that string.

Version 8 SAS xport files have the header
  HEADER RECORD*******LIBV8 HEADER RECORD!!!!!!!000000000000000000000
000000000
It is easy to check for that in your text editor or in R.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Apr 14, 2018 at 1:30 PM, David Winsemius <[hidden email]>
wrote:

>
> > On Apr 14, 2018, at 12:18 PM, WRAY NICHOLAS via R-help <
> [hidden email]> wrote:
> >
> >
> > -------- Original Message ----------
> > From: WRAY NICHOLAS <[hidden email]>
> > To: peter dalgaard <[hidden email]>
> > Date: 14 April 2018 at 20:18
> > Subject: Re: [R] Reading xpt files into R
> >
> >
> > Well yesterday I'd downloaded the "foreign" package and tried to open
> the xpt file using that:
> >
> > library(foreign)
> > read.xport("test.xpt")
> >
> > I got the following error and warning messages:
> >
> >> read.xport("test.xpt")
> > Error in read.xport("test.xpt") :
> > The specified file does not start with a SAS xport file header!
> > In addition: Warning message:
> > In readBin(file, what = character(0), n = 1, size =
> nchar(xport.file.header,  :
> > null terminator not found: breaking string at 10000 bytes
> >
> > I can open the xpt using wordpad and there is a header but it seems to
> be just text.  I really don't know what constitutes an "
> > SAS xport file header"
>
> I'm not sure why Peter deleted my copy of a sample of a SAS xport header
> that I took from an NHANES data distribution. He seemed to think I was
> confused about the function you had been using. The reason I mentioned that
> `read.xport` was from the 'foreign' package is that one generally loads
> that package to make the function available, while it appears you were
> using a different package, SASxport, and I didn't know whether that package
> had a function which had the same name as the one from pkg-foreign, and if
> it did whether it might depend on the read.xport function in foreign. You
> should not need to download the 'foreign' package, since it ships with
> every distribution of R. These are the arguments accepted by that function:
>
> SASxport::read.xport
> function (file, force.integer = TRUE, formats = NULL, name.chars = NULL,
>     names.tolower = FALSE, keep = NULL, drop = NULL, as.is = 0.95,
>     verbose = FALSE, as.list = FALSE, include.formats = FALSE)
>
>
>  When I look at the SASxport::read.xport function code, it is in fact,
> _not_ the same function. But it does have the R statement about what it
> thinks qualifies as a SAS xprot file:
>
> xport.file.header <- "HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000
> "
>
> It checks to see whether the file starts with that string.
>
> This is what appeared in my first message:
>
> >
> > The "export" or "transfer format from SA is supposed to make reading
> data less difficult and standardized. This is what a header from the
> version used by the NHANES releases (that's all one line):
> >
> > HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000
> SAS     SAS     SASLIB  9.2     XP_PRO
> 16SEP09:09:39:2516SEP09:09:39:25
>                       HEADER RECORD*******MEMBER  HEADER RECORD!!!!!!!000000000000000001600000000140
> HEADER RECORD*******DSCRPTR HEADER RECORD!!!!!!!000000000000000000000000000000
> SAS     DEMO    SASDATA 9.2     XP_PRO
> 16SEP09:09:39:2516SEP09:09:39:25
>                       HEADER RECORD*******NAMESTR HEADER RECORD!!!!!!!000000014400000000000000000000
>    SEQN    Respondent sequence number
>
> So the header is text, but it is text with a particular structure. If your
> file doesn't have that structure, then it's not a SAS xport file. The .xpt
> extension is also used for Mozilla Firefox plugins.
>
>
> >
> > Nick
> >
> >
> >
> > On 14 April 2018 at 10:32 peter dalgaard <[hidden email]> wrote:
> >
> > That's what he tried,
>
> Actually not, Peter. Wray was using a function of the same name, but not
> from pkg-foreign. Perhaps he was following the tutorial at:
>
> http://www.phusewiki.org/wiki/index.php?title=Open_XPT_File_with_R
>
>
> > but the bottom line is that just because something is called foo.xpt
> there is no guarantee that it actually is a SAS XPORT file. Firefox plugins
> use the same extension but it could really be anything - naming conventions
> are just that: conventions.
> >
> > So dig deeper and find out what the file really is (or was supposed to
> be).
>
> Peter and I agree agree on that advice.
>
> >
> > -pd
> >
> >>
> >>            On 14 Apr 2018, at 00:18 , David Winsemius <
> [hidden email]> wrote:
> >>
> >>            There is a read.xport function in the foreign package and I
> think most people would have chosen that one as a first attemp. It's part
> of the standard R distribution. It refers you to https://support.sas.com/
> techsup/technote/ts140.pdf for details on the format.
> >
> --
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'
>  -Gehm's Corollary to Clarke's Third Law
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.