Problem with download.file ?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with download.file ?

Giles Crane

# download.file() Seems to put the xlsx file onto hard drive.

>download.file("http://www.udel.edu/johnmack/data_library/zipcode_centroids.xlsx", "zipcode_centroids.xlsx")
trying URL 'http://www.udel.edu/johnmack/data_library/zipcode_centroids.xlsx'
Content type 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' length 2785832 bytes (2.7 Mb)
opened URL
downloaded 2.7 Mb


# Trouble reading file with xlsx.

library(xlsx)
Loading required package: rJava
Loading required package: xlsxjars
Warning messages:
1: package ‘xlsx’ was built under R version 3.1.3
2: package ‘rJava’ was built under R version 3.1.3

>df <- read.xlsx2("zipcode_centroids.xlsx", sheetIndex=1)
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  :
   java.util.zip.ZipException: invalid entry size (expected 1168 but got 1173 bytes)


# I downloaded the file manually (same name) from the web page and tried again.
# Then I read the file into R with xlsx successfully.


>df <- read.xlsx2("/zipdist/zipcode_centroids.xlsx", sheetIndex=1)
>str(df)
'data.frame': 42961 obs. of  8 variables:
  $ ZIPCODE  : Factor w/ 42961 levels "01001","01002",..: 1 2 3 4 5 6 7 8 9 10 ...
  $ TOWN.    : Factor w/ 18955 levels "Aaronsburg","Abbeville",..: 85 333 333 333 898 1089 1459 1620 1899 2929 ...
  $ STATE    : Factor w/ 52 levels "AK","AL","AR",..: 21 21 21 21 21 21 21 21 21 21 ...
  $ LATITUDE : Factor w/ 37352 levels "-7.209975","19.101978",..: 28020 28948 28916 28971 29047 28624 28326 28418 28197 28603 ...
  $ LONGITUDE: Factor w/ 37241 levels "-100.00991","-100.02632",..: 8799 8706 8811 8715 8470 8639 9019 8608 8531 9065 ...
  $ STFIPS   : Factor w/ 51 levels "01","02","04",..: 22 22 22 22 22 22 22 22 22 22 ...
  $ CD       : Factor w/ 55 levels "00","01","02",..: 3 2 2 2 2 2 2 3 3 2 ...
  $ CONG_DIST: Factor w/ 436 levels "01_01","01_02",..: 191 190 190 190 190 190 190 191 191 190 ...

# Is there a problem with download.file() when file is an Excel file or this particular Excel file?

--
Giles L Crane, MPH, ASA, NJPHA
Statistical Consultant and R Instructor
621 Lake Drive
Princeton, NJ  08540
Phone: 609 924-0971
Email: [hidden email]


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem with download.file ?

William Dunlap
Add the argument mode="wb" to your call to download.file().  On Windows
this means to use 'binary' format - do not change line endings.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Mar 27, 2015 at 7:25 AM, Giles Crane <[hidden email]> wrote:

>
> # download.file() Seems to put the xlsx file onto hard drive.
>
> >download.file("
> http://www.udel.edu/johnmack/data_library/zipcode_centroids.xlsx",
> "zipcode_centroids.xlsx")
> trying URL '
> http://www.udel.edu/johnmack/data_library/zipcode_centroids.xlsx'
> Content type
> 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' length
> 2785832 bytes (2.7 Mb)
> opened URL
> downloaded 2.7 Mb
>
>
> # Trouble reading file with xlsx.
>
> library(xlsx)
> Loading required package: rJava
> Loading required package: xlsxjars
> Warning messages:
> 1: package ‘xlsx’ was built under R version 3.1.3
> 2: package ‘rJava’ was built under R version 3.1.3
>
> >df <- read.xlsx2("zipcode_centroids.xlsx", sheetIndex=1)
> Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  :
>    java.util.zip.ZipException: invalid entry size (expected 1168 but got
> 1173 bytes)
>
>
> # I downloaded the file manually (same name) from the web page and tried
> again.
> # Then I read the file into R with xlsx successfully.
>
>
> >df <- read.xlsx2("/zipdist/zipcode_centroids.xlsx", sheetIndex=1)
> >str(df)
> 'data.frame':   42961 obs. of  8 variables:
>   $ ZIPCODE  : Factor w/ 42961 levels "01001","01002",..: 1 2 3 4 5 6 7 8
> 9 10 ...
>   $ TOWN.    : Factor w/ 18955 levels "Aaronsburg","Abbeville",..: 85 333
> 333 333 898 1089 1459 1620 1899 2929 ...
>   $ STATE    : Factor w/ 52 levels "AK","AL","AR",..: 21 21 21 21 21 21 21
> 21 21 21 ...
>   $ LATITUDE : Factor w/ 37352 levels "-7.209975","19.101978",..: 28020
> 28948 28916 28971 29047 28624 28326 28418 28197 28603 ...
>   $ LONGITUDE: Factor w/ 37241 levels "-100.00991","-100.02632",..: 8799
> 8706 8811 8715 8470 8639 9019 8608 8531 9065 ...
>   $ STFIPS   : Factor w/ 51 levels "01","02","04",..: 22 22 22 22 22 22 22
> 22 22 22 ...
>   $ CD       : Factor w/ 55 levels "00","01","02",..: 3 2 2 2 2 2 2 3 3 2
> ...
>   $ CONG_DIST: Factor w/ 436 levels "01_01","01_02",..: 191 190 190 190
> 190 190 190 191 191 190 ...
>
> # Is there a problem with download.file() when file is an Excel file or
> this particular Excel file?
>
> --
> Giles L Crane, MPH, ASA, NJPHA
> Statistical Consultant and R Instructor
> 621 Lake Drive
> Princeton, NJ  08540
> Phone: 609 924-0971
> Email: [hidden email]
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.