Plot by FIPS Code using Shapefiles

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Plot by FIPS Code using Shapefiles

Shouro Dasgupta
I am trying to plot data by FIPS code using county shapes files.

library(data.table)
> library(rgdal)
> library(colourschemes)
> library(RColorBrewer)
> library(maptools)
> library(maps)
> library(ggmap)


I have data by FIPS code which looks like this:

>
>
> dput(head(max_change))
> structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
> "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
> 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
> 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
> 5.82369276823497e-06,
> 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
> 5.60128903156804e-06), change = c(-1.47141054005866, -0.904829303986895,
> -1.47141054005866, -1.58621746782168, -1.49938750670105, -1.47141054005866
> )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), class =
> c("data.table",
> "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
> 0x0000000000110788>)


 I add leading zeroes by:

max_change <- as.data.table(max_change)
max_change$FIPS <- sprintf("%05d",as.numeric(max_change$FIPS))

I downloaded shapefiles from here:
ftp://ftp2.census.gov/geo/tiger/TIGER2014/COUNTY/.

I obtain the FIPS codes from the shapefiles and order them using:

shapes_fips <- shapes$GEOID
> shapes_fips <- as.data.table(shapes_fips)
> setnames(shapes_fips, "shapes_fips", "FIPS")
> shapes_fips <- shapes_fips[with(shapes_fips, order(FIPS)), ]
> shapes_fips$FIPS <- as.character(shapes_fips$FIPS)


Then I merge the FIPS codes with my original dataset using:

>
> merged_data <- merge(shapes_fips,max_change,by="FIPS",all.X=T, all.y=T)
> merged_data <- as.data.table(merged_data)


Which looks like this:

structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",

> "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
> 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
> 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
> 5.82369276823497e-06,
> 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
> 5.60128903156804e-06), change = c(-1.47141054005866, -0.904829303986895,
> -1.47141054005866, -1.58621746782168, -1.49938750670105, -1.47141054005866
> )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), sorted =
> "FIPS", class = c("data.table",
> "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
> 0x0000000000110788>)


But when I try to merged data back to the SpatialPolygonsDataFrame called
shapes, I get the following error:

shapes$change <- merged_data$change

Error in `[[<-.data.frame`(`*tmp*`, name, value = c(-1.47141054005866,  :
>   replacement has 3109 rows, data has 3233


 Apologies for the messy example, what am I doing wrong? Any help will be
greatly appreciated. Thank you!

Sincerely,

Shouro

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Plot by FIPS Code using Shapefiles

ajdamico
hi, after running each individual line of code above, check that the object
still has the expected number of records and unique county fips codes.  it
looks like length( shapes$GEOID ) == 3233 but nrow( merged_data ) == 3109.
the way for you to debug this is for you to go through line by line after
creating each new object  :)

i'm also not sure it's safe to work with gis objects as you're doing, there
are some well-documented examples of working with tiger files here
https://github.com/davidbrae/swmap



On Tue, May 5, 2015 at 11:00 AM, Shouro Dasgupta <[hidden email]> wrote:

> I am trying to plot data by FIPS code using county shapes files.
>
> library(data.table)
> > library(rgdal)
> > library(colourschemes)
> > library(RColorBrewer)
> > library(maptools)
> > library(maps)
> > library(ggmap)
>
>
> I have data by FIPS code which looks like this:
> >
> >
> > dput(head(max_change))
> > structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
> > "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
> > 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
> > 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
> > 5.82369276823497e-06,
> > 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
> > 5.60128903156804e-06), change = c(-1.47141054005866, -0.904829303986895,
> > -1.47141054005866, -1.58621746782168, -1.49938750670105,
> -1.47141054005866
> > )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), class =
> > c("data.table",
> > "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
> > 0x0000000000110788>)
>
>
>  I add leading zeroes by:
>
> max_change <- as.data.table(max_change)
> max_change$FIPS <- sprintf("%05d",as.numeric(max_change$FIPS))
>
> I downloaded shapefiles from here:
> ftp://ftp2.census.gov/geo/tiger/TIGER2014/COUNTY/.
>
> I obtain the FIPS codes from the shapefiles and order them using:
>
> shapes_fips <- shapes$GEOID
> > shapes_fips <- as.data.table(shapes_fips)
> > setnames(shapes_fips, "shapes_fips", "FIPS")
> > shapes_fips <- shapes_fips[with(shapes_fips, order(FIPS)), ]
> > shapes_fips$FIPS <- as.character(shapes_fips$FIPS)
>
>
> Then I merge the FIPS codes with my original dataset using:
>
> >
> > merged_data <- merge(shapes_fips,max_change,by="FIPS",all.X=T, all.y=T)
> > merged_data <- as.data.table(merged_data)
>
>
> Which looks like this:
>
> structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
> > "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
> > 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
> > 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
> > 5.82369276823497e-06,
> > 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
> > 5.60128903156804e-06), change = c(-1.47141054005866, -0.904829303986895,
> > -1.47141054005866, -1.58621746782168, -1.49938750670105,
> -1.47141054005866
> > )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), sorted =
> > "FIPS", class = c("data.table",
> > "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
> > 0x0000000000110788>)
>
>
> But when I try to merged data back to the SpatialPolygonsDataFrame called
> shapes, I get the following error:
>
> shapes$change <- merged_data$change
>
> Error in `[[<-.data.frame`(`*tmp*`, name, value = c(-1.47141054005866,  :
> >   replacement has 3109 rows, data has 3233
>
>
>  Apologies for the messy example, what am I doing wrong? Any help will be
> greatly appreciated. Thank you!
>
> Sincerely,
>
> Shouro
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Plot by FIPS Code using Shapefiles

Shouro Dasgupta
Hello,

Thank you for your reply. My original data has 3109 FIPS codes. Is there a
way to merge only this data into the shapefiles? I hope I am clear.

Thank you also for the link, I am trying to do something like this:
https://gist.github.com/reubano/1281134.

Thanks again!

Sincerely,

Shouro

On Tue, May 5, 2015 at 5:21 PM, Anthony Damico <[hidden email]> wrote:

> hi, after running each individual line of code above, check that the
> object still has the expected number of records and unique county fips
> codes.  it looks like length( shapes$GEOID ) == 3233 but nrow( merged_data
> ) == 3109.  the way for you to debug this is for you to go through line by
> line after creating each new object  :)
>
> i'm also not sure it's safe to work with gis objects as you're doing,
> there are some well-documented examples of working with tiger files here
> https://github.com/davidbrae/swmap
>
>
>
> On Tue, May 5, 2015 at 11:00 AM, Shouro Dasgupta <[hidden email]> wrote:
>
>> I am trying to plot data by FIPS code using county shapes files.
>>
>> library(data.table)
>> > library(rgdal)
>> > library(colourschemes)
>> > library(RColorBrewer)
>> > library(maptools)
>> > library(maps)
>> > library(ggmap)
>>
>>
>> I have data by FIPS code which looks like this:
>> >
>> >
>> > dput(head(max_change))
>> > structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
>> > "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
>> > 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
>> > 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
>> > 5.82369276823497e-06,
>> > 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
>> > 5.60128903156804e-06), change = c(-1.47141054005866, -0.904829303986895,
>> > -1.47141054005866, -1.58621746782168, -1.49938750670105,
>> -1.47141054005866
>> > )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), class =
>> > c("data.table",
>> > "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
>> > 0x0000000000110788>)
>>
>>
>>  I add leading zeroes by:
>>
>> max_change <- as.data.table(max_change)
>> max_change$FIPS <- sprintf("%05d",as.numeric(max_change$FIPS))
>>
>> I downloaded shapefiles from here:
>> ftp://ftp2.census.gov/geo/tiger/TIGER2014/COUNTY/.
>>
>> I obtain the FIPS codes from the shapefiles and order them using:
>>
>> shapes_fips <- shapes$GEOID
>> > shapes_fips <- as.data.table(shapes_fips)
>> > setnames(shapes_fips, "shapes_fips", "FIPS")
>> > shapes_fips <- shapes_fips[with(shapes_fips, order(FIPS)), ]
>> > shapes_fips$FIPS <- as.character(shapes_fips$FIPS)
>>
>>
>> Then I merge the FIPS codes with my original dataset using:
>>
>> >
>> > merged_data <- merge(shapes_fips,max_change,by="FIPS",all.X=T, all.y=T)
>> > merged_data <- as.data.table(merged_data)
>>
>>
>> Which looks like this:
>>
>> structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
>> > "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
>> > 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
>> > 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
>> > 5.82369276823497e-06,
>> > 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
>> > 5.60128903156804e-06), change = c(-1.47141054005866, -0.904829303986895,
>> > -1.47141054005866, -1.58621746782168, -1.49938750670105,
>> -1.47141054005866
>> > )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), sorted =
>> > "FIPS", class = c("data.table",
>> > "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
>> > 0x0000000000110788>)
>>
>>
>> But when I try to merged data back to the SpatialPolygonsDataFrame called
>> shapes, I get the following error:
>>
>> shapes$change <- merged_data$change
>>
>> Error in `[[<-.data.frame`(`*tmp*`, name, value = c(-1.47141054005866,  :
>> >   replacement has 3109 rows, data has 3233
>>
>>
>>  Apologies for the messy example, what am I doing wrong? Any help will be
>> greatly appreciated. Thank you!
>>
>> Sincerely,
>>
>> Shouro
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


--

Shouro Dasgupta
PhD Candidate | Department of Economics
Ca' Foscari University of Venezia

------------------------------

Junior Researcher | Fondazione Eni Enrico Mattei (FEEM)
Isola di San Giorgio Maggiore, 8 | 30124 Venice, Italy
Tel: +39 041 2700 436

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Plot by FIPS Code using Shapefiles

ajdamico
so check the unique number of fips codes in the objects before and after

> merged_data <- merge(shapes_fips,max_change,by="FIPS",all.X=T, all.y=T)

also note that all.X should be all.x and you might want to use FALSE for
one or both of those



On Tue, May 5, 2015 at 11:40 AM, Shouro Dasgupta <[hidden email]> wrote:

> Hello,
>
> Thank you for your reply. My original data has 3109 FIPS codes. Is there a
> way to merge only this data into the shapefiles? I hope I am clear.
>
> Thank you also for the link, I am trying to do something like this:
> https://gist.github.com/reubano/1281134.
>
> Thanks again!
>
> Sincerely,
>
> Shouro
>
> On Tue, May 5, 2015 at 5:21 PM, Anthony Damico <[hidden email]> wrote:
>
>> hi, after running each individual line of code above, check that the
>> object still has the expected number of records and unique county fips
>> codes.  it looks like length( shapes$GEOID ) == 3233 but nrow( merged_data
>> ) == 3109.  the way for you to debug this is for you to go through line by
>> line after creating each new object  :)
>>
>> i'm also not sure it's safe to work with gis objects as you're doing,
>> there are some well-documented examples of working with tiger files here
>> https://github.com/davidbrae/swmap
>>
>>
>>
>> On Tue, May 5, 2015 at 11:00 AM, Shouro Dasgupta <[hidden email]>
>> wrote:
>>
>>> I am trying to plot data by FIPS code using county shapes files.
>>>
>>> library(data.table)
>>> > library(rgdal)
>>> > library(colourschemes)
>>> > library(RColorBrewer)
>>> > library(maptools)
>>> > library(maps)
>>> > library(ggmap)
>>>
>>>
>>> I have data by FIPS code which looks like this:
>>> >
>>> >
>>> > dput(head(max_change))
>>> > structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
>>> > "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
>>> > 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
>>> > 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
>>> > 5.82369276823497e-06,
>>> > 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
>>> > 5.60128903156804e-06), change = c(-1.47141054005866,
>>> -0.904829303986895,
>>> > -1.47141054005866, -1.58621746782168, -1.49938750670105,
>>> -1.47141054005866
>>> > )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), class =
>>> > c("data.table",
>>> > "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
>>> > 0x0000000000110788>)
>>>
>>>
>>>  I add leading zeroes by:
>>>
>>> max_change <- as.data.table(max_change)
>>> max_change$FIPS <- sprintf("%05d",as.numeric(max_change$FIPS))
>>>
>>> I downloaded shapefiles from here:
>>> ftp://ftp2.census.gov/geo/tiger/TIGER2014/COUNTY/.
>>>
>>> I obtain the FIPS codes from the shapefiles and order them using:
>>>
>>> shapes_fips <- shapes$GEOID
>>> > shapes_fips <- as.data.table(shapes_fips)
>>> > setnames(shapes_fips, "shapes_fips", "FIPS")
>>> > shapes_fips <- shapes_fips[with(shapes_fips, order(FIPS)), ]
>>> > shapes_fips$FIPS <- as.character(shapes_fips$FIPS)
>>>
>>>
>>> Then I merge the FIPS codes with my original dataset using:
>>>
>>> >
>>> > merged_data <- merge(shapes_fips,max_change,by="FIPS",all.X=T, all.y=T)
>>> > merged_data <- as.data.table(merged_data)
>>>
>>>
>>> Which looks like this:
>>>
>>> structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
>>> > "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
>>> > 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
>>> > 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
>>> > 5.82369276823497e-06,
>>> > 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
>>> > 5.60128903156804e-06), change = c(-1.47141054005866,
>>> -0.904829303986895,
>>> > -1.47141054005866, -1.58621746782168, -1.49938750670105,
>>> -1.47141054005866
>>> > )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), sorted =
>>> > "FIPS", class = c("data.table",
>>> > "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
>>> > 0x0000000000110788>)
>>>
>>>
>>> But when I try to merged data back to the SpatialPolygonsDataFrame called
>>> shapes, I get the following error:
>>>
>>> shapes$change <- merged_data$change
>>>
>>> Error in `[[<-.data.frame`(`*tmp*`, name, value = c(-1.47141054005866,  :
>>> >   replacement has 3109 rows, data has 3233
>>>
>>>
>>>  Apologies for the messy example, what am I doing wrong? Any help will be
>>> greatly appreciated. Thank you!
>>>
>>> Sincerely,
>>>
>>> Shouro
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>
> --
>
> Shouro Dasgupta
> PhD Candidate | Department of Economics
> Ca' Foscari University of Venezia
>
> ------------------------------
>
> Junior Researcher | Fondazione Eni Enrico Mattei (FEEM)
> Isola di San Giorgio Maggiore, 8 | 30124 Venice, Italy
> Tel: +39 041 2700 436
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Plot by FIPS Code using Shapefiles

Corey Sparks
In reply to this post by Shouro Dasgupta
Joining data the way you're doing it is dangerous, Roger Bivand and others describes a standard way to do this process here:
http://r-sig-geo.2731867.n2.nabble.com/Merging-shapefiles-and-csv-td7586839.html

And I do an example using US Census data here, using merge():
http://spatialdemography.org/wp-content/uploads/2013/04/9.-Sparks.pdf

look at page 134 of that pdf.

Hope this helps
Corey Sparks, PhD
Associate Professor
Department of Demography
University of Texas at San Antonio
501 West César E. Chávez  Blvd
Monterey Building 2.270C
San Antonio, TX 78207
210-458-3166
corey.sparks 'at' utsa.edu
coreysparks.weebly.com
Reply | Threaded
Open this post in threaded view
|

Re: Plot by FIPS Code using Shapefiles

Roger Bivand
Corey Sparks <corey.sparks <at> utsa.edu> writes:

>
> Joining data the way you're doing it is dangerous, Roger Bivand and others
> describes a standard way to do this process here:
>
http://r-sig-geo.2731867.n2.nabble.com/Merging-shapefiles-and-csv-td7586839.html


Quite right - the chunks Corey is referring to are:

Please do refer to the vignette in the maptools package, and to previous
threads which have advised that merge() should not be used, and that the
row.names of the data frames be used as ID keys. Typically using match() on
the row.names of the two objects will show which are not correctly aligned.

and

Beware that the data from the objects may be jumbled - never use merge,
always use match() on the row.names vectors of the objects to ensure that
the key-IDs agree. Jumbled data happens, it is important not to think
"shapefile" but to think DBMS with the ID key your way of staying sane.

The maptools vignette is at:

http://cran.r-project.org/web/packages/maptools/vignettes/combine_maptools.pdf

or:

library(maptools)
vignette("combine_maptools")

Here I also suspect that you'll find that there are non-unique FIPS in the
county polygons file, so may need to go through
maptools::unionSpatialPolygons() first.

Roger

>
> And I do an example using US Census data here, using merge():
> http://spatialdemography.org/wp-content/uploads/2013/04/9.-Sparks.pdf
> <http://spatialdemography.org/wp-content/uploads/2013/04/9.-Sparks.pdf>  
>
> look at page 134 of that pdf.
>
> Hope this helps
>
> -----
> Corey Sparks, PhD
> Assistant Professor
> Department of Demography
> University of Texas at San Antonio
> 501 West César E. Chávez  Blvd
> Monterey Building 2.270C
> San Antonio, TX 78207
> 210-458-3166
> corey.sparks 'at' utsa.edu
> coreysparks.weebly.com
> --
> View this message in context:
http://r.789695.n4.nabble.com/Plot-by-FIPS-Code-using-Shapefiles-tp4706830p4706840.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help <at> r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Roger Bivand
Department of Economics
NHH Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: Plot by FIPS Code using Shapefiles

Shouro Dasgupta
In reply to this post by ajdamico
Dear Anthony,

Thanks again for your reply. The following worked for merge:

merged_data <- merge(shapes_fips,max_change,by="FIPS",all.x=T, all.y=F)


 However, I think I am doing something wrong - as I have 3109 FIPS code in
my original data but when I merge with the shapes
file SpatialPolygonsDataFrame, its not merging properly, many NA.

Is it a good idea to convert the shapefiles into data.frame/data.table for
merging and then transform it back to shapefiles? This is what I have been
doing:

shapes <- readShapePoly("F:/GCM//tl_2014_us_county/tl_2014_us_county.shp")
> shapes <- as.data.frame(shapes)
> setnames(shapes, "GEOID", "FIPS")
>


> shapes_fips <- shapes$GEOID
> shapes_fips <- as.data.table(shapes_fips)
> setnames(shapes_fips, "shapes_fips", "FIPS")
> shapes_fips <- shapes_fips[with(shapes_fips, order(FIPS)), ]
> shapes_fips$FIPS <- as.character(shapes_fips$FIPS)
>


> merged_data <- merge(shapes_fips,max_change,by="FIPS",all.x=F, all.y=T)
> merged_data <- as.data.table(merged_data)
> merged_data <- merged_data[with(merged_data, order(FIPS)), ]
>


> shapes$change <- merged_data$change


Thanks again!

Sincerely,

Shouro

On Tue, May 5, 2015 at 6:00 PM, Anthony Damico <[hidden email]> wrote:

> so check the unique number of fips codes in the objects before and after
>
> > merged_data <- merge(shapes_fips,max_change,by="FIPS",all.X=T, all.y=T)
>
> also note that all.X should be all.x and you might want to use FALSE for
> one or both of those
>
>
>
> On Tue, May 5, 2015 at 11:40 AM, Shouro Dasgupta <[hidden email]> wrote:
>
>> Hello,
>>
>> Thank you for your reply. My original data has 3109 FIPS codes. Is there
>> a way to merge only this data into the shapefiles? I hope I am clear.
>>
>> Thank you also for the link, I am trying to do something like this:
>> https://gist.github.com/reubano/1281134.
>>
>> Thanks again!
>>
>> Sincerely,
>>
>> Shouro
>>
>> On Tue, May 5, 2015 at 5:21 PM, Anthony Damico <[hidden email]>
>> wrote:
>>
>>> hi, after running each individual line of code above, check that the
>>> object still has the expected number of records and unique county fips
>>> codes.  it looks like length( shapes$GEOID ) == 3233 but nrow( merged_data
>>> ) == 3109.  the way for you to debug this is for you to go through line by
>>> line after creating each new object  :)
>>>
>>> i'm also not sure it's safe to work with gis objects as you're doing,
>>> there are some well-documented examples of working with tiger files here
>>> https://github.com/davidbrae/swmap
>>>
>>>
>>>
>>> On Tue, May 5, 2015 at 11:00 AM, Shouro Dasgupta <[hidden email]>
>>> wrote:
>>>
>>>> I am trying to plot data by FIPS code using county shapes files.
>>>>
>>>> library(data.table)
>>>> > library(rgdal)
>>>> > library(colourschemes)
>>>> > library(RColorBrewer)
>>>> > library(maptools)
>>>> > library(maps)
>>>> > library(ggmap)
>>>>
>>>>
>>>> I have data by FIPS code which looks like this:
>>>> >
>>>> >
>>>> > dput(head(max_change))
>>>> > structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
>>>> > "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
>>>> > 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
>>>> > 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
>>>> > 5.82369276823497e-06,
>>>> > 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
>>>> > 5.60128903156804e-06), change = c(-1.47141054005866,
>>>> -0.904829303986895,
>>>> > -1.47141054005866, -1.58621746782168, -1.49938750670105,
>>>> -1.47141054005866
>>>> > )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), class =
>>>> > c("data.table",
>>>> > "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
>>>> > 0x0000000000110788>)
>>>>
>>>>
>>>>  I add leading zeroes by:
>>>>
>>>> max_change <- as.data.table(max_change)
>>>> max_change$FIPS <- sprintf("%05d",as.numeric(max_change$FIPS))
>>>>
>>>> I downloaded shapefiles from here:
>>>> ftp://ftp2.census.gov/geo/tiger/TIGER2014/COUNTY/.
>>>>
>>>> I obtain the FIPS codes from the shapefiles and order them using:
>>>>
>>>> shapes_fips <- shapes$GEOID
>>>> > shapes_fips <- as.data.table(shapes_fips)
>>>> > setnames(shapes_fips, "shapes_fips", "FIPS")
>>>> > shapes_fips <- shapes_fips[with(shapes_fips, order(FIPS)), ]
>>>> > shapes_fips$FIPS <- as.character(shapes_fips$FIPS)
>>>>
>>>>
>>>> Then I merge the FIPS codes with my original dataset using:
>>>>
>>>> >
>>>> > merged_data <- merge(shapes_fips,max_change,by="FIPS",all.X=T,
>>>> all.y=T)
>>>> > merged_data <- as.data.table(merged_data)
>>>>
>>>>
>>>> Which looks like this:
>>>>
>>>> structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009",
>>>> > "01011"), pred_hist = c(5.68493780563595e-06, 5.87686839563543e-06,
>>>> > 5.68493780563595e-06, 5.84476370329784e-06, 5.89156133294344e-06,
>>>> > 5.68493780563595e-06), pred_sim = c(5.60128903156804e-06,
>>>> > 5.82369276823497e-06,
>>>> > 5.60128903156804e-06, 5.75205304048323e-06, 5.80322399836766e-06,
>>>> > 5.60128903156804e-06), change = c(-1.47141054005866,
>>>> -0.904829303986895,
>>>> > -1.47141054005866, -1.58621746782168, -1.49938750670105,
>>>> -1.47141054005866
>>>> > )), .Names = c("FIPS", "pred_hist", "pred_sim", "change"), sorted =
>>>> > "FIPS", class = c("data.table",
>>>> > "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer:
>>>> > 0x0000000000110788>)
>>>>
>>>>
>>>> But when I try to merged data back to the SpatialPolygonsDataFrame
>>>> called
>>>> shapes, I get the following error:
>>>>
>>>> shapes$change <- merged_data$change
>>>>
>>>> Error in `[[<-.data.frame`(`*tmp*`, name, value = c(-1.47141054005866,
>>>> :
>>>> >   replacement has 3109 rows, data has 3233
>>>>
>>>>
>>>>  Apologies for the messy example, what am I doing wrong? Any help will
>>>> be
>>>> greatly appreciated. Thank you!
>>>>
>>>> Sincerely,
>>>>
>>>> Shouro
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>
>>
>> --
>>
>> Shouro Dasgupta
>> PhD Candidate | Department of Economics
>> Ca' Foscari University of Venezia
>>
>> ------------------------------
>>
>> Junior Researcher | Fondazione Eni Enrico Mattei (FEEM)
>> Isola di San Giorgio Maggiore, 8 | 30124 Venice, Italy
>> Tel: +39 041 2700 436
>>
>>
>


--

Shouro Dasgupta
PhD Candidate | Department of Economics
Ca' Foscari University of Venezia

------------------------------

Junior Researcher | Fondazione Eni Enrico Mattei (FEEM)
Isola di San Giorgio Maggiore, 8 | 30124 Venice, Italy
Tel: +39 041 2700 436

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Plot by FIPS Code using Shapefiles

Shouro Dasgupta
In reply to this post by Roger Bivand
Excellent suggestions Professors! I really appreciated it. This is what I
am using:

library(data.table)
> library(rgdal)
> library(colourschemes)
> library(RColorBrewer)
> library(maptools)
> library(maps)
> library(ggmap)
> library(classInt)


max_change is my csv file while shapes is my spatial data:

max_change <-read.csv ("F:/GCM/max_change.csv")
> max_change <- as.data.table(max_change)
> max_change$FIPS <- sprintf("%05d",as.numeric(max_change$FIPS))
>


> max_change <- max_change[with(max_change, order(FIPS)), ]
>


> shapes=readOGR(dsn="F:/GCM//tl_2014_us_county", "tl_2014_us_county")
> shapes$FIPS = paste(shapes$STATEFP,shapes$COUNTYFP,sep="")
>


> m = match(shapes$GEOID,max_change$FIPS)
>


> shapes$change = max_change$change[m]


I am plotting the results now. I am getting different plots regarding
county boundaries (most likely because I am doing something wrong). When I
use the *plot *command:

colours = brewer.pal(6,"PuRd")
> sd = data.frame(col=colours,values=c(-2.5,-2,-1.5,-1,-.5,0))
> sc = nearestScheme(sd)
>


> plot(c(-129,-61),c(21,53),type="n",axes=FALSE,xlab="",ylab="")
> title(main = "Mid-Century Projections (GISS-ER-2) using Max GAM")
> plot(shapes,col=sc(shapes$change),add=TRUE,border="white",lwd=0.2,
> colorkey=T)


I get a "beautiful" plot with the county boundaries clearly separated but
for *spplot*;

pal = brewer.pal(6,"Reds")
> brks.eq = classIntervals(shapes$change, style = "jenks")
> spplot(shapes, "change",xlim = c(-129,-61), ylim = c(21,53),
> at=brks.eq$brks,col.regions=pal, col="transparent",
>        main = list(label="Mid-Century Projections (GISS-ER-2) using Max
> GAM"))


 The county boundaries/FIPS are not defined. What am I doing wrong? Thanks
again!

Sincerely,

Shouro

On Wed, May 6, 2015 at 12:23 PM, Roger Bivand <[hidden email]> wrote:

> Corey Sparks <corey.sparks <at> utsa.edu> writes:
>
> >
> > Joining data the way you're doing it is dangerous, Roger Bivand and
> others
> > describes a standard way to do this process here:
> >
>
> http://r-sig-geo.2731867.n2.nabble.com/Merging-shapefiles-and-csv-td7586839.html
>
>
> Quite right - the chunks Corey is referring to are:
>
> Please do refer to the vignette in the maptools package, and to previous
> threads which have advised that merge() should not be used, and that the
> row.names of the data frames be used as ID keys. Typically using match() on
> the row.names of the two objects will show which are not correctly aligned.
>
> and
>
> Beware that the data from the objects may be jumbled - never use merge,
> always use match() on the row.names vectors of the objects to ensure that
> the key-IDs agree. Jumbled data happens, it is important not to think
> "shapefile" but to think DBMS with the ID key your way of staying sane.
>
> The maptools vignette is at:
>
>
> http://cran.r-project.org/web/packages/maptools/vignettes/combine_maptools.pdf
>
> or:
>
> library(maptools)
> vignette("combine_maptools")
>
> Here I also suspect that you'll find that there are non-unique FIPS in the
> county polygons file, so may need to go through
> maptools::unionSpatialPolygons() first.
>
> Roger
>
> >
> > And I do an example using US Census data here, using merge():
> > http://spatialdemography.org/wp-content/uploads/2013/04/9.-Sparks.pdf
> > <http://spatialdemography.org/wp-content/uploads/2013/04/9.-Sparks.pdf>
> >
> > look at page 134 of that pdf.
> >
> > Hope this helps
> >
> > -----
> > Corey Sparks, PhD
> > Assistant Professor
> > Department of Demography
> > University of Texas at San Antonio
> > 501 West César E. Chávez  Blvd
> > Monterey Building 2.270C
> > San Antonio, TX 78207
> > 210-458-3166
> > corey.sparks 'at' utsa.edu
> > coreysparks.weebly.com
> > --
> > View this message in context:
>
> http://r.789695.n4.nabble.com/Plot-by-FIPS-Code-using-Shapefiles-tp4706830p4706840.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > ______________________________________________
> > R-help <at> r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--

Shouro Dasgupta
PhD Candidate | Department of Economics
Ca' Foscari University of Venezia

------------------------------

Junior Researcher | Fondazione Eni Enrico Mattei (FEEM)
Isola di San Giorgio Maggiore, 8 | 30124 Venice, Italy
Tel: +39 041 2700 436

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.