merging shapes and plotting the physiographic boundary of the Colorado Plateau

Today I found myself needing to plot the physiographic boundary of the Colorado Plateau in Python. It’s been a while since I’ve touched on shapefiles (or anything on the blog) so I figured I’d write a quick blurb on reading and plotting this particular shapefile.

Data: shapefile of data from  Fenneman and Johnson 1946 [1] available at https://water.usgs.gov/GIS/dsdl/physio_shp.zip

Code to load & plot & write processed data: colorado_plateau.py

Python requirements: pyshp, shapely, matplotlib

What you’ll learn: reading shapefiles, merging polygon shapes in Python with shapely

The Data

The first challenge was finding the actual lat/lon coordinates defining the edge of the Colorado Plateau… it’s amazing how many papers in geology/geophysics plot the boundary but don’t actually reference where the heck they got their coordinates from. After much digging I FINALLY found a paper that actually cited their source: Hopper and Fischer 2018 [2] reference a 1946 publication by Fenneman and Johnson [1] titled “Physiographic divisions of the conterminous U. S.” and after a quick search I found the digitized data from that publication online at water.usgs.gov.

Here’s the summary page containing metadata: https://water.usgs.gov/GIS/metadata/usgswrd/XML/physio.xml

and a direct link to the zipped shapefile:  https://water.usgs.gov/GIS/dsdl/physio_shp.zip.

The dataset contains a large number of physiographic regions and the Colorado Plateau is subdivided into multiple regions, so the code below pulls out the regions within the Colorado Plateau and joins them into a single shape defining the full boundary. To run the code below, unpack physio_shp.zip wherever you downloaded it to and rename the folder to physio (to match expectations for the pyshp shapefile reader).

The Code

The full code is here.

The XML data for the shapefile defines a province code for different provinces, for which the Colorado Plateau sub-regions have a value of 21. So the code (1) reads the shapefiles, (2) finds the shapes with a province code of 21 and (3) combines them.

Step 1:  imports, reading arguments, reading the shapefile.

shapefile is the library for pyshp, otherwise pretty self explanatory:

import shapefile, os,sys
import matplotlib.pyplot as plt
from shapely.geometry import Polygon
from shapely.ops import cascaded_union

# read the arguments
fname=sys.argv[1] # path to physio.shp
outfolder=None
if len(sys.argv)>2:
    outfolder=sys.argv[2] # folder to store output

# read the shapefile
sf = shapefile.Reader(fname)

Step 2: Find the Colorado Plateau shapes.

The shapes are described in the records list of the shapefile object:

sf.records()

records() is a list of attributes for each shape and a single record looks like

[3.886, 9.904, 220, 15, 212, '21b', 'INTERMONTANE PLATEAUS', 'COLORADO PLATEAUS', 'UINTA BASIN', 21]

The final value is the province code — so we just need to save off the indeces for which that value is 21. It turns out the 3rd value in the record list is actually a cross-reference to a shape ID, but for some reason the indexing is offset by 2 when reading this shapefile with python. So the shape data for this shape would be accessed with:

sf.shapes()[218]

rather than 220. Not sure why it’s off by 2 (would expect it to be off by 1 due to python indexing), but in any case, my code simply records the list index as python sees it:

# find the record indeces for colorado plateau (province ID = 21)
i_rec=0
recs_to_plot=[]
for rec in sf.records():
    if rec[-1]==21:
        print(rec)
        print(i_rec)
        recs_to_plot.append(i_rec)
    i_rec=i_rec+1

# plot the individual records
plt.subplot(1,2,1)
for rec in recs_to_plot:
    pts=sf.shapes()[rec].points
    lons=[pt[0] for pt in pts]
    lats=[pt[1] for pt in pts]
    plt.plot(lons,lats,'.k')

As seen above — the coordinates for the shape boundaries for a record are in

sf.shapes()[rec].points

which is a list of longitude and latitude points (which the code unpacks for plotting). This section of code will generate the following outline of the Colorado Plateau regions:

Step 3: merging shapes

This is the fun bit! What we want is just the outer boundary of the union of all the shapes. The python library shapely lets us do this very easily by creating a list of shapely Polygon objects then combining them with the cascaded_union method:

# create a single shape for Colorado Plateau from union of sub-shapes
polies=[]
for rec in recs_to_plot:
    polies.append(Polygon(sf.shapes()[rec].points))
CP_bound=cascaded_union(polies)

# plot the exterior shape
lon,lat = CP_bound.exterior.xy
plt.subplot(1,2,2)
plt.plot(lon,lat,'.k')

and the resulting plot of just the exterior boundary:

Step 4: output the processed data 

The code also exports the lat/lon points defining that exterior boundary with:

# export the final shape as a CSV of boundary points
if outfolder is not None:
    f=open(os.path.join(outfolder,'ColoradoPlateauBoundary.csv'),'w')
    f.write("lon,lat\n")
    for i,j in zip(lon,lat):
        f.write(str(i)+","+str(j)+"\n")
    f.close()

I could have written some code to save the data in a shapefile format, but for such a small amount of data I find it easier to save a CSV and just create a Polygon from the list of points as I need it. I’m actually planning to create a Polygon that will be combined with GeoPandas to find sets of points falling within the plateau (GeoPandas lets you do database joins on geospatial data, it’s awesome!).

Running the Code

To run the code:

python colorado_plateau.py /path/to/physio/physio.shp /folder/for/output/

where the first argument is the path to the downloaded and unpacked shapefile and the second argument is the location to save the CSV file (this argument is optional — no data will be saved if not included).

References

[1] Fenneman, N. M., & Johnson, D. W. (1946). Physiographic
divisions of the conterminous U.S. Reston, VA: US Geological Survey,
Physiographic Committee Special Map. https://water.usgs.gov/GIS/metadata/usgswrd/XML/physio.xml

[2] Hopper, E., & Fischer, K. M. (2018), The changing face of the lithosphere-asthenosphere boundary: Imaging continental scale patterns in upper mantle structure across the contiguous U.S. with Sp converted waves. Geochemistry, Geophysics, Geosystems, 19 , 2 593 – 2 614 . https://doi.org/10. 1029/2018GC007476