Geopandas Exercise¶

  • Create a geodataframe from a CSV file with coordinate columns
  • Create a geodataframe from a shapefile
  • Exploring and Plotting geodataframes
  • Attribute subset
  • Dissolving features
  • Spatial subsets
  • Spatial joins
In [1]:
#Read in the packages
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point

1. Fetching and exploring spatial data¶

► Q1.1 Read the Permitted_Animal_Facilities.csv file into a geodataframe called gdf_permits. Assume the coordinate columns (Location Lat Num and Location Long Num) use the NAD 1983 coordinate reference system (WKID/EPSG = 4269).

In [2]:
#Read the CSV into a Pandas dataframe

#Construct geometries from the dataframe's coordinate columns

#Create a geodataframe from the dataframe, the geometry, and the coordinate reference system
gdf_permits = 

#Reveal the columns in the data
gdf_permits.columns
Out[2]:
Index(['Permit Number', 'Facility Name', 'Combined Owner',
       'Regulated Operation', 'Permit Type', 'Regulated Activity',
       'Allowable Count', 'Number Of Lagoons', 'Issued Date', 'Effective Date',
       'Expiration Date', 'Admin Region', 'County Name', 'Location Lat Num',
       'Location Long Num', 'Address 1', 'Address 2', 'City', 'State', 'Zip',
       'geometry'],
      dtype='object')

► Q1.2 Plot the geodataframe as a simple point map.

In [3]:
#Plot the geodataframe as simple points

► Q1.3 Subset points where values in the State column equal "NC" into a new dataframe called gdf_permits_nc.

In [4]:
#Subset NC records
gdf_permits_nc = 
#Report the shape of the geo dataframe
gdf_permits_nc.shape
Out[4]:
(2450, 21)

► Q1.4 Plot the subset features. For full credit:

  • Assign different colors to each unique value in the Permit Type column
  • Use the Set3 color map
  • Draw markers something other than the default, using an opacity of 0.4
  • Set the figuire size to 12 x 6
In [5]:
#Plot the NC records

► Q4 Read the HUC12.shp shapefile into a geodataframe named gdf_HUC12, and show the columns in the geodataframe.

In [6]:
#Read in the HUC12 feature class

#Reveal the columns
gdf_HUC12.columns
Out[6]:
Index(['OBJECTID_2', 'OBJECTID', 'HUC_8', 'HUC_10', 'HUC_12', 'ACRES',
       'NCONTRB_A', 'HU_10_GNIS', 'HU_12_GNIS', 'HU_10_DS', 'HU_10_NAME',
       'HU_10_MOD', 'HU_10_TYPE', 'HU_12_DS', 'HU_12_NAME', 'HU_12_MOD',
       'HU_12_TYPE', 'META_ID', 'STATES', 'GlobalID', 'SHAPE_Leng', 'GAZ_ID',
       'WBD_Date', 'VPUID', 'Shape_Le_1', 'Shape_STAr', 'Shape_STLe',
       'geometry'],
      dtype='object')

► Q5 Show whether the HUC_12 geodataframes has the same coordinate reference system as the Animal Permit (NC subset) one.

In [7]:
#Reveal whether this and the animal permit geodataframe  
# share the same coordinate reference system
Out[7]:
True

► Q6 Plot the features so that each HU_10_NAME appears as a different color

In [8]:
#Plot the features, showing HUC_10s as different colors

► Q7 Dissolve HUC_12 features based on the HUC_10 attribute, naming the output gdf_HUC10. The output should include the sum of the ACRES column and the first HU_10_NAME value.

In [9]:
#Dissolve HUC12 features on the HUC_10 attributes & display
gdf_HUC10 = 
gdf_HUC10.head()
Out[9]:
geometry ACRES HU_10_NAME
HUC_10
0304010101 POLYGON ((-81.39609 36.05608, -81.39592 36.055... 112698.0 Headwaters Yadkin River
0304010102 POLYGON ((-81.31413 36.00396, -81.31409 36.003... 120847.0 W Kerr Scott Reservoir-Yadkin River
0304010103 POLYGON ((-81.14188 36.05491, -81.14221 36.054... 134933.0 Reddies River-Yadkin River
0304010104 POLYGON ((-81.00537 36.14105, -81.00539 36.141... 117712.0 Roaring River-Yadkin River
0304010105 POLYGON ((-80.78394 36.39482, -80.78381 36.394... 69230.0 Mitchell River

► Q8 Subset the NC animal permit locations that fall within the HUC_10 features. Save the output as gdf_HUC_permits

In [10]:
#Intersect the two dataframes
gdf_HUC_permits = 

#Plot

► Challenge question: The gdf_HUC_permits dataframe now includes the HUC 10 name (in the HU_10_NAME attribute) for each animal permit location. We can now count the total number of animal permits in each HUC 10. Furthermore, because the Animal Permit data includes an attribute of the number of animals (Allowable Count attribute), we can also compute the total number of permitted animals for each HUC 10.

With this information, see if you can:

  • C1 Create a bar plot showing the number of permits by HUC 10
  • *TIP*: The value_counts() function can be useful here...
  • C2 Create a bar plot showing the total allowable count of animals in each HUC 10
  • *TIP*: You'll need to aggregate your gdf_HUC_animal data either via Pandas' groupby or GeoPandas' dissolve function... to compute the sum of allowable counts across each HUC10.
  • C3 Create a plot of HUC10s colored by the number of permits per 1000 acres.
  • I actually give you the code to plot # of permits per HUC_10 (below), but see if you can alter the code to show the total allowable count by HUC_10. Then, see if you can divide the values by area to compute density of permits and/or allowable count in each HUC_10...
In [11]:
#Create a bar plot of # of permits in each HUC 10
In [12]:
#Create a bar plot of the total # of allowable animals in each HUC 10
In [13]:
## Plot the # of permits in each HUC 10

#Spatially join the animal permits to each HUC 10 feature
gdf_sjoin = gpd.sjoin(left_df=gdf_HUC10,
                      right_df=gdf_permits_nc,
                      how='left',
                      predicate='contains')

#Dissolve on HUC 10
gdf_sjoin_HUC10 = gdf_sjoin.dissolve(
    'HU_10_NAME',
    aggfunc={'Allowable Count':'count','ACRES':'sum'})

#Plot based on Allowable Count attribute
gdf_sjoin_HUC10.plot('Allowable Count',legend=True);
In [14]:
#Plot the # of permits per 1000 acres