Geoprocessing Workflows in Python

ENV 859 - Geospatial Data Analytics   |   Fall 2025   |   Instructor: John Fay  

Introduction & Learning Objectives

Having now covered the basics of Python, including how to work with built-in and 3rd party Python packages – including ArcPy – we are now ready to explore how spatial analysis can be done with Python. Using the Hurricane Mapping tool as a template, we’ll replicating the geoprocessing workflow we produced in the ArcGIS Pro geoprocessing modeler, but now in a fully transparent, fully reproducible Jupyter notebook. In doing so, we’ll cover the following:

Learning Objectives:

  • Continue to hone our basic Python coding skills, including interacting with the ESRI help documentation on ArcPy
  • Continue to develop our scripting skills, understanding that writing scrips is not necessarily a linear process
  • Creating an organized coding workspace
  • Importing the arcpy package alongside other useful Python packages
  • Finding and implementing the proper syntax for an ArcPy geoprocessing tool
  • Setting coding variables, including pathnames to spatial datasets
  • Using pathlib’s Path module to work with relative paths.
  • Setting ArcPy environment variables using the arcpy.env module
  • Work with geoprocessing tool outputs
  • Execute geoprocessing tools in sequence
  • Include process update messages as well as map outputs in our code

✅ Task 1: Preparing the Workspace

As with all our spatial analysis tasks, everything begins with creating a tidy project workspace with subfolders to keep files organized.

  • Create a project folder on your V: drive. Name it whatever you want, but be sure it includes no spaces or unusual characters.
  • Within this project folder, create folders for your data:
    • First create a “data “folder, and within this Data folder create folders called “raw” and “processed
  • Within this project folder, create a folder for your code and an initial notebook file:
    • Create a folder named “src
    • In this folder, create a new text file, renaming it “HurricaneTracker_v1.ipynb
  • Download and unzip the North Atlantic IBTrACS point feature class into your Raw data folder.
  • Create a Readme.txt file in the project folder, and in this file include a short description of the project, your email, and the date.

Your workspace should now resemble this schematic:

    Project_folder/
         |
         ├ data/
         |   |
         |   ├ raw/
         |   |	├ IBTrACS.NA.list.v04r00.points.dbf
         |   |	├ IBTrACS.NA.list.v04r00.points.prj
         |   |	├ IBTrACS.NA.list.v04r00.points.shp
         |   |	└ IBTrACS.NA.list.v04r00.points.shx
         |   |
         |   └ processed/
         |
         ├ src/
         |   |
         |   └ HurricaneTracker_v1.ipynb
         |
         └ Readme.txt

✅ Task 2: Code your Workflow

2.1 Initialize your notebook and add a short description

One main advantage of Jupyter notebooks is to make our code easy to follow with the use of Markdown cells. So we often start with a Markdown cell that provides a little background to what the notebook will do. I also find adding a short description, sometimes even a bulleted workflow, allows me to keep focused on the coding task at hand.

  • Open your project folder in VSCode
  • Open the Jupyter notebook file in the VSCode Editor
  • Set the Kernel to use the arcgispro-py3 kernel
  • Add a markdown cell, and in that cell add:
    • A notebook title, in a large font
    • A short description of what the code will do
    • Your name and the date

2.2 Import packages

Scripts and notebooks using packages typically import those packages early on in the code. This practice allows others to see up front what packages are required before running any code.

  • Add a new code cell to your notebook
  • Add a comment line indicating what this cell does: #Import packages
  • Import the ArcPy module: import arcpy
  • From the pathlib package, import the Path submodule: from pathlib import Path

2.3 Begin the workflow: Subset the TrackPoint shapefiles

Similar to adding a new process to our ArcGIS Pro geoprocessing modeler, we’ll start by adding code to execute a single tool, in this case the Select tool. Here, of course, we can’t simply drag and drop the tool into our code; instead, we need to determine the proper syntax for the tool we want to use.

2.3.1 Get the syntax for the tool

You have many options for identifying the proper syntax of a tool, but the most reliable is the ArcGIS Pro help.

  • Open the ArcGIS Pro on-line help: https://pro.arcgis.com/en/pro-app/latest/help/main/welcome-to-the-arcgis-pro-app-help.htm

  • Click on the Tool Reference menu bar and expand the Geoprocessing Tools menu list on the left side.

  • Find the Select tool from the Analysis>Extract toolbox .

  • Navigate to the Parameter section and click the Python tab to expose the Python syntax for the Select tool.

    Notes on ArcPy geoprocessing tools

    A few important facets of the structure and syntax of all ArcPy geoprocessing tools:

    • The tool names are preceded by “arcpy.” and then the name of the toolbox in which it is found: arcpy.analysis.Select.
    • Geoprocessing tools can have a mix of required and optional parameters, the latter are encased in curly braces (“{}”).
    • For tools that have a spatial dataset as an input or output parameter, we provide the path string to the dataset.

So, to execute the Select tool, we need to provide the input feature class (‘in_features’), the output feature class (‘out_feature_class’, and optionally the selecting SQL expression).

2.3.2 Code the tool

Add a new code cell and type in Select command, setting the parameters as follows:

Parameter Value
in_features The absolute path to the IBTrACS shapefile…
out_feature memory\\TrackPoints (We’ll use the in-memory workspace for intermediate files!)
where_clause "SEASON = 2018 And NAME = 'FLORENCE'"

While not necessary for code execution, I recommend making your code as legible as possible. This means: include the parameter names in your code, and enter each parameter on a separate line. Thus, your code would look something like:

arcpy.analysis.Select(
    in_features = "V:\\HurricaneTracker_arcpy\\data\\raw\\IBTrACS.NA.list.v04r00.points.shp",
    out_features = "memory\\TrackPoints"
    where_clause = "SEASON = 2018 And NAME = 'FLORENCE'"
)

2.3.3 Test the tool

Now run your code. If all goes well, it should generate a message that looks something like the one shown below - and you should have a new feature class stored in the path string “memory\Trackpoints”.


Messages

Start Time: Wednesday, April 9, 2027 8:39:01 AM
Succeeded at Wednesday, Wednesday, April 9, 2027 8:39:03 AM (Elapsed Time: 1.52 seconds)

If not, you’ll have to debug your error. :pensive:

2.4 Continue the workflow: Convert the points to a line

Repeat the steps above, but for the Points To Line tool (in the Data Management toolbox).

A few important notes:

  • The Input_Features parameter will be set to the path string used in the Select tool, i.e., memory\\TrackPoints
  • You can set the Output_Feature_Class to also be an in-memory feature class, e.g. memory\\TrackLines
  • If you include the parameter names in your code, you can add parameters in any order you want. If you omit them, however, you’ll need to code parameters in the exact order shown in the documentation. (And you can skip optional parameters by setting their values to be empty strings: "")

Code:

#Code to convert track points to a line feature, sorting on the ISO_DATE field
arcpy.management.PointsToLine(
    Input_Features="memory\\Tracklines",
    Output_Feature_Class="memory\\Tracklines",
    Sort_Field="ISO_TIME"
)

2.5 Continue the workflow: Select Intersecting Counties

2.5.1 Set a variable to the URL of the USA Counties Feature Service

Recall that the US Counties dataset is an on-line feature service, accessed by providing its URL. We could embed this URL in the Select Features By Location geoprocessing tool, but as these URLs tend to be long and confusing-looking, I prefer to set them to a variable outside the code that executes the geoprocessing tool.

usa_counties = 'https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Counties_Generalized_Boundaries/FeatureServer/0'

2.5.2 Code the “Select Features By Location”

We have to treat this geoprocessing tool a bit differently than the two previous ones because it doesn’t generate an output, but rather creates a virtual selection of features from the input feature layer we specify. This also gives us a chance to examine the result object that is generated when a geoprocessing tool is run.

To do this, we simply create a variable, we’ll call it the_result and set its value to be the output of the geoprocessing tool. the_result = arcpy.management.SelectLayerByLocation(...). When the tool is run, it won’t display a message to our notebook, but instead store the result in this variable.

And from this result object, we have better control over the outputs generated – see the documentation linked above for more information. The element we want from the result object is the actual output, which is extracted via the result object’s getOutput(0) function.

Code:

#Process: Select by location
the_result = arcpy.management.SelectLayerByLocation(
    in_layer=usa_counties,
    overlap_type='INTERSECT',
    select_features="memory\\Tracklines"
)
#Extract the tool output to a feature layer object
selected_counties_lyr = the_result.getOuput(0)

Now, we have a variable, selected_counties_lyr that we can use in subsequent geoprocessing tools.

2.6 Continue the workflow: Save Selected Counties Features to a New Feature Layer

2.6.1 Set a variable to the output feature class where the feature class will be created.

affected_counties = (And then set an absolute path to a shapefile that will be created in your data/processed folder, e.g. V:\\HurricaneTracker_arcpy\\data\\\processed\\affected_counties.shp)

2.6.2 Code the Copy Features tool

Use the Copy Features tool to save the selected_counties_lyr to the feature class path string specified in the previous step.

Code:

#Process: Copy layer to output feature class
print(f'Saving affected counties to {affected_counties}')
arcpy.management.CopyFeatures(
    in_features=selected_counties_lyr,
    out_feature_class=affected_counties
)

2.6.3 Run the code

You should see a new feature class in your data/processed folder.


✅ Task 3: Streamline your workflow

If all went well, your code successfully executes your workflow of extracting storm points for a specific storm, connecting them to a storm track line, selecting counties that intersect that track, and saving the counties to a file. If you tried running your code from start to finish again, however, you’d likely run into an error as ArcPy would stumble in trying to overwrite existing datasets. Also, if you wanted to change storms , you’d have to dig fairly deeply into your code.

Here, we address those issues and also streamline our code to be more readable, re-codable, and generally be more robust. We do this by introducing a few new concepts: pathlib’s Path module, ArcPy’s env module, and some standard good coding practices.

3.1 Working with paths and the pathlib package

Your code may run now, but if you move your workspace your path strings will point to incorrect locations. The solution to this is to use relative paths, and a package to help with this is the pathlib package, specifically the Path class of the pathlub package.

3.1.1 Import the Path class

  • Import the Path class into your coding environment in the same code cell where you import the ArcPy package.

    from pathlib import Path
    

3.1.3 Examine the Path class

  • Create a new code cell where we can play with the Path class and run the following commands:

    path.cwd()

3.2 Environment settings in ArcPy with arcpy.env

Just as in ArcGIS Pro, ArcPy has environment settings as well. These are set using the arcpy.env class, full documentation of which is here. We will use setting to set the default workspace to our data/raw folder. We will also add code that allows us to overwrite existing files in our code.

Environment settings should be made early in your script, but certainly after you import the arcpy package.

3.1.1 Set the default working directory

  • Create a new code cell below the one where you import packages.
  • Add the line arcpy.env.workspace =