Python & GIS - Extending Python

ENV 859 - Geospatial Data Analytics   |   Fall 2024   |   Instructor: John Fay  

To run the exercises will included in this session, download ArcPy1_modules.exe file to a local folder and unpack it.

Introduction

The previous Python sections introduced us to the basics of Python: objects (number, strings, etc.), some scripting techniques (looping and conditional processing), and the general anatomy of a Python script (comments, variables, and a sequence of python statements). While there's a great deal you can do with these basic elements of Python, we've really only scratched the surface.

In this tutorial we look beyond Python's set of built-in functions and into the world of Python modules and packages.

In its simplest form, a Python module (also sometimes called a “library” ) is just a Python script or set of scripts that are called from another Python script, similar to how a geoprocessing model in ArcGIS can contain other geoprocessing models. A number of useful modules are distributed with basic Python, and heaps more developed by the greater Python community can be imported into your Python scripting environment. In the exercises below, we examine some common Python modules in preparation for using the ArcPy module, which serves as the gateway for using ArcGIS within Python.

A Python package refers to a collection of modules or sometimes just to a more complex module. Behind the scenes, packages are organized a bit differently than modules, formally speaking. From a user’s perspective, however, they behave functionally the same. Indeed, the two names are often used interchangeably, as I will do here in this document.

Learning objectives:

Topic Learning objectives
A simple example:
The math module
• Explain what Python modules and packages are and how they are used
• Import a module into your coding environment
• Use the math module to do computations in your code
• Learn what a module can do using the dir() and help() functions
Python’s Built-In modules • Find a list of all the modules that come packaged with base Python
Some useful Python modules • Use the os module to interact with your operating system via Python
• Use the sys module to list your machine’s system settings
• Use the sys module’s argv class to to read variables into your script
• Use the pathlib to find files on your computer
More on importing packages • Import packages so they can be referenced with a name you choose
• Import only sub-components of a package
• Explain the importance of namespace in Python
Introduction to ArcPy • Import ESRI’s ArcPy package into your coding environment
• List the functions associated with ArcPy
• List the classes associated with ArcPy
• List the sub-modules included with ArcPy and what they do

Resources:


Prologue: The Power of Base Python

We’ve already written scripts that perform some complex tasks using only the most fundamental Python objects. Let’s review one more example.

  • Open the “ArcPy1_modules” in VS Code.

  • In VS Code, open the “1_ForestStockDemo.py” script.

    This script allows the user to enter a tree type (“loblolly”, “oak”, “cottonwood”, or “cypress”) and age (in years) and inform the user of the carbon stored in that tree according to known carbon stocking rates. As you can see, it does this with the use of strings, numbers, dictionaries, and some conditional statements.

  • Run the code, playing with different tree types and ages.

1. A simple example: The math module

Let’s begin by revisiting how Python can be used as a calculator.

  • We’ll need access to a Python prompt. From the VS Code Command Palette (Ctrl+Shift+P), run the Python: Start Terminal REPL command. (Tip: Just type REPL to find the command.)

  • At a Python prompt, type the following:

    5 * 5
    8 ** 2
    

Python works nicely to compute our output. But try to do anything beyond basic math, a calculating the logarithm of a number for example, and you may encounters errors:

  • Attempt the following in Python:

    log(10)
    cos(2)
    

We get errors (and it’s not just because of the syntax). Turns out that Python itself is not able to compute logarithms, trigonometry, square roots, and other slightly more sophisticated mathematical functions. Surely, Python should be able to do that, but how?

The solution is to augment Python's built-in functionality by importing a module. Here, we examine a simple example of a module in action - the math module which brings, as you might guess, additional mathematical functionality to Python.

» Exercise 1: Importing and using the math module

  1. At Python prompt, type the following to import the math module

    import math
    

    This statement imports all the functionality of the math module to your current Python session. Or if this statement were included in a script, it would add the math functionality to your script.

  2. To see a list of all the functions associated with the math module type:

    dir(math)
    

    You may also see a list of the functions in the dropdown menu that appears after typing math. in the Python console…

  3. To get help and the syntax of a specific math function type the following (here for the sqrt function):

    help(math.log)
    

    Or better yet, you can read the comprehensive module documentation on-line:
    https://docs.python.org/3.6/library/math.html

  4. Great! Now we have a function that can calculate square roots. Let’s try it:

    log(10)
    
  5. OK not quite, but we were close. In order to run functions associated with the math module, we need explicitly call that the function is part of the *math* module, done by preceding the function with the name of the module:

    math.pi
    math.log(10)
    math.cos(2)
    
  6. Challenges:

    • What is the cosine of pi?
    • Convert 2 * pi radians into degrees…

2. Python’s Built-in modules

Python is distributed as a core program which includes the scripting language and its built-in functions (the part that is overseen by Guido van Rossum, the "Benevolent Dictator for Life") as well as a suite of “built-in” modules that work with the core scripting language, i.e. ones that do not need special installation. The list of modules included with Python is quite extensive and can be seen in by clicking on the Global Module Index link at the main documentation site for the version of Python with which you are working:

https://docs.python.org/3/

You'll see that there are a great many modules at your disposal! And on top of all these, there are countless modules beyond these that are available that are available to download. In fact Jupyter itself is a Python module!

We certainly won’t attempt to learn all these modules, nor will you likely use all or even many of these modules. Instead, we'll focus a bit more on how incorporate modules in your scripts, and then we will examine a few particularly useful modules, including the ArcPy module that enables us to control ArcGIS from Python.

» Exercise 2: Python’s standard modules

One of my favorite sites listing useful Python modules and learning about them is Doug Hellman’s Python Module of the Week: https://pymotw.com/3/. I find this site much more helpful than Python’s own documentation.

  • Have a look at Mr. Hellman’s description of the math module: https://pymotw.com/3/math/index.html

    • Use the math module to print pi .

    • Using the format command, can you print pi to exactly 10 decimal places?

  • Next, look at the statistics module (https://pymotw.com/3/statistics/index.html)

    • Import the statistics module into your Python session

    • Compute the mean of the following list of numbers: [10,12,14,21,29,8,11,1,30]

  • Finally, have a quick look at the datetime module (https://pymotw.com/3/datetime/index.html)

    • theDate = datetime.date.today() returns what type of variable?
    • What are some operations you can do on this theDate variable?
    • Can you figure out how to print the current year from this theDate variable?

3. Some useful Python modules

The sys and os modules are two built-in Python modules that allow Python to interact with the machine on which it is running and its file system. These modules can be handy for, among other things, enabling your script to interact with things outside your script (e.g. script inputs and outputs), as well as navigate the folders and drives of your machine to nab a particular file or list its contents - something we’ll see later that is essential in using ArcPy.

Help on these modules can be found in the Python Global Module Index and in the tutorial sections of the on-line Python documentation:

os module:

sys module:

» Exercise 3a: The os module → osModuleDemo.py

Open the osModuleDemo.py in VSCode - or simply type the commands below at a Python command prompt or in a Jupyter notebook.

  1. First, import the os module:
import os
  1. List the contents of your V: drive (returns a list object):
os.listdir("V:\\")
  1. Get the current working directory (place from where Python was started):
os.getcwd()
  1. Create a new folder in your V: drive:
os.mkdir("V:\\osplayground")
  1. Ensure that the folder you created exists:
os.path.exists("V:\\osplayground")
  1. Set the current working directory to the new folder, then create a file, write text, and close it.:
os.chdir("V:\\osplayground")
fileObj = open("myFile.txt",'w')
fileObj.write("Hi!")
fileObj.close()
  1. Open the file in a text editor to witness your handy work…:
os.system("notepad myFile.txt")

(Close the notepad application that appears before proceeding)

  1. Rename the file:
os.rename("myFile.txt", "theFile.txt")
  1. Create a pathString variable and explore various uses of the os.path sub-module:
pathString = "V:\\osplayground\\theFile.txt"    #Make a variable of the path to avoid retyping it
print (os.path.basename(pathString))            #Gets the file name (without the path)
print (os.path.dirname(pathString))             #Gets the path in which the file exists
  1. Use the os.path.join function to create a string from path components:
pathString2 = os.path.join(pathItems[0],"MyOtherFile.txt")
print(pathString2)
  1. Delete the [renamed] text file you just created:
os.remove("V:\\osplayground\\theFile.txt")
  1. Go back to the V: drive and then delete the folder you created (if it's not in use):
os.chdir("V:\\")
os.rmdir("V:\\osplayground\\")

From these examples of some os and os.path commands, you get a feel for the module's ability to navigate and manipulate - even open - various file-system related objects. This can be quite handy when we get into manipulating spatial datasets within Python as the input to many ArcGIS tools are path names and file names of files on the computer.

While the sys module is capable of many tasks, its primary use as far as we will be concerned is accepting user inputs from outside of python. This is essential for calling Python scripts from other applications (e.g. from ArcGIS), if we want our scripts to accept variable inputs and/or generate outputs. The sys module also allows us to stop a script in its tracks, in cases when that's appropriate.

» Exercise 3b: Using sys to accept arguments → sysModuleDemo.py

What if we want our code to accept run with values set outside of our code. Previously, we looked at the input() statement, but that expects the user to type in a value at a prompt. However, if we want the code to accept values without a prompt, we can use the sys module.

Here an example of what we are aiming for:

  1. Open your Python Command Prompt (a shortcut is provided in the workspace)

  2. Type and run the following command at the command prompt:

    python V:\ArcPy1_modules\3b_sysModuleDemo.py John 2010
    
  3. Try again, replacing “John” and “2010” with another name and another year. What happens?

What we just did is run our 3b_sysModuleDemo.py script with the provided inputs of “John” and “2010” (or whatever you typed in at the prompt). These values, and they could be one value, 2 values, 543 value - however many you want, are passed into our script as an argument vector, i.e. a list of values, the items in which are identified by their order in the list.

The argv object of the sys module hold the contents of this list, with the first item always being the name of the script being called.

Now let’s look a the process in VSCode. It’s a bit tricky as we have to add our arguments to a debugging file called launch.json (which is provided in our workspace). Open the launch.vscode file in VSCode. You can hover over the items to get an idea what they are for, but specifically note the “args “ line (line #14). You’ll see the two arguments we are providing to our script when it’s run in VSCode. Really, that’s all you need to know for now.

:point_right: To create/access the launch.json file, select Run >Add Configuration...or Run > Open Configuration... respectively.

To run code that uses sys.argv, we run it in debug mode. This tells VS Code to read the launch.vscode file.

  1. Open the 3b_sysModuleDemo.py script (located in the ArcPy1 folder) in VSCode.

  2. Run the code as is. You will get a “IndexError: list index out of range” error. That’s because no arguments are suppled in basic Run mode.

  3. Run the code in debug mode (F5).

  4. Examine what's written to the Python console. Do you understand what's going on with respect to what’s supplied in the above launch.vscode file and the script?


The **argument vector** (sys.argv) will also be quite useful as we get into writing geoprocessing scripts that integrate ArcGIS and Python.

The sys.exit() function allows the script to be halted immediately. This is a useful error handling technique as shown when you put in a year past the current year…

4. More on importing modules into your scripts

In the above exercises, we imported the modules with the simple import statement. There are some variations on how to import modules that lead to different means of using the modules in your script. The exercise below examines a variety way of loading a module into your script.

» Exercise 4 - Alternative import statements

  1. Type the following to import just the path component from the os module:

    from os import path
    print path.join("C:\\temp","data.txt")
    

    This can be useful if we constantly use the os.path statement in our script and want to avoid typing os. in front of the path. Seems insignificant in this case, but in writing long scripts with modules with long names, this can be handy.

  2. Type the following to import a module but as a shorter name:

    import pandas as pd
    dir(pd)
    

    Again, this is serves mostly as a time saver in writing scripts by allowing us to type "win32c" instead of the full module name of "win32com.client".

  3. Import all subcomponents of a module at the root level:

     from math import *
     sqrt(25)
    

    Note that by importing everything, i.e. by using *, we don't have to preface the sqrt function with the math module name (e.g. "math.sqrt") The above format of importing a module imports all the functions at the root level so that the module need not be included when executing the command.

    The danger in using these shorthand methods for importing and using modules in our scripts, however, is that functions can get confused when two modules share the same function name. For example, if both moduleA and moduleB had a function called "factorate", and we imported both modules as…

     from moduleA import *
     from moduleB import *
    

    …then any time we use "factorate" in our script, we couldn't be certain which module's "factorate" it would use.

  4. Delete objects from the script/session (these can be be modules or variables):

     del path, pd
    

    This "checks back in" the path module; its functions can no longer be used in your scripts.


5. Introduction to ArcPy

When ArcGIS is installed on a machine, a Python package called ArcPy also gets installed. When imported into a script, this module grants access to everything ArcGIS, allowing us to do virtually anything we can do in ArcGIS Pro from within a Python script. Before digging too deeply into how spatial analysis is done within Python, let's take a moment to get acquainted with ArcPy.

» Exercise 5 - A quick introduction to ArcPy

  1. In the Python console, import ArcPy:
    import arcpy
    
  2. Take a gander at all the functions associated with ArcPy:

     funcList = dir(arcpy)
     print(len(funcList))
    

    There are > 1300 functions associated with ArcPy!

  3. Have a look at the raw help file for ArcPy:
     help(arcpy)
    
  4. Display the scripting syntax for a particular ArcPy command (here the Spatial Analyst slope function):

    help(arcpy.sa.Slope)
    
  5. Open up the ESRI help documentation on ArcPy and its function, classes, and modules (more on what these are will come later):

  6. Familiarize yourself with the ArcGIS Pro help for specific functions.

    import arcpy
       
    # Set the current workspace
    arcpy.env.workspace = "c:/data/DEMS"
       
    # Get and print a list of GRIDs from the workspace
    rasters = arcpy.ListRasters("*", "GRID")
    for raster in rasters:
        print(raster)
    

    Do you understand what each line in this Python script does? Could you implement it for a workspace?

Take a moment and familiarize yourself with all the resources available that provide information on both what's available in ArcPy and how to use it. The above examples all pull from documentation linked to the very version of Python you are using rather than some static document.

ESRI's Python package has evolved tremendously with each new version of ArcGIS. Consequently, any printed document or book on the shelf can become outdated quickly. Accessing information associated with the installation or on-line will (at least should) be kept up to date!


Summary

Other than listing some coordinates of Sara the turtle and importing the ArcPy module, these past three tutorials really have not dealt with spatial analysis or ArcGIS at all. But they have laid important ground work for writing useful and robust scripts that conduct spatial analysis.

By gaining command of numbers, strings, and collections within the scripting world we can manipulate the inputs and arguments for running ArcGIS commands via ArcPy. By writing scripts that can iterate through lines of text in a text document or items in a list, or that can execute statements only when a set of preconditions are met, we have a great deal of control over guiding an analytical workflow from start to end. And finally, by learning what Python modules exist and how to incorporate them into our scripts, we vastly expand the capabilities of our scripts.

What remains is gaining command of the ArcPy module so that we can build scripts that can do virtually anything that Desktop ArcGIS can do - and more, since we now have superior control over iteration, decision making, and even calculations from within a scripting environment than we do from interacting with the desktop user interface or even the model builder.

Up next, then, is one more tutorial that discusses the structure of the ArcPy module, how it's used to execute ArcGIS tools, set geoprocessing environments, interact with features data, create feature classes, and more.