ENV 859 - Advanced GIS

ArcPy - Scripting with ArcPy

Introduction

In previous sessions, we covered the very basics of Python: variables, data types, controlling the flow of code (conditional and looping statements), and using modules. We also discussed the general steps in writing, running, and debugging Python scripts using PythonWin. Here, we build off our fundamental understanding of writing Python scripts (and get more practice with the concepts mentioned above) by learning the core techniques for using ESRI’s ArcPy module to write scripts that can perform entire geospatial analysis workflows.

The topics covered are listed below. NOTE: To run these, we'll need ArcGIS Pro installed, a cloned Python environment with PythonWin (or Spyder) installed, and the ArcPyDemo1 workspace (located here) downloaded and unpacked on your V: drive. Some exercises also require that the W: drive be properly mapped.

Learning objectives

ArcPy - Scripting with ArcPy♦ GIS operations from Python: 1. Running geoprocessing tools in Python (link)» Exercise 1a: Executing a tool using Python within ArcGIS Pro» Exercise 1b: Executing a tool using Python within an IDE►Key takeaways for running geoprocessing tools in Python:» Challenge!2. ArcPy functions (link)» Exercise 2: Functions3. ArcPy classes (link)♦ Properties♦ Methods» Exercise 3: Classes4. Using environment settings in Python (link)» Exercise 4: Getting and setting environment values5. Allowing user inputs in your script: sys.argv and arcpy.getParameterAsText()» Exercise 5: Using sys.argv6. Running python scripts from ArcGIS & sending status messages (warnings, errors, etc.) (link)» Exercise 6: Running scripts from ArcGIS Pro♦ Working with geospatial data: 7. Describing data (link)» Exercise 7: Describing data with the describe object8. Accessing, updating, and creating data using cursors (link) » Exercise 8: Working with Cursors 9. The Geometry object: Reading and Editing feature geometries. (link)Point objects vs. Point/Multipoint/Polyline/Polygon Geometry objects


♦ GIS operations from Python:

1. Running geoprocessing tools in Python (link)

One of the key features of ArcPy is its ability to run any ArcGIS tool from Python. Here we examine the fundamentals of running ArcGIS tools from Python. We begin by diving in, and then review a few approaches for teaching ourselves more about how to incorporate ArcGIS tools into out Python scripts.

» Exercise 1a: Executing a tool using Python within ArcGIS Pro

You may have, at one point or another, noticed that you can open a Python window from within ArcGIS. While this Python prompt is not suited for writing scripts (debugging is a challenge), it is useful for running specific Python statements. Here, we'll explore how ArcGIS commands are executed in Python from the ArcGIS Python script.

  1. Unzip the ArcPyDemo1.zip file to your V: drive and open the project. This is a sample dataset of the San Diego area including roads, a set of study quads, areas below 250 m in elevation and areas less than 40% slope.

  2. Open up the Python command line window in ArcPro: Analysis Tab>Python button.

  3. At the Python prompt, type the following command to buffer the roads feature class 500 meters.

Notice how the intellitype feature of ArcGIS helps you format the command and how some basic information on the tool is provided at the right hand side.

Also notice that there is no need, when running Python from within ArcGIS, to import ArcPy; it is imported automatically for obvious reasons. The Python prompt, however, works exactly like the interactive prompt in PythonWin.

  1. Run the command and examine the output.

  2. Use the key on your keyboard at the Python prompt to scroll up to the Python statement you just ran.
    Edit the command so that it now appears as below:

Here we've altered the command to dissolve all the output features. This option to dissolve is the 6th argument in the tool’s syntax. The first 3 are required, but we skip the 4th and 5th ones ("line_side" and "line_end_type") by supplying empty strings, to get to the 6th. Arguments are supplied in a specific order; we cannot skip arguments to get to a specific one. Instead we can supply empty strings to assign default values. We'll talk more about command parameters shortly.

  1. The outputs of the above commands are added to the map, and as part of our map, we can use these outputs in later Python commands just as we did the original feature classes. Alternatively, we can assign the outputs to Python variables and use them in subsequent commands:

  1. Where are we writing all these outputs? Check the environment settings to see whether that's determining where they are saved. Turns out that we have to be explicit about path name if we want to control where the outputs go. We can do that with variables:

The Python window in ArcGIS is a good way to dive into using tools in Python and should shed a bit more light on how to go about running tools via Python. The intellitype feature and the brief help were useful in structuring our syntax, but there are other places to go to get a more thorough explanation of how specific tools are used in Python.

» Exercise 1b: Executing a tool using Python within an IDE

Here we'll do a task similar to what we did in the previous example, but from entirely within PythonWin. We'll create a script that selects roads of a certain class value and buffers them a given distance.

  1. In the scripts folder of the ArcPyDemo folder, create a new Python script.

  2. View the help on the Select (Analysis) tool (here's a link). Scroll down to the end of the page to see an example of how the tool is used in Python. Let's start by copying what ESRI gives and tailoring it to our example. Copy the Select Example 2 (stand-alone Python script) to the clipboard and paste it into your Python script:

  3. Change arc.env.workspace to point to the "SanDiego "folder within your data folder: V:/ArcPyDemo1/Data/SanDiego. (This directory, coincidentally also contains a shapefile called majorrds.shp!)

  4. Change the out_feature_class variable so that the majorrdsClass4.shp shapefile is created in your scratch folder.

  5. Save and run the Python script. Then check your scratch folder to see whether the file was created. (A link to working code is here)

  6. Now, let's add another set of commands to buffer our "class 4" roads.

    Something things to keep in mind: out_feature_class is actually the input for this tool; buffRoads is the tool output. Also, as in the preceding example, we have two "placeholder" inputs (the empty strings) for the two optional parameters, so that we can assign "ALL" as the dissolve option.

  7. Run the script and examine your scratch folder. Or better yet, open the outputs in ArcGIS. (Working code link.)

    Note you'll have to manually delete the majorrdsClass4.shp file each time you run this. Well see how to fix this soon!

 

►Key takeaways for running geoprocessing tools in Python:

 

» Challenge!

 

2. ArcPy functions (link)

Functions in ArcPy execute a particular task within a script and can be incorporated into a larger program. All ArcGIS geoprocessing tools are accessible as ArcPy functions, but not all functions are necessarily provided ArcGIS tools. Functions can, for example, return a list of fields within a table, set the analysis cell size, or extract the extent of a feature class.

A full list of ArcPy functions is available here:
http://pro.arcgis.com/en/pro-app/arcpy/functions/alphabetical-list-of-arcpy-functions.htm

The following exercise offers some example applications of ArcPy functions...

» Exercise 2: Functions

→ First we'll investigate the Exists function (link):

 

Now we'll explore the ListFields function (link):

 

Challenge!

 

3. ArcPy classes (link)

In the exercise above, the ListFields function returns a list. Each item in this list is a field object; more precisely, each item is an ArcPy object belonging to the field class.

A class is roughly the same as what we've been calling "data type". All scripting objects (that is, anything we can assign a variable to in Python) belongs to a class. And the class to which it belongs determines what properties can be assigned to the variable, and what we can do with the variable, or its methods.

To use a familiar example, in the code myName = "John", the variable myName become an instance of the Python string class. and thus inherits all the properties (its length, whether it's alphanumeric, etc.) and all the methods (split, index, upper,…) defined by the Python string class.

ArcPy has many, many of its own classes. Some examples are point, field, extent, and value table (e.g. for use in reclassifications). A full list of Classes available in ArcPy is available here
http://pro.arcgis.com/en/pro-app/arcpy/classes/alphabetical-list-of-arcpy-classes.htm

As in all of Python, classes are used in ArcPy to create objects that have a specific set of properties and methods. For instance, the following statement creates a new point object, meaning a script variable assigned to the ArcPy “point” class:

The MyPoint variable is now said to be a “Point object”, and it therefore assumes the defined properties and methods of any ArcPy point, listed here: http://pro.arcgis.com/en/pro-app/arcpy/classes/point.htm . What then are these properties and methods??

♦ Properties

An object’s properties define or describe an object. For example, our point object has properties that define its X, Y, and Z coordinates, as well as a measure and ID property.

We can set our point’s X and Y coordinates as follows:

Some properties can be both read and changed within a script: a point object can be moved by changing its X and Y property values. However, other properties can only be read. An example would be a raster object’s “format” property which indicates what kind of raster it is (e.g. GRID, TIFF, etc.), and can only be changed by converting the raster, not simply by changing its property value.

♦ Methods

An object’s methods are all the actions that you can perform using the object. For example, a point object has a method called “within”:

This method allows you, via a script, to determine whether a point falls within another [point, line, or polygon] object.

Classes, properties, and methods are by no means unique to ArcPy. In fact, whenever we assign a variable in Python, is becomes the member of a class – which class is discovered using the type() function. For example, if we assign X = “Python is fun”, X becomes a member of the Python string class, and it has properties (e.g. a length, whether it’s upper case) and methods (capitalize, replace) of its own.

 

» Exercise 3: Classes

 

4. Using environment settings in Python (link)

Environment variables, such as the current and scratch workspace locations, are set as properties of the ArcPy env class. Setting these can be useful and, in some cases, essential. An example of this is seen in Exercise 1b above where we set the "workspace" environment variable (env.workspace = "C:/data"); in doing so, we no longer have to supply full paths to the "majorrds.shp" dataset as ArcPy now assumes the dataset exists in the current workspace.

Environment variables work just as they do in the ArcGIS Pro application: once set, other procedures can refer to those values which facilitates coding.

» Exercise 4: Getting and setting environment values

 

5. Allowing user inputs in your script: sys.argv and arcpy.getParameterAsText()

We can reuse scripts we write by editing the inputs in the script itself, e.g. changing paths to point to different input feature classes or applying different buffer distances. However, it’s far more elegant to instead specify certain variables in our script to be user inputs with values specified at run time. Python enables user input via the argv object, which is contained in the sys module - otherwise called the sys.argv object.

The sys.argv object is, in computer jargon, an argument vector - which just means it’s a list containing arguments. The values of this list are supplied when the user runs the script, either from within PythonWin or ArcGIS. (More on that in a bit...)

» Exercise 5: Using sys.argv

 

ArcPy has its own version of sys.argv that allows values to be passed into Python scripts: arcpy.GetParameterAsText(). More on this function here: http://pro.arcgis.com/en/pro-app/arcpy/functions/getparameterastext.htm. It works just like sys.argv. With one key exception: GetParameterAsText does not include the scripts name as its first element, so user inputs begin at position "0" in the list.

Why use one over the other? sys.argv has limitations on the number of characters it can accept. GetParameterAsText has no character limit. So it would seem the latter is the better choice. However, as we'll see later, it's sometimes useful to know the name of the script that stored as sys.argv[0] -- and sys.argv does not require an expensive ArcGIS license, so it's useful to know both are at your disposal.

 

6. Running python scripts from ArcGIS & sending status messages (warnings, errors, etc.) (link)

Now that we have a way of specifying user input in our scripts, we can link our code back to ArcGIS Pro as script tools, i.e., so that they can be run as tools in a geoprocessing model. We can also set our scripts to send messages back to the ArcGIS Status window -- as opposed to "printing" messages to an interactive window that wouldn't appear when run from ArcGIS.

» Exercise 6: Running scripts from ArcGIS Pro


♦ Working with geospatial data:

7. Describing data (link)

When writing a script for a certain purpose, you may run into a case where you need to know some property of a given geospatial dataset. What is its extent? Is it a point or a polygon feature class? Is it a floating point or integer raster? How many bands does the raster have? What is its spatial reference?

All these questions are answered by creating a describe object for a given dataset, which is simply a Python dictionary listing all the dataset's properties. Here is an example of how to create a describe object, here assigned the variable name dsc for the climate.shp shapefile in the W:\359Data\SanDiego workspace.

» Exercise 7: Describing data with the describe object

Once created, the describe object allows us to extract properties of the climate.shp file. These two lines print out the name of the Object ID field (i.e., each row's internal ID) and whether the climate feature class has a spatial index associated with it.

To show how this might be use programmatically, we can add these two lines which build a spatial index for the climate.shp feature class, but only if it doesn't have one already:

The describe object is a bit tricky in that it is dynamic: its properties vary on what type of dataset is being described. For example, a raster dataset has a "cell size" property, but a feature class would not. However, as the describe object is structured as a Python dictionary, we can easily reveal all the properties associated just by looking at the object's keys:

Note: Describe objects in previous versions of ArcPy returned were a different, more unwieldy data type, not a dictionary. In fact, this describe object can still be created using the legacy format: arcpy.Describe() (the da is dropped). I see no reason to use this option, though you can read about it here. It's a real pain to use relative to the dictionary returned by arcpy.da.Describe() and is only kept for support of old scripts...

 

8. Accessing, updating, and creating data using cursors (link)

Reading and writing data in a feature attribute table is much like reading and writing lines of text in a text file using Python's "file object", though there are a few key differences. In ArcPy, instead of a file object, we link to a feature class, table, or raster attribute table using a cursor. These cursors come in three flavors:

Search Cursors are used simply to retrieve individual features;

Update Cursors are used to retrieve and edit properties of individual features;

Insert Cursors are used to append new records to a table and set their values.

When we create these cursors, we have to specify the data table and the fields in the table we want to retrieve. Optionally we can provide a "where clause" to filter the rows returned. The constraint with cursors, as is the case with file objects, is that data can only be access sequentially: if you want to edit the 100th record, you have to move past the first 99 to get to it. However, supplying the where clause can often be used to narrow in on just the rows you want.

Once a cursor is created, we can use a loop (for or while) to iterate through existing records in search and update cursors. Each iteration returns a "row" object, which is a list of field values created in the same order we specified when creating the cursor.

Here, we'll only examine the Search Cursor. The ArcGIS Pro online help provides good explanations and examples how the other cursors work. Below is a script example that creates a search cursor for the climate.shp feature class in the SanDiego workspace. (Note that this feature class has two attributes: Climate_ID and Zone which are referred to in the script below.)

» Exercise 8: Working with Cursors

 

9. The Geometry object: Reading and Editing feature geometries. (link)

In the above example, we modified our code to extract Shape@area to return the area of our feature. The bit after the @ is termed the "token" and there are other "tokens" besides "area" that we can used to pull values from geometry fields (link). However, had we omitted the tokens and just specified Shape@, the value returned would be an ArcPy geometry - in this case a polygon object since our dataset is a polygon feature class.

This is useful because, once we have a specific geometry object, we can do various things to it programmatically...

» Exercise 9: Reading Geometries from a Feature Class

Point objects vs. Point/Multipoint/Polyline/Polygon Geometry objects

In the above example, the coordinates used to define myPoint are assumed to be in the same coordinate system as feature. But what if they weren't? How would we specify the spatial reference of our point? The answer reveals a subtle difference between two ArcPy classes related to the Geometry class.

An ArcPy Point object is the most fundamental spatial element in ArcPy. It's defined by its X-Y (and sometimes Z or M) coordinate properties, but the Point object does not have any spatial reference property to define what coordinate system these values are in.

A Point Geometry object, however, does have a spatial reference property. And thus, if we wanted to query whether a point falls within a polygon and the point's coordinates are in a different coordinate system, we'd have to first construct a Point object and then construct a Point Geometry object from that point and define its coordinate system. (ESRI seems to go out of its way to make projections, etc. difficult!)

» Exercise 9b: Point Geometries

Let's look at an example: Use the following code to see if out myPoint defined in geographic coordinates falls within the feature.

Like Point Geometries, other geometry classes (Multipoint, Polylines, and Polygons) are also constructed of Point objects - or just raw coordinate pairs provided arrays (i.e. Python lists). And spatial references can be assigned to these objects at the time they are constructed or later, via their spatialReference property.

 

♦ Summary

Clearly, ArcPy does much, much more than what we covered here, but this should at least serve to orient you on how ArcPy is structured and were to go. Really, the key takeaways are:

From here, it's mostly a matter of getting more experience under your belt to become proficient in using ArcGIS in Python.