Geospatial Data Analytics - Course Synopsis

ENV 859 - Geospatial Data Analytics   |   Fall 2025   |   Instructor: John Fay  

Course theme

For many of you, Geospatial Data Analytics may be the last formal GIS class you take. Ever. Ideally then, by the end of this class, you’d have learned everything there is to know about GIS, but of course that won’t be the case. There’s just too much to know and the technology changes too quickly.

So instead, this course aims to prepare you for “life beyond the classroom”. In other words, I want you to leave this class (and NSOE) with enough know-how and confidence to confront any kind of geospatial challenge and make steadfast progress toward meeting that challenge.

To get there, we’ll cover a set of topics related to GIS, starting with some familiar geoprocessing in ArcGIS Pro and progressing into topics that may be completely new to you. Often, we’ll just barely introduce these topics before moving on to the next, but this is by design. These quick introductions will expose you to a new facet of GIS and give you enough of a footing in the topic to continue learning more on your own, if desired. In the end, you should discover that many complicated technologies that may seem completely beyond your grasp, often just require a bit of guided curiosity, patience, and determination to learn and use.

I also, however, want to instill in you the notion of geospatial analysis as a branch of the broader data analytics and artificial intelligence “revolution”. Over the past several years, the explosive growth of new, large datasets and the computing power to handle these datasets are expanding the very types of questions we can explore - hence the data analytics revolution. Through the topics we cover, I will emphasize how each has its role in broader data analytics, and in the end recap how the skills you’ve learned give you greater command to forge raw data into actionable intelligence and informed decisions.


Topic 1: Advanced Geoprocessing

Having explored ArcGIS Pro’s geoprocessing tools and workflows in previous GIS courses, we begin this class by taking a deeper dive into designing reproducible workflows using ArcPro’s Model Builder. Specifically, we explore how to set up a robust and distributable workspace, and how to create your own distributable geoprocessing tools – tools that allow users to specify inputs, iterate processes through a number of files or values, to execute processes only if specific conditions are met, and all with a user friendly, well-documented interface. Finally, we learn how to share these workflows with others in convenient, user-friendly packaging.

Topic 2: Python 101

Topic 1 exposes you to some powerful capabilities of the ArcGIS Pro Model Builder, but also to some of its limitations. The Python scripting language lifts many of these limitations, but it requires an investment in learning the language. Here, we introduce Python, striving to develop a foundational understanding of the basic concepts of the language: data types, language structure, reading and writing files, iterating through lists, conditional statements. We see how these basic concepts come together by writing and executing simple code blocks within Jupyter notebooks. Also, while learning the basics of Python, we pay attention to ways to continue your learning journey as efficiently as possible, pointing out useful coding resources and discussing judicious use of AI chatbots and coding assistants.

Topic 3: Scripting with Python

Often overlooked when learning a coding language are the tools and techniques to write scripts with the language. In this section, we continue with more practice with the Python language, but now in the context of writing a sequence of Python commands to execute a defined and somewhat complex task, i.e. a Python script. We discuss how to approach a scripting task by developing “pseudo-code”, and then the often circuitous route of writing the code, while continuing to learn new things about Python. We see how [free] tools such as Visual Studio Code and Git/GitHub facilitate this process. This is all done in the context of a real-world example of processing sea-turtle tracking data, with acknowledgement of the importance of reproducibility and transparency in doing any kind of analysis.

Topic 4: GIS & Python

With our Python foundation beginning to take shape, we now see what exactly our investment in the language enables us to do. Here, we begin with a look at the vast number of Python packages accessible to us and how we tap into them. And then we dive into the ArcPy Python package which grants us access to everything ArcGIS Pro can do and more from within Python.

Here, we examine how to run ArcGIS Pro geoprocessing tools with Python and other various functions, classes, and modules made available through ArcPy, doing so in a Jupyter notebook that mimics our geoprocessing workflow developed in Topic 1. We also see how ArcPy allows us to automate repetitive and complex spatial analysis tasks when embedded in Python scripts.

Topic 5: GIS in the context of Data Science: “Spatial Data Science”

Data Science is a fast emerging and “sexy” topic these days, but what is it? Here we’ll discuss what’s behind the big movement and the role GIS plays in it. We’ll examine the data science workflow of data engineering, data visualization/exploration, analysis/modeling/scripting, and sharing/collaboration. We’ll also discuss the importance of reproducibility and transparency. Additionally, we’ll examine key data structures used in data science, specifically the dataframe and its spatial counterpart, the spatially enabled dataframe, learning how these are constructed and from various data sources and used in analyses. It’s here where we’ll take a deep dive into ESRI’s ArcGIS API for Python, a powerful new package that links GIS, data science, and our next topic - cloud based GIS.

Time permitting, we’ll also examine the non-ESRI, open-source alternatives to include spatial analysis in data science tasks. These include GDAL, GeoPandas, Shapely, Fiona, OSM, and Folium. We may also explore technologies such as machine learning, artificial intelligence, and image processing from a spatial analysis perspective. We could also examine spatial analysis tools supported in R.

Additional Topics & Course Project

Towards the end of the semester, you will likely see that GIS + Python + Spatial Data Science can take you in a vast number of directions. Some of you may know exactly where you want to go next, and others will still want more guidance. This section is designed to cater both crowds. Based on the demand of the class, I will organize a number of short demonstrations on topics identified by the class. I will also offer guidance on targeted research topics students and teams of students want to execute. Underlying all these efforts will be a focus not just on how to execute various tasks, but how to go about learning how to execute these tasks.

Potential topics are listed below:

Cloud-based GIS

The paradigm of computing is changing. Rather than downloading datasets to our local machine, we are accessing remote data services. And rather than crunching analyses on our local CPU, we are tapping into remote processing services. In this section, we explore the concept of client-server architecture as it applies to technologies such as ArcGIS Online. We reveal that the ArcGIS Python API is actually a wrapper for something that is far more powerful and likely to be the dominant platform for geospatial analysis in the not-so-distant future.

More specifically, we’ll design, execute, and share spatial analysis workflows using ArcGIS Online. We’ll examine other useful on-line tools ESRI provides: Story Maps, Dashboards, and Insights. Then we’ll peek “behind the curtain” of these technologies, into the application programming interfaces, or APIs, that drive them and how we can control these APIs using Python to do things like automate data download, perform spatial analysis, and develop analytical dashboards.

Data Engineering & Visualization in ArcGIS Pro

Vast amounts of new data are constantly being made available. These datasets, however, seldom follow any consistent format and come with varying levels of background randomness or “noise”. As such, the process of operationalizing and exploring data is necessarily hands on, iterative, and non-systematic; the phrase “your mileage may vary” is quite appropriate here. Fortunately, the data engineering tools in ArcGIS Pro (as well as in other applications and coding languages), facilitate open-ended exploration both within and among feature attributes, which in turn, reveals new questions you might want to ask of your data, new hypotheses to test via further analysis.

Here, we walk through one example of data engineering. Specifically, we look at EPA Air Quality Data for the United States. These data are not as messy as some other data you might encounter, yet you’ll see that there are still some key steps required to get these data into our analytical environment (ArcGIS Pro), and a broad swath of questions you can explore with these data.

Spatial Statistics

While we won’t have the time to go deep into the statistical concepts underlying these topics, we will look at the workflows within ArcGIS Pro that enable so quite advanced spatial statistical analyses. These include: cluster analysis (supervised and machine learning approaches), analyzing spatio-temporal data to develop space-time cubes for analyzing and predicting trends across space and time, and geospatial techniques for making predictions. These topics will lean heavily into resources ESRI has developed.

Google Earth Engine

GEE is a popular and powerful platform for analyzing spatial data in the cloud. The service leverages Google’s massive cloud computing technology and is offered freely to academic and non-profit group. It is particularly adept at broad scale analyses and requires minimal local resources. In addition to it’s powerful analytical abilities, GEE also provided instant access to a vast amount of remotely sensed and other geospatial data.

GEE is natively controlled via JavaScript commands. You can write and execute JavaScript code in its on-line interface, saving scripts to linked GitHub accounts. We, however, we leverage our existing knowledge of Python to run GEE via Jupyter notebooks using Quisheng Wu’s wonderful and amazing geemap package. This package not only provides access to virtually all of GEE’s capability via Python commands, but it include excellent documentation, tutorials, and even recorded workshops.

GIS in R & R-Studio

The capacity for geospatial analysis in R and R-Studio is growing rapidly. Here we’ll dabble in the ‘sf’ and ‘rgdal’ packages. We may also [quickly] examine the significance of R-Markdown and R-Shiny in the scripting landscape.