Beyond ArcPy2 : Web Services, APIs, & “Cloud Based GIS”
Beyond ArcPy2 : Web Services, APIs, & “Cloud Based GIS”
[TOC]
Introduction
In the last session, we examined ways Python can access and organize raw data stored locally and from the web. Here, we look further into how we can leverage the vast number of internet based resources from the Python coding environment. First, however, we step back and review the basic mechanism that allows us to pull data from remote servers to our local machine, something we term ==web services==. Then we introduce the concept of the application programming interface, or **API**: what they are, how they are structured, and ways Python can interact with them. Then we re-examine GIS in the context of using these APIs, a combination often termed “web-based” or “cloud-based GIS”.
Lab Prep
-
For this exercise, you’ll need to fork and clone this GitHub repository: https://github.com/ENV859/UsingAPIs
- Navigate to the URL above
- Log in to GitHub with your username and password
- Fork the repository
- Clone your forked repository to your desktop/virtual machine (using GitHub Desktop)
♦ Web services and “REST”
Web services
Web services are the engines from which we get “answers” from the internet. An example we’ve all used is a Google search: You ask Google to find web pages related to a search term and Viola! you get a list of useful links.
» Try this: http://lmgtfy.com/?q=nicholas+school+of+the+environment
What is actually going is that that we, the client are sending a request to a remote service provider (a server or bank of servers). These servers process this request and send back a response. This all happens fairly seamlessly in our web browsers, because browsers are designed to do precisely that: formulate a request, send it off to the correct service provider, wait for a response, and handle that response in a useful manner.
The reason this all works is because clients and servers adhere to a set of protocols. (“HTTP” stands for “hypertext transfer protocol”). How these protocols work is a bit beyond what we actually need to know, but it’s important to understand the interaction between client and server is like a language both client and server understand, and that once we understand some rules of this language, we can send request to a server without using a browser, (e.g. with Python) which is quite powerful!
“Representational State Transfer” or “REST”
Really, all we need to know about REST here is that it’s a very useful format for sending requests and receiving responses. With “REST-based” (or “RESTful”) web services (as opposed to SOAP-based web services, which we’ll just ignore here), communication between client and server is entirely text-based (vs. sending specifically formatted programming objects). There’s obviously more to it than that, but for now content yourself that our discussion from here pertains to REST-based web services, of which there are a LOT of useful ones.
» **Exercise: Google Search as a REST based search**
-
Above, you used the “Let me Google that for you” to mimic how you’d search for “Nicholas School of the Environment”. If you haven’t already, click the
Google Search
button to complete the search. -
Once the search result appears, note the URL (i.e., the web address):
https://www.google.com/search?q=nicholas+school+of+the+environment
-
In the URL, change
nicholas+school+of+the+environment
toDuke University Marine Lab
and hitEnter
. -
Check out the web page that comes up…
Perhaps this result is not too extraordinary, but what you just did by altering the URL is modified the REST command to the Google Search server. Yep, that’s kind of it. In web browsers, the URL is where you send a REST request, which is a text string. This request has specific components:
https://www.google.com
is the address of the server to which you are sending the requestsearch?
is the name of the service you are using, andDuke University Marine Lab
is an argument sent to the service.
And yes, the response is also text, even though it has nice formatting in our web page. All web pages are text (some text invokes fancy javascript code, but the page is still text). To see that text, right click on your page and hit “View Page Source” (or hit ctrl
-u
).
So, in a nutshell, REST is the mechanism of sending a text-based request to a server, and receiving the [usually] text-based response.
Our Google search is quite simple. Let’s examine a more complex one involving National Water Information System (NWIS) data.
» Exercise: A more complex REST request
- Go to the NWIS web site that we used to download data for one of our first tutorials: http://waterdata.usgs.gov/nwis
- Click on the Current Conditions button, then the Build Current Conditions Table link.
- Next, check Site Number and click Submit
- In the next page, enter the site number
02085070
, accept all the other defaults, and click the Submit button at the very bottom of the page. - In the next page, click the link in the Site Number column of the table.
- In the next page, change ‘Output Format’ to Tab-Separated and ‘Days’ to 1. Clear out the dates in the “Begin date” and “End date” boxes, then hit
GO
.
The result should be a text screen in your browser listing all data collected in the last 24 hours for site 02085070 - Eno River Near Durham. More importantly, take a look at the URL for the current page:
Edit this URL in your web browser, changing the site_no to from 02085070
to 02087183
. You’ll see that the page now lists data for site 02087183 - Neuse River near Falls Lake, NC.
In other words, this URL is essentially a command string sent to the NWIS server with arguments embedded within it!
Generally speaking arguments in a Web Service URL follow the ?
and are separated &
. We can then parse this URL into the parts listed below. We can deduce what they might mean through some keen observation (e.g. look at the web page used to generate the URL), and perhaps through experimentation (i.e. change the values and see what happens).
component | meaning |
---|---|
http://waterdata.usgs.gov/nwis | the web-service provider |
uv | the service provided |
cb_00060=on | include discharge data in the output |
cb_00065=on | include gage height data in the output |
format=rdb | list the output as a tab-separated |
period=1 | days to include in the output table |
site_no=02085070 | the site number |
♦ Application Programming Interfaces, or APIs
APIs are really just formalized web services. They typically use the REST interface, just like in the NWIS request example above, but they often come with more documentation, making them easier to use - though some are much easier to use than others. APIs are everywhere and are quite useful for pulling specific data from repositories – and much much more.
Let’s look at another example. This one works much the same as the NWIS one above, but this one appears much more formal, more designed to be used as a web service.
» Exercise: The BISON API:
Another, more documented example of this can be seen in USGS web services hosted at the USGS’ Biodiversity Information Serving Our Nation (BISON), which provides its own API: https://bison.usgs.gov/doc/api.jsp
The BISON API provides a number of services and documents their parameters.
- Copy and paste their example URL in a browser:
http://bison.usgs.gov/api/search.json?species=Bison%20bison&type=scientific_name&start=0&count=1
You see the result is in JSON format (which we’ll discuss later), and you can probably guess how you might edit this URL to return results for a different species.
- Modify the URL to show data for“
Cryptobranchus alleganiensis
” – aka the “hellbender”).
The REST API section of the web site documents various parameters you can invoke in the URL to filter the records returned. These include a filter for the county FIPS code. We can utilize this to return a set of record for a specific county, e.g. Durham (FIPS 37063)
- Try this URL, which returns the first 100 records in Durham County.
https://bison.usgs.gov/api/search.json?count=100&countyFips=37063
With some editing you could shape this data into a table which you could import into GIS…
→ [See the `2a-Exploring-the-BISON-API.ipynb` notebook in your `UsingAPIs` repository...]Other example APIs
The pool of APIs is vast and growing. Some make a big effort at publicizing themselves, while others kind of just “exist”. Below are a few more examples of well documented data APIs:
-
https://www.waterqualitydata.us/portal/ - They have full documentation and a nice link at the bottom that shows how selections the graphic interface appear in the REST-ful URL request.
-
https://www.census.gov/data/developers/data-sets.html - A number of different specific APIs end points, each with examples and documentation.
→ [See the `2b-Census.ipynb` notebook in your `UsingAPIs` repository...] -
http://data.neonscience.org/data-api - A bit more challenging to use, but lots of interesting biological data.
-
https://regulationsgov.github.io/developers - For those policy wonks out there…
♦ ESRI REST-based web services
ESRI has certainly recognized the utility of the API framework, from both the client and server ends. On the client side, ArcGIS Pro is a redesign of ArcGIS desktop with much more internet integration. ESRI also provides the nascent ArcGIS Python API, which we’ll examine shortly, as a scripting platform for accessing internet based resources. On the server side, ESRI’s ArcGIS Server and ArcGIS Online platforms facilitate sharing of data and geoprocessing services that are accessible via fairly well documented REST-based APIs.
Let’s now take a look at ESRI web services and how we can tap into them using Python. First, when searching for some on-line data or at some other point surfing the web you may have come across a web site that looks like this one:
http://sampleserver1.arcgisonline.com/ArcGIS/rest/services.
More and more of these sites are popping up.* Here are just a few:
- http://services.nationalmap.gov/ArcGIS/rest/services
- https://hydro.nationalmap.gov/arcgis/rest/services/nhd/MapServer
- https://gis.ngdc.noaa.gov/arcgis/rest/services
- http://tigerweb.geo.census.gov/arcgis/rest/services
- http://gis.ncdcr.gov/ArcGIS/rest/services
What are all these sites?? Well, they are the end points to a vast amount of spatially enabled web services hosted using ESRI’s ArcGIS Server or its ArcGIS online technologies, and we can access these services using the REST interface.
■ In fact, a Google search of “
inurl:arcgis/rest/services
”, perhaps followed by an organization name or keyword, is a great way to search for data!
Exploring ESRI based web services
→ [See the `3-ArcGIS-REST_Service-Demo.ipynb` notebook in your `UsingAPIs` repository...]Let’s take a look at an example using a service I’ve created on a server hosted here in the Nicholas School:
https://ns-win2012test.win.duke.edu/arcgis/rest/services
The services are organized as a series of folders. Click on the ENV859 folder and you’ll see few services available: 1 map servers and 2 geoprocessing servers. The map servers serve data and the GPServers serve geoprocessing functionality.
Click on the Discharge map server and you’ll see properties for this service. What I like to do first when investigating a map service is to look at the data. The fastest way to see the data is to click on the “View In: ArcGIS JavaScript” link. This opens a new window displaying the data using ArcGIS’s JavaScript API – an alternative to the Google Maps API (more info on that later).
This map service only serves one data layer: North Carolina HUCs, but it could provide many (e.g: https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/tigerWMS_Census2010/MapServer).
At the bottom of the map service page is a list of the supported operations on the service. Click the Export Map link. This is a (slightly clumsy) interface to build a custom request on these features. If you enter values into the blanks then hit “Export Map Image (GET)” it will send the request to the server using those parameters. It will also create the URL used to format that request.
Try filling out the form like this (below) and the click “Export Map Image (GET)” at the very bottom.
(the coordinates are -79.3, 35, -77.9, 36.5
)
You’ll see an image zoomed to the Upper Neuse HUC. Also, in the URL of the page created contains the REST format request, which can be broken down as this:
https://ns-win2012test.win.duke.edu/
arcgis/rest/services/ENV859/Discharge/MapServer/export
?bbox=-79.3%2C+35%2C+-77.9%2C+36.5
&bboxSR=4269
&layers=
&layerDefs=
&size=
&imageSR=
&format=png
&transparent=false
&dpi=
&time=
&layerTimeOptions=
&dynamicLayers=
&gdbVersion=
&mapScale=
&f=html
This URL is what we can manipulate to modify our request programmatically. To see what each of these parameters do, we consult the documentation on the ArcGIS Map Server REST API for the Export Map operation:
https://developers.arcgis.com/rest/services-reference/export-map.htm
-
Try tweaking the URL so the output is in an image format (
&f=image
- at the very end of the URL). It produces get a PNG format image of the layer clipped to the bound supplied in the bounding box. -
Lastly, for Layer Definitions enter:
0:HUC_NAME = 'Upper Neuse'
. This instructs the server to, for the first layer (atindex = 0
), select features where theHUC_NAME = ‘Upper Neuse’
and only generate output for those features.
In short, we have quite a powerful interface to spatial data. Time permitting, we will examine some interesting uses for these REST based services.
♦ REST based APIs that process data
Web services and API can process data as well as return data. A good example of this are Geocoding services: you provide an address, and the server returns the coordinates of that address. Open Street Map provides one such geocoding web service, and to capitalize on it, you again simply need to figure out how to construct the request and handle the response. The documentation for this API is provided here: http://wiki.openstreetmap.org/wiki/Nominatim
Recap & what’s next
Knowing Python has opened many doors for accessing and analyzing data. The built-in Python classes (numbers, strings, lists, dictionaries, etc.) and functions (loops, flow control, etc.) provide ample flexibility to get things done with scripts.
Adding 3rd party packages to our base installation can vastly simplify and expand what we can do with our Python foundation, and beyond that, we see that there’s a world of web services and APIs that extends our abilities much, much further!
Up next, we’ll concentrate on what to do with all these data: how to manage large datasets in Python (with Numpy, Pandas, and Geopandas), and also how to communicate with these data with more Python packages (matplotlib, seaborn, and gglot) as well as more APIs (leaflet/folium, google maps). And finally we’ll examine frameworks for putting all these components together to assemble some handy interactive apps with plotly, dash, and r-shiny!