Python 101 - Python Data Structures

ENV 859 - Geospatial Data Analytics   |   Fall 2023   |   Instructor: John Fay  

Introduction

Data structures are collections of objects organized in a way that allow us to add, track, update, remove, and extract specific values in these collections. Basic Python offers four types of data structures: lists, tuples, dictionaries, and sets* each with distinct uses and syntax. We describe these in this lesson. Additionally, we reexamine string objects, this time as a a data structure itself.

* Python actually supports additional data structures (e.g. "arrays", "data frames", "panels"), but those are actually part of Python add-ons, not "built-in" data structures. We'll discuss these objects a bit later.

The key to mastering the use of various types of data structures include:

  • Determining unique features, properties, and methods associated with each data structure object;
  • Determining the syntax for creating each type of data structure object;
  • Understanding how specific elements are identified, added, removed, etc from the data structure;

Similar to our previous session, we again lean on VanderPlas’ Whirlwind Tour of Python materials for detailed descriptions of these objects and examples of their use. Likewise as before, I provide a bulleted list of the essentials and we will review all material in class with recorded videos provided.

The materials and exercises required for this lesson are included in the zip files downloaded in the previous lesson.

Topics and Learning Objectives

Topic On completion, you should be able to…
1. Lists ♦ Store and access objects in a Python list
♦ Describe the importance of list objects being ordered & mutable
Add single items or multiple items to an existing list using list methods
♦ Modify list contents using list arithmetic
Sort items in a list in ascending or descending order
♦ Describe what indices are in the context of lists & how they are used
Extract single elements & slices of elements from a list
2. Tuples ♦ Store and access objects in a Python tuple
♦ Describe what “immutability” is and how it relates to Python tuples
♦ Describe cases where tuples might be more appropriate than lists
3. Dictionaries ♦ Store and access objects in a Python dictionary
♦ Explain the concept and advantages of key:value pairs with respect to dictionaries
Add and update values stored in a dictionary
List the keys, values, and items associated with a dictionary
4. Sets ♦ Store and access objects in a Python set
Convert a list or tuple to a set object
♦ Use set commands to compare collections
5. More on Strings ♦ Create a multi-line string
♦ Use Python’s escaping character to enable special behavior in strings
♦ Print formatted values in the context of other strings
♦ Use list methods on string objects
♦ Reverse the order a string
♦ Determine whether a substring occurs within a string

1. Lists

Reading: WToP: 06-Built-in-Data-Structures.ipynb

Key concepts:

  • Lists are ordered and mutable collections
    • Resemble a vector
    • Items in a list can be of different data types
  • Created using square brackets myList = [1,2,"Apple"]
  • List arithmetic: you can add and multiply lists, but not subtract or divide
  • List Indexing and slicing to get list items - zero based
  • Functions to manipulate lists: use tab-complete or help

Exercises:

  • 1a thru 1i in PythonExercises2.ipynb

2. Tuples

Reading: WToP: 06-Built-in-Data-Structures.ipynb

Key concepts:

  • Like a list, but immutable: Cannot add, remove, or rearrange items.
  • Why and when use tuples over lists.
  • Created using parentheses, or not: myTuple = (1, 2, "Apple") or myTuple = 1,2,"Apple"
  • “Modify” by creating a new tuple: myTuple += (4, 5, True)

Exercises:

  • 2a thru 2d in PythonExercises2.ipynb

3. Dictionaries

Reading: WToP: 06-Built-in-Data-Structures.ipynb

Key concepts:

  • A collection of unordered objects, like a list, but items referred to by a ‘key’, not an index
  • Created using curly braces with key/value pairs: playerCount = {'Volleyball': 6, 'Baseball': 9}
  • Items retrieved by its key: x = playerCount['Volleyball']
  • Items can be updated: playerCount['Volleyball'] = 2
  • New items can be added” playerCount['Soccer'] = 11
  • Dictionaries have functions…

Exercises:

  • 3a thru 3g in PythonExercises2.ipynb

4. Sets

Reading: WToP: 06-Built-in-Data-Structures.ipynb

Key concepts:

  • Collection of unordered, unique objects
  • Created using curly braces: primes = {2, 3, 5, 7}
  • Can perform set functions: union, intersection, difference, symmetric difference…\

Exercises:

  • 4a thru 4f in PythonExercises2.ipynb

5. More on Strings

Reading: WToP: 14-Strings-and-Regular-Expressions.ipynb

Key Concepts:

  • Single or double quotes can be used to define a string, as long as they match:

    • myName = 'John'
    • myName = "John"
  • Triplets of single or double quotes (""") around a string allows us to define multi-line strings”

    • myHiaku = '''
      to code in Python
      is not very difficult
      in fact, it is fun
      '''
      
  • The backslash \ allows special characters

    • We can include a quote in a string by preceding it with a backslash: "\""
    • Including \n in a string insert a new line; adding a \t will insert a tab
  • Preceding a string with the letter r tells Python to treat the string as a “raw” string, meaning backslashes lose their magical powers: r"V:\Python\MyProject.ipynb'

  • Strings have many functions to add and remove spaces to your string…

  • .find() and .index() are useful functions to locate substrings within strings.

  • Formatting: Python has some nice tricks to insert variables/values into strings

    • %s in a string will be substituted with a string:

      name = "Elizabeth"
      print("Queen %s is England's monarch" %name)
      
    • %.3f in a string will be substituted with a floating point object with 3 decimals

      pi = 3.141592
      print("pi = %.3f" %pi)
      
    • Alternatively, the format() command works similar to %s or %f substitution

      name = "Elizabeth"
      print("Queen {0} is {1}'s monarch".format(name,"England"))
      
      pi = 3.141592
      print("pi = {0:.3f}".format(pi))
      

Exercises:

  • 3.6.1 thru 3.6.4 in PythonExercises3.ipynb

♦ Recap & What’s next?

Scalar variables (integers, floats, strings) and data structures (lists, tuples, dictionaries, and sets) are the elemental building blocks of Python code.

Next up are the programming constructs that enable us to develop workflows with these building blocks. These include:

  • “For” and “While” loops that allow us to iterate through items in a collection.
  • “If/else” logic that allow to selectively execute code based on some condition.

We will also take a deeper dive into strings, as they are prevalent in coding various tasks, and, interestingly, quite important when we start doing GIS in Python.

Also, in assembling these workflows, we are going to move beyond Jupyter notebooks, which are great for tinkering with code snippets and start working in integrated development environment or IDE, which gives us more control at stepping through code and debugging it.