In [1]:
#Import data
import pandas as pd
In [2]:
#Initial import of surveys dataset
survey_df = pd.read_csv('../data/surveys.csv',
                        index_col='record_id',
                        dtype={'plot_id':'str'})
survey_df.head()
Out[2]:
month day year plot_id species_id sex hindfoot_length weight
record_id
1 7 16 1977 2 NL M 32.0 NaN
2 7 16 1977 3 NL M 33.0 NaN
3 7 16 1977 2 DM F 37.0 NaN
4 7 16 1977 7 DM M 36.0 NaN
5 7 16 1977 3 DM M 35.0 NaN

1. Calculating and updating fields

A typical operation in dataframes is creating new fields from values in other fields. We'll show this by computing hindfoodt_length (currently in mm) into inches. (25.4 mm per inch...)

In [3]:
#Create a field of hindfoot length in inches
survey_df['hindfoot_length_in'] = survey_df['hindfoot_length'] / 25.4

#Review the output
survey_df.head()
Out[3]:
month day year plot_id species_id sex hindfoot_length weight hindfoot_length_in
record_id
1 7 16 1977 2 NL M 32.0 NaN 1.259843
2 7 16 1977 3 NL M 33.0 NaN 1.299213
3 7 16 1977 2 DM F 37.0 NaN 1.456693
4 7 16 1977 7 DM M 36.0 NaN 1.417323
5 7 16 1977 3 DM M 35.0 NaN 1.377953

â–º YOU TRY IT:

  • Create a new column "weight_oz" converting weigth (grams) into ounces (28.35 g/oz)
In [4]:
#Create a field of weight in ounces


#Review the output
survey_df.tail()
Out[4]:
month day year plot_id species_id sex hindfoot_length weight hindfoot_length_in
record_id
35545 12 31 2002 15 AH NaN NaN NaN NaN
35546 12 31 2002 15 AH NaN NaN NaN NaN
35547 12 31 2002 10 RM F 15.0 14.0 0.590551
35548 12 31 2002 7 DO M 36.0 51.0 1.417323
35549 12 31 2002 5 NaN NaN NaN NaN NaN

2. Selecting data from a dataframe

We often want to isolate specific rows and/or columns of data from our dataframe for further analysis. Pandas offers many ways to do this. Here we'll review these techniques.

2a. Selecting single rows, select rows, or row slices using iloc

Recall that dataframes can be considered lists (rows) of lists (columns). As such, we can select specific rows by their integer index, either indivdually or in slices. This is done using the dataframe's iloc method.

Refer to the Pandas help on iloc for more info on this functionality:

iloc is short for integer location

In [5]:
#Fetch the 4th row of data
survey_df.iloc[3]
Out[5]:
month                       7
day                        16
year                     1977
plot_id                     7
species_id                 DM
sex                         M
hindfoot_length            36
weight                    NaN
hindfoot_length_in    1.41732
Name: 4, dtype: object
In [6]:
#Fetch the 101st thru 110th rows of data
survey_df.iloc[100:110]
Out[6]:
month day year plot_id species_id sex hindfoot_length weight hindfoot_length_in
record_id
101 8 20 1977 5 DM F 35.0 46.0 1.377953
102 8 20 1977 23 DM F 35.0 40.0 1.377953
103 8 20 1977 18 DM F 35.0 30.0 1.377953
104 8 20 1977 11 DS M 43.0 NaN 1.692913
105 8 20 1977 19 DM M 35.0 39.0 1.377953
106 8 20 1977 12 NL NaN NaN NaN NaN
107 8 20 1977 18 NL NaN NaN NaN NaN
108 8 20 1977 11 PP M 21.0 NaN 0.826772
109 8 20 1977 6 DM M 35.0 34.0 1.377953
110 8 20 1977 10 DS NaN NaN NaN NaN
In [7]:
#Fetch the 3rd, 5th, and 10th rows
survey_df.iloc[[2,4,9]]
Out[7]:
month day year plot_id species_id sex hindfoot_length weight hindfoot_length_in
record_id
3 7 16 1977 2 DM F 37.0 NaN 1.456693
5 7 16 1977 3 DM M 35.0 NaN 1.377953
10 7 16 1977 6 PF F 20.0 NaN 0.787402

2b. Selecting rows and columns using iloc

The iloc method can also be used to select columns. Which columns to select are passed as the second parameter in the iloc function. Therefore, we need to specify which rows we want to fetch first. If we want to retrieve the selected columns for all rows, we can just pass a ":" as the first parameter.

In [8]:
#Select all rows and the 4th column of the dataset
survey_df.iloc[:,3]
Out[8]:
record_id
1         2
2         3
3         2
4         7
5         3
         ..
35545    15
35546    15
35547    10
35548     7
35549     5
Name: plot_id, Length: 35549, dtype: object
In [9]:
#Select the first 5 rows and the 4th column of the dataset
survey_df.iloc[:5,3]
Out[9]:
record_id
1    2
2    3
3    2
4    7
5    3
Name: plot_id, dtype: object

→ So you see, selecting subsets is just a matter of messing around with the row/column integer indices and inserting them into the iloc method.

2c. Selecting rows and columns using loc

While iloc selects rows and columns by their absolute position, loc selects rows and columns by their labels. Yes, each row and column has a label. A column's label is its column name; a row's label is its label index (not to be confused with its integer index, which is simply its row number).

When Pandas reads in a CSV file, the CSV file determines the column names, but row labels are assigned a sequential numeric value by default. When we read in data into our surveys_df, we specified the values inthe record_id column to become our index, i.e., our row labels. We can now use these labels to select rows, and combine them with column names to select row/column combinations:

In [10]:
#Fetch the row with row label of 10
survey_df.loc[10]
Out[10]:
month                        7
day                         16
year                      1977
plot_id                      6
species_id                  PF
sex                          F
hindfoot_length             20
weight                     NaN
hindfoot_length_in    0.787402
Name: 10, dtype: object
In [11]:
#Fetch value in the species_id column in the row with row label of 10 
survey_df.loc[10,'species_id']
Out[11]:
'PF'

Because our dataframe's row index (i.e. its row labels) happens to be numeric, we can fetch slices of values. We can also fetch multiple columns by passing them in as a list.

In [12]:
#Fetch value in the species_id column in the row with rows with labels between 10 and 20
survey_df.loc[10:20,['species_id','sex']]
Out[12]:
species_id sex
record_id
10 PF F
11 DS F
12 DM M
13 DM M
14 DM NaN
15 DM F
16 DM F
17 DS F
18 PP M
19 PF NaN
20 DS F

2d. Selecting rows based on criteria - using queries

Instead of selecting data by position or by index, we can also query our dataframe for all records meeting specific criteria. In this first example, we'll use the dataframe's query() function.

In [13]:
#Fetch all records from the year 1997 into a new variable
records_1997 = survey_df.query('year == 1997')
records_1997.shape
Out[13]:
(2493, 9)
In [14]:
#Fetch all records from 1997 for female individuals
females_1997 = survey_df.query('year == 1997 & sex == "F"')
females_1997.shape
Out[14]:
(1071, 9)

2e. Selecting rows based on criteria - using masks

Queries are handy, but as more criteria are used, the query statement can get cumbersome. The use of masks provide a somewhat more simplified approach, or at least a more modular approach.

A "mask" is a Python series (i.e. one column of data) with a Boolean values for each row:

In [15]:
survey_df
Out[15]:
month day year plot_id species_id sex hindfoot_length weight hindfoot_length_in
record_id
1 7 16 1977 2 NL M 32.0 NaN 1.259843
2 7 16 1977 3 NL M 33.0 NaN 1.299213
3 7 16 1977 2 DM F 37.0 NaN 1.456693
4 7 16 1977 7 DM M 36.0 NaN 1.417323
5 7 16 1977 3 DM M 35.0 NaN 1.377953
... ... ... ... ... ... ... ... ... ...
35545 12 31 2002 15 AH NaN NaN NaN NaN
35546 12 31 2002 15 AH NaN NaN NaN NaN
35547 12 31 2002 10 RM F 15.0 14.0 0.590551
35548 12 31 2002 7 DO M 36.0 51.0 1.417323
35549 12 31 2002 5 NaN NaN NaN NaN NaN

35549 rows × 9 columns

In [16]:
#Create a mask for all records with hindfeet smaller than 35mm
mask_smallfeet = survey_df['hindfoot_length'] < 35
mask_smallfeet
Out[16]:
record_id
1         True
2         True
3        False
4        False
5        False
         ...  
35545    False
35546    False
35547     True
35548    False
35549    False
Name: hindfoot_length, Length: 35549, dtype: bool

We can now apply this mask to select records where the mask value is true by using it with the loc method.

In [17]:
survey_df.loc[mask_smallfeet]
Out[17]:
month day year plot_id species_id sex hindfoot_length weight hindfoot_length_in
record_id
1 7 16 1977 2 NL M 32.0 NaN 1.259843
2 7 16 1977 3 NL M 33.0 NaN 1.299213
6 7 16 1977 1 PF M 14.0 NaN 0.551181
9 7 16 1977 1 DM F 34.0 NaN 1.338583
10 7 16 1977 6 PF F 20.0 NaN 0.787402
... ... ... ... ... ... ... ... ... ...
35540 12 31 2002 15 PB F 26.0 23.0 1.023622
35541 12 31 2002 15 PB F 24.0 31.0 0.944882
35542 12 31 2002 15 PB F 26.0 29.0 1.023622
35543 12 31 2002 15 PB F 27.0 34.0 1.062992
35547 12 31 2002 10 RM F 15.0 14.0 0.590551

18172 rows × 9 columns

What's nice about masks is that we can combine multiple masks. Think of each Boolean value as as on/off switch that determines whether or not the value is added to the final set. We use & ["and" switch] to combine switches such that each must be "on", and | ["or" switch] to require only one switch to be on.

In [18]:
#Create a mask of females
mask_male = survey_df['sex'] == 'M'
mask_male

#Create a mask for records colected in April
mask_april = survey_df['month'] == 4
In [19]:
#Filter records meeting all criteria
data_subset = survey_df.loc[mask_smallfeet & mask_male & mask_april]
data_subset.shape
Out[19]:
(790, 9)
In [20]:
#Filter records meeting any criteria
data_subset = survey_df.loc[mask_smallfeet | mask_male | mask_april]
data_subset.shape
Out[20]:
(27550, 9)

►Knowledge check:

  • Use the query() function to select all male individuals with weight greater than 41 g.
    • What is the median hindfoot length of this population?
  • Perform the same analysis using masks.
In [21]:
#Extract males weighing more than 41 g - using query
subset = survey_df.query("sex == 'M' and weight > 41")
subset['hindfoot_length'].median()
Out[21]:
36.0
In [22]:
#Extract males weighing more than 41 g - using masks
mask_weight = survey_df['weight'] > 41
subset_2 = survey_df.loc[mask_male & mask_weight]
subset_2['hindfoot_length'].median()
Out[22]:
36.0

2f. Updating values in selected rows/columns

We've learned how to select data in specific rows and columns. Now we'll update those values.

Our dataset has some erroneous data: records occuring on the "31st" day of April (which has only 30 days in it). We'll replace those dates to be the 30th day of April. To do this, we have to select rows where the month is '4' and the day is '31'. And then with those rows selected, we'll update the the values in the day column to be 30.

In [23]:
#Create masks for day = 30 and month = 4
mask_day = survey_df['day'] == 31
mask_april = survey_df['month'] == 4 
mask_sept = survey_df['month'] == 9

If we apply the masks, it returns a dataframe of that selection

In [24]:
#Apply the masks
survey_df.loc[mask_day & (mask_april | mask_sept)]
Out[24]:
month day year plot_id species_id sex hindfoot_length weight hindfoot_length_in
record_id
30650 4 31 2000 6 PP F 22.0 19.0 0.866142
30651 4 31 2000 6 PB M 25.0 32.0 0.984252
30652 4 31 2000 6 PB F 25.0 30.0 0.984252
30653 4 31 2000 6 PP M 23.0 20.0 0.905512
30654 4 31 2000 6 PP M 23.0 24.0 0.905512
... ... ... ... ... ... ... ... ... ...
31500 9 31 2000 16 PM M 17.0 11.0 0.669291
31501 9 31 2000 16 OT F 20.0 NaN 0.787402
31502 9 31 2000 5 NaN NaN NaN NaN NaN
31503 9 31 2000 7 NaN NaN NaN NaN NaN
31504 9 31 2000 10 NaN NaN NaN NaN NaN

136 rows × 9 columns

We can add a column designation to just get one column with the selected rows:

In [25]:
#Fetch the just the day column of the filtered rows
survey_df.loc[mask_day & (mask_april | mask_sept), 'day']
Out[25]:
record_id
30650    31
30651    31
30652    31
30653    31
30654    31
         ..
31500    31
31501    31
31502    31
31503    31
31504    31
Name: day, Length: 136, dtype: int64

Rather than just display to the screen, we can assign and new value to this selection

In [26]:
#Update the values
survey_df.loc[mask_day & (mask_april | mask_sept), 'day'] = 30

Check our results...

In [27]:
#What's the largest value in the day column for records where month = 4
survey_df.query('month == 4 or month == 9')['day'].max()
Out[27]:
30

With the dates fixed, we can now create a date attribute

In [28]:
survey_df['date'] = pd.to_datetime(survey_df[['year','month','day']])
survey_df.head()
Out[28]:
month day year plot_id species_id sex hindfoot_length weight hindfoot_length_in date
record_id
1 7 16 1977 2 NL M 32.0 NaN 1.259843 1977-07-16
2 7 16 1977 3 NL M 33.0 NaN 1.299213 1977-07-16
3 7 16 1977 2 DM F 37.0 NaN 1.456693 1977-07-16
4 7 16 1977 7 DM M 36.0 NaN 1.417323 1977-07-16
5 7 16 1977 3 DM M 35.0 NaN 1.377953 1977-07-16

3. Grouping data

We often want to calculate summary statistics grouped by subsets or attributes within fields of our data. For example, we might want to calculate the average weight of all individuals per some variable like species_id, plot_id or sex. This is done by Pandas' group_by() function.

So let's go through the process of grouping data by sex (a variable without a lot of unique values) and computing mean weight and mean hindfoot length of males vs females.

First, some review:

  • We can calculate summary statistics for all records in a single column using the syntax below:
In [29]:
survey_df['weight'].describe()
Out[29]:
count    32283.000000
mean        42.672428
std         36.631259
min          4.000000
25%         20.000000
50%         37.000000
75%         48.000000
max        280.000000
Name: weight, dtype: float64
  • We can also extract each single metric separately if we wish:
In [30]:
print(" Min: ", survey_df['weight'].min())
print(" Max: ", survey_df['weight'].max())
print(" Mean: ", survey_df['weight'].mean())
print(" Std Dev: ", survey_df['weight'].std())
print(" Count: ", survey_df['weight'].count())
 Min:  4.0
 Max:  280.0
 Mean:  42.672428212991356
 Std Dev:  36.63125947458358
 Count:  32283
  • To summarize by a categorical value, we group the data on that variable. The result of this operation is a new type of object - a Pandas "DataFrameGroupBy" object, which is an intermediate to doing analysis on grouped data.
In [49]:
#Group the data by unique values in the `sex` field
grouped_data = survey_df.groupby('sex')

#This creates a Panda's "grouped dataframe" object
type(grouped_data)
Out[49]:
pandas.core.groupby.generic.DataFrameGroupBy
  • To analyze the grouped dataframe object, we supply an aggregate function like min(), median(), count() etc.
In [50]:
#Compute the median value of all numeric fields for each group
grouped_data.median()
Out[50]:
month day year hindfoot_length weight hindfoot_length_in
sex
F 7 16 1990 27.0 34.0 1.062992
M 6 16 1990 34.0 39.0 1.338583
In [55]:
# Provide a list of fields if you want to restrict what columns are reported
grouped_data[['hindfoot_length','weight']].median()
Out[55]:
hindfoot_length weight
sex
F 27.0 34.0
M 34.0 39.0
In [33]:
# Or, use the describe function to reveal all summary stats for the grouped data
grouped_data['weight'].describe()
Out[33]:
count mean std min 25% 50% 75% max
sex
F 15303.0 42.170555 36.847958 4.0 20.0 34.0 46.0 274.0
M 16879.0 42.995379 36.184981 4.0 20.0 39.0 49.0 280.0

Challenge - Summary Data

  1. How many recorded individuals are female F and how many male M
In [57]:
# Challenge 1: Show the *count* of records, grouped by sex
#grouped_data.â–ˆ()
grouped_data.count()
Out[57]:
month day year plot_id species_id hindfoot_length weight hindfoot_length_in date
sex
F 15690 15690 15690 15690 15690 14894 15303 14894 15690
M 17348 17348 17348 17348 17348 16476 16879 16476 17348
  1. The statement above produces another dataframe. Show just the count of the records in the weight field in the resulting dataframe.
In [58]:
# Challege 2: Just show the 'month' column in the above statement
#grouped_data.â–ˆ()['â–ˆ']
grouped_data.count()['month']
Out[58]:
sex
F    15690
M    17348
Name: month, dtype: int64
  1. What happens when you group by two columns using the following syntax and then grab mean values:
    • grouped_data2 = survey_df.groupby(['plot_id','sex'])
    • grouped_data2.mean()
In [64]:
# Challenge 3
grouped_data2 = survey_df.groupby(['plot_id','sex'])
grouped_data2.mean()
Out[64]:
month day year hindfoot_length weight hindfoot_length_in
plot_id sex
1 F 6.597877 15.338443 1990.933962 31.733911 46.311138 1.249367
M 6.121461 15.905936 1990.091324 34.302770 55.950560 1.350503
10 F 5.588652 16.964539 1989.248227 18.641791 17.094203 0.733929
M 5.718310 16.739437 1989.007042 19.567164 19.971223 0.770361
11 F 6.759124 16.272506 1989.836983 32.029299 43.515075 1.260996
M 6.374150 15.968902 1989.856171 32.078014 43.366197 1.262914
12 F 6.509434 16.305660 1990.266981 30.975124 49.831731 1.219493
M 6.304167 16.367500 1990.400833 31.762489 48.909710 1.250492
13 F 6.802548 16.216561 1990.619427 27.201014 40.524590 1.070906
M 6.480204 16.000000 1989.911877 27.893793 40.097754 1.098181
14 F 6.510578 16.672779 1989.974612 32.973373 47.355491 1.298164
M 6.660748 16.499065 1990.587850 32.961802 45.159378 1.297709
15 F 6.569647 16.149688 1992.523909 21.949891 26.670236 0.864169
M 6.185819 17.405868 1990.770171 21.803109 27.523691 0.858390
16 F 6.360360 16.810811 1991.351351 23.144928 25.810427 0.911218
M 6.201465 16.615385 1990.926740 23.480916 23.811321 0.924446
17 F 6.650224 15.892377 1990.785874 30.918536 48.176201 1.217265
M 6.569801 16.183286 1991.331434 32.227634 47.558853 1.268804
18 F 6.698013 15.960265 1990.536424 26.690341 36.963514 1.050801
M 6.610224 16.797125 1988.551118 27.703072 43.546952 1.090672
19 F 6.701149 15.226054 1993.417625 21.257937 21.978599 0.836927
M 6.533563 16.647160 1991.740103 21.071685 20.306878 0.829594
2 F 6.426804 16.288660 1990.449485 30.161220 52.561845 1.187450
M 6.340035 15.440559 1990.756119 30.353760 51.391382 1.195030
20 F 6.743077 16.026154 1990.253846 27.069193 52.624406 1.065716
M 6.489396 16.375204 1989.243067 27.908451 44.197279 1.098758
21 F 6.860884 16.307692 1993.878887 22.366554 25.974832 0.880573
M 6.671840 16.203991 1992.199557 21.736721 22.772622 0.855776
22 F 6.651267 15.521610 1990.973174 34.108320 53.647059 1.342847
M 6.381872 16.682021 1990.650817 33.359746 54.572531 1.313376
23 F 6.860606 16.036364 1989.024242 20.051948 20.564417 0.789447
M 6.391304 16.077295 1989.961353 19.850000 18.941463 0.781496
24 F 6.596708 16.393004 1987.485597 26.993377 47.914405 1.062731
M 6.360825 16.971134 1988.641237 25.786996 39.321503 1.015236
3 F 6.604703 16.155655 1992.013438 23.774044 31.215349 0.935986
M 6.271429 16.444048 1992.275000 23.833744 34.163241 0.938336
4 F 6.442661 15.741972 1990.235092 33.249102 46.818824 1.309020
M 6.430097 16.500971 1991.000971 34.097959 48.888119 1.342439
5 F 6.142315 15.722960 1986.485769 28.921844 40.974806 1.138655
M 6.194539 15.703072 1986.817406 29.694794 40.708551 1.169086
6 F 6.628223 16.301221 1991.579376 26.981322 36.352288 1.062257
M 6.035340 16.083770 1990.556283 27.425591 36.867388 1.079748
7 F 6.385075 15.313433 1991.441791 19.779553 20.006135 0.778723
M 6.719870 15.778502 1991.462541 20.536667 21.194719 0.808530
8 F 6.632143 15.825000 1991.267857 32.187578 45.623011 1.267227
M 6.571719 15.840285 1991.686673 33.751059 49.641372 1.328782
9 F 6.499414 15.547479 1989.303634 35.126092 53.618469 1.382917
M 6.361554 15.201195 1990.632470 34.175732 49.519309 1.345501
  1. Display summary statistics of hindfood length values for each plot id in your data.
    HINT: First group data, then extract the one variable you want to summarize, and finally use the describe function on that result to compute summary statistics...
In [ ]:
# Challenge 4

More complex aggregating functions...

We can also supply a dictionary of aggregating functions so that each column in the grouped result is aggregated exactly how we want (i.e. instead of computing just the sum or mean of all columns). This dictionary is built by specifying the column name as the key and the aggregate function(s) as the values. Below is an example to aggregate the data by sex, computing the minumum and maximum of the year, the median of the hindfoot_length, and the mean weight.

In [66]:
#Create the aggregate function dictionary
aggFuncs = {
    "year": ['count','min','max'],#compute count, min, and max of the year attribute
    "hindfoot_length": 'median',  #compute the median hindfoot length
    "weight": 'mean'}             #compute the mean weight
#Apply the dictionary
survey_df.groupby('sex').agg(aggFuncs)
Out[66]:
year hindfoot_length weight
count min max median mean
sex
F 15690 1977 2002 27.0 42.170555
M 17348 1977 2002 34.0 42.995379

4. Transforming data with pivot tables

And finally, we can transform our data via pivot tables. In this operation, we take two categorical values in our data set, creating row labels with one and column names with the other.

In [84]:
survey_df.pivot_table(
    values = 'weight',
    index = 'species_id',
    columns= 'year',
    aggfunc ='mean'
).fillna(-1).T.style.background_gradient(cmap = 'YlGnBu')
Out[84]:
species_id BA DM DO DS NL OL OT OX PB PE PF PH PI PL PM PP PX RF RM RO RX SF SH SO SS
year
1977 -1.000000 41.141304 42.666667 121.437500 -1.000000 21.666667 21.000000 21.333333 -1.000000 19.500000 7.173913 -1.000000 -1.000000 -1.000000 -1.000000 15.333333 -1.000000 -1.000000 10.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000
1978 -1.000000 40.795455 45.000000 115.198653 184.656250 31.027027 23.220000 -1.000000 -1.000000 20.428571 7.088235 -1.000000 -1.000000 -1.000000 20.666667 14.869565 -1.000000 -1.000000 7.500000 -1.000000 -1.000000 -1.000000 89.000000 -1.000000 130.000000
1979 -1.000000 43.507317 45.862069 111.796954 138.000000 33.717391 23.075758 -1.000000 -1.000000 20.529412 7.529412 -1.000000 -1.000000 -1.000000 23.666667 15.105263 -1.000000 -1.000000 8.333333 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000
1980 -1.000000 44.136082 48.058824 120.725664 158.982143 33.107143 24.083333 -1.000000 -1.000000 22.450000 7.455556 -1.000000 -1.000000 -1.000000 21.000000 14.176471 -1.000000 -1.000000 10.227273 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000
1981 -1.000000 44.032907 49.111111 125.818444 165.903226 33.296296 24.415385 -1.000000 -1.000000 20.558140 7.152542 -1.000000 -1.000000 -1.000000 19.933333 13.950000 -1.000000 -1.000000 11.178571 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 57.000000
1982 -1.000000 41.975042 47.919643 115.647929 160.613208 35.590361 24.925532 24.000000 -1.000000 21.173077 6.918919 -1.000000 -1.000000 -1.000000 21.391304 16.125000 -1.000000 11.500000 10.436242 -1.000000 -1.000000 -1.000000 79.000000 -1.000000 -1.000000
1983 -1.000000 40.944551 47.150602 122.033088 156.691489 34.584615 24.697674 -1.000000 -1.000000 20.864865 6.833333 -1.000000 -1.000000 -1.000000 22.023810 15.468750 -1.000000 -1.000000 9.872000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000
1984 -1.000000 40.765306 48.415094 124.082192 150.095238 32.550725 22.416667 18.000000 -1.000000 20.210526 7.428571 28.000000 -1.000000 -1.000000 19.545455 15.307692 -1.000000 -1.000000 11.152542 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000
1985 -1.000000 41.507645 47.956731 124.326316 148.829268 32.108696 22.064516 -1.000000 -1.000000 20.360000 -1.000000 32.666667 -1.000000 -1.000000 19.000000 15.764706 -1.000000 -1.000000 8.371429 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000
1986 -1.000000 43.361596 49.372727 128.071429 159.724138 30.880952 21.500000 -1.000000 -1.000000 22.520000 8.000000 39.000000 -1.000000 -1.000000 21.696970 16.750000 -1.000000 -1.000000 10.750000 -1.000000 -1.000000 -1.000000 55.000000 -1.000000 -1.000000
1987 -1.000000 43.232609 50.087379 126.383838 158.840000 30.779221 23.069767 -1.000000 -1.000000 21.625899 7.500000 -1.000000 -1.000000 -1.000000 22.297710 17.840909 -1.000000 -1.000000 10.657609 -1.000000 -1.000000 -1.000000 73.800000 -1.000000 -1.000000
1988 -1.000000 43.397790 51.463768 129.490566 163.104167 30.306122 24.070588 -1.000000 -1.000000 22.625668 8.000000 33.000000 -1.000000 -1.000000 21.759259 18.280000 -1.000000 13.818182 10.322115 -1.000000 -1.000000 -1.000000 72.806452 -1.000000 -1.000000
1989 7.000000 44.349206 51.025641 121.896552 151.278689 31.947368 24.663366 20.000000 -1.000000 21.935673 7.230769 30.500000 -1.000000 -1.000000 20.222222 17.409091 -1.000000 13.346939 10.411311 -1.000000 -1.000000 54.800000 76.345455 -1.000000 -1.000000
1990 8.000000 41.769912 48.512048 121.187500 154.275862 31.200000 23.675325 -1.000000 -1.000000 21.671233 7.117647 31.142857 -1.000000 -1.000000 -1.000000 16.029412 -1.000000 12.916667 10.305677 -1.000000 -1.000000 52.611111 76.888889 -1.000000 -1.000000
1991 9.240000 43.148338 49.695652 113.000000 148.785714 28.171429 24.588235 -1.000000 -1.000000 21.435484 7.827586 -1.000000 -1.000000 -1.000000 -1.000000 17.904762 -1.000000 -1.000000 10.498305 11.000000 -1.000000 93.166667 63.000000 53.909091 -1.000000
1992 7.833333 43.877966 48.367816 112.352941 139.000000 28.454545 24.928571 -1.000000 -1.000000 22.750000 8.767857 31.142857 18.000000 -1.000000 -1.000000 17.479675 -1.000000 -1.000000 10.904762 -1.000000 -1.000000 43.000000 -1.000000 55.263158 -1.000000
1993 -1.000000 43.373984 48.461538 105.722222 127.391304 27.545455 22.363636 -1.000000 -1.000000 19.230769 8.238806 -1.000000 -1.000000 -1.000000 -1.000000 17.954023 -1.000000 -1.000000 10.747475 -1.000000 -1.000000 44.000000 -1.000000 55.600000 -1.000000
1994 -1.000000 42.373288 47.750000 106.625000 186.875000 21.900000 25.285714 -1.000000 -1.000000 18.000000 7.855072 -1.000000 -1.000000 -1.000000 -1.000000 17.585714 -1.000000 -1.000000 10.675000 -1.000000 -1.000000 -1.000000 -1.000000 62.333333 -1.000000
1995 -1.000000 44.806147 49.592593 -1.000000 155.833333 27.296296 24.868421 -1.000000 34.000000 21.230769 8.780645 35.333333 -1.000000 22.333333 27.375000 16.844444 -1.000000 -1.000000 12.653846 -1.000000 20.000000 -1.000000 -1.000000 -1.000000 -1.000000
1996 -1.000000 44.506173 47.234940 -1.000000 148.000000 27.769231 23.453704 -1.000000 32.578947 22.027778 8.393846 -1.000000 -1.000000 21.166667 20.224490 18.095563 -1.000000 -1.000000 11.455556 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000
1997 -1.000000 44.708551 48.177866 111.000000 150.688889 33.625000 24.785156 -1.000000 31.085603 20.929825 8.448649 22.000000 -1.000000 18.789474 21.126394 18.175000 19.000000 20.000000 11.230769 -1.000000 11.000000 -1.000000 50.857143 54.666667 -1.000000
1998 -1.000000 43.131403 49.731183 116.000000 159.466667 -1.000000 24.675676 -1.000000 30.082237 20.304348 8.720000 -1.000000 -1.000000 16.714286 20.591398 16.266990 -1.000000 -1.000000 13.100000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000 -1.000000
1999 -1.000000 43.945402 48.012048 120.714286 182.200000 -1.000000 25.723810 -1.000000 31.710037 25.428571 -1.000000 -1.000000 -1.000000 -1.000000 22.523810 16.521212 -1.000000 -1.000000 10.555556 -1.000000 -1.000000 -1.000000 54.000000 -1.000000 -1.000000
2000 -1.000000 43.126638 49.224719 -1.000000 179.307692 -1.000000 25.303448 -1.000000 30.878899 21.615385 -1.000000 -1.000000 -1.000000 21.000000 20.500000 16.788618 -1.000000 -1.000000 11.400000 -1.000000 -1.000000 -1.000000 73.375000 -1.000000 -1.000000
2001 -1.000000 45.442177 52.233766 -1.000000 167.851064 -1.000000 23.707792 -1.000000 32.798851 20.400000 8.538462 -1.000000 18.000000 -1.000000 26.666667 17.752896 -1.000000 -1.000000 11.333333 -1.000000 -1.000000 -1.000000 79.900000 -1.000000 -1.000000
2002 -1.000000 46.168317 49.514403 -1.000000 182.159091 25.428571 23.833333 -1.000000 32.359447 21.719298 7.388889 -1.000000 20.000000 -1.000000 19.000000 17.018617 -1.000000 -1.000000 10.000000 10.142857 -1.000000 62.166667 64.666667 -1.000000 -1.000000
In [ ]: