Lab 2. Principles of Cartographic Design

Overview

In this lab, you’ll gain familiarity with the GeoDa software.

Objectives for this practical:

  • Wrangle a data file to prepare for merge
  • Join non-spatial data to spatial data
  • Define data classification and intervals
  • Crate Choropleth maps in GeoDa

Research Question

Environment Setup

We will be working with GeoDa for the practical. Ensure you’ve properly downloaded the software and are able to successfully open.

Documentation of any opensource software is always a must when first using it. Find the documentation for “Basic mapping” on the GeoDa site to get more details, screenshots, and overview of techniques we will be doing here.

Download Data

First, download the data required in the lab.

  • Opioid-related overdose cases (CSV Format)
  • Community area boundaries (Shapefile Format)

Opioid cases can be downloaded from the Chicago Health Atlas (https://www.chicagohealthatlas.org/indicators/opioid-related-overdose-deaths). Community area boundaries can be found at the City of Chicago Data Portal (https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Community-Areas-current-/cauq-8yn6). Unzip and add all the data to a new folder saved for your classwork.

Inspect Data

Now, open and inspect the CSV files in Excel (or software of your choice), and identify the geography columns. We see that this data is in “long” format, meaning every row has a value at a specific geography (citywide or community area) and year. We need to update the data so there are only 77 rows, one corresponding to each community area. To do this, select and copy all rows with Community Area geographies in 2016, and paste in a new tab. There should only be 77 rows. Save this new data as CSV.

Open in GeoDa

Next, we will open the newly created Community Area boundaries data in GeoDa.

  • Open GeoDa
  • In the folder browse button, select ESRI Shapefile, and locate and open your Community Area boundary file. Alternatively, you can drag and drop your file.
  • In the new Map window, click the Basemap icon (second from right) to add a Basemap.
  • To open the corresponding attributes, click on the Table icon in the main toolbar (to the left of the W icon)

Merge data

Now we merge the opioid data to the community areas data. In GeoDa, this is very straightforward: * Under Table in GeoDa’s tabs, select Merge Table * Navigate to your dataset * Merge by key values. Select the community area number (our key) you will join on. In the Community Area shapefile, that was “commarea.” What is it in the other dataset? If you’re not sure, close the Merge window, check, and then return. * Sometimes, field names may be too long for import. This is a function of rule names established for shapefiles, and several GIS software will support and uphold that. Rename field names to shorter versions if necessary. * File merged successfully!

Data Classification

Choropleth maps use data classification techniques to segment the data into different intervals. Each interval then gets mapped with a different color. How you classify the data impacts the intervals, which then impacts your choropleth and interpretation. Before we can start choropleth mapping, we need to better understand how this works. There are different data classification techniques, with some details here directly from GeoDa documentation by Anselin (2018):

Quantiles: Data is grouped into equal bins. A quantile map of 4 bins will group the data into four groups of equal numbers, based on where these breaks are in the data. Sometimes there is data with several fields having the same number, or ties. This can be a problem in quantile maps. GeoDa will move observations to new bins (commonly the next bin up) with there are ties.

Any time there are ties in the ranking of observations that align with the values for the breakpoints, the classification in a quantile map will be problematic and result in categories with an unequal number of observations (Anselin 2018).

Natural Breaks: A natural breaks map uses a nonlinear algorithm to group observations such that the within-group variation is maximized, following the work of Fisher (1958) and Jenks (1977). This is an essentially a clustering algorithm in one dimension to determine the break points that yield groups with the largest internal similarity (Anselin 2018).

Equal Intervals: An equal intervals map uses the same principle as a histogram to organize the observations into categories that divide the range of the variable into equal interval bins. This contrasts with the quantile map, where the number of observations in each bin is equal, but the range for each bin is not. For the equal interval classification, the value range between the lower and upper bound in each bin is constant across bins, but the number of observations in each bin is typically not equal.

Choropleth Mapping in GeoDa

We are ready for choropleth mapping! First, construct a histogram of the data so we can see how the data is distributed in a non-spatial way. Click the “histogram icon” in the main toolbar. Select “Count” as your field; this is the count of all opioid-related overdose deaths in 2016. Inspect the data, selecting intervals of interest to see where that data plots in your main map.

Under the Map tab, select and try the following: * Quantile Map with 4 bins * Quantile Map with 6 bins * Equal Interval Map with 4 bins * Natural Breaks Map with 4 bins

For each, select the last interval in the legend to see corresponding data in the histogram. In the attribute table, data selected will show up in yellow. You can right-click in the attribute table and “move up selected values to top” as well.

Next, try some additional bins, specifically the Standard Deviation Map and a Box Map (with 1.5 hinge). Find the corresponding section in GeoDa documentation that explains these data classification techniques.

To the right of the Histogram icon, click the Box Plot icon. Use a 1.5-hinge for the Count field. Inspect for any outliers. Drag-select regions of the box plot and view how those areas are highlighted in the data table and map views.

Next Steps

With each choropleth map, a new insight can be found. What is similar across all maps? What is changing, and why? If a spatial pattern emerges in a consistent way, it is more likely to be a true phenomenon, in contrast to a data artifact or poor classification choice.

Determine the best choropleth map to present. Update styles by right-clicking on the icons in the legend to change the colors if required. For thematic maps, having more than 5 bins is generally not recommended, though each classification strategy is unique to the data at hand. Go to Color Brewer (http://colorbrewer2.org/#type= sequential&scheme=BuGn&n=3) to explore and view different color strategies. Return to GeoDa to update, using the HEX code for new colors if importing a specific color scheme. Finally, update with your favorite basemap for this data. Update transparency if needed.

Consider additional data that may make this more meaningful. Could you add additional columns for previous years, following what was available in the original dataset? What about additional socioeconomic or built environment indicators?

Search for an additional column to join, merge, and generate a new choropleth map for that variable. What was the added value?