Lab 5. Modeling Neighborhoods
Overview
In this practical, we will identify and quantify crimes at a local areal level for two major cities using data from open data portals. This week’s objectives will be to:
- wrangle and clean raw crime data in R
- convert CSV to spatial points for analysis & visualization
- enable and transform Coordinate Reference Systems (CRS)
- spatially join points to tracts (‘point in polygon’ analysis)
Research Question
This lab practical reflects work done in an actual research study. A team of clinicians, epidemiologists, and other health researchers were interested in comparing differences in access to trauma hospitals across three major cities. Victims of violent crime need quick access to trauma hospitals to ensure optimal results; if areas with disprortionately high homicides are especially far from trauma ERs, health outcomes can be disproportionately worse. The goal of this practical is to generate a new spatial variable, total number of homicides per census tract, in two major cities (LA and NYC) using raw data from each city’s data portal, for further analysis. Homicides were used to proxy violent crime because differenes in reporting structures on violent crime were too different for effective comparison otherwise.
You can read more about the study at: Tung, E. L., Hampton, D. A., Kolak, M., Rogers, S. O., Yang, J. P., & Peek, M. E. (2019). Race/Ethnicity and Geographic Access to Urban Trauma Care. JAMA network open, 2(3), e190138-e190138..
Environment Setup
For this lab, you’ll need to have R and RStudio downloaded and installed on your system. We will work with the following libraries, so please be sure to have already installed:
- sf
- tmap
- leaflet
- data.table
- tidyverse
First, load the libraries we’ll need for our lab.
## Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
## ── Attaching packages ──────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0 ✓ purrr 0.3.4
## ✓ tibble 3.0.1 ✓ dplyr 0.8.5
## ✓ tidyr 1.0.3 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ─────────────────── tidyverse_conflicts() ──
## x dplyr::between() masks data.table::between()
## x dplyr::filter() masks stats::filter()
## x dplyr::first() masks data.table::first()
## x dplyr::lag() masks stats::lag()
## x dplyr::last() masks data.table::last()
## x purrr::transpose() masks data.table::transpose()
Set your working directory.
#setwd("~/Desktop/Lab5-LACrimes")
Clean & Wrangle Data
We will be working with crime data from the Los Angeles open data portal first. When working with crime data, we will filter to the year of interest within the data portal, and then download the filtered dataset. This is because downloading all crimes produces an unnecessarily large dataset, from which we just need a short period of data. Unless you are interested in all crimes across time, start with a smaller subset that is more closely matched to your period of interest. As you get more comfortable with coding and optimizing speed and efficiency, your process may change.
In this lab practical, all violent crimes from 2015 are provided as the filtered, downloaded dataset that we start with. From these violent crimes, we are tasked with identifying “homicides” to make more comparable variables between cities. We must identify rows coded as homicides in LA, and will later do the same for NYC. Note that each police department jurisdiction codes slightly differently. Identifying data for true and meaningful comparison is an important step in the research process.
Load CSV with fread
Load the filtered CSV of crimes in Los Angeles from 2015. Here we use the fread
function from the data.table
package, which reads in CSV data much more quickly and efficiently then the base R system.
LAcrime<-fread("LAPD2015_Violent.csv", header = T)
head(LAcrime)
## DR Number Date Reported Date Occurred Time Occurred Area ID Area Name
## 1: 151504287 1/5/15 1/5/15 1320 15 N Hollywood
## 2: 151504288 1/5/15 1/5/15 1320 15 N Hollywood
## 3: 151504289 1/5/15 1/5/15 1320 15 N Hollywood
## 4: 151504298 1/6/15 1/5/15 1140 15 N Hollywood
## 5: 150104246 1/6/15 1/6/15 600 1 Central
## 6: 151704164 1/6/15 1/6/15 520 17 Devonshire
## Reporting District Crime Code Crime Code Description
## 1: 1599 231 ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER
## 2: 1599 231 ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER
## 3: 1599 231 ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER
## 4: 1599 231 ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER
## 5: 152 231 ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER
## 6: 1785 231 ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER
## MO Codes Victim Age Victim Sex Victim Descent Premise Code
## 1: 1212 1100 NA M B 501
## 2: 1212 NA M O 501
## 3: 1212 NA M O 501
## 4: 1212 31 M H 710
## 5: 416 20 M H 101
## 6: 1271 1309 2003 1822 NA F H 101
## Premise Description Weapon Used Code Weapon Description
## 1: SINGLE FAMILY DWELLING 109 SEMI-AUTOMATIC PISTOL
## 2: SINGLE FAMILY DWELLING 109 SEMI-AUTOMATIC PISTOL
## 3: SINGLE FAMILY DWELLING 109 SEMI-AUTOMATIC PISTOL
## 4: OTHER PREMISE 102 HAND GUN
## 5: STREET 500 UNKNOWN WEAPON/OTHER WEAPON
## 6: STREET 307 VEHICLE
## Status Code Status Description Crime Code 1 Crime Code 2 Crime Code 3
## 1: AA Adult Arrest 231 NA NA
## 2: AA Adult Arrest 231 NA NA
## 3: AA Adult Arrest 231 NA NA
## 4: AA Adult Arrest 231 NA NA
## 5: AA Adult Arrest 231 NA NA
## 6: AA Adult Arrest 231 NA NA
## Crime Code 4 Address
## 1: NA 7700 SKYHILL DR
## 2: NA 7700 SKYHILL DR
## 3: NA 7700 SKYHILL DR
## 4: NA 7700 SKYHILL DR
## 5: NA GRAND AV
## 6: NA PRAIRIE
## Cross Street location latitude longitude
## 1: 34.133, -118.3644 34.133 -118.3644
## 2: 34.133, -118.3644 34.133 -118.3644
## 3: 34.133, -118.3644 34.133 -118.3644
## 4: 34.133, -118.3644 34.133 -118.3644
## 5: 6TH ST 34.0486, -118.2554 34.0486 -118.2554
## 6: RESEDA 34.2391, -118.5361 34.2391 -118.5361
Identify unique
Code
Which column stores crime information? Inspect all of them in detail. In this dataset, the “Crime Code Description” attribute field seems to include information we can explore in further detail to filter out homicides.
Identify all possible crimes in the code description field using the unique
function. Where such information is contained will change depending on the city and specific dataset you’re working with.
unique(LAcrime$`Crime Code Description`)
## [1] "ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER"
## [2] "ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT"
## [3] "ATTEMPTED ROBBERY"
## [4] "BATTERY - SIMPLE ASSAULT"
## [5] "CRIMINAL HOMICIDE"
## [6] "OTHER ASSAULT"
## [7] "ROBBERY"
Subset Data
Let’s subset the data to only include homicides. First, let’s make sure our file is a proper data frame.
LAcrime.df<- as.data.frame(LAcrime)
Next, we subset the data. In this base R form of subsetting, we identify all crime codes with the “Criminal Homicide” code. Note that because there are spaces in the column heading for this variable, we had to use single quotes around the column heading.
# Base R subset:
s1<-subset(LAcrime.df,LAcrime.df$`Crime Code Description`== "CRIMINAL HOMICIDE")
Here, we bind all the rows (rbind
) of the subset and give it a new name. This ‘sticks’ all the rows from our subset back together as a data frame.
LAcrime.hom<-rbind(s1)
Challenge: This base R approach is successful here, but there are many other ways of subsetting data in R that are more elegant (and don’t require the data frame check or rbind step). As a challenge, explore more options on your own! Hint: check filter
from dplyr
.
Inspect Data
Let’s preview the first six rows of the data subset, and check dimensions. How many homicides were in LA in 2015? Hint: the total number of observations = crimes.
head(s1)
## DR Number Date Reported Date Occurred Time Occurred Area ID Area Name
## 29097 151015594 10/2/15 1/1/15 6 10 West Valley
## 29098 151604033 1/2/15 1/2/15 755 16 Foothill
## 29099 150200508 1/4/15 1/3/15 2045 2 Rampart
## 29100 150604256 1/4/15 1/4/15 2355 6 Hollywood
## 29101 151804154 1/5/15 1/4/15 1900 18 Southeast
## 29102 150104310 1/7/15 1/6/15 1047 1 Central
## Reporting District Crime Code Crime Code Description
## 29097 1047 110 CRIMINAL HOMICIDE
## 29098 1675 110 CRIMINAL HOMICIDE
## 29099 211 110 CRIMINAL HOMICIDE
## 29100 646 110 CRIMINAL HOMICIDE
## 29101 1821 110 CRIMINAL HOMICIDE
## 29102 165 110 CRIMINAL HOMICIDE
## MO Codes Victim Age Victim Sex Victim Descent
## 29097 24 F W
## 29098 0430 1100 1402 1414 27 F B
## 29099 1100 0430 0906 1402 28 M H
## 29100 1100 0430 53 M B
## 29101 0906 1270 0302 0334 0430 1100 1407 28 M H
## 29102 1218 0411 34 M B
## Premise Code Premise Description Weapon Used Code
## 29097 149 RIVER BED* 500
## 29098 127 TRASH CAN/TRASH DUMPSTER 106
## 29099 102 SIDEWALK 106
## 29100 101 STREET 109
## 29101 301 GAS STATION 102
## 29102 102 SIDEWALK 204
## Weapon Description Status Code Status Description Crime Code 1
## 29097 UNKNOWN WEAPON/OTHER WEAPON IC Invest Cont 110
## 29098 UNKNOWN FIREARM IC Invest Cont 110
## 29099 UNKNOWN FIREARM AA Adult Arrest 110
## 29100 SEMI-AUTOMATIC PISTOL AA Adult Arrest 110
## 29101 HAND GUN IC Invest Cont 110
## 29102 FOLDING KNIFE AA Adult Arrest 110
## Crime Code 2 Crime Code 3 Crime Code 4
## 29097 NA NA NA
## 29098 998 NA NA
## 29099 998 NA NA
## 29100 998 NA NA
## 29101 NA NA NA
## 29102 NA NA NA
## Address
## 29097 BALBOA BL
## 29098 9100 DE GARMO AV
## 29099 600 N ALEXANDRIA AV
## 29100 1600 N CAHUENGA BL
## 29101 9900 S HOOVER ST
## 29102 600 WALL ST
## Cross Street location latitude longitude
## 29097 S VICTORY BL 34.1775, -118.5088 34.1775 -118.5088
## 29098 34.2346, -118.3741 34.2346 -118.3741
## 29099 34.0813, -118.2981 34.0813 -118.2981
## 29100 34.0998, -118.3295 34.0998 -118.3295
## 29101 33.9464, -118.2869 33.9464 -118.2869
## 29102 34.0433, -118.2488 34.0433 -118.2488
Save Cleaned Data
Finally, let’s write this subset to a CSV of homicide data in LA from 2015 to archive our data as we’ve wrangled it.
write.csv(LAcrime.hom,"LAcrime_hom.csv")
Convert CSV to Points
While our CSV file includes location information, it is still not spatial data because the spatial dimension has not been enabled. Let’s enable it.
First, identify which long/lat fields we are using.
Identify X,Y locations
Coordinate information is stored as columns “longitude” and “latitude” (or X,Y coordinates) corresponding to:
glimpse(LAcrime.hom[,c("longitude","latitude")])
## Rows: 282
## Columns: 2
## $ longitude <chr> "-118.5088", "-118.3741", "-118.2981", "-118.3295", "-118.2…
## $ latitude <chr> "34.1775", "34.2346", "34.0813", "34.0998", "33.9464", "34.…
Now we convert the crimes to points using the long/lat fields, and assign the standard projection of WGS84 using the SRID code EPSG:4326 based on our “best guess” of the actual CRS.
Remember that long = x, lat = y! If you mix this up, your points will end up getting projected on the other side of the world.
Even if we want to convert to another CRS later, we must first “respect” the CRS that the long/lat data is currently in. We use the st_as_sf
function from the sf
package. Uncomment and run the line below.
Check long/lat structure
Let’s check the data structure of long/lat to first confirm they are numeric:
str(LAcrime.hom[,c("longitude", "latitude")])
## 'data.frame': 282 obs. of 2 variables:
## $ longitude: chr "-118.5088" "-118.3741" "-118.2981" "-118.3295" ...
## $ latitude : chr "34.1775" "34.2346" "34.0813" "34.0998" ...
They are character data formats – we need numeric numbers. A quick online search shows multiple ways to convert data structures in R. We will use the as.numeric
function to convert these fields.
LAcrime.hom$latitude <- as.numeric(LAcrime.hom$latitude)
LAcrime.hom$longitude <- as.numeric(LAcrime.hom$longitude)
## Warning: NAs introduced by coercion
We get a new error stating that “NAs introduced by coercion.” That suggests that we have a few observations that do not have either long/lat values! These will not be included in the final analysis. Note that we can’t convert to a spatial data format unless we remove these. Uncomment and run the folllowing:
LAcrime.pts <- st_as_sf(LAcrime.hom, coords = c("longitude","latitude"), crs = 4326)
## Error in st_as_sf.data.frame(LAcrime.hom, coords = c("longitude", "latitude"), : missing values in coordinates not allowed
If we run this expression as is, we get an error that “missing values in coordinates not allowed.”
ID Missing Data
Let’s see if we can identify which observation(s) is(are) faulty. We’ll use the subset()
function to see which crimes have an NA value – using the is.na()
function to check for null values in the long or lat fields:
LAcrime.hom.na <- subset(LAcrime.hom, is.na(LAcrime.hom[,c("longitude", "latitude")]))
glimpse(LAcrime.hom.na) #1 observations
## Rows: 1
## Columns: 28
## $ `DR Number` <int> 151716991
## $ `Date Reported` <chr> "10/4/15"
## $ `Date Occurred` <chr> "10/4/15"
## $ `Time Occurred` <int> 2100
## $ `Area ID` <int> 17
## $ `Area Name` <chr> "Devonshire"
## $ `Reporting District` <int> 1796
## $ `Crime Code` <int> 110
## $ `Crime Code Description` <chr> "CRIMINAL HOMICIDE"
## $ `MO Codes` <chr> "1309 0554 0416 1402"
## $ `Victim Age` <int> 35
## $ `Victim Sex` <chr> "M"
## $ `Victim Descent` <chr> "B"
## $ `Premise Code` <int> 101
## $ `Premise Description` <chr> "STREET"
## $ `Weapon Used Code` <int> 307
## $ `Weapon Description` <chr> "VEHICLE"
## $ `Status Code` <chr> "AA"
## $ `Status Description` <chr> "Adult Arrest"
## $ `Crime Code 1` <int> 110
## $ `Crime Code 2` <int> 998
## $ `Crime Code 3` <int> NA
## $ `Crime Code 4` <int> NA
## $ Address <chr> "16900 NAPA ST"
## $ `Cross Street` <chr> ""
## $ location <chr> "34.2267, -118.5"
## $ latitude <dbl> 34.2267
## $ longitude <dbl> NA
Remove NA values
In this case we have one observation that seems to have been incorrectly coded; while location information is present, the longitude value is empty. We could assign the location value, but in this case, we will be extra cautious and remove the observation. Here we just grab the long/lat fields and the unique ID.
LAcrime.hom2 <- na.omit(LAcrime.hom[,c("DR Number","longitude", "latitude")])
str(LAcrime.hom2)
## 'data.frame': 281 obs. of 3 variables:
## $ DR Number: int 151015594 151604033 150200508 150604256 151804154 150104310 151204583 150504595 151804468 151204896 ...
## $ longitude: num -119 -118 -118 -118 -118 ...
## $ latitude : num 34.2 34.2 34.1 34.1 33.9 ...
## - attr(*, "na.action")= 'omit' Named int 225
## ..- attr(*, "names")= chr "29321"
We use the str()
function to ensure we have 1 less observation, for a total of 281 (out of 282).
Challenge: How would you get all the variables, minus the NA’s in long/lat?
Convert & Inspect
Now we can succesffully convert the data frame into a spatial data frame.
LAcrime.pts <- st_as_sf(LAcrime.hom2, coords = c("longitude","latitude"), crs = 4326)
Let’s plot our points to make sure they look like LA:
plot(LAcrime.pts)
Again, if these points were not plotting correctly, you would need to check: (1) if you specified long/lat correctly or if they were flipped by accident, and (2) if the CRS you used was in fact the real CRS of the coordinates.
Save Clean Data
We now have a subset of crime data for Los Angeles in 2015 that only includes homicides, recorded as a CSV, and now as a spatial point data frame. We’ll write the homicide data with all features available to a shapefile for archiving. Uncomment and run.
#st_write(LAcrime.pts,"LAcrime_hom.shp")
Rinse and Repeat
Next, do the same for the NYC dataset. What crime code description did you use? How many total homicides were there in NYC in 2015? Were there any NA values you had to deal with in the lat/long fields? Save the cleaned NYC homicide dataset as a CSV and the cleaned NYC points as a SHP.
Standardize CRS
Load & Inspect
We’ll rename the cleaned crime dataset to make it easier for analysis here. You could also load the new point shapefile you generated instead.
LAcrimes<-LAcrime.pts
Next load the LA tract shapefile, as provided in the lab materials.
LAtracts <- st_read("LAC_Shape.shp")
## Reading layer `LAC_Shape' from data source `/Users/HIPark/Documents/micrometcalf/Intro2GIS/book/LAC_Shape.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1009 features and 12 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -118.6983 ymin: 33.69692 xmax: -118.149 ymax: 34.34164
## CRS: 4269
Overlay Points & Polygons
We can plot these quickly to ensure they are overlaying correctly. If they are, our coordinate systems are working correctly.
## 1st layer (gets plotted first)
tm_shape(LAtracts) + tm_borders(alpha = 0.4) +
## 2nd layer (overlay)
tm_shape(LAcrime.pts) + tm_dots(size = 0.1, col="red")
Check CRS
Check the Coordinate System/Projection for your data.
st_crs(LAcrimes)
## Coordinate Reference System:
## User input: EPSG:4326
## wkt:
## GEOGCS["WGS 84",
## DATUM["WGS_1984",
## SPHEROID["WGS 84",6378137,298.257223563,
## AUTHORITY["EPSG","7030"]],
## AUTHORITY["EPSG","6326"]],
## PRIMEM["Greenwich",0,
## AUTHORITY["EPSG","8901"]],
## UNIT["degree",0.0174532925199433,
## AUTHORITY["EPSG","9122"]],
## AUTHORITY["EPSG","4326"]]
Are the coordinate systems for crime points and tracts the same?
st_crs(LAtracts)
## Coordinate Reference System:
## User input: 4269
## wkt:
## GEOGCS["NAD83",
## DATUM["North_American_Datum_1983",
## SPHEROID["GRS 1980",6378137,298.257222101,
## AUTHORITY["EPSG","7019"]],
## TOWGS84[0,0,0,0,0,0,0],
## AUTHORITY["EPSG","6269"]],
## PRIMEM["Greenwich",0,
## AUTHORITY["EPSG","8901"]],
## UNIT["degree",0.0174532925199433,
## AUTHORITY["EPSG","9122"]],
## AUTHORITY["EPSG","4269"]]
If they match, we are ready for point-in-polygon (PIP) or spatial join operation. R is very finicky about wanting an identical CRS specification. Since they don’t match exactly by R standards, we need to transform our files into the same projection.
Transform CRS
We’ll use the LAtracts CRS as our main projection. We then transform LAcrimes into the new projection using the st_transform
function.
CRS.new <- st_crs(LAtracts)
LAcrimes <- st_transform(LAcrimes, CRS.new)
Check the CRS of both datasets again. If they are identical you’re ready to move onto the next step!
Point-in-Polygon
Our LA Crimes dataset is very small; we just kept the ID per point in case we need to merge more information from the raw data file later. We need to identify which census tract each crime occurred in, next.
glimpse(LAcrimes)
## Rows: 281
## Columns: 2
## $ `DR Number` <int> 151015594, 151604033, 150200508, 150604256, 151804154, 15…
## $ geometry <POINT [°]> POINT (-118.5088 34.1775), POINT (-118.3741 34.2346…
Spatial Join
First we will spatially join Crimes and Tracts. This operations uses a within
operation to essentially “stick” all attributes from census tracts to the crime data file, based on the spatial location or intersection of crimes in tracts.
crime_in_tract <- st_join(LAcrimes, LAtracts, join = st_within)
## although coordinates are longitude/latitude, st_within assumes that they are planar
glimpse(crime_in_tract)
## Rows: 281
## Columns: 14
## $ `DR Number` <int> 151015594, 151604033, 150200508, 150604256, 151804154, 15…
## $ STATEFP10 <chr> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06…
## $ COUNTYFP10 <chr> "037", "037", "037", "037", "037", "037", "037", "037", "…
## $ TRACTCE10 <chr> "139001", "121102", "192610", "190700", "240402", "206300…
## $ GEOID10 <chr> "06037139001", "06037121102", "06037192610", "06037190700…
## $ NAME10 <chr> "1390.01", "1211.02", "1926.10", "1907", "2404.02", "2063…
## $ NAMELSAD10 <chr> "Census Tract 1390.01", "Census Tract 1211.02", "Census T…
## $ MTFCC10 <chr> "G5020", "G5020", "G5020", "G5020", "G5020", "G5020", "G5…
## $ FUNCSTAT10 <chr> "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S…
## $ ALAND10 <dbl> 1239788, 7458077, 415408, 642502, 601299, 616657, 742058,…
## $ AWATER10 <dbl> 0, 82268, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ INTPTLAT10 <chr> "+34.1754113", "+34.2375326", "+34.0813650", "+34.0986626…
## $ INTPTLON10 <chr> "-118.5117180", "-118.3841883", "-118.2961539", "-118.333…
## $ geometry <POINT [°]> POINT (-118.5088 34.1775), POINT (-118.3741 34.2346…
Note that the st_join
function assumed planar coordinates, though our actual CRS is not in a planar CRS. For our purposes, because the geographic space of LA is relatively small (compared to the surface of the Earth), it will be okay. For larger areas or to be more precise, you would need to research, identify, and transform all files into a different CRS.
Count Crimes per Tract
Next, we’ll count all crimes by tract using the table
function. There are many ways to do this operation in R. Inspect the output.
crime_tract_count <- as.data.frame(table(crime_in_tract$TRACTCE10))
glimpse(crime_tract_count)
## Rows: 206
## Columns: 2
## $ Var1 <fct> 101300, 104108, 104203, 104310, 104320, 104401, 104404, 104701, …
## $ Freq <int> 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1…
As a challenge, explore more options on your own using online R resources. See if you can identify ways to do this using dyplyr
functions.
Rename Column Names
We will next rename column names.
names(crime_tract_count) <- c("TRACTCE10","CrimeCt")
glimpse(crime_tract_count)
## Rows: 206
## Columns: 2
## $ TRACTCE10 <fct> 101300, 104108, 104203, 104310, 104320, 104401, 104404, 104…
## $ CrimeCt <int> 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1,…
Merge Data
Now we can merge our count table back to our master LAtracts spatial file. We will use the common key “TRACTCSE10” to merge these. Inspect.
LAtracts_new <- merge(LAtracts, crime_tract_count, by="TRACTCE10")
glimpse(LAtracts_new)
## Rows: 206
## Columns: 14
## $ TRACTCE10 <chr> "101300", "104108", "104203", "104310", "104320", "104401"…
## $ STATEFP10 <chr> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06"…
## $ COUNTYFP10 <chr> "037", "037", "037", "037", "037", "037", "037", "037", "0…
## $ GEOID10 <chr> "06037101300", "06037104108", "06037104203", "06037104310"…
## $ NAME10 <chr> "1013", "1041.08", "1042.03", "1043.10", "1043.20", "1044.…
## $ NAMELSAD10 <chr> "Census Tract 1013", "Census Tract 1041.08", "Census Tract…
## $ MTFCC10 <chr> "G5020", "G5020", "G5020", "G5020", "G5020", "G5020", "G50…
## $ FUNCSTAT10 <chr> "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S"…
## $ ALAND10 <dbl> 2580401, 1096680, 707955, 1508623, 1212162, 681115, 539620…
## $ AWATER10 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 7144, 12394, 0, 3298, 0, 0, 0, …
## $ INTPTLAT10 <chr> "+34.2487776", "+34.2731497", "+34.2790581", "+34.2763731"…
## $ INTPTLON10 <chr> "-118.2709988", "-118.3985703", "-118.4115543", "-118.4292…
## $ CrimeCt <int> 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1…
## $ geometry <POLYGON [°]> POLYGON ((-118.2667 34.2484..., POLYGON ((-118.399…
Our new spatially calculated variable “CrimeCt” is successfully added to our master spatial file.
Conclusions
Visualize Ouput
We’ll use tmap
to quickly plot our map.
tm_shape(LAtracts_new) + tm_fill("CrimeCt", n=4, pal = "BuPu", title="LA Homicides in 2015")
Next, let’s generate an interactive map. Change the tmap mode to “view” – note that it was in the “plot” mode by default.
tmap_mode("view")
## tmap mode set to interactive viewing
Now input the same code to map as before, and explore!
tm_shape(LAtracts_new) + tm_fill("CrimeCt", n=4, pal = "BuPu", title="LA Homicides in 2015")
Save your Shapefile
Save the new shapefile you made using st_write
.
Interpretations
There is some apparent clustering of homicides in the central part of LA in 2015, and to a lesser extent the Northern section of the city. However, most tracts had no homicides and the total number of the year remain relatively small. In the next phase of analysis, this number will be used with additional data resources to better evaluate disparities in trauma hospital accessibility.
Compare Across Cities
Using the spatial data file you generated for NYC in the previous section, attempt to repeat the PIP operation with the NYC tract dataset. Are there any new errors that come up? Visualize your final NYC dataset showing count of Homicides by tract. Save as a shapfile. Revisit your interpretations.