Almost three years ago I ran the numbers to identify “Vancouver’s most lucrative fire hydrant”.
Being a card-carrying Shoupista it’s high time for me to do an update. And looking back I can’t help but realize how my approach to data analysis, even about such trivial things as parking tickets, has changed since then. Back then I scripted makeshift analysis in a general purpose language. Nowadays I work in R or Python and am much more structured in my approach, with emphasis on reproducibility and transparency. And this included the entire pipeline, from data acquisition, to data cleaning, data analysis and visualization.
In this case, we are working with City of Vancouver Open Data, and data acquisition happens through my relatively new
VancouvR package that ties into the new City of Vancouver Open Data API and is now on CRAN. Usually I hide the code from the post but make it available on GitHub in case people are interested, but this time around I am leaving some of the code blocks visible to showcase the
VancouvR package and give people an idea what it looks like, and as advertisement for more people to come work with Vancouver data.
First up, let’s check for parking ticket related datasets.
library(tidyverse) library(VancouvR) ticket_datasets <- search_cov_datasets("parking tickets") ticket_datasets %>% select(dataset_id,title) %>% pretty_table()
|parking-tickets-2017-2019||Parking tickets 2017-2019|
|parking-tickets-2010-2013||Parking tickets 2010-2013|
|parking-tickets-2014-2016||Parking tickets 2014-2016|
Before accessing the data it’s often useful to take a peek at the metadata to get and overview of what to expect from the dataset.
get_cov_metadata(ticket_datasets$dataset_id %>% first) %>% pretty_table()
|block||int||Block||Block level of the street where the infraction occurred.|
|street||text||Street||Name of the street where the infraction occurred|
|entrydate||date||EntryDate||Date the infraction occurred|
|bylaw||int||Bylaw||Specific parking bylaw which the parking ticket was issued under|
|section||text||Section||Specific section of the bylaw which the infraction pertains|
|status||text||Status||Status of the parking ticket. – CA = One time courtesy cancellation (no longer exists), IS = Issued, RA = Cancelled due to Paid by Phone, VA = Void, VR = Void request, VS = Auto-void, WR = Warning|
|infractiontext||text||InfractionText||Description of the infraction|
|year||text||Year||Year the infraction occurred|
We are only interested in parking tickets related to fire hydrants. One great feature of the new City of Vancouver Open Data API is that we can do some basic summary statistics on their server, greatly simplifying and speeding up the analysis. The API accepts some SQL-like dialect that allows to specify basic
group_by clauses, as well
select statements that in our R package default to simply counting the number of rows.
agg <- aggregate_cov_data("parking-tickets-2017-2019", where = "infractiontext LIKE 'FIRE'", group_by = "bylaw,section,infractiontext,status") agg %>% pretty_table()
|STOP WITHIN 5 METRES OF A FIRE HYDRANT||IS||17.2(C)||2849||9510|
|STOP WITHIN 5 METRES OF A FIRE HYDRANT||VA||17.2(C)||2849||1765|
|STOP WITHIN 5 METRES OF A FIRE HYDRANT||WR||17.2(C)||2849||14|
We learn that 9510 non-voided or disputed tickets have been issued for “STOP WITHIN 5 METRES OF A FIRE HYDRANT”. Armed with that knowledge, we now query more detailed data on all these tickets. One hiccup is that the datasets for different time frames are inconsistently formatted, turning off automatic type-casting based on the inconsistent metadata makes it easier to work with the data.
fire_hydrant_tickets <- ticket_datasets$dataset_id %>% lapply(function(ds)get_cov_data(ds, where = "section = '17.2(C)' and status = 'IS'",cast_type=FALSE)) %>% bind_rows fire_hydrant_tickets %>% ggplot(aes(x=year)) + geom_bar(fill="steelblue") + scale_y_continuous(labels=scales::comma) + plot_theme + labs(title="City of Vancouver parking tickets",x="",y="Number of issued tickets")
So the City has issued around 3,000 tickets a year for parking within 5 metres of a fire hydrant. The last data entry we have for 2019 is from 2019-09-30, so there is still time for that number to grow. The Parking Bylaw calls for a $100 penalty for parking within 5 meteres of a fire hydrant, as measured along the curb from the closest point to the hydrant. So that comes out to about $300k a year in fines for blocking fire hydrants in the City of Vancouver.
Next up, lets check the top 5 most heavily ticketed fire hydrants in our 9 year period.
fire_hydrant_tickets %>% count(block,street) %>% top_n(5) %>% arrange(-n) %>% pretty_table()
|2100||40th Ave W.||683|
|2100||W 40TH AVE||483|
Looking at the list we immediately notice something odd. Numbers 2 and 4 appear to be the same block and street, just written differently. The addresses aren’t properly normalized, we will have to do some data cleaning work first. 麻煩! We hide the code for that behind a function call.
top_hydrants <- fire_hydrant_tickets %>% normalize_addresses() %>% count(Address) %>% top_n(5) top_hydrants %>% arrange(-n) %>% pretty_table()
|2100 W 40TH AV||1166|
|1100 HARO ST||770|
|400 KEEFER ST||717|
|2100 PINE ST||337|
|5600 ORMIDALE ST||332|
The clear winner is the one on the 2100 block of W 40TH AVE. Checking Google Street View, there is only one on the block. It was hard to find because – two cars blocked the view on it.
To see how things have evolved over time we can check how they fared over the years.
fire_hydrant_tickets %>% normalize_addresses() %>% count(Address,year) %>% group_by(year) %>% top_n(1) %>% arrange(year,-n) %>% pretty_table()
|1100 HARO ST||2011||103|
|400 KEEFER ST||2012||121|
|400 KEEFER ST||2013||129|
|400 KEEFER ST||2014||163|
|400 KEEFER ST||2015||112|
|2100 W 40TH AV||2016||143|
|2100 W 40TH AV||2017||237|
|2100 W 40TH AV||2018||347|
|5600 ORMIDALE ST||2019||168|
The winner each year is from our overall top 5 list. Current front runner for 2019 is the one on the 5600 block of Ormidale St, which deserves a closer look.
fire_hydrant_tickets %>% normalize_addresses() %>% filter(Address=="5600 ORMIDALE ST") %>% mutate(month=strftime(entrydate,"%m")) %>% mutate(Date=as.Date(paste0(year,"-",month,"-15"))) %>% ggplot(aes(x=Date)) + geom_bar(fill="brown") + plot_theme + labs(title="Fire hydrant on the 5600 block of Ormidale St",x="Month",y="Number of tickets")
It looks like the hydrant was a fairly low-key affair until 2014, when it dropped off the map and then took off around 2018. A quick check with Google Street View indicates that this sits in front of a new development. Checking through the timeline, the site shows a house in May 2009, which has been torn down by June 2012, although it is still possible to illegally park in front of it. By May 2014 and July 2014 the neighbouring house is gone too and there is a hole in the ground with heavy machinery digging a foundation, but cars can still illegally park there. In June 2015 and May 2016 it’s a full-on construction site with no options to park illegally any more. August 2017 marks the end of construction and the first people seem to have moved in. And the parking tickets start ramping up, with many more people trying to park on the street now.
Lastly, let’s get a high-level view on all the parking tickets issued throughout the city. But here things get a little ugly as we have to first geocode the blocks, and we will hide the code from now on.
Especially the fire hydrants that are attracting lots of tickets should probably receive a review by the engineering department. While people should pay more attention to where they park, there are some straight-forward ways to make things easier. Simply paining the curb red will probably fix this for most hydrants and make sure they are free of obstructions and easy to access in case of a fire.
More parking tickets
Fire hydrants are just one way to get a parking ticket. We can of course continue this and see what areas got the most overall tickets. And for what reason. We will concentrate on the tickets issued 2017 to 2019.
The distribution of tickets across the city is fairly consistent across years, with total ticket counts peaking in downtown, as well as the central Broadway corridor and in Kits.
To understand these patterns better, it is useful to look at the top reasons parking tickets have been issued.
The presence of parking meters clearly plays a role in parking tickets with a total of 596,944 out of the 1,074,220 infractions referencing some kind of violation involving a parking meter.
There seems to be a clear relationship between the number of tickets and the number of parking meters in each area. We can normalize the meter-related parking tickets by the number of meters in each area to give us a count on the average number of tickets per meter in each of the areas.
This shows a much more uniform pattern, with a clear outlier in Strathcona which might be worth looking into further.
Now that we have some understanding of meter-related tickets, we can take a look at the remaining non-meter related tickets.
The remaining tickets distribute quite well over a range of categories.
Geographically, highly ticketed blocks cluster in the downtown core and surrounding areas, and spill out along commercial corridors. One can’t help but notice the correlation with meter locations, possibly due to ticketing officers focusing their efforts on those areas.
That’s a wrap for tonight’s quick run-through on how to use our new-ish
VancouvR package to easily access Vancouver Open Data. As usual, the code is available for anyone to download and adapt for their own purposes.