On the heels of the new assessment data we can start to slice the data in different ways to understand various aspects of the real estate landscape in Vancouver. The fact that Vancouver Open Data makes historic data available gives the ability to look for changes over time.
Our maps explore this by visualizing some aspects of these changes for all properties, but it might also be useful to filter the properties we show to focus in on specific criteria.
“Teardowns” always triggers lots of emotions in Vancouver. Without looking at the emotional side and trying to avoid any judgement we will investigate the data to understand what buildings have been torn down recently and predict which buildings will get torn down next. And map them. Long story short, we predict that 1 in 6 buildings on this map (and then some more with lower teardown probability) will get torn down and rebuilt by 2026.
Building age temporal distribution
To start understanding teardowns and rebuilds let’s look at the age of the building stock.
To get a better overview of the building stock through time we can graph the number of buildings by age. We look at buildings, not units. So a stratified building with 100 units would still count as one building in our graph. And it is not looking at how many buildings were built in each year, but how many buildings that were built in a given year are still standing today.
We still have 7 buildings in Vancouver that were built before 1900 (the earliest from 1800). Skipping these we graph the rest to get:
Starting with 1950 the distribution of buildings by age is quite uniform, with a short peak around the early 1990s.
The dip at the end is due to some lag in new buildings showing up in the property dataset. Looking at the more recent history it is safe to assume that the number of buildings still standing corresponds well to the buildings of units built in that year. So the pace of new buildings right now seems to fit in quite well with the recent history and is a little lower than the peak in the early 1990s.
Recent Building stock (and recent teardowns) spatial distribution
The next question is to focus on the spatial distribution of recent redevelopment by filtering out older buildings. Being too lazy to add a bush for dynamic selection of time ranges I just made a static (in time) view only showing the 6883 properties built after 2006. It is quite safe to assume that most of those new buildings replaced older ones that were torn down. So this map of new buildings is also a map of locations of buildings that were torn down in the last 10 years.
What’s interesting is when selecting
relative building value view that
there are some properties that have been recently re-developed with increadibly low building value, like the property
at 5649 Dunbar St.
This gives a window into some of the imperfections of the BC Assessment process where the building
value after re-development is not properly reflected in their dataset. In this case it seems to be
a property whose only “improvement” seems to be the pavement on it.
It also shows that recent building (or teardown) activity is fairly uniform across the city, with only some areas standing out as having little development like the West End, parts of Kitsilano and Strathcona.
What gets torn down and rebuilt next?
The big question is of course where new buildings get built next. In a built up space like Vancouver there are few sites left where building a new building does not mean tearing down an old one. So another way to ask that question is: What gets torn down next?
Predicting which building will get torn down next is of course impossible. So what we try to do is assign a “teardown probability” to each building.
Let’s first try to understand why a particular building might get torn down as opposed to the one next door. Typically buildings get torn down at the time when they change ownership. So if a building is not sold, it is far less likely to get torn down. So what makes a building more likely to get torn down when it is sold? One hypothesis would be that the value of the building relative to the land should play an important factor. Let’s test this hypothesis using the data.
We take the 2006 tax dataset as a baseline and check how many of the buildings have been torn down by 2016. Refer to the Methodology and Data section at the botton for the messy details. We only count buildings, so we count a strata lot with 100 units in the same building as one building. Then we use the 2016 dataset to check how many of them are still around, identifying them by their tax coordinate and again asking they be marked as being built no later than 2006.
These criteria capture well what we are looking for, but they are not perfect. As a predictive variable we use the
So we sort the properties by their teardown coefficient using the 2006 tax assessment data and we check how each group fares.
First up a graph of the distribution of buildigs in 2006 by their teardown coefficient.
Next up the number of buildings in each category that got torn down and rebuilt by 2016:
We see that our initial hypothesis seems to hold up quite well. The number of buildings that got torn down and rebuilt decreases as the teardown coefficient increases. Remember that we defined the teardown coefficient to be the percentage of the building value out of the total value of the property.
Refer to the methodology and data section for further information on how these numbers were extracted.
To explore this further let’s graph the frequency with which a building in a given teardown coefficeint range gets torn down. To keep things cleaner where we only plot up to a teardown coefficient of 50%:
We see that the teardown coefficient has high predictive value for a building to be torn down and being rebuilt in the following 10 years. Buildings with a teardown coefficient below 5% have about an 18% chance, and the probability declines exponentially down to zero at a teardown coefficient of about 50%.
If we were more serious about this we would fit and exponential curve to the data and compute how well it fits the data, repeat the computation for other time frames, run it on individual neighbourhoods and maybe also on data from other municipalities to properly validate our model. We could also refine the model by refining our filters, see the methodology and data section for more details.
And we could add other factors that likely effect the teardown probability, like building age, proximity to arterials and others. Of course these are not independent factors, so this kind of analysis requires more time.
Now to the main part: Predicting teardowns. How many buildings will get torn down and rebuilt in the next 10 years? Let’s use what we have just learned to extrapolate.
First up the graph of the 2016 building stock by teardown coefficient:
To estimate how many buildings will get torn down and rebuilt in each category we simply multiply each bin with the teardown probability from the frequency graph above:
Bottom line, we predict around 8,000 buildings to be torn down and rebuilt by 2026. That’s significantly more than the around 5,900 buildings that we identified as going through this process during the prior 10 years.
There are lots of assumptions that went into this estimate. While we are confident in our analysis that properties with low teardown coefficient are the ones most likely to be torn down, it is less clear if the number of properties being torn down grows linearly as the properties with low teardown coefficient grow. In our case the number of properties with teardown coefficient below 5% grew from 20492 (21% of the 2006 stock) to 32509 (33.5% of the 2016) stock, which may be out of the range where our simplistic extrapolation holds. One could try to understand this by carefully analyzing all available tax years, and not just the two extremes of the available spectrum.
Now that we understand how to assign a teardown probability to buildings, let’s map them! To keep things as simple as possible let’s focus in on the homes with a teardown coefficient below 5%. They make up the bulk in our prediction and have the simple interpretation that a little more than 1 in 6 of these will get replaced by something else by 2026. So here is the interactive map of just these 31301 buildings, where we have filtered out some parks, marinas and rail lines. And this only accounts for the 5,700 buildings predicted to be torn down with a teardown coefficient below 5% cutoff and neglects the roughly 2,000 more that are predicted to be torn down that have a teardown coefficient above 5%.
Methodology and Data
Only for people who love getting their hands dirty or who want to reproduce or expand on the analysis.
First thing to note is that there is no way to detect “teardowns” in the dataset, the only way is to look at what has been rebuilt and what has ‘dropped off’. To be more precise, there data fields to look at is the “land coordinate”, which links a taxable property to a physical structure, and the “year built”. And both fields have problems.
The “land coordinate” gets de-commissioned and re-assigned during certain re-develpments. And the city dataset provides no way to link the old one to the new one. One way to do that is through the polygons that mark the property boundaries, that would allow tracking of complex re-assemblies of land. But the city does not publish historic records of property polygons.
The “year built” also has lots of issues. Sometimes it is blank even though it records the value of the building as greater than zero. Sometimes the “year built” will be set to a date later than the date of the dataset, for example the 2006 tax dataset has buildings with “year built” all they way up to 2013.
Then comes the issue of filtering. We decided to filter out parks, rail lines and marinas without structures on them. The algorithm is somewhat simplistic, it’s the same one that was used to filter properties for the maps. Additionally we filter out properties from the heritage dataset. There is definitely room for improvement here, but without a clear question of what exactly to measure (only single family homes, or also condos or apartments, treat commercial separately, …) it does not make much sense to invest energy into this. After all, this is just looking for a rough model.
So how do we detect rebuilds? We take the land coordinates from properties identified as park or heritage and sieve through the 2006 tax data to retrieve all records that don’t match these land coordinates and have a “year built” column set as 2006 or earlier or don’t have a “year built” set at all but change from zero to non-zero building value from 2006 to 2016.
Pretty messy. We mapped about 6,900 buildings were built after 2006, but only traced 5,869 buildings in the 2006 tax dataset as being torn down and rebuilt. That difference is largely explained by different selection criteria. The map only considers properties with a “year built” field set, but for the analysis we also added properties that don’t have that field set but go from zero building value in 2006 to non-zero building value in 2016 which gets us to 7,784 “rebuilds”. On the other hand in the analysis we don’t consder the roughly 140 heritage buildings that would pass our filter of being built after 2006, and the 2016 tax dataset has 2,422 more buildings than the 2006 dataset, some of which can be seen on this map and are due to subdivisions being split off of the original property.
Anyway, if you want to get your hand dirty on this shoot me a message and I will hook you up with my scripts.