Rental data is an important ingredient to understanding the housing market. Unfortunately we, or maybe just I, don’t have good data on this. And the little data we have we, or maybe just I, don’t understand well. So I spent some time to make sense of what we got. Here is what I learned.
Rental Data
Ideally we would like to know what people pay for rent, where they live, how much money they make, when they first moved in and signed the contract, the term of their contract, how many bedrooms and what size they live in, what kind of building they live in, their income, …
Sadly we don’t know these things, at least not at the individual tenant level. But we have various sources that tell differnt parts of the story.
Census
The census comes closest in allowing us to cross-tabulate most of these variables, but it is only administred once every five years (and is only a ~1/4 subsample). Data for housing from the 2016 census won’t get released until October 25.
CMHC
CMHC has several surveys that can help us figure out what happens in between the censuses. They keep track of the size and makeup of the rental market, for example tracking purpose built rental construction and the purpose built rental stock, subsidized rentals and “secondary market” rentals. They also keep track of vacancy rates and rents in the purpose-built and secondary markets. Their data comes with a consisent methodology, but varying timelines, geographic resoltion, somtimes information on building type and number of bedrooms, and varying quality of the data. CMHC rental data measures “stock” rents, so average or median rents.
Rental Listing Services
Rental listing services aren’t strictly speaking a data source, but they can become one if one keeps track of the listings that are posted. This can be done manually or one can try to automate the “scraping” of these listings. These data sources exist, but the legality of “scraping” is not entirely clear, which means people are generally not willing to share data obtained this way and casues friction in the effort to understand the rental market.
Data obtained this way measures “turnover” rents, so the amount people pay at the time they sign the lease.
Comparing Sources
Each of these sources comes with it’s own advantages and limitations. To understand these, let’s check how these different sources relate.
Part 1: Rental Stock
We divide the rental stock into 4 main categories:
- Purpose Built Rental (Primary Market)
- Private Rental (Secondary Market)
- Subsidized Rental
- Vacant Rental
Using CMHC and census data we can get a good estimate how large each of these segments are.
library(cancensus)
## Loading required package: dplyr
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(cmhc)
<- function(dataset,level,name){
rental_data =list_census_regions(dataset,use_cache = TRUE,quiet=TRUE) %>%
regionsfilter(level==!!level,name==!!name) %>% as_census_region_list
=c("v_CA11N_2252","v_CA11N_2253","v_CA11N_2254","v_CA11N_2255","v_CA11N_2289")
vectors<- get_census(dataset=dataset, regions=regions, vectors=vectors, level='Regions', labels="short") %>%
census_data mutate(tenure=v_CA11N_2252, owner=v_CA11N_2253, renter=v_CA11N_2254, band=v_CA11N_2255, subsidized=round(v_CA11N_2289/100*renter))
=cmhc_timeseries_params(table_id = cmhc_table_list["Rms Rental Universe Time Series"], region = cmhc_region_params(name,level))
cmhc_stock_params=cmhc_timeseries_params(table_id = cmhc_table_list["Rms Vacancy Rate Time Seris"], region = cmhc_region_params(name,level))
cmhc_vacancy_params=cmhc_timeseries_params(table_id = cmhc_table_list["Srms Vacancy Rate Time Seris"], region = cmhc_region_params(name,"CMA")) # can't get CSD level vacancy rates
cmhc_secondary_vacancy_params
<- get_cmhc(cmhc_stock_params) %>% rename(Date=X1) %>% select(-X2)
cmhc_stock_data <- get_cmhc(cmhc_vacancy_params) %>% rename(Date=X1)
cmhc_primary_vacancy_data <- get_cmhc(cmhc_secondary_vacancy_params) %>% rename(Date=X1)
cmhc_secondary_vacancy_data
=filter(cmhc_stock_data,grepl("^2011.*$",Date))$Total
pb_stock=filter(cmhc_primary_vacancy_data,grepl("^2011.*$",Date))$Total/100
primary_vacancy=filter(cmhc_secondary_vacancy_data,grepl("^2011.*$",Date))$Total/100
secondary_vacancy=census_data$renter
all=round(pb_stock / (1-primary_vacancy))
primary_stock=round((all-primary_stock-census_data$subsidized)/(1-secondary_vacancy))
secondary_stock=round(census_data$subsidized / (1-primary_vacancy))
subsidized_stock=round((census_data$subsidized+primary_stock)*primary_vacancy+secondary_stock*secondary_vacancy)
vacant_stock<- tibble(type=c("Purpose Built Rental","Subsized Rental","Private Investor Rental"),count=c(primary_stock,subsidized_stock,secondary_stock))
df return (df)
}
<- rental_data("CA11","CSD","Vancouver") df
## Warning in list_census_regions(dataset, use_cache = TRUE, quiet = TRUE):
## Cached regions list may be out of date. Set `use_cache = FALSE` to update
## it.
## Reading vectors data from local cache.
## Warning in strptime(x, fmt, tz = "GMT"): unknown timezone 'default/America/
## Vancouver'
=sum(df$count)
total$ratio=df$count/total df
Part 2: Stock Rents
Part 3: Turnover Rents
Part 4: Vacancy Rates and Stock Rents
Part 5: Stock Rents vs Turnover Rents
We start with CMHC and census data, both of these are freely available and both measure “stock” rents. We have convenient access to census data using our cancensus
package and CMHC data using our cmhc
package.
#devtools::install_github("mountainmath/cancensus")
library(cancensus)
#devtools::install_github("mountainmath/cmhc")
library(cmhc)
Let’s read in the historical average rent data from CMHC.
<- function(data) {
prep =c("Bachelor","1 Bedroom","2 Bedroom","3 Bedroom +","Total","Single","Semi / Row / Duplex","Other- Primarily Accessory Suites")
variables$Date=as.integer(sub("\\s.+$","",data$X1))
datareturn(data %>% select(c("Date","name",intersect(names(data), variables))))
}
<- function(places,geography_type){
get_median_rent_data
<- do.call(rbind,lapply(places, function(place_name){
average_primary_rents get_cmhc(cmhc_timeseries_params(table_id = "2.2.11", region=cmhc_region_params(geography = place_name, type=geography_type))) %>%
mutate(name=place_name)
%>% prep
})) <- do.call(rbind,lapply(places, function(place_name){
average_condo_rents get_cmhc(cmhc_timeseries_params(table_id = "4.4.2", region=cmhc_region_params(geography = place_name, type=geography_type))) %>%
mutate(name=place_name)
%>% prep
})) <- do.call(rbind,lapply(places, function(place_name){
average_other_secondary_rents get_cmhc(cmhc_timeseries_params(table_id = "4.6.2", region=cmhc_region_params(geography = place_name, type=geography_type))) %>%
mutate(name=place_name)
%>% prep
}))
<- do.call(rbind,list(
plot_data %>% rename(`Total Primary` = Total) %>% melt(id=c("Date","name")),
average_primary_rents %>%
average_condo_rents rename(`Bachelor Condo`=`Bachelor`,
`1 Bedroom Condo`=`1 Bedroom`,
`2 Bedroom Condo`=`2 Bedroom`,
`3 Bedroom + Condo`=`3 Bedroom +`,
`Total Condo` = Total) %>%
melt(id=c("Date","name")),
%>% rename(`Total Other` = Total) %>% melt(id=c("Date","name"))
average_other_secondary_rents
))return(plot_data)
}
And plot the time series.
=c("Vancouver","Toronto","Calgary","Victoria")
places="CMA"
geography_type
<- get_median_rent_data(places,geography_type)
plot_data ggplot(plot_data %>% arrange(Date) %>% filter(name=="Vancouver"),
aes(x=Date, y=value, colour=variable)) +
geom_line() +
geom_point() +
scale_y_continuous(labels=currency_format) +
labs(title="Vancouver Stock Median Rents", x="Year", y="Monthly Rent")
Looking at 1 bedroom rents across the cities we get
ggplot(plot_data %>% filter(variable %in% c("1 Bedroom","1 Bedroom Condo")) %>%
mutate(type=paste0(name," ",variable)) %>%
arrange(Date,type),
aes(x=Date, y=value, colour=type)) +
geom_line() +
geom_point() +
scale_color_brewer("Metro / Unit Type",palette = 'Paired') +
scale_y_continuous(labels=currency_format) +
labs(title="1 Bedroom CMA Stock Median Rents", x="Year", y="Monthly Rent")
With a similar patter for 2 bedroom rent
ggplot(plot_data %>% filter(variable %in% c("2 Bedroom","2 Bedroom Condo")) %>%
mutate(type=paste0(name," ",variable)) %>%
arrange(Date,type),
aes(x=Date, y=value, colour=type)) +
geom_line() +
geom_point() +
scale_color_brewer("Metro / Unit Type",palette = 'Paired') +
scale_y_continuous(labels=currency_format) +
labs(title="2 Bedroom CMA Stock Median Rents", x="Year", y="Monthly Rent")
We can narrow it down to just the cities (census subdivisions).
Reuse
Citation
@misc{rental-data.2017,
author = {{von Bergmann}, Jens},
title = {Rental {Data}},
date = {2017-09-12},
url = {https://doodles.mountainmath.ca/posts/2017-09-12-rental-data/},
langid = {en}
}