The election data got posted on the Vancouver Open Data website so we decided to take a very quick peek at how the candidates fared by polling station. Citizens can vote at any station they want, so there is are no voting districts. But proximity to home is probably a large factor in determining where people vote, although some may choose locations close to work or somewhere else convenient. For anyone that wants to refine the analysis, the R Notebook that generated this post lives on GitHub.
Not sure how long this has been live, but this morning fellow cancensus developer Dmitry flagged a new StatCan feature. Interactive thematic web maps. Essentially it enables users to choose from the a selection of 2016 census variables and map them. You can zoom and pan around, and select the aggregation levels to display the data at down to census tracts. And there is a option to download single variables as CSV.
Earlier today I came across Gil Meslin’s tweet suggesting to reproduce this rent graph for neighbourhoods in Toronto. I agree that this would be fun to do. All it requires is mixing the Toronto neighbourhoods with renal listings data, which I happen to have handy. So time to get working. Neighbourhoods To do this we need to grab the Toronto neighbourhoods which can be found on Toronto’s open data website.
After the BC government stopped publishing foreign buyer’s data after May this year it reversed course and gave the data out to media outlets earlier this week. It started being released to the general public only earlier today, with the complete data becoming available around noon today. The Data There is a number of metrics in the data, the one we will focus on is the share, median dollar value and total dollar volume of foreign buyer purchases.
I want do a short post to gently remind people of pitfalls when overly relying on medians for understanding complex issues. Medians are useful because they take a complex distribution and break it down into a single, simple to understand number. This works well as long as this does not mask other aspects of the distribution that are important in the context it is used. A good example for the dangers of overly relying on medians is the “median multiple” metric that gets used a lot, the median dwelling value divided by the median household income in an area.