Machine Vision for Urban Planning

As part of the Neo-demographics project a large amount of geodemographic data has been collected in Dar es Salaam, Tanzania, over the last year. This has been possible via:

  1. Crowdsourcing using  a large team of local community members in Tanzania
  2. Mining of Call Record Data (CDR) via collaboration with the African telecommunications company, TiGo.

The result has been a set of gold standard statistics that splits the city into 641 regions. Each region is labelled with 45 geodemographic features (i.e. transport access, education levels, etc.) and 20 behavioural features (i.e.  average mobility, no. mobile transactions, etc.).

This data is the first of its kind in the region and valuable to a wide range of applications, from local policy (e.g. for transportation planning) to business investment (e.g. store locations). However, if one wishes to replicate and Machine vision urbanextend this analysis to other countries, one would be dependent on both the availability of private sector data, which is typically hard to obtain as well as further intensive crowd-sourcing investment.

An opportunity to bypass these requirements has presented itself due to  a parallel data collection activity that occurred.  Using a pair of senseFly eBee mini-drones, the project collected high res 2.5-dimensional aerial RGB imagery for  the entire city. This imagery is very large – over 762GB covering 88 km2, consisting of more than 55 billion pixels. The hypothesis is that it should now be possible to form a model of the geo-demographic features (that we have a ground truth for) based on image analysis of this rich visual dataset. The large amount of data suggests that Deep Learning techniques would be an ideal tool for this task.

Deep Learning has recently attracted a lot of attention, as it has been shown to outperform any other machine learning technique on just about every problem rooted in large amounts of data. These neural network techniques have become possible with the invention of novel non-linear functions applied inside each neuron, novel unsupervised synapse weight initialisations and the availability of large datasets and faster computation.

The goal of the project is two-fold: 1) to use Deep Learning to determine outlines of dwellings from the imagery and 2) to generate visual features and find relationships between these and geodemographic features. While the demographic variables could probably be detected just as well from the visual data without knowing the dwelling sizes, estimating the sizes of houses explicitly will give insight into how the Deep Learning process made its decision. This is a small step towards ‘whitening the black box’ of machine learning.