Search
  • James Goodall

#IronQuest: Black & White

Updated: Dec 8, 2020

After telling myself I’d be entering #IronQuest on multiple occasions and failing in all my attempts to do so, I decided that this month would be different - I would actually do something about it and decided to rise to the challenge of the black & white round.


This blog post will go through my thought processes along the way, as well as the details of how I built my entry to this round



I had an idea to use curved lines fixed to latitude & longitude coordinates, in the same way as I did for #MakeoverMonday week 44, but needed to find the right data for this to work.


From previous pieces of work I’ve completed I know that the police make data available via the website data.police.uk, and within this there were latitude and longitude coordinates of where the crimes took place, as well as the LSOAs of the crimes.


So I had it narrowed down to crimes data, but what would the subject be that I’d want to focus on? Well, 2020 being what it is, the first thing that comes to mind is, of course, COVID-19, and I had the thought that with the impact of lockdowns and general reduction in social activity throughout the country, this must surely have reduced the volumes of crimes being reported & investigated right? Well, we’ll come back to that later.


I therefore decided to download crimes data for the period January 2019 – October 2019, and compare that against the crimes data for the same period in 2020 (as that was the latest date data was available to at the time of development). The data downloads all came in individual files for each month, so I had a task of unioning them all together and doing a little bit of data prep. I used Safe Software’s FME to complete this, as it’s a tool I’m very familiar with and could rattle through the task quite quickly. In the end I actually didn’t use the lat/long coordinates of the crimes themselves and instead sourced the lat/longs of the LSOA centroids to use as the destination points for my curvy lines.


As it turned out, I would actually go on to use 4 different data sources for this workbook, comprised of:


  • Row level crimes data

  • LSOA data with lat/longs for the curved lines

  • Aggregated crimes data by LSOA joined to an LSOA shapefile

  • A separate data source for a curved timeline chart that features later in the build


(I actually used 5 but one of them is just for a generic Sankey template I built that I won’t go into in this post)


I’d seen a couple of diamond shaped heatmap style charts floating around on social media (I believe it was this post by Chris Love and this post by Wendy Shijia that burrowed into my subconscious) and so I wanted this to feature at the top of my viz with everything else flowing down from here


So how to go about building this...


First of all, I needed the right number of dimension values to form a diamond (/square), namely, 4, 9, 16, 32 etc.


In my data I had 14…



I got around this by grouping the ‘Crime type’ fields to get this down to 9 (still meaningful) groups of crimes



The focus of the dashboard was the change in crime volumes between 2019 – 2020, so I needed to create some calculated fields to work this out


I created a fixed LOD for the 2019 values as follows:



Followed by another fixed LOD field for the 2020 values



With these two set, I could then work out the percentage change between the two years for each crime group



The main structure of this chart would be 9 diamond shapes sitting on X/Y coordinates. To work out what these coordinates would be, in this workbook I just took the dimension values sorted by ‘% Difference’ values and fixed them to coordinates based on their ranking, however I will be posting a blog shortly about how you can do this without fixing the coordinates against specific dimension values, and using ranks instead (so keep your eyes peeled for that one if you're interested).




Then it’s just a simple case of adding them to the ’Columns’ and ‘Rows’ shelves (as dimensions). I actually found that I needed a bit more space to play around with the sizing after I’d loaded the sheet onto my dashboard so I ended up multiplying the X and Y fields by 100 on the shelves. I dragged the ‘SUM(% Difference)’ field to colour, added the ‘Crime type (group)’ field to detail and played around with the sizes to fit (note this will take some fiddling around on the dashboard to make it look right for you)



To create the ‘curved lines map’ sheet , I borrowed what I learned from the #MakeoverMonday previously mentioned, which itself borrowed techniques I’d seen used by Christian Felix and Agata Ketterick.


I joined (1 to 1, full-outer) a ‘join.csv’ file that simply had the numbers 1 to 49 recorded in a single ‘Point’ column onto my data source that had the LSOAs in question with corresponding lat/long coordinates and went about building the sheet



There were a number of calculated fields required to make this happen:


The first one is just a fixed value that effectively fixes the length of the lines horizontally



The second is another fixed value that sets the starting point of the line on the chart horizontally (this again takes some playing around with to find out what works for you)



The last fixed value field is one that sets the starting point of the lines vertically (again, fiddly to get right - you get the picture)



(If you want multiple line start point locations you’ll need to use an IF condition to set these)


Next I set the sigmoid field up, which is a little more complicated than the sigmoid functions you may have used if you’ve created sankeys before, but follow the basic same principle, just with a few more conditions added to frame them for the purpose of this chart



Next I needed to set the revised X/Y fields that will produce the ‘curved’ lines (in reality just a series of points connected)




From there the chart is fairly simple to build – just drag the ‘CurveX’ and ‘CurveY’ fields onto the Columns and Rows shelves and add the LSOA name to the Details mark then play around with the sizes / colours etc



I wanted these to feed into a map of LSOAs, so that was the next bit to create


I joined a shapefile of LSOAs to my data that had a summary of total count of crimes by LSOA, and re-created a couple of fields used in the previous data source but with slight tweaks as the data had already been aggregated at this point





The map itself was created by double clicking on the ‘Geometry’ field then dragging the ‘SUM(% Difference)’ field to colour, the LSOA name field to detail and formatting to fit the aesthetic of the rest of the workbook



Because the curved lines map and the LSOA shapefile map were different sheets these needed manually lining up on the dashboard. I did this by enabling the ‘County borders’ map layer on the curved line sheet and lining up & resizing the LSOA map until it fit the boundary, then removed the map layer again




The next chart to add was a nice easy one - a line chart. But in this case I wanted to compare a period in 2019 against a period in 2020, but how was I going to do that as surely on a line chart this would just show as a continuous line from left to right?


Right! So I needed to do something to equalise these. I created a new calculated field called ‘Equalised Date’ as follows:



I could then simply drag this onto the ‘Columns’ shelf as a ‘MONTH’ datepart (of the ‘Equalised Date’ field) and add a count of crimes against the Y axis, with the ‘YEAR’ datepart (of the original date field) added to the ‘Colour’ mark



The last chart I wanted to add was a kind of butterfly chart with rounded bars that showed the change in rank from one year to the next. I would do this via a combination of 3 different sheets


The first two were easy, I created rounded bars with ‘Crime type’ on the ‘Rows’ shelf and the count of crimes on the ‘Columns’ shelf and ordered them by count of crimes descending



A blog post on how to create rounded bar charts can be found here, so I won’t go into that on this post


The sheet for 2019 had the X axis reversed



I then dragged these onto the dashboard in a vertical floating container with a blank container in the middle to make space for a curved timeline chart that would link the two together



I could do this via a standard slope chart, but to fit the aesthetic of my workbook, I wanted these to be curved, and to do this, I borrowed a technique I have used in previous dashboards (such as this one) as detailed in this data surfers blog post


Unfortunately this also meant creating yet another data source to use where I could fix the ranking values into the data. Again I used FME to do this by creating a field called ‘Rank’ that was a count of crimes by crime type, sorting them in descending order, then adding a local counter, starting at 1 for each year



As the data surfers blog post explains how to do this I won’t cover this again in this blog post but I would strongly recommend having a look as that has certainly helped me!


With this chart created I simply dragged this on to the dashboard as a floating object and moved it until it lined up with the rounded bars



The final touches were just creating a few extra sheets to add as floating sheets to use as legends / descriptions for the diamond heatmap at the top and adding some additional text throughout the dashboard (floating blank objects with a 1px height were used for the lines)



Well with all of those charts built, what was the data actually telling us? Well, despite my assumptions earlier in the process, there were a few surprising outcomes, namely that during the summer, crimes in general were higher than in the same period in 2020! There will obviously be a number of different factors feeding into this, but I suspect this may be partly due to both a number of different protests that took place over this period as well as the pent up desires of the British public to famously go out and drink during the warmer weather, having not had the opportunity during the spring period (potentially also combined with more police resources available to target this following a reduction in other crimes types such as burglary (again likely due to the lockdowns causing more people to be at home)).


I most definitely enjoyed this opportunity to take part in #IronQuest, forcing myself to push harder to achieve things that I might not have otherwise allowed myself to do (for example during #MakeoverMonday tasks, I try to limit the amount of time I’m spending on dashboard development). Combined with this, the challenge to use only black/white/greyscale colours forced me to think outside of the norms of using colour to emphasise a point, or draw attention to an aspect on a dashboard, yet still communicate a story in the data. So a massive thanks to Sarah Bartlett for this amazing initiative and I look forward to participating in future rounds from here on!


Tableau Public version:




118 views0 comments