#ReViz – Battling Infectious Diseases in the 20th Century

You have probably seen me write this before, but this visualisation is one of my favourites.

The way in which it puts the message across, with such simplicity, is astounding.

2016-06-12_09-22-01.png

This visualisation was part of a series of visualisations produced in an article by the Wall Street Journal graphics department, the article is well worth a read, and discuses the positive impact of vaccinations.

I will also make a second shout out to the Project Tycho team at the University of Pittsburgh. An initiative to advance the availability of public health data in the US. If you are interested in public health data then it is well worth signing up, they have put together an amazing database.

Now, you may perceive this as something that would be easy to build in Tableau, my response would be yes and no, there were definitely some challenges on the way.

I guess the main reason you have visited this post is to see the how to guide on creating the dynamic key, by Friday this week I intend to write and link a separate blog describing this process here, but for now, the process is outlined towards the bottom of this article.


Data Structure.

After downloading the data set from the Project Tycho site here is the format with which it came in. We have the number of incidents, by states and by week/year.

First things first, we need to pivot the data. To bring all our incident numbers into a singular column. So we would be pivoting each of the state columns around our [YEAR] and [WEEK] fields.

2016-06-12_09-38-18 Pivot

The weekly level of detail was also not necessary, so the next task was to remove this, I used the summarise tool to return the number of incidents simply by year and state.

Okay, i’ll admit I didn’t see this until after building version 1 of the viz, but the incident rates in the WSJ viz are given per 100,000 members of the population. This meant some data blending was required. I found population by state data from the US Cencus and blending this with my data set before creating the incident rate by doing the calculation; ([Sum_Cases]/[Population])*100,000.

The data set I used can be found here.


Bulding the chart

So this wasn’t a simple heat map in that we are using two discrete dimensions and a measure. In order to get the constant lines on our visualisation we needed what would have traditionally been our 2nd dimension, year, to be continuous.

2016-06-13_18-59-46.png

Heat maps usually use the square mark type, however when using just one discrete dimension it is not possible to do this without getting overlapping marks.

2016-06-13_19-02-53.png

To the right side of this image you can see that the squares layer on top of each other.

However you can also see that now we have changed our year to continuous, we know have the ability to select a constant line.

One way in which we can prevent this overlapping is to change the mark type to ‘Gantt Bar’.

We can then create a calculated field which will be the average of 1, AVG(1). In reality it can be any number, providing it assigns the same value to each of our state/year combinations. Which may not be the case if you use the SUM function.

By then dropping this onto the size shelf you see we get our heat map, with the added ability to add features from the analytics pane.

The State(Copy) pill you see on the rows shelf was created to convert the names from capitals to a title case.

2016-06-13_19-10-00

It was then a simple case of adding our two constant lines which represents the points in time when the vaccine was firstly introduced, and secondly reintroduced with an improved formula.


Creating a custom colour palette

You may have noticed that the colour scale used in this visualisation is not a native tableau colour palette, however with Tableau you have the ability to create your own, I knew this was a feature of Tableau, but It was something that I had yet to have a go at.

I followed their great knowledge base article on the subject.

There are three types of colour palette used in Tableau.

  1. a categorical colour palette – ‘A categorical colour palette contains several distinct colours that can be assigned to discrete dimension members.’
  2. a sequential colour palette – ‘Another type of palette is the sequential colour palette. Typically, this type of palette shows a single colour, varying in intensity. This type of colour palette is used for continuous fields, typically for measures.’
  3. a custom diverging colour palette – shows two ranges of values using color intensity to show the magnitude of the number and the actual color to show which range the number is from. Diverging palettes are most commonly used to show the difference between positive and negative numbers.

Here I am just going to outline the reasons why I created the code (see below), rather than take you through the step by step guide on how to, when the Tableau article is so great.

<color-palette name=”Measles Viz Colour Scale” type=”ordered-diverging” >
<color>#ddebf8</color>
<color>#0c9dc5</color>
<color>#44ae57</color>
<color>#fcd43d</color>
<color>#e59e26</color>
<color>#e48925</color>
<color>#e95034</color>
<color>#da4b31</color>
<color>#cf472e</color>
</color-palette>

Note, the XML tag for color is spelt the American (wrong) way, so make sure you are using American English when writing your XML code.

I simply wrote this code in notepad.

A diverging colour palette was the type I chose because it is about blending colours (although usually two/three, in my case 9), and using these colours to then show the intensity of events, in my case the number of measles related incidences.

You can also see that I weighted my colour palette, as in the original visualisation. In order to do this I simply added more of variants of that colour to the palette. Of course they are not identical colours because we don’t want a lower number of incidents to have the same colour as the higher number of incidents, so I therefor made these a couple of shades lighter.

2016-06-13_20-38-16


Building the dynamic key.

Okay, i’ll be honest, it took me a fair few attempts to perfect the method for building the dynamic legend. I knew this would be challenging when I looked at doing this as a #ReViz project and this was something that pushed me towards doing it.

‘Stop mambling, blah blah blah, tell us the how’

Part 1 – Creating the colour legend

  1. Duplicate and then union this duplicate to your original data set. I used alteryx for this step, but you could try using Tableau’s new union feature.
  2. Create an additional column, in  my case [RowCount].
  3. Assign 1 as the [RowCount] value for the original data set
  4. Assign 2 as the [RowCount] value for the duplicated data set

    2016-06-12_10-25-29

  5. Create a calculated field in Tableau that converts your [RowCount] values into the min/max values of your legend. In my case, 0 and 3000.if [RowCount] = 1 then 0 else 3000 ENDChanged the type to discrete.
  6. Create bins based on this field. The bin size will be 1.

    bins

  7. Create a  calculated field, simply; index()
  8. Drag your index() calculated field to columns
  9. Drag your bins onto colour
  10. Drag your index() calculated field to colour, and change the compute using to the bins.
  11. Make the colour scale match that of your chart
  12. Change the mark type to Bar (if it isn’t already)Part 2 – Creating the arrow
  13. Drag the pill used to colour your scale on your chart onto the columns shelf.
  14. Remove everything from the marks card for this 2nd axis.
  15. Change the mark type to shape
  16. Select the “▼” symbol
  17. Dual axis your chart.Part 3 – Adding your interactivity
  18. Create your dashboard
  19. Apply any necessary action filters to create the interactivity between the chart and
    dynamic key.

    2016-06-12_11-01-38

  20. One final change I made was that I did not want to show the arrow unless the filter had been applied. In order to do this I created a calculated field which I placed on my 2nd axis instead of the field used in the chart.if COUNTD([State]) = 1 then avg([Per 1000]) END.This calculated field means that the value will only be returned if there is only one state selected. I.e. if someone is hovering over a value on the chart.If they are not hovering over the chart, then all states are in the view so the COUNTD([State]) will equal 50 and thus the value will be NULL and the arrow will not show.

After this it was a case of formatting the dashboard into the style which you see in the Tableau Public visualisation; I won’t bore you with that.

Ben

#VizLikeAnArtist

 

Advertisements

#ReViz – Circles in Circles

My latest purchase ‘Infographica’ by Martin and Simon Toseland inspired this #ReViz project, a penny for your thoughts.

I saw it and immediately wanted to replicate it. It’s beautiful, the use of colours are excellent, and most important it tells the story extremely well (and quickly). There was also the challenge that this was something I have yet to make in Tableau.

IMG_1084

In this blog i’m going to teach you how I made these ‘circles in circles’ charts.

With this #ReViz project I decided to use my own dataset, and just focus on recreating the viz type used in the piece rather than replicating it entirely.

The data set has salary data for the UK by region. For each region we have two measures, we have the average (median) salary for that region, and we also have the median UK salary; a similar structure to that outlined in the visualisation above.

So once I had imported this into Tableau I simply did the following…

  1. Drag my dimension, [Region], onto columns (or rows, which ever you choose)
  2. Create a calculated field, simply with the value 0, and drag this calculated field onto the rows shelf.
  3. Drag one of my measures, [Median gross annual earnings (£) by region], onto size
  4. Drag the calculated field [0] onto rows for a second time (creating a second pane).
  5. Place our second measure  [Median gross annual earnings (£) UK-Wide] onto the detail shelf for the second axis.
  6. Change the mark type to ‘Shape’, and if it hasn’t already, change the ‘Shape’ to the circle outline.
  7. In order to ensure your visualisation is true we must adjust the size scale for each axis so that they are identical.
  8. Dual axis the chart, and synchronize your axis

You may note in my published viz that the regions are actually not in a single line. I did this by replacing the 0 values with plotted X and Y points and creating my axis from these instead. My ‘Shape mapping in Tableau’ blog may provide you with an idea on how I did this.

Ben

#VizLikeAnAritst

Penny-thoughts

#ReViz – LGBT | Gay rights by state

This week whilst searching for inspiration for the Tableau Iron Viz competition I came across this unbelievably beautiful and informative visualisation from The Guardian US Interactive Team. I love it.

If you were to ask me where I would put this viz on the ‘Functional Beautiful quadrant‘ it would be very close to being in the top  right.

This viz type is also one that features on the front cover of ‘Information is Beautiful’, a book with which I have gained all sorts of insight on the subject of data visualisation and ideas for my work.

For this chart type you need two dimensions and a measure. In the case of this ReViz our first dimension is our states, these are our segments, our second dimension is our law type, represented by the different rings around the circle.

You could almost treat these as rows and columns.

 

Our measure is represented by the level at which the laws exist; the measure is represented by the colour of each sector.

3


 

Remembering that we are working with a circle chart type and polygons, completing the data preparation in Alteryx consisted of a number of similar practices to those learnt in my previous #ReViz project so I will only point out the differences and how to adjust the formulas to account for this.

Difference number 1:

In contrast to our ‘coxcomb’ chart where we are dealing with sectors (or a slice of pie), this time we are dealing with a ‘donut segment’.

 

esf.png

 

Difference number 2:

Our radius is not determined by the measure value but instead the dimension ‘row’ that it is on. Our first set of points, from 1-5 on diagram c will have a radius of lets say 1, our 2nd set of points for that segment, 6-10 will have a radius of lets say 2.

The segment that would lie above this, the first set of points would have a radius of 2. The 2nd set 3, and the points are offset in this way.


So how do these differences affect the formulas introduced in the previous #ReViz blog. Lets start with our angle formula that we can use to identify the angle at each point.

This is the formula we used last week, with our first and last points representing our central points and therefore; whilst the path points between represented our angles at the points on the edge of the circle.

IF [PathID] = 1 or [PathID] = 13 then 0 else ((360/[Number of Segments])*([SegmentID]-1))+ (([PathID]-2)*(360/[Number of Segments])/10) endif

Instead this time we need to find the angle of our lower row first, path points 1-5 (diagram c), before then finding the angle for path points 6-10.

A quick point. We know that points 5 and 6 have the same angle, as do 4 and 7, 3 and 8, 2 and 9 and 1 and 10.

We also have this same fields available, where ‘[number of segments]’ is the number of dimensions, in our case 51 states; [Segment ID], an individual ID for each of these segments, from 1-51; and finally a [Path ID], in the case of our example 1-10. Although In my viz I used 40 points to ensure a smooth curve.

Here is the formula that I used to generate the angles at each Path ID.

IF [Path ID] <= 5 then

(([Segment ID]-1)*((360/[Number of Segments])))

+

((((360/[Number of Segments]))/4)*[Path ID])-(((360/[Number of Segments]))/4)

else

(([Segment ID]-1)*((360/[Number of Segments])))

+

((((360/[Number of Segments]))/4)*(10-[Path ID]))

endif

So what is this actually doing?

So our first condition IF [Path ID] <= 5. Here I am segregating our 1st half of path points to our second, as mentioned above the calculation will need to be different.

With this part of the formula (([Segment ID]-1)*((360/[Number of Segments]))) I am outlining the initial angle for that segment. Dividing 360 by the total number of segments will return the angle covered by each segment. Multiplying this by our [Segment ID]-1 will therefore give us our start point. Segment 1 will start at 0 degrees. Segment 2 will start at 1 * the angle for each segment, segment 3 will be two times this value and so on.

The following aspect of the formula will return the angle increment from the original starting angle for that segment.

((((360/[Number of States]))/4) returns the angle increment between each path. Multiplying this by the Path ID gives us the increment value from the original angle for the segment. -1 from our Path ID ensures that we do not increment at the point of our first Path, and therefore it remains at the original angle. This is similar to what we do above with our segments.

(((360/[Number of States]))/4)*([Path ID]-1)

Okay, so this is our value when the points are less than or equal to 5. The next part of the formula focusses on where this condition is false; i.e. points 6-10 when our angle returns back on its self.

The only difference is in the 2nd part of the formula.

((((360/[Number of States]))/4)*((11-[Path ID])-1))

The part (11-[Path ID]) gives us the inverse of what our Path ID value should actually be, so as highlighted above, when my Path ID = 6 I will get the angle for the same as I would for the first part of the formula for Path ID 5 (remembering basic maths that 11-6 = 5).


 

Now that we have the angle for each of our points It is now a case of how do we offset the radius and therefor the location of our x,y points as highlight in the difference 2 section

Well I have ID’d my 2nd dimension (rows) in a similar way to that of my segments. So we can use a formula and this value to create our offset. Remembering that we must convert our angle to radians in order to calculate our x,y points.

Again I will need to condition my formula based on the Path ID. If it is less than 5 I want a lower radius value than if it above 5, thus giving us our offset.

The value for X can be calculated as:

IF [Path ID] <= 5

then sin([Radians])*([Row ID]+2)

else

sin([Radians])*([Row ID]+3)

endif

Why +2 and +3 I here you ask, well the difference between these 2 values (2 v 3) gives us the offset between the points below or above 5.

Starting at +2 gives our lowest Row ID (1) a value of 3. This part determines the size of the hole in the centre of our target chart. If we wanted we could make our starting value -1 and therefore our calculation will start at 0 and thus our chart would start from the centre (where the radius is 0).

2016-04-28_18-15-10

 

Our value of Y can be calculated in the same manner, swapping Sin for Cos.

IF [Path ID] <= 5

then cos([Radians])*([Row ID]+2)

else cos([Radians])*([Row ID]+3)

endif


 

With this visualisations there are a lot of actions which I plan to outlined in a separate Tableau tips blog.

I hope you enjoyed the read. The finished workflow is available here.

Here is a link the the final visualisation.

Ben

#VizLikeAnAritst

#ReViz – Coxcombs in Tableau – Part 2

So yesterday I finished off part 1 by outlining the aspects needed to create our coxcomb visualization and identify the points for the edge of our segment in order to do that we need to know:

  1. The angle
  2. The radius (SQRT([Measure])/((PI()*(1^2))/[Count])))
  3. The total number of segments
  4. Some mathematical understanding

So I will show you how I used Alteryx to allow me to do this. I will also remind you of the type of data input that we have. A series of dimensions and measures, pretty simple.

2016-04-19_12-55-01

So the first question we need to ask is how many dimensions are in our dataset? We can use a summarize tool to do this. Creating a count distinct calculation of our dimension members. In our example this value will be 12.

We need to know this information for each of our dimensions to allow us to calculate our points. So to do this I am going to use the append tool to add this information as a column to the right of our original dataset.

At this stage it is also important to note that we may have multiple lines for the same dimension. So we should also summarize our original dataset to ensure our data is aggregated for each dimension member; grouping  by our [Dimension Member] column and summing the [Measure] column. Of course you may wish to aggregate your data in a different way.

2016-04-20_17-38-32

We now have a dataset that looks something like this…

2016-04-20_17-49-24

Now I wanted to just add a RecordID to our segments, we will use this RecordID (which I renamed to SegmentID), to enable us to calculate our angles at each point).

Now in order to draw the points for each of our segments we need as a minimum 4 points as outlined with the image shown in my 1st post.

Of course realistically in order to create a smoother curve we should use more points. I used 13 in my workflow this will give us 11 points around the edge of our circle for each segment and our 2 (0,0) points.

dfsvds

Now in order to use the Path functionality of tableau our Path ID’s must be in one column on different rows. So if we want 13 Path ID’s for each segment we need to generate 12 additional rows for each. In Alteryx there is a tool for this, aptly named ‘Generate Rows’. In order to use this we must set an ‘initial expression’, ‘condition expression’ and ‘loop expression’.

Generate Rows

In our case the configuration of this expression is pretty simple. Our initial expression is ‘1’, this is the value of our row for the dimension member that already exists.

Our loop expression represents how we want the PathID to change dependent on the previous value. We simply want to increment ours by 1, so PathID+1.

Our condition expression represents when we want Alteryx to stop generating rows. Well we want 13 rows in total, so on the 12th row, when we are incrementing by 1, the PathID will be 13. Therefore we can make our conditional expression PathID <= 13.

2016-04-20_18-48-35


From here on in we just need one tool, the formula tool, in this we are going to create 5 different formulas which calculate the;

  1. angle
  2. angle in radians
  3. radius
  4. x
  5. y

We are almost there now, it is a case of calculating the angle at each point and the distance, with which we can then calculate the x,y point for each PathID.

Well remembering that PathID 1 and Path ID 13 represent our centre point for each SegmentID then we can just 0 these values. So,

IF [PathID] = 1 or [PathID] = 13 then 0

Unfortunately the next part isn’t quite as simple.

We need to first identify the increment for each PathID between 1 and 13.

Well (360/Number of Segments) will give us the angle of each segment, 30 degrees. There are 10 increments for each segment. Therefore, (360/Number of Segments)/10) gives us our incremental value.

We can also use our PathID to acknowledge the point of the incrimental proccess for each segment. I.e. PathID 2 should have no incrimental value as it is the first point; so we now know the incrimental value for each PathID is equal to;

(([PathID]-2)*(360/Number of Segments)/10)

We now just need to calculate the base value for each segment to which we add this incremental value. This base value is the starting angle for each segment, for SegmentID 1 we need a value 0 degrees, for SegmentID 2, we need a value of 30 degrees, 60 degrees for SegmentID 3 and so on.

This can be calculated as ((360/[Number of Segments])*([SegmentID]-1))

Using this information we can now build out our angle formula

IF

[PathID] = 1 or [PathID] = 13

then

0

 

 

else

((360/[Number of Segments])*([SegmentID]-1))+ (([PathID]-2)*(360/[Number of Segments])/10)

endif

2016-04-20_19-28-17

Unfortunately for Alteryx users there is no Radians function so one final thing we need to do is change our angle into Radians using: Angle*(PI()/180)


So now to calculate our Radius for each segment. This is relatively straight forward when compared with the above. We just need our measure value for the specific segment, and the total number of segments, please do not ask me to explain the underlying math behind this, I have no clue:

SQRT([Measure]/((PI()*POW(1,2))/[Number of Segments]))

or

SQRT

([Measure]

/

PI()*POW(1,2)

/

[Number of Segments])

probably makes it slightly easier to read.


We now finally have everything on our list!

  1. The angle – CHECK
  2. The radius (SQRT([Measure])/((PI()*(1^2))/[Count]))) – CHECK
  3. The total number of segments – CHECK
  4. Some mathematical understanding – CHECK

So lets now calculate our x,y values and export to Tableau!

(again remembering that PathID 1 and PathID 13 are 0,0)

x = if [PathID] = 1 or [PathID] = 13 then 0 else Sin([Radians])*[Radius] endif

y = if [PathID] = 1 or [PathID] = 13 then 0 else Cos([Radians])*[Radius] endif

Again I have made the workflow available so people can have a better look if they wish.

At this point it is simply a case of building your coxcomb chart in Tableau.

Of course with the Florence Nightingale visualisation I created there was a significant amount of formatting and I also had to layer three different outputs on top of each other to give me the three different causes of death, please reach out to me if you want to know more.

Ben

#VizLikeAnArtist

2016-04-20_19-43-45

 

#ReViz – Coxcombs in Tableau – Part 1

This ReViz project is probably my favourite that I have undertook so far. Remaking something of such historical significance was great, and building it was incredibly challenging but fun as well.

2016-04-19_19-14-57

You have probably heard of this visualisation’s designer, Florence Nightingale, a British nurse famed for her contribution to modern nursing and the graphical representation of data.

This visualisation was produced during the Crimean war where Nightingale meticulously collected data on mortality rates of soldiers whilst serving in the British camp. The visualisation captured the attention of key stakeholders who quickly saw that sanitation was a key influencing factor on death rates in war zone hospitals, and introduced policy to control this.

It is seen as one of the finest examples of how visualising data can play a key role in triggering conversation and aiding the decision making process, especially given what was at stake; peoples lives.

As a result of Nightingales visualisation and the policies introduced, the death rate was seen to fall from 42% to 2% (Dictionary of National Biography, 1911).

Because of this visualisations significance in the data world I decided to give it a go using Tableau and Alteryx. Of course, this type of chart, a ‘coxcomb’ is not a supported chart type so I new that I would have to create the segments using polygons.

The ‘coxcomb’ whilst looking complex, is in principal very simple, the number of dimension members represents the number of segments; the measure value represents the size.

I decided to first use a very basic dataset, with 12 dimension members with a measure value. I find that by bringing things to their most basic point it helps make understanding what you are doing much clearer.

2016-04-19_12-55-01

Next you need to understand how Tableau draws polygons. you must have an X and Y point for each mark, Tableau draws lines between these marks based on the order you define or ‘Path’.

2016-04-19_13-01-11

We already know two points for each of our segments. The starting point is going to be the centre of the pie (0,0) and the end point is going to be the centre of the pie (0,0). The x and y coordinates of the points between depend on the angle and size at that point.

dfsvds

So how do you calculate the x and y value of a point on the edge of a circle. Well this is where my data school colleague Simona Loffredo’s ‘math blog’ came in handy.

x = r*cos(alpha) – r being our radius, alpha being the radians value of our angle, which can be calculated as angle*(PI()/180)

y = r*sin(alpha)

For an examples sake lets say point 2 is at 0 degrees and the radius is 1.

so x = 1*cos(0) = 0
y = 1*sin(0) = 1

So point 2 is at (0,1).

Of course, with a coxcomb chart the radius is not a set value, it depends on the measure value it’s self. So woo, more maths!

The radius in this case for each individual segment can be calculated as:

SQRT([Measure])/((PI()*(1^2))/[Count]))

Where our [Measure] is the value for that dimension member and the [Count] represents the number of segments.

So what the hell does this actually mean.

Well for each point on the edge of the circle we need the following information:

  1. The angle
  2. The radius
  3. The total number of segments
  4. Some mathematical understanding

Given the length of this post I have decided to split it into two posts. For this first part we have looked at the underlying principals to help us build our coxcomb visualisation. The next part focusses on the actual build of the visualisation data in Atleryx.

Ben

#VizLikeAnArtist

 

 

 

#ReViz – A way to learn Atleryx and Tableau

Those of you that have followed my existence in the Tableausphere will probably already know that one of my favourite hobbies is to rip off other peoples work. I even used my guest blog for Tableau to write about it.

It’s quite a simple process, I find a visualisation I like, I recreate it. ‘Steal like an artist’ as the phrase goes.

I do have one condition on my mind when doing these projects, the visualisation will be something that I have never made in Tableau before.

So why do I do it?

Well to me, it is the best way to learn Tableau.

Recreating visualisations stretches the limits of Tableau’s ability and allows you to spot small areas that can help your understanding of the software. Plus it also allows you to to practice new techniques and explore different ways of visualising data.

From here on in, I am going to produce a ReViz every Tuesday and blog its creation (although you can expect the first blog to land tomorrow!).

Here are some examples of recreations that I have made previously, all of which can be seen in more detail here.

#VizLikeAnArtist

ReViz.png

 

#ReViz – City of Crime

This week I found this beautiful visualisation in ‘London: The Information Capital’ about crime in London. A great subject and a great viz, but I knew it would be challenging to build in Tableau. So I gave it a go.

image (3).jpeg

I knew where I needed to start. I needed the data for crimes in London, and I needed a shape file of London.

Well the police forces of England, Wales and Northern Ireland have recently made a huge push towards providing open data. They have created a site where you can easily pull off csv’s which list all recorded crimes going back to December 2010. This data gives details on the crime type, it’s location (longitude and latitudes) and the current outcome.

I downloaded each month for 2016 and focussed my download to the Metropolitan Police (the policy force of Greater London).

My data school colleague Simona Loffredo had also recently created a shape file of all regions in Great Britain, which she has made available here.

Now I had all my data; next step.

I had a list of things that I knew I needed to do from here;

  1. Create shapes of each 200m square area in London
  2. Plot each of the crimes in London on top of this shape file
  3. Use this to summarize the largest crime type for each 200m square area
  4. Create in Tableau

It was pretty evident from the get go that Alteryx was needed for this ReViz, given it’s spatial analytics capabilities.

So here’s how I did it.

  1. Input the Great Britain Regions shape file into Alteryx, and filter to just London
  2. Bring in the ‘Make Grid’ tool. Reference the spatial object from your shape file, and select the grid size, in my case ‘0.2 Kilometers’.Make Grid

    This turned my one shape file of London into 40,453 individual spatial objects.This tool was something that I had never seen before in Alteryx. In fact I started trying to make my own grid using some complex maths and formulas before realising it was there. Here is how Alteryx defines the tool.’The Make Grid tool takes a specified spatial object (point or polygon ) and creates a grid based on the spatial object. The resulting grid is either a single grid, bound to the extent of the input spatial objects, or individual grids that dissect each input polygon.’

     

  3. Next was to input my crime data. I had 12 files (one for each month from 2016. A quick tip here is that you can bring in multiple files at the same time by using the * as a  wildcard character.
  4. I now needed to use my longitudes and latitudes to make a spatial match between my crimes and my grids. Think of this as a spatial join or spatial lookup. Alteryx is going to return me the spatial object with which my crimes lie.SFGF

    In order to do this I must first use the ‘create points’ tool to turn my latitude and longitude data into a spatial object.Workflow.png

    The configuration for your spatial match is dependant upon how you configure your inputs. In my case my crimes were directed into the ‘T-Input’ and my areas the ‘U-Input’. So therefore my ‘join’ was set as ‘Where Target Within Universe’.

    2016-04-12_21-14-30

  5. Now I just wanted to return my top crimes for each of my areas. So I used the summarize tool, grouping by ‘grid name’ and ‘crime type’ and also doing a count of ‘crime type’. I also needed to bring through my spatial object. This left me with a list of my grid areas, the different crime types and how often the crime types had occurred within that area.SummarizeI now needed to rank my crimes within each grid area. So I first sorted my data (using the ‘sort tool’, first by ‘grid name’ and then my count of crimes.Now in cases where we have groups in our data set, then we cannot simply rank using the ‘Record ID’ tool and instead must use the ‘Multi-Row Formula’ tool which allows us to reference rows above or below that of the cell that is being calculated.

    if [Row-1:GridName] != [GridName] then 1 else [Row-1:Rank]+1 endif

    This formula evaluates whether the ‘Grid Name’ in the cell below is equal to that of the current cell. If it is not we want our ranking to start again (so 1), otherwise we want to use the ranking of our previous row, and add 1 (thus increasing our rank value).

    Now we just need to filter out those values where the rank does not equal 1.

    WOO, we have our data set. We have a list of each 200m area in London, alongside the top crime in that area, as well as a spatial object for that area.

  6. There is one final stage to our Alteryx workflow. At present Tableau does not support spatial data types so we must use Alteryx to turn our spatial objects into polygons that we can use in Tableau.Thankfully, there’s an app for thatcreated by Info Lab Tableau Zen, Craig Bloodworth.I simply add the macro in, reference my spatial object, and then add an output as a TDE file.
  7. Now time to build the viz. Well Craig has created this video which shows users how to use the output to create your polygon map in Tableau. (In short, double click latitude, double click longitude, change the mark type to Polygon, place ‘PointID’ on the Path shelf (making sure it is discrete) and place ‘PolygonID’ on the detail shelf (again making sure it is discrete).It is then a simple case of dragging crime type to colour.(I of course did some minor formatting before publishing to my Tableau Public profile.)Finished.png

    I think the design process for recreating this visualisation shows just why I do this sort of thing. It was challenging, I found out more about the tools available in alteryx, and I learnt a new visualisation technique.

    What more could you want from a nights work!

  8.  

    For those interested in seeing the full workflow I have made it available for download.

    See you again soon.

    Ben

    #VizLikeAnArtist