R and Tableau. Writing your entire data source to an R data-frame.

In isolation this may seem like a strange thing to do, but when placed hand in hand with more advanced analytics it can be hugely powerful; one example of which I shall post next week, and discuss briefly towards the end of this article.

Essentially it gives us the power to manipulate our data in any way we wish before passing the data back into Tableau (as a second data source).


The structure of the script and set-up is fairly simple, and is outlined below (note this trick relies on having a RowID type field within your data set). We can use this to return any number of columns in our data source at a row level.

1st things first, as always, initiate your R serve connection.

Now comes writing the script…

SCRIPT_REAL(“

// Open our calculation

RowID <- data.frame(RowID = unlist(.arg1))

// Create a data frame that consists of our list of RowIDs

ColumnOne <- data.frame(ColumnOne = unlist(.arg2))
ColumnOne$RowID <- 1:nrow(ColumnOne)

// Create a data frame that consists of a list of all values

ColumnTwo <- data.frame(ColumnTwo = unlist(.arg3))
ColumnTwo$RowID <- 1:nrow(ColumnTwo)

// repeat script for each column

ColumnN <- data.frame(ColumnN = unlist(.argN+1))
ColumnN$RowID <- 1:nrow(ColumnN)

data <- Reduce(function(x, y) merge(x,y, all=TRUE),list(RowID, ColumnOne, ColumnTwo, ColumnN,…))

// Merge each of the individual data frames together on a common field.

write.csv(data,’TableauDataSource.csv’)

// As a test write our data frame to a .csv file to inspect

1

// Return the value 1 for every mark in our data as we must return a value from our calculation

“,attr([RowID]),attr([ColumnOne]),attr([ColumnTwo]),attr([ColumnN])…)

// Close script and declare the columns which relate to the different arguments passed into the script


Before we close the calculation we should be sure to change our script so it is computed along ‘Row ID’, meaning we will receive every single row without our data set, if we were to compute along a different variable then we would receive an aggregated version of our data set.

Now in order for our script to execute we need to build a view with the script in. You can do this by simply placing your RowID column on detail, before bringing our script onto detail also. The workbook can be downloaded from this link.

2017-01-18_19-18-05.png

The result…

2017-01-19_18-24-10.png


So what can this offer us? Well alongside a couple of colleagues of mine we have been working on creating a ‘loop’ between Tableau, R, SQL and Tableau.

By being able to write our entire data source to an R data frame we then have the power to manipulate our data in any way, shape or form. We can then write this new data structure back into Tableau (via SQL), as a second data source, which can be queried from within Tableau using filters and parameters, I like to label it as ‘filterable data prep’.

2017-01-18_15-31-45.png

An example is highlighted in the image below. Here I have started with Superstore as my base data structure. I have then written this structure into an R data frame before manipulating it in a way that performs and writes a Market Basket analysis of our initial data set, which can be filtered using any of the columns and values that may exist within our initial data set (in the example below I have excluded ‘technology’ products for instance.

Tableau - R - SQL - Tableau Loop.png

I will write and give an example of this loop in my next blog which I hope will add further clarity to how powerful this can become.

Ben

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s