Attribution modelling in R

Attribution modelling in R an example

Here I am going into some examples in attribution modelling in R. It is a complex topic and much more can be said about it than I will be able to do here. I will here go hands on mostly into the markov model using the channel attribution package in R.

Here are some other posts where we cover some of the other topics:

For this example we pull data into a data-frame from our rest API.

Pulling data via our REST API

Diving straight into code here:

require("httr")
require("jsonlite")

data_req <- GET("https://api.windsor.ai/USER_NAME/USER_NAME_attribution/public/conversions?api_key=API_KEY")

journey_data <- content(data_req, "text")
journey_data_json <- fromJSON(journey_data, flatten = TRUE)
journey_data_df <- as.data.frame(journey_data_json)

The data retrieved then looks like this.

customer journeys for attribution modelling — Customer journeys for attribution modelling

The data contains both converting journeys and non-converting journeys. This is important for the model to give reliable attribution values. In the above example we have both clicks and impressions in the journeys. The image shows source, medium and campaigns along the customer journey but it is easy to go down to keyword level also in the modelling. This way one gets a data driven attributed value for every keyword along every customer journey.

Loading R packages and calculating the attributions

We use the following R packages for this example.

# Install these libraries (only do this once)
# install.packages("ChannelAttribution")
# install.packages("reshape")
# install.packages("ggplot2")

Load the packages
library(ChannelAttribution)
library(reshape)
library(ggplot2)

Here we calculate the first-touch, last-touch and linear-touch models.

H <- heuristic_models(journey_data_df, 'sourcepath', 'totalconversions', var_value='totalconversionvalue')

And here we calculate the markov model.

M <- markov_model(journey_data_df, 'sourcepath', 'totalconversions', var_value='totalconversionvalue', order = 1)

Then we join the data-frames by channel-name to be able to compare the attribution models more easily.

attributions <- merge(H, M, by='channel_name')

We remove some colums we dont need so we keep only the interesting ones in this case.

attributions <- attributions[, (colnames(R)%in%c('channel_name', 'first_touch_conversions', 'last_touch_conversions', 'linear_touch_conversions', 'total_conversion'))]

# Renames the columns
colnames(attributions) <- c('channel_name', 'first_touch', 'last_touch', 'linear_touch', 'markov_model')

Before plotting them we definitely need to filter the dataframe a bit as in our case we had more than 500 different converting sources.

attributions <- top_n(attributions, 10, markov_model)

Here we transform the data-frame so ggplot can use it more easily.
attributions <- melt(attributions, id='channel_name')

Plotting the data

And here we can plot the conversions in a bar chart.

ggplot(attributions, aes(channel_name, value, fill = variable)) +
geom_bar(stat='identity', position='dodge') +
ggtitle('Attributed conversoins with the different models') +
theme(axis.title.x = element_text(vjust = -2)) +
theme(axis.title.y = element_text(vjust = +2)) +
theme(title = element_text(size = 16)) +
theme(plot.title=element_text(size = 20)) +
ylab("")

The chart looks like below in this example.

attribution modelling in R 1 — attribution modelling in R

To make attribution modelling more actionable one has to join it with the cost data so one can get a ROAS or a CPA based on the chosen attribution model. That way one can allocate the budget and spend where it has the biggest impact. Multi-touch attribution models help here significantly because then it simplifies the analyses as one does not have to take into account bounce-rates and click-trough rates etc. Everything is included in the model when its put into perspective how much was spent on the channel.

Budget optimisations

In optimising the budget and making the data actionable is where our budget-optimiser comes in handy. The budget optimiser in our software takes into account the impact of budget optimisations and gives you prioritised optimisations. Get in touch for a demo or sign-up for a free trial!

Try Windsor.ai today

Access all your data from your favorite sources in one place.
Get started for free with a 30 - day trial.

Start Free Trial

Trough our REST API it is also possible to pull the attributed conversions per keyword directly into excel or a google sheet via our API. Let us know if you are interested in this! Here is a documentation about the API: https://windsor.ai/api-documentation/

This blog is also submitted to https://www.r-bloggers.com/ They have many more practical tips on how to use R.

Attribution modelling in R

Contents