Using propensity models to predict customer behavior

While there are various types of propensity models, the one we use most at Faraday is the random decision forest. Like real forests, this one is made of trees — decision trees. Decision trees are classifier algorithms that look like flow charts, showing the choices made to reach a certain outcome.

Using propensity models to predict customer behavior

This article is part of Faraday's Out of the Lab series, which highlights initiatives our Data Science team undertakes and challenges they solve.

Machine learning models and rich data are at the core of what makes consumer predictions so powerful. They provide a wealth of information about customer behavior to brands looking to improve their predictive capabilities and level up their marketing strategies.

To predict customer behavior, Faraday builds propensity models — statistical analyses that predict the probabilistic likelihood of an individual performing a certain action. This type of prediction helps brands hone in on the leads and customers who are likely to be the most integral to driving revenue and growing the brand.

Selecting a propensity model

Our prediction system classifies against outcomes using an array of predictive algorithms, including individual decision trees, random decision forests, logistic regressions, and neural networks. At run time, we are attempting, in each case, to figure out which algorithm works best. Usually it’s random decision forests (RDF).

Random decision forests are made up of individual decision trees, which are are classifier algorithms that look like flow charts, showing the choices made to reach a certain outcome.

A decision tree that looks like a flowchart, with the categories age, income level, and whether or not the customer has kids, and each category further divided into whether that group is likely to convert

Predictions aren’t based on a single tree, though; we use dozens of trees to improve the accuracy of the algorithm, hence the random decision forest. We look at the universe of trees, their splits and branches, and choose the ones that are the most promising when it comes to creating a path towards the desired outcome. This path leads from seemingly arbitrary data to a classification: the “yes” or “no” to the outcome. In other words, we're defining someone as a “good potential customer.” (In the image above, the “yes” is “mostly converted” and the “no” is the “mostly did not convert.”)

While the classification seems binary, the reality is that often there's not a clear “yes” or “no.” We are dealing with the likelihood that someone performs an action, and that is never definitively 100% or 0%. During the prediction process, we employ propensity scores, which are a way of saying, “This person is more likely to be a good potential customer than not.” Every individual in the group we’re looking at is scored, and those in the top percentiles are recommended to the client as the best fits for the outcome they want to realize.

Using random decision forests to predict propensity

Now, if there are so many predictive algorithms we can use, why do we choose random decision forests? Well, for one, they’re the most explainable; with so many decision trees at play, we have a strong sense of the different importances at each node split in a tree. These importances show us which attributes contribute most toward the success of an outcome, or the most toward failure.

RDF also handles missing data very well, especially compared to a logistic regression or neural network, which both tend to require values to make predictions. In a decision tree, if there is a missing value, it moves onto the next node because there are other decision paths in the case of missing data. When you scale that to a whole RDF, the prediction becomes very accurate and isn’t perturbed by missing data.

A third reason we use random decision forests is collinearity. RDFs may see that some data are correlated — say, mortgage value and income. A lot of the attributes the algorithm deals with are linearly related but don’t necessarily directly drive a certain customer behavior. An RDF finds areas of greatest information gain, meaning that it may realize there is interdependence between attributes, but it will choose the feature that gives the algorithm the greatest leverage over the prediction. This is particularly helpful for us, as we deal with tremendous amounts of customer data, and data on humans is generally noisy and often subverts expected distributions and assumptions.

How do brands act on propensity models’ predictions?

Brands can execute hyper-targeted marketing campaigns with these predictions, prioritizing the individuals ranked in the top percentile of the analyzed group, as they have the highest likelihood of performing the behavior the campaign is meant to elicit.

These predictions can support marketing initiatives across the entire customer lifecycle, predicting conversion, future engagement behaviors, churn, and even reactivation. Propensity predictions are crucial to marketing strategies today because they allow brands to be proactive rather than reactive, anticipating their customers’ behavior and serving up ads and content that will nudge them down the right path.

If you’re interested in learning more about how propensity models work and how your brand can utilize them, tune into our Out of the Lab webinar series on Thursday, March 18th at 1:00PM EST.