This article is part of Faraday's Out of the Lab series, which highlights initiatives our Data Science team undertakes and challenges they solve.
At Faraday, one of the key services we provide is segmenting our clients' customer bases into distinct persona groups, or personas. Understanding these personas enables our clients to easily create personalized marketing campaigns at scale.
For instance, our personas model might discover a solid group of young professionals in cities, allowing marketers to target those customers with ad creative that is distinct from other campaigns targeting, say, mothers of large suburban households. Each group is discovered from client data using a time-tested method called k-means clustering.
What is k-means clustering?
K-means clustering is an unsupervised clustering algorithm that was first introduced in 1957 by Stuart Lloyd of Bell Labs. An unsupervised algorithm does not require the data to be labeled in order to train the model. This is an important feature of the k-means algorithm because it allows for the discovery of subgroups within the data without any previous assumptions about possible groupings — in this case, personas.
How does k-means clustering work?
When presented with a data set, the k-means algorithm first introduces random center points around which each data point will eventually be clustered (circles, Fig. 1). Then each data point is assigned to the group of the center to which they are closest.
This happens by calculating the distance from a data point to each cluster center using an equation known as the Euclidean distance measure. For two dimensions, the Euclidean distance is represented by the diagonal line connecting the point to each of the cluster centers. The point is then assigned to the cluster center with the shortest diagonal line (Fig. 2).
Once each point has been assigned to the randomly chosen cluster centers, the centers then "update," moving to the center of the group by recalculating the mean (average) distances to all the points in the group (Fig. 3).
After the cluster centers have updated to their new positions, each point is reassigned to a new group using the Euclidean distance measure again. Most of the points will remain in their original groups, but some points will be assigned to a new group (Fig. 4). So, then the centers must be updated again to be at the center of the new group. This pattern repeats until the centers no longer move.
K-means and personas in the present day
All of the above is how k-means clustering worked back in 1957. Since then there have been upgrades allowing us to use something called categorical features. At Faraday, we create persona groups using a mix of these features.
As far as algorithms go, k-means is rather straightforward, but has one major downfall: you have to tell it how many groups to look for, and it will always succeed in finding that many groups. That means if you tell the algorithm to look for 500 groups, it will find them — even if there is no reason for there to be 500 distinct groups within the data. If you're a marketer, you probably don't want to create 500 unique ads for every campaign. But should you make four? Or six? Or ten?
At Faraday, we target the optimal number of groups hidden within the data, rather than guessing and hoping we are right. Here's how we find the optimal number of personas when using the k-means algorithm for persona development.
Want to learn more? Watch our webinar!
Using Shopify? Our new app uses this approach to develop your very own personas. Try the Faraday Personas & Insights app for free!