Customer Lifetime Value (CLTV) is a big deal, and particularly in online clothing retail, where free delivery and returns means retailers may spend more on their worst customers’ postage than those customers spend on stuff.

Last year, ASOS – the UK-based retailer – announced a potentially significant improvement in how it calculates and applies CLTV in marketing based on machine learning. For the technically inclined, a full description can be found here, but I’ll parse the most important point for our purposes:

We show how automatically learned features can be combined with handcrafted features to produce a model that is both aware of domain knowledge and can learn rich patterns of customer behavior from raw data.

Retailers build models that classify a given customer as valuable, and potentially how valuable (either ranking or in categories), according to signals associated with the customer. These signals are what is meant by ‘handcrafted features’, and include for ASOS: customer demographics, purchase history, returns history, and web and app session logs.

They say handcrafted because they are supplied to the model, say from a data warehouse. One feature was country, for example (imagine being British as predictive of high CLTV). Another, days since last order. They included 132 of these features to make their model.

ASOS’ data scientists combined this model with handcrafted features, with new signals that they learned from the data – signals that would have been previously impossible to access. The belief being: the richer model will provide more precise calculations of CLTV, and more profit. Here’s how they derived the learned signals:

High-value customers tend to browse products of higher value, less popular products and products that may not be at the lowest price on the market….lower-value customers will tend to appear together in product sequences during sales periods or for products that are priced below the market. This information is difficult to incorporate into the model using hand-crafted features as the number of sequences of product views grows combinatorially.

High-value customers tend to look at different products at different times, than low-value customers. So, the sequence of who looks at what product page may tell you a lot about the value of a customer. However, the sequences by which all ASOS customers view all ASOS products is not a feature you can simply put in a model like country or days since last order.

The data science team used a machine learning framework that uses embeddings. What it does is look across all of a products’ customer views, then assign a value to each customer based on the kinds of customers who he or she appears close to in products’ view sequences. This gives the team a data output that differentiates each customer based on the kinds of products they look at across platforms, with the assumption that this can help categorize high-value from low-value customers.

Will it work? Their analysis concludes:

We obtained a significant uplift in AUC using embeddings of customers…This result shows that this approach is highly relevant and we are working to incorporate the technique into our live system at the point of writing this paper.

An uplift in AUC means that the system with the learned features was better than the system without learned features at distinguishing between the two classes (high-value and low-value) – enough to justify putting the model with handcrafted and learned features into production. They’re also planning to explore a model with only learned features.

What’s it mean?

First, it’s an interesting advance in how analysis of digital behavior (all of it) can inform better marketing activities. Second, it opens up for ASOS new angles of analytical attack, untested by rivals.  Third, it demonstrates how new kinds of natural language processing and machine learning might be employed by retailers and marketers against a number of potentially unrelated challenges.