Chat with us, powered by LiveChat
Like it! share it!

Customer segmentation is a powerful tool that enables you to get to know your clients and thus – treat them differently and better fulfil their various needs, which is the key factor of success in the retail industry. In this blog post, I want to dive into customer segmentation data science and explain one of the most widely used customer segmentation techniques – RFM analysis.

What is RFM analysis?

RFM stands for Recency, Frequency, Monetary Value. The RFM analysis is the technique of customer segmentation based on their transaction history. It allows us to collect insights about consumer behaviour and optimize marketing strategy accordingly. In particular, one can leverage RFM to create personalized special offers to improve sales and decrease customer retention.

The RFM analysis is based on three metrics, which measure different (but equally important) customer characteristics: How much time passed since the last purchase? (Recency), How many transactions were made? (Frequency), and How much money was spent?(Monetary Value).

 Without further ado, let me show you how to conduct the RFM analysis. 

Datasets used for RFM analysis

For our RFM analysis, we used ‘Online Retail II UCI’ dataset from Kaggle (available at https://www.kaggle.com/mashlyn/online-retail-ii-uci/data

This Kaggle dataset contains information about made transactions with its date (‘trans_date’), amount of money (‘trans_amount’) and customer id (‘customer_id’). 

Customer_idTrans_dateTrans_amount
CS52952013-02-1135
CS47682015-03-1539
CS21222013-02-2652
CS12172011-11-1699
CS18502013-11-2078

To perform the analysis, we need to transform the dataset in such a way that each row contains data regarding one customer:

  • number of months since the last purchase (Recency),
  • number of made purchases (Frequency),
  • the total amount of spent money (Monetary Value).

Below, you can see the fragment of the transformed data frame. 

Customer_idRecencyFrequencyMonetary value
CS11122.004151012
CS11131.150201490
CS11141.051191432
CS11150.361221659
CS11166.67013857

Assigning RFM scores

The first step of the RFM analysis is to score each customer based on transaction characteristics.

There are several ways of doing this:

  • by using quantiles: we rank the customers using the chosen metric from the best to the worst one, then divide ranked customers into groups of equal sizes and assign each group a score. 
  • by using predefined boundaries: we predefine what score is assigned to the given value of a metric based on business knowledge. For example, for frequency, customers who made 0-10 purchases get score 1, 10-20 – score 2 and so on.
  • by using machine learning: we will cover this case in the next post.

In our case, we used the quantiles method. For each metric, we divided customers into five groups and assigned each group a score from 1 (the “worst”) to 5 (the “best”).

Recency score 1 is given to customers who made the last purchase a long time ago and 5 to those ones who bought something recently. For frequency/monetary value score 1 means the lowest number of transactions/amount of spent money, whereas 5 means the greatest number of transactions/amount of spent money.

Creating segments for the RFM analysis

There are a few approaches to create segments for the RFM analysis. I want to focus on the following two ways:

  1. based on 3-digit code created by concatenating scores,
  2. based on the sum of scores.

Concat scores

In this method, all scores are concatenated to 3-digit code. The “worst” customers will have a code 111 and the “best” ones 555. These codes are then used to assign customers to the segments. 

In the table below you can find a description of each segment.

SegmentCharacteristic
ChampionsThe best customers, they bought and spent a lot and made the latest purchase recently. 
Loyal CustomersVery good customers, they spent a lot. 
Potential LoyalistThey are recent customers, but they already spent a lot. 
New CustomersRecent customers, who made only some purchases. 
PromisingBought often and spent quite much, but made last. purchase some time ago. 
Need AttentionRecency and monetary value above average. 
About To SleepBelow average recency and monetary value.
At RiskThey bought frequently but didn’t make any purchase for a long time. 
Cannot Lose ThemThe customers who spent a lot, but have been inactive for a while. 
HibernatingCustomers with low frequency and monetary value, who have not bought anything for a long time.
LostThe worst customers, they didn’t make any purchase for a long time and they have never spent a lot. 

Now, we can look at our segments.

Sum scores

The simpler approach is to divide customers into groups based on the sum of their scores (S), which in our case varies between 3 and 15. It is arbitrary how one decides to choose segments’ boundaries. In this analysis, we divide customers into 3 groups:

  • bronze: S < 5
  • silver: 5 <= S < 10
  • gold: S >= 10

This method gives a quick insight into which customers are more valuable. Although, a drawback is that customers with different buying behaviours can be assigned to one segment.

SegmentMean recencyMean frequencyMean monetary valueNumber of users
Gold1.621.81510.93675
Silver2.814.9889.62352
Bronze6.711.3556.0862

 

Closing notes

The final step of the RFM analysis is to use knowledge about customers in practice by creating a marketing strategy and personalized offers. 

Stay tuned and find out how to use machine learning in customer analysis in my next article very soon.

Author's
Agata Hanas NeuroSYS
Agata Hanas
Machine Learning Researcher
Please check your e-mail

We sent a message to your email. Confirm it and join our group of subscribers!

Join our small, but happy and loyal group of subscribers!
E-mail address
Insert your Email correctly please
I agree that NeuroSYS may collect and process my data to answer my enquiries and provide me with product and service information.
Read and accept
This site uses cookies. By continuing to navigate on this website, you accept the use of cookies.
icon
Done!
Thank you for your application!
icon
Let's get in touch!
We want to get to know you a little bit, but we need some help from your side. Let's start with filling gaps below.
Full name
Please provide us with your full name
Email
Please provide us your current Email
Telephone
Please provide us with your Phone number
Your LinkedIn profile
Please show us your professional social side :)
Link to your portfolio / GitHub
Please insert your Portfolio / GitHub URL correctly
Message
Nothing to say? Maybe just a little bit? Even "Hi" will work - thanks!
CV file
Please upload your CV
Select file
Please choose one of the following
I hereby authorize the processing of my personal data included in this form for the present recruitment-related purposes by NeuroSYS Sp. z o.o. (Rybacka 7 Street, 53-565 Wrocław) (in accordance with the General Data Protection Regulation (EU) 2016/679 of 27.04.2018 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, as well as repealing Directive 95/46/EC (Data Protection Directive)). I acknowledge that submitting my personal data is voluntary, I have the right to access my data and rectify it.
Read and accept
I hereby authorize the processing of my personal data included in my job application for the needs of future recruitment processes by NeuroSYS Sp. z o.o. (Rybacka 7 Street, 53-565 Wrocław).
Read and accept
Captcha is required