DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

rfm

business analytics

slug: step-rfm

RFM Step

Purpose

Performs RFM (Recency, Frequency, Monetary) analysis on customer transaction data to segment customers based on their purchase behavior. Creates powerful customer segmentation for targeted marketing campaigns and customer relationship management.

When to Use

  • Segment customers based on purchasing behavior
  • Identify high-value customers and at-risk customers
  • Prioritize marketing efforts and resource allocation
  • Create targeted marketing campaigns
  • Understand customer lifecycle stages
  • Analyze customer retention and churn potential
  • Develop loyalty programs based on behavior patterns

How It Works

  1. Takes transaction data with customer IDs, transaction dates, and monetary values
  2. Calculates three key metrics for each customer:
    • Recency: How recently the customer made a purchase (in days)
    • Frequency: How often the customer makes purchases (count of transactions)
    • Monetary: How much money the customer spends (total or average value)
  3. Scores each customer on each dimension using the selected scoring method
  4. Returns a comprehensive customer-level dataset with all RFM metrics and scores
  5. Preserves the original customer identifiers for easy integration with other data

Parameters

Required

  • idCol (string) - Column name containing customer identifiers (default: customerID)
  • dateCol (string) - Column name containing transaction dates (default: transactionDate)
  • amtCol (string) - Column name containing transaction amounts (default: amount)

Optional

  • type (string) - Scoring method to use (default: quintile):
    • quintile - 5 equal groups (1-5 scoring)
    • decile - 10 equal groups (1-10 scoring)
    • quartile - 4 equal groups (1-4 scoring)
    • custom - Custom number of groups specified by k parameter
  • k (integer) - Number of groups for custom scoring (only used when type is custom)
  • evenBinCounts (string) - Whether to ensure equal number of customers in each bin (default: false)

Input Requirements

  • Transaction-level data (multiple transactions per customer)
  • Valid customer identifier column
  • Valid transaction date column (must be in a format that can be parsed as a date)
  • Valid transaction amount column (must be numeric)

Output Columns

  • Original customer identifier (preserved from input)
  • recentDate - Most recent transaction date for the customer
  • recency - Days since most recent purchase
  • freq - Total number of transactions for the customer
  • monetary - Total monetary value of all transactions for the customer
  • rScore - Recency score (higher = more recent purchase)
  • fScore - Frequency score (higher = more frequent purchases)
  • mScore - Monetary score (higher = higher spending)

Interpretation Guide

Scoring Scale

  • For quintile scoring (default):
    • 5 = Top 20% (best customers)
    • 4 = Next 20%
    • 3 = Middle 20%
    • 2 = Next 20%
    • 1 = Bottom 20% (worst customers)

Common Segments

  • Champions (555): Recent, frequent buyers with high spend
  • Loyal Customers (454, 544, 545): Regular buyers with good spend
  • Potential Loyalists (543, 444, 435): Recent customers with average frequency/spend
  • New Customers (531, 521, 511): Bought recently, but not frequently
  • At-Risk (355, 354, 345): Above average recency, frequency and monetary
  • Cannot Lose (155, 154, 144): Used to buy frequently but haven't returned
  • Hibernating (111, 121, 131): Last purchase long ago, low frequency and value

Example Usage

Basic RFM Analysis

RFM:
  idCol: customer_id
  dateCol: purchase_date
  amtCol: total_amount

Custom Scoring Approach

RFM:
  idCol: member_number
  dateCol: transaction_date
  amtCol: revenue
  type: custom
  k: 7
  evenBinCounts: true

Example Output

Input Data (Transaction Level)

customer_id purchase_date total_amount
C001 2024-09-15 125.99
C001 2024-08-20 89.50
C001 2024-07-05 215.75
C002 2024-06-18 45.99
C003 2024-09-30 320.00
C003 2024-09-02 150.25
C004 2024-05-10 75.50

Output Data (Customer Level with Quintile Scoring)

customer_id recentDate recency freq monetary rScore fScore mScore
C001 2024-09-15 39 3 431.24 4 5 5
C002 2024-06-18 128 1 45.99 2 3 2
C003 2024-09-30 24 2 470.25 5 4 5
C004 2024-05-10 167 1 75.50 1 3 3

Next Steps After RFM

  • Calculate composite scores: rfmScore = rScore + fScore + mScore
  • Create customer segments: Use calculate-step to define segment labels
  • Visualize segments: Use chart-step to create RFM heatmaps or distribution charts
  • Target campaigns: Filter for specific segments for marketing actions
  • Track movement: Use the lag-step to track segment changes over time

Related Documentation

  • filter step - Pre-filter transaction data
  • calculate step - Create composite scores and segment labels
  • sort step - Sort customers by RFM scores
  • chart step - Visualize RFM segments