DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

How To Use Market Basket Analysis

business analytics

slug: tutorial-how-to-use-market-basket-analysis

There are the three fundamental metrics in market basket analysis.

1. Support - "How often does this appear?"

Formula: Support(X) = (Transactions containing X) / (Total Transactions)

What it measures: Frequency or popularity

Example from your data:

  • Milk appears in 10 out of 15 transactions
  • Support(Milk) = 10/15 = 0.67 = 67%

For a rule (X → Y):

  • Support(Milk → Bread) = Transactions with BOTH / Total = 6/15 = 40%

Why it matters:

  • Low support = rare pattern (might be noise, or might be a valuable niche)
  • High support = common pattern (but might just be popular items)

Business use: Filter out patterns that happen too rarely to be actionable


2. Confidence - "How reliable is this pattern?"

Formula: Confidence(X → Y) = Support(X and Y) / Support(X)

What it measures: Conditional probability - "If someone buys X, what % chance they also buy Y?"

Example from your data:

  • Butter → Milk has confidence = 0.8571 = 86%
  • This means: Of the 7 people who bought Butter, 6 also bought Milk (6/7 = 86%)

Why it matters:

  • High confidence = strong predictive power
  • If confidence(X → Y) = 100%, then EVERY time someone buys X, they buy Y

Business use: Determine how likely a recommendation will be relevant

Important asymmetry:

  • Confidence(Butter → Milk) = 86%
  • Confidence(Milk → Butter) = 67%
  • These are DIFFERENT because different denominators!

3. Lift - "Is this a special relationship?"

Formula: Lift(X → Y) = Support(X and Y) / (Support(X) × Support(Y))

Or equivalently: Lift = Confidence(X → Y) / Support(Y)

What it measures: How much more likely Y is purchased when X is purchased, compared to Y's baseline popularity

Interpretation:

  • Lift = 1.0 → No association (items are independent)
  • Lift > 1.0 → Positive association (buying X increases likelihood of buying Y)
  • Lift < 1.0 → Negative association (buying X decreases likelihood of buying Y)

Example from your data:

Peanut Butter → Jelly: Lift = 15

  • Jelly normally appears in 1/15 = 6.7% of transactions
  • When someone buys Peanut Butter, Jelly appears in 100% of transactions
  • 100% / 6.7% = 15× more likely!

Butter → Milk: Lift = 1.0

  • Milk normally appears in 67% of transactions
  • When someone buys Butter, Milk appears in 86% of transactions
  • But Butter itself is very common, so this is close to what you'd expect by chance
  • 86% / 67% ≈ 1.3, but with rounding and the small dataset, it shows as 1.0

Why it matters:

  • Lift tells you if an association is meaningful vs just popular items appearing together
  • High lift = surprising, actionable insight
  • Lift near 1 = items are just both popular

Business use: Focus on high-lift rules for cross-selling and bundling


How They Work Together:

Metric Question It Answers Business Concern
Support How often does this happen? Is this pattern frequent enough to act on?
Confidence How reliable is this prediction? Will this recommendation be relevant?
Lift Is this relationship special? Is this insight surprising and valuable?

Practical Example from Your Data:

Rule: Peanut Butter → Jelly

  • Support = 6.7% (happens in 1/15 transactions)
  • Confidence = 100% (everyone who buys PB also buys Jelly)
  • Lift = 15 (15× more likely than random)

Interpretation:

  • ✅ Perfect predictive power (confidence = 100%)
  • ✅ Extremely strong association (lift = 15)
  • ⚠️ Rare occurrence (support = 6.7%)

Business action: Create PB&J bundle, display together, but don't expect huge volume


Rule: Butter → Milk

  • Support = 40% (happens in 6/15 transactions)
  • Confidence = 86% (most butter buyers also buy milk)
  • Lift = 1.0 (not surprising - both are just popular)

Interpretation:

  • ✅ Frequent pattern (support = 40%)
  • ✅ Good predictive power (confidence = 86%)
  • ⚠️ No special relationship (lift = 1.0)

Business action: Less interesting for targeted promotions, since both items are already bought frequently anyway


The Sweet Spot:

Ideal rules for action:

  • Medium-to-high support (frequent enough to matter)
  • High confidence (reliable prediction)
  • High lift (surprising, meaningful relationship)