DAZL Documentation | Data Analytics A-to-Z Processing Language

Quick Index Pages (1)

Welcome to DAZL

Recipes (24)

Topic Maps (18)

Examples (18)

Tutorials (6)

Reference (7)

How To Use Market Basket Analysis

business analytics

slug: tutorial-how-to-use-market-basket-analysis

There are the three fundamental metrics in market basket analysis.

1. Support - "How often does this appear?"

Formula: Support(X) = (Transactions containing X) / (Total Transactions)

What it measures: Frequency or popularity

Example from your data:

Milk appears in 10 out of 15 transactions
Support(Milk) = 10/15 = 0.67 = 67%

For a rule (X → Y):

Support(Milk → Bread) = Transactions with BOTH / Total = 6/15 = 40%

Why it matters:

Low support = rare pattern (might be noise, or might be a valuable niche)
High support = common pattern (but might just be popular items)

Business use: Filter out patterns that happen too rarely to be actionable

2. Confidence - "How reliable is this pattern?"

Formula: Confidence(X → Y) = Support(X and Y) / Support(X)

What it measures: Conditional probability - "If someone buys X, what % chance they also buy Y?"

Example from your data:

Butter → Milk has confidence = 0.8571 = 86%
This means: Of the 7 people who bought Butter, 6 also bought Milk (6/7 = 86%)

Why it matters:

High confidence = strong predictive power
If confidence(X → Y) = 100%, then EVERY time someone buys X, they buy Y

Business use: Determine how likely a recommendation will be relevant

Important asymmetry:

Confidence(Butter → Milk) = 86%
Confidence(Milk → Butter) = 67%
These are DIFFERENT because different denominators!

3. Lift - "Is this a special relationship?"

Formula: Lift(X → Y) = Support(X and Y) / (Support(X) × Support(Y))

Or equivalently: Lift = Confidence(X → Y) / Support(Y)

What it measures: How much more likely Y is purchased when X is purchased, compared to Y's baseline popularity

Interpretation:

Lift = 1.0 → No association (items are independent)
Lift > 1.0 → Positive association (buying X increases likelihood of buying Y)
Lift < 1.0 → Negative association (buying X decreases likelihood of buying Y)

Example from your data:

Peanut Butter → Jelly: Lift = 15

Jelly normally appears in 1/15 = 6.7% of transactions
When someone buys Peanut Butter, Jelly appears in 100% of transactions
100% / 6.7% = 15× more likely!

Butter → Milk: Lift = 1.0

Milk normally appears in 67% of transactions
When someone buys Butter, Milk appears in 86% of transactions
But Butter itself is very common, so this is close to what you'd expect by chance
86% / 67% ≈ 1.3, but with rounding and the small dataset, it shows as 1.0

Why it matters:

Lift tells you if an association is meaningful vs just popular items appearing together
High lift = surprising, actionable insight
Lift near 1 = items are just both popular

Business use: Focus on high-lift rules for cross-selling and bundling

How They Work Together:

Metric	Question It Answers	Business Concern
Support	How often does this happen?	Is this pattern frequent enough to act on?
Confidence	How reliable is this prediction?	Will this recommendation be relevant?
Lift	Is this relationship special?	Is this insight surprising and valuable?

Practical Example from Your Data:

Rule: Peanut Butter → Jelly

Support = 6.7% (happens in 1/15 transactions)
Confidence = 100% (everyone who buys PB also buys Jelly)
Lift = 15 (15× more likely than random)

Interpretation:

✅ Perfect predictive power (confidence = 100%)
✅ Extremely strong association (lift = 15)
⚠️ Rare occurrence (support = 6.7%)

Business action: Create PB&J bundle, display together, but don't expect huge volume

Rule: Butter → Milk

Support = 40% (happens in 6/15 transactions)
Confidence = 86% (most butter buyers also buy milk)
Lift = 1.0 (not surprising - both are just popular)

Interpretation:

✅ Frequent pattern (support = 40%)
✅ Good predictive power (confidence = 86%)
⚠️ No special relationship (lift = 1.0)

Business action: Less interesting for targeted promotions, since both items are already bought frequently anyway

The Sweet Spot:

Ideal rules for action:

Medium-to-high support (frequent enough to matter)
High confidence (reliable prediction)
High lift (surprising, meaningful relationship)