Market Basket / Association Analysis
business analytics
slug: topic-map-business-analytics-market-basket-association-analysis
Vocabulary:
- market_basket: Set of items purchased together
- association_rule: If X then Y relationship (X→Y)
- antecedent: The "if" part (left-hand side) of rule
- consequent: The "then" part (right-hand side) of rule
- support: Frequency of itemset (P(X))
- confidence: Conditional probability (P(Y|X))
- lift: Ratio of observed to expected co-occurrence (P(Y|X) / P(Y))
- conviction: Measure of implication strength
- leverage: Difference between observed and expected co-occurrence
- itemset: Collection of items
- frequent_itemset: Itemset appearing above minimum support threshold
- apriori: Classic algorithm for finding association rules
Concepts:
- co_occurrence: Items appearing together more than by chance
- positive_association: Lift > 1, items appear together more than expected
- negative_association: Lift < 1, items appear together less than expected
- independence: Lift ≈ 1, no association
- rule_quality: Balancing support, confidence, and lift
- actionable_rules: Rules useful for business decisions
- spurious_association: Statistical association without causal relationship
- minimum_thresholds: Support, confidence, lift cutoffs to filter rules
- rule_redundancy: Multiple rules saying essentially same thing
Concepts_advanced:
- closed_itemsets: Itemsets with no superset of same support
- maximal_itemsets: Itemsets with no superset above minimum support
- sequential_patterns: Order matters (A then B then C)
- hierarchical_associations: Rules at different aggregation levels
- multi_dimensional_associations: Rules involving multiple attributes beyond items
- negative_rules: X→NOT Y relationships
Procedures:
- identify_transactions: Define what constitutes a "basket"
- create_itemsets: List all unique items or dimension values
- calculate_support: Count frequency of each itemset
- filter_by_min_support: Keep only frequent itemsets
- generate_rules: Create X→Y from frequent itemsets
- calculate_confidence: P(Y|X) for each rule
- calculate_lift: Observed/expected ratio
- filter_by_min_confidence: Keep strong rules
- filter_by_min_lift: Keep rules with lift > 1 (or higher threshold)
- rank_rules: Order by lift, confidence, support, or combination
Procedures_detailed:
- support_calculation: count(X) / total_transactions
- confidence_calculation: count(X,Y) / count(X)
- lift_calculation: confidence(X→Y) / support(Y) = P(Y|X) / P(Y)
- conviction_calculation: [1 - support(Y)] / [1 - confidence(X→Y)]
- leverage_calculation: support(X,Y) - [support(X) × support(Y)]
- apriori_algorithm: Iteratively generate candidate itemsets, prune infrequent
Topics:
- product_recommendations
- cross_sell_optimization
- bundle_design
- store_layout_optimization
- customer_segment_profiling
- content_recommendation
- fraud_pattern_detection
- diagnostic_associations
- clickstream_analysis
- menu_optimization
Categories:
- pattern_mining
- co_occurrence_analysis
- rule_discovery
- recommendation_systems
- behavioral_analysis
Themes:
- discovering_hidden_patterns: Find non-obvious associations
- actionable_recommendations: Rules that drive business decisions
- statistical_vs_causal: Association doesn't imply causation
- incremental_revenue: Driving add-on purchases
Trends:
- personalized_associations: Rules specific to customer segments
- real_time_recommendations: Live association-based suggestions
- deep_learning_embeddings: Neural nets learn item associations
- graph_based_associations: Network analysis of item relationships
- contextual_associations: Rules vary by time, location, season
Use_cases:
- retail: "Customers who buy diapers also buy beer (lift=2.3) - place nearby or bundle"
- ecommerce: "Users viewing laptops then view laptop bags (confidence=65%, lift=3.1)"
- restaurant: "Customers ordering steak also order wine (lift=2.8) - train upsell"
- streaming: "Users watching Show A also watch Show B (lift=4.2) - recommend"
- healthcare: "Patients with Symptom X often have Diagnosis Y (lift=3.5) - screening protocol"
- banking: "Customers with checking also get savings (lift=2.1) - cross-sell campaign"
- telecom: "Subscribers with unlimited data also get streaming bundle (lift=3.9)"
- hospitality: "Guests booking spa also book late checkout (lift=2.6) - package deal"