DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

Market Basket / Association Analysis

business analytics

slug: topic-map-business-analytics-market-basket-association-analysis

Vocabulary:

  • market_basket: Set of items purchased together
  • association_rule: If X then Y relationship (X→Y)
  • antecedent: The "if" part (left-hand side) of rule
  • consequent: The "then" part (right-hand side) of rule
  • support: Frequency of itemset (P(X))
  • confidence: Conditional probability (P(Y|X))
  • lift: Ratio of observed to expected co-occurrence (P(Y|X) / P(Y))
  • conviction: Measure of implication strength
  • leverage: Difference between observed and expected co-occurrence
  • itemset: Collection of items
  • frequent_itemset: Itemset appearing above minimum support threshold
  • apriori: Classic algorithm for finding association rules

Concepts:

  • co_occurrence: Items appearing together more than by chance
  • positive_association: Lift > 1, items appear together more than expected
  • negative_association: Lift < 1, items appear together less than expected
  • independence: Lift ≈ 1, no association
  • rule_quality: Balancing support, confidence, and lift
  • actionable_rules: Rules useful for business decisions
  • spurious_association: Statistical association without causal relationship
  • minimum_thresholds: Support, confidence, lift cutoffs to filter rules
  • rule_redundancy: Multiple rules saying essentially same thing

Concepts_advanced:

  • closed_itemsets: Itemsets with no superset of same support
  • maximal_itemsets: Itemsets with no superset above minimum support
  • sequential_patterns: Order matters (A then B then C)
  • hierarchical_associations: Rules at different aggregation levels
  • multi_dimensional_associations: Rules involving multiple attributes beyond items
  • negative_rules: X→NOT Y relationships

Procedures:

  • identify_transactions: Define what constitutes a "basket"
  • create_itemsets: List all unique items or dimension values
  • calculate_support: Count frequency of each itemset
  • filter_by_min_support: Keep only frequent itemsets
  • generate_rules: Create X→Y from frequent itemsets
  • calculate_confidence: P(Y|X) for each rule
  • calculate_lift: Observed/expected ratio
  • filter_by_min_confidence: Keep strong rules
  • filter_by_min_lift: Keep rules with lift > 1 (or higher threshold)
  • rank_rules: Order by lift, confidence, support, or combination

Procedures_detailed:

  • support_calculation: count(X) / total_transactions
  • confidence_calculation: count(X,Y) / count(X)
  • lift_calculation: confidence(X→Y) / support(Y) = P(Y|X) / P(Y)
  • conviction_calculation: [1 - support(Y)] / [1 - confidence(X→Y)]
  • leverage_calculation: support(X,Y) - [support(X) × support(Y)]
  • apriori_algorithm: Iteratively generate candidate itemsets, prune infrequent

Topics:

  • product_recommendations
  • cross_sell_optimization
  • bundle_design
  • store_layout_optimization
  • customer_segment_profiling
  • content_recommendation
  • fraud_pattern_detection
  • diagnostic_associations
  • clickstream_analysis
  • menu_optimization

Categories:

  • pattern_mining
  • co_occurrence_analysis
  • rule_discovery
  • recommendation_systems
  • behavioral_analysis

Themes:

  • discovering_hidden_patterns: Find non-obvious associations
  • actionable_recommendations: Rules that drive business decisions
  • statistical_vs_causal: Association doesn't imply causation
  • incremental_revenue: Driving add-on purchases

Trends:

  • personalized_associations: Rules specific to customer segments
  • real_time_recommendations: Live association-based suggestions
  • deep_learning_embeddings: Neural nets learn item associations
  • graph_based_associations: Network analysis of item relationships
  • contextual_associations: Rules vary by time, location, season

Use_cases:

  • retail: "Customers who buy diapers also buy beer (lift=2.3) - place nearby or bundle"
  • ecommerce: "Users viewing laptops then view laptop bags (confidence=65%, lift=3.1)"
  • restaurant: "Customers ordering steak also order wine (lift=2.8) - train upsell"
  • streaming: "Users watching Show A also watch Show B (lift=4.2) - recommend"
  • healthcare: "Patients with Symptom X often have Diagnosis Y (lift=3.5) - screening protocol"
  • banking: "Customers with checking also get savings (lift=2.1) - cross-sell campaign"
  • telecom: "Subscribers with unlimited data also get streaming bundle (lift=3.9)"
  • hospitality: "Guests booking spa also book late checkout (lift=2.6) - package deal"