DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

index

business analytics

slug: step-index

Index Step

Purpose

Calculates performance indices that compare segment rates against a baseline, highlighting over/under-performing segments with statistical significance flags. Essential for identifying efficiency, effectiveness, and opportunity areas within data.

When to Use

  • Compare segment performance against overall average
  • Identify over/under-performing categories, regions, or segments
  • Analyze marketing efficiency across channels
  • Detect consumer behavior patterns by demographic
  • Compare conversion rates against baselines
  • Normalize data for fair cross-segment comparison
  • Find outlier performance requiring further investigation

How It Works

  1. Takes hierarchical data (typically from a cube step) with a measure and rate-based metric
  2. Establishes a baseline value based on the specified indexing approach
  3. Calculates an index value for each segment using the formula: (segment_rate / baseline_rate) × 100
  4. Identifies statistically significant deviations from the baseline (index value of 100)
  5. Optionally calculates segment composition percentages
  6. Adds flags for significantly over/under-performing segments
  7. Provides comprehensive statistics about index distribution

Parameters

Required

  • measure (string) - Base column to analyze (e.g., "conversion")
  • metric (string) - Rate-based metric to use (e.g., "mean", "median")

Optional

  • indexAgainst (string) - Baseline selection method (default: level0):
    • level0 - Use grand total (top level) as baseline
    • parent - Use parent segment as baseline (hierarchical comparison)
    • Custom filter criteria can be specified via baselineFilter
  • baselineFilter (object) - Custom criteria to identify baseline row
  • indexColumn (string) - Name for the index column (default: {measure}Index)
  • significanceThreshold (number) - Minimum deviation to flag as significant (default: 20)
  • includeComposition (boolean) - Add composition percentage column (default: true)
  • flagThreshold (boolean) - Add index flag columns (default: true)

Index Interpretation

  • Index = 100: Segment performs exactly at baseline level
  • Index > 100: Segment over-performs the baseline (e.g., 120 = 20% higher)
  • Index < 100: Segment under-performs the baseline (e.g., 80 = 20% lower)
  • Significant deviation: When absolute difference from 100 exceeds threshold

Input Requirements

  • Data must contain the measure column specified in the format measure.metric
  • Rate-based metrics are required (mean, median, etc.), not sum or count
  • For hierarchical indexing, a 'level' column is expected (typically from cube step)
  • For parent-based indexing, hierarchical structure must be clearly defined

Output Columns

All original columns plus:

  • segmentRate - Rate value for this segment
  • baselineRate - Baseline rate value used for comparison
  • {measure}Index - Calculated index value (segment/baseline × 100)
  • composition - Percentage of total frequency (optional)
  • indexFlag - Significance flag: 'significant' or 'at_par' (optional)
  • overUnderFlag - Direction flag: 'over', 'under', or null (optional)

Extras (Statistical Summary)

Detailed statistics stored in extras.stats.index:

  • measure - Measure used in analysis
  • metric - Rate-based metric used
  • indexAgainst - Baseline approach used
  • baselineRate - Value of the baseline rate
  • significanceThreshold - Deviation threshold for significance
  • segmentsAnalyzed - Number of segments analyzed
  • overIndexing - Count of significantly over-indexing segments
  • atPar - Count of segments at parity with baseline
  • underIndexing - Count of significantly under-indexing segments
  • strongestOverIndex - Details about highest over-indexing segment
  • weakestUnderIndex - Details about lowest under-indexing segment

Example Usage

Basic Index Analysis Against Total

index:
  measure: conversion
  metric: mean
  significanceThreshold: 15

Parent-Level Comparison

index:
  measure: engagement
  metric: mean
  indexAgainst: parent
  includeComposition: true
  flagThreshold: true

Custom Baseline

index:
  measure: response_rate
  metric: mean
  baselineFilter:
    region: "National"
    segment: "All"

Example Output

Input Data (From Cube Step)

level region customer_segment conversion.mean freq
0 All All 3.5 1000
1 North All 4.2 400
1 South All 3.0 600
2 North Premium 5.8 150
2 North Standard 3.2 250
2 South Premium 4.5 200
2 South Standard 2.2 400

Output Data (Index Analysis)

level region customer_segment conversion.mean freq segmentRate baselineRate conversionIndex composition indexFlag overUnderFlag
0 All All 3.5 1000 3.5 3.5 100.00 100.00 at_par null
1 North All 4.2 400 4.2 3.5 120.00 40.00 significant over
1 South All 3.0 600 3.0 3.5 85.71 60.00 significant under
2 North Premium 5.8 150 5.8 3.5 165.71 15.00 significant over
2 North Standard 3.2 250 3.2 3.5 91.43 25.00 at_par null
2 South Premium 4.5 200 4.5 3.5 128.57 20.00 significant over
2 South Standard 2.2 400 2.2 3.5 62.86 40.00 significant under

Extras (Statistical Summary)

extras.stats.index = {
  "measure": "conversion",
  "metric": "mean",
  "indexAgainst": "level0",
  "baselineRate": 3.5,
  "significanceThreshold": 20,
  "segmentsAnalyzed": 7,
  "overIndexing": 3,
  "atPar": 2,
  "underIndexing": 2,
  "strongestOverIndex": {
    "segment": "North|Premium",
    "index": 165.71,
    "composition": 15.00
  },
  "weakestUnderIndex": {
    "segment": "South|Standard",
    "index": 62.86,
    "composition": 40.00
  }
}

Analysis Applications

  • Marketing Optimization: Target segments with high composition but low index
  • Resource Allocation: Prioritize segments with high index and high composition
  • Opportunity Identification: Investigate significant deviations for insights
  • Performance Benchmarking: Compare performance across different dimensions
  • Normalization: Account for structural differences when comparing segments

Related Documentation

  • cube step - Create hierarchical data for index analysis
  • filter step - Focus on specific segments for detailed analysis
  • calculate step - Create custom scoring based on index values
  • chart step - Visualize index analysis with charts
  • sort step - Sort index results by significance