DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

Predict Customer Spending Using Machine Learning

machine learning

slug: recipe-machine-learning-predict-customer-spending-using-machine-learning

Recipe: Predict Customer Spending Using Machine Learning

category: business analytics / predictive modeling

Problem

You want to predict spending for new customers based on historical data so that you can:

  • Estimate potential revenue per customer
  • Identify high-value or low-value prospects
  • Inform resource allocation or marketing strategies
  • Evaluate model performance and refine predictive features

Solution

Train a machine learning model on historical customer data, then apply it to new customers to generate predicted outcomes:

  • loadInline step — Define historical training data and new customer data inline
  • trainModel step — Train a predictive model using selected features and target
  • useModel step — Apply the trained model to new customers to predict outcomes
  • print step — Display predictions for analysis or reporting

Step Sequence

loadInline steptrainModel steploadInline stepuseModel stepprint step

Input Datasets

  • Training dataset: Historical records including features and target variable

    • Features: numeric or categorical fields used for prediction (e.g., age, income)
    • Target: outcome variable to predict (e.g., spending)
  • Prediction dataset: New records with feature values to score
  • Both datasets must be structured as arrays of associative arrays (rows)

Output Dataset

  • Original dataset enriched with predictions
  • Key columns: all input features plus predicted_<target> (customizable via output_column)

Step-By-Step Explanation

Step Purpose Notes
loadInline step Load historical training data Inline data definition for small or test datasets
trainModel step Train ML model Select target, features, and modelType; store trained model in workflow extras
loadInline step Load new customers Define dataset for scoring
useModel step Apply trained model Predictions added as a new column (output_column)
print step Display predictions Can visualize predicted outcomes or export results

Variations & Extensions

  • Train different model types (linear regression, logistic regression, k-NN) using modelType
  • Normalize numeric features or handle missing values via parameters
  • Split data into training and test sets using test_size for model evaluation
  • Integrate predicted results with charts or dashboards for visualization
  • Combine with filter step to score only a subset of customers

Concepts Demonstrated

  • Training and applying machine learning models in a workflow
  • Inline data definition for rapid prototyping
  • Feature selection and predictive modeling parameters
  • Appending predictions to new datasets for analysis

Related Recipes

  • Benchmark segments against historical averages (Index)
  • Classify customers by predicted outcomes for targeted campaigns
  • Contribution + Pareto analysis to identify drivers of spending

Notes & Best Practices

  • Ensure features in the prediction dataset match the training dataset exactly
  • Normalize or preprocess features consistently between training and scoring
  • Choose appropriate model type for target variable (numeric vs categorical)
  • Use output_column to clearly label predicted results
  • Optionally evaluate model performance using a test split or cross-validation

Example Workflow (DSL)

steps:
  - loadInline:
      data:
        - {age: 22, income: 38000, spend: 800}
        - {age: 25, income: 45000, spend: 1200}
        - {age: 29, income: 56000, spend: 1800}
      output: trainingData

  - trainModel:
      dataset: trainingData
      target: spend
      features: [age, income]
      modelType: linear
      params:
        normalize: true
        test_size: 0.2
        learning_rate: 0.01
        max_iterations: 1000
      output: spendModel

  - loadInline:
      data:
        - {age: 28, income: 52000}
        - {age: 42, income: 81000}
      output: newCustomers

  - useModel:
      dataset: newCustomers
      model: spendModel
      output_column: predicted_spend

  - print:
      title: "Predicted Customer Spending"

Example Output

age income predicted_spend
28 52000 1420
42 81000 2800