DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

Regression analysis

exploratory statistics

slug: recipe-exploratory-statistics-regression-analysis

Recipe: Regression analysis

category: exploratory statistics

Problem

You need to model relationships between variables:

  • predict numeric outcomes based on one or more predictors
  • quantify the strength and significance of relationships
  • identify key drivers of observed patterns

Solution

Follow these steps to perform regression analysis:

  • load the dataset
  • select dependent and independent variables
  • apply regression analysis (linear, multiple, or custom)
  • review coefficients, p-values, and model diagnostics
  • optionally use predictions for downstream analysis

Step Sequence

load step -> [step-reg] -> calculate step -> chart step

Input Datasets

  • transactions_clean — cleaned transactional data
  • Notes: numeric dependent variable (e.g., total_sales) and predictor variables (e.g., quantity, discount, customer_segment encoded numerically)

Output Dataset

  • regression_results — table containing coefficients, standard errors, p-values, and optionally fitted values
  • Notes: can be used to interpret drivers or generate predictions

Step-By-Step Explanation

Step Purpose Notes
load step Load dataset Supports local file, database, or API sources
[step-reg] Fit regression model Example: linear regression of total_sales ~ quantity + discount
calculate step Compute derived metrics or predictions Optional: create predicted sales or residuals
chart step Visualize relationships or residuals Optional scatterplots, regression lines, or residual plots

Variations & Extensions

  • Perform multiple regression with multiple independent variables
  • Use filter step to subset dataset before regression
  • Combine with [step-corr] to check predictor multicollinearity
  • Include dashboard step to display results interactively

Concepts Demonstrated

  • Regression modeling
  • Predictor and outcome analysis
  • Model diagnostics and interpretation
  • Sequencing analysis and visualization steps

Related Recipes

  • Univariate analysis of numeric variables
  • Correlation analysis between numeric variables

Notes & Best Practices

  • Check assumptions: linearity, independence, normality of residuals
  • Document variables and transformations applied
  • Use visualization to confirm model fit and identify outliers

Metadata


title: "Regression analysis"
category: "exploratory statistics"
difficulty: "Intermediate"
tags: [regression, modeling, numeric, EDA]
inputs: [transactions_clean]
outputs: [regression_results]
steps: [step-load, step-reg, step-calculate, step-chart]
author: "Tom Argiro"
last_updated: "2025-10-25"
doc_type: "recipe"