DAZL Documentation | Data Analytics A-to-Z Processing Language

DAZL Documentation | Data Analytics A-to-Z Processing Language

Contents

Quick Index Pages (1)

Welcome to DAZL

Recipes (24)

Topic Maps (18)

Examples (18)

Tutorials (6)

Reference (7)

Analysis of numeric variables

exploratory statistics

slug: recipe-exploratory-statistics-analysis-of-numeric-variables

Recipe: Analysis of numeric variables

category: exploratory statistics

Problem

You need to understand the distribution and characteristics of numeric variables:

detect outliers or extreme values
summarize key statistics (mean, median, standard deviation)
identify patterns for further analysis or transformation

Solution

Follow these steps to perform univariate analysis:

load the dataset
apply univariate analysis to the target numeric fields
review summary statistics and visualizations
optionally filter or flag outliers for cleaning

Step Sequence

load step -> univariate step -> filter step

Input Datasets

transactions_clean — cleaned transactional data
Notes: focus on numeric fields like amount, quantity, discount

Output Dataset

numeric_summary — table summarizing each numeric variable
Notes: includes count, mean, median, min, max, standard deviation, and optionally flagged outliers

Step-By-Step Explanation

Step	Purpose	Notes
load step	Load the dataset	Supports local file, database, or API sources
univariate step	Compute summary statistics for numeric variables	Example: calculate mean, median, SD for `amount`
filter step	Optionally flag or remove outliers	Example: filter transactions with amounts > 3 SDs from mean

Variations & Extensions

Apply chart step to visualize distributions (histogram, boxplot)
Combine with calculate step to create normalized or transformed fields
Use compare step to compare numeric distributions across different datasets or periods

Concepts Demonstrated

Univariate statistical analysis
Outlier detection
Data summarization for numeric fields
Sequencing statistical steps

Related Recipes

Frequency analysis of categorical data
Correlation analysis between numeric variables

Notes & Best Practices

Always inspect extreme values before filtering or transforming
Document transformations applied for reproducibility
Consider visualizing distributions for better insight