DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

sort

data management

slug: step-sort

Sort Step

Purpose

Reorders dataset records based on one or more columns in ascending or descending order. Provides flexible sorting capabilities with multiple notation options.

When to Use

  • Create rank-ordered views of data
  • Prepare data for reporting or visualization where order matters
  • Group similar values together
  • Find highest/lowest values in specific columns
  • Implement custom sorting logic for dashboards or exports

How It Works

  1. Normalizes various sorting specifications into a standardized format
  2. Creates a custom comparison function based on sort specifications
  3. Applies PHP's usort() to reorder the entire dataset
  4. Handles special cases like null values, mixed data types, and case-insensitivity
  5. Tracks sorting metadata in extras

Parameters

Required

  • by - Specifies sorting criteria using one of the following formats:
    • String: Single column name (e.g., "last_name")
    • Array of strings: Multiple columns with optional descending prefix (e.g., ["last_name", "-age"])
    • Associative array: Column names with explicit directions (e.g., {"last_name": "asc", "age": "desc"})

Sorting Behavior

  • Null values always sort to the end regardless of direction
  • Numeric comparison is used when both values are numeric
  • Case-insensitive string comparison is used for text values
  • Multi-column sorting evaluates columns in order until a difference is found
  • If no sort is specified, data remains in its original order

Input Requirements

  • Any dataset with columns referenced in sort specifications
  • Column names must exist in the dataset (missing columns are treated as null)

Sort Direction Notations

Three ways to specify descending order:

  1. Prefix column name with -: ["-age", "last_name"]
  2. Associative array: {"age": "desc", "last_name": "asc"}
  3. For single column sorting: {"by": "-age"}

Ascending is the default when direction is not specified.

Output

Data

  • Reordered version of the input dataset

Extras

  • sort_applied - Timestamp when sorting was performed
  • sort_columns - List of columns used for sorting
  • records_sorted - Number of records that were sorted

Example Usage

# Simple single-column sort
sort:
  by: "last_name"

# Multi-column sort with mixed directions
sort:
  by:
    - "department"
    - "-salary"
    - "hire_date"

# Explicit direction specification
sort:
  by:
    revenue: "desc"
    customer_name: "asc"

Example Output

Input Data

id last_name first_name department salary
1 Smith John Sales 75000
2 Jones Sarah Marketing 82000
3 Davis Michael Sales 65000
4 Brown Jessica Engineering 92000
5 Wilson Robert Marketing null

Sorted Output (Using by: ["department", "-salary", "last_name"])

id last_name first_name department salary
4 Brown Jessica Engineering 92000
2 Jones Sarah Marketing 82000
5 Wilson Robert Marketing null
1 Smith John Sales 75000
3 Davis Michael Sales 65000

Related Documentation

  • filter step - Filter records before sorting