DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

calculate

data management

slug: step-calculate

Calculate Step

Purpose

Creates new calculated fields or transforms existing ones using either direct assignments or conditional logic. Safely evaluates expressions to derive new data points from existing values.

When to Use

  • Create derived metrics from raw data
  • Standardize or transform field values
  • Implement business rules as calculated fields
  • Add flags or indicators based on conditions
  • Simplify complex calculations with reusable expressions
  • Perform data type conversions or formatting

How It Works

  1. Processes each record in the dataset sequentially
  2. Applies two types of calculations:
    • Assign: Direct field assignments using expressions or literals
    • Case: Conditional logic with when/then rules (similar to SQL CASE statements)
  3. Evaluates expressions in a sandboxed environment with access to record fields
  4. Automatically adds new fields to the PDV (Physical Data View) metadata
  5. Tracks metrics about the calculation process in the extras

Parameters

Required

At least one of:

  • assign (object) - Map of field names to expressions/values
  • case (object) - Map of field names to conditional rules

Optional

  • strict_mode (boolean) - When true, throws exceptions for failed expressions. Default: false

Security Features

  • Blacklist of dangerous PHP functions to prevent code injection
  • Expression validation before evaluation
  • Exception handling for malformed expressions

Input Requirements

  • Any dataset with fields referenced in calculation expressions
  • Field names used in expressions must exist in the dataset

Expression Syntax

Assign Expressions

  • Direct value assignment: "fieldName": 123
  • String literal: "status": "Active"
  • Basic arithmetic: "total": "$price * $quantity"
  • Function calls: "fullName": "trim($firstName . ' ' . $lastName)"

Case Expressions

Structure for each field:

"fieldName": [
  {"when": "condition1", "then": "value1"},
  {"when": "condition2", "then": "value2"},
  {"else": "defaultValue"}
]
  • Conditions are evaluated in order until one matches
  • Optional else clause provides a default value

Output

Data

  • Original dataset with new or modified fields based on calculations

PDV Updates

Adds metadata for new fields including:

  • name - Field name
  • type - Detected data type (N for numeric, C for character)
  • length - Estimated field length
  • format - Default null (can be updated later)
  • label - Same as field name
  • source - Set to 'calculated'

Extras

  • calculate_applied - Timestamp when calculation was performed
  • fields_calculated - Number of fields added to PDV
  • assign_processed - Number of assign operations performed
  • case_processed - Number of case operations performed

Example Usage

calculate:
  assign:
    total: "$price * $quantity"
    discount: "$total * 0.1"
    netTotal: "$total - $discount"
    timestamp: "date('Y-m-d H:i:s')"

  case:
    priceTier:
      - {when: "$price < 10", then: "Low"}
      - {when: "$price < 50", then: "Medium"}
      - {when: "$price >= 50", then: "Premium"}

    shippingClass:
      - {when: "$weight > 20", then: "Heavy"}
      - {when: "$fragile == true", then: "Careful"}
      - {else: "Standard"}

  strict_mode: true

Example Output

Original Data

id price quantity weight fragile
1 45 2 15 false
2 8.99 3 25 true

Calculated Data

id price quantity weight fragile total discount netTotal timestamp priceTier shippingClass
1 45 2 15 false 90 9 81 2024-10-24 12:30:15 Medium Standard
2 8.99 3 25 true 26.97 2.697 24.273 2024-10-24 12:30:15 Low Careful

Related Documentation

  • [step-filter]] - Apply filters using calculated fields
  • sort step - Sort data based on calculated values