DAZL Documentation | Data Analytics A-to-Z Processing Language

Quick Index Pages (1)

Welcome to DAZL

Recipes (24)

Topic Maps (18)

Examples (18)

Tutorials (6)

Reference (7)

load

data management

slug: step-load

Purpose

Initializes the data pipeline by loading a dataset into memory. This step acts as the entry point for all downstream transformations and analyses in a workflow.

When to Use

Start a new workflow with a dataset reference (SQL, API, or virtual source)
Pass already loaded data into subsequent steps
Standardize input structure for the pipeline

How It Works

The orchestrator (executeStep) resolves the dataset reference and fetches the corresponding data.
The step handler (handle_load) receives the dataset, wraps it into a consistent structure (data, pdv, extras), and returns it to the orchestrator.
The resulting structure becomes the working dataset for subsequent steps.

Parameters

Required

dataset (object) — Defines the data source to load.
- source (string) — Name or identifier of the dataset (e.g., table name or virtual dataset).
- type (string) — Type of source (e.g., sql, api, array, etc.).

Optional

output (string) — Name of the dataset alias for downstream reference.

Security Features

All external access (SQL, API, etc.) is handled by the orchestrator, not directly within the handle_load() function.
The handler itself performs no data fetching or evaluation — it only returns normalized structures.

Input Requirements

Input may be empty or contain pre-loaded data.
The orchestrator provides $params['data'] when applicable.

Output

Data

Returns the dataset provided by the orchestrator in $inArray['data'].

PDV

Metadata about the dataset’s columns (if available).

Extras

Includes basic diagnostics such as record counts.

Output Structure

Key	Description
`data`	Array of dataset records
`pdv`	Column metadata (if provided)
`extras`	Record count and other optional info
`outputType`	`"work"` — signals the step produced an in-memory dataset

Example Usage

steps:
  - load:
      dataset:
        source: freqTest
        type: sql
      output: dataToAnalyze

  - freq:
      dataset: dataToAnalyze
      columns: [priority, region]
      output: Two Column Summary

Example Output

{
  "data": [
    {"priority": "High", "region": "East"},
    {"priority": "Medium", "region": "West"}
  ],
  "pdv": {},
  "extras": {"record_count": 2},
  "outputType": "work"
}

DAZL Documentation | Data Analytics A-to-Z Processing Language

Contents

Quick Index Pages (1)

Steps (34)

Recipes (24)

Topic Maps (18)

Examples (18)

Tutorials (6)

Reference (7)

load

Purpose

When to Use

How It Works

Parameters

Required

Optional

Security Features

Input Requirements

Output

Data

PDV

Extras

Output Structure

Example Usage

Example Output

Related Documentation