DAZL Documentation | Data Analytics A-to-Z Processing Language


Contents

catalog

presentation

slug: step-catalog

Purpose

Provides a complete inventory of all active datasets, reports, and workflow logs (waterfalls) in the current interpreter session. Useful for auditing, reporting, and understanding the available data and artifacts in a workflow.

When to Use

  • Explore all datasets and reports currently loaded in the session
  • Audit workflow outputs and track the state of datasets
  • Generate summary tables for reporting or monitoring
  • Identify datasets with associated metadata or extras

How It Works

  1. Queries the interpreter for all objects in the workflow:

    • Datasets from the working memory (work)
    • Reports from HTML outputs (html)
    • Waterfalls from logged workflow steps (waterfalls)
  2. For each object type, builds a summary record:

    • Dataset: Name, row count, column count, PDV presence, extras count
    • Report: Name, size in KB
    • Waterfall: Name, number of steps
  3. Combines all objects into a single catalog array.
  4. Generates an HTML table for easy visualization using manageTable().
  5. Returns the catalog as both structured data (data) and HTML (html).
  6. Provides summary statistics in extras.stats for total counts of datasets, reports, and waterfalls.

Parameters

Required

  • _interpreter (object) – The workflow interpreter instance that stores datasets, reports, and waterfalls.

Optional

  • None. This step uses all available objects in the interpreter session.

Input Requirements

  • A running interpreter session with datasets (work), HTML reports, or waterfall logs.

Output

Data (data)

An array of catalog items, each with fields depending on type:

  • Dataset

    • name – Dataset name
    • type'dataset'
    • rows – Number of rows
    • columns – Number of columns
    • has_pdv – Boolean, indicates if PDV metadata exists
    • has_extras – Boolean, indicates if extras exist
    • extras_items – Number of items in extras
  • Report

    • name – Report name
    • type'report'
    • size_kb – Approximate size in KB
  • Waterfall

    • name – Waterfall log name
    • type'waterfall'
    • steps – Number of steps logged

HTML (html)

  • HTML table visualizing the catalog with rows for datasets, reports, and waterfalls

Extras (extras)

  • stats:

    • total_datasets – Count of datasets
    • total_reports – Count of reports
    • total_waterfalls – Count of waterfalls

Output Type

  • outputType: 'html'

Example Usage

steps:
  - loadInline:
      data:
        - {id: 1, name: Alice, sales: 200}
        - {id: 2, name: Bob,   sales: 350}
      output: salesData

  - catalog:

Explanation:

  • The catalog step produces an overview of all datasets, HTML outputs, and workflow logs.
  • Useful for tracking session contents, auditing outputs, or feeding summary dashboards.
  • The HTML table can be displayed directly in a report or UI.

Notes & Best Practices

  • Run this step after key workflow steps to capture the current state of datasets and reports.
  • Useful for debugging or validating that datasets and reports were generated as expected.
  • extras.stats provides quick high-level metrics for workflow monitoring.