data management
slug: step-compareCompares two or more datasets to identify differences, similarities, changes, or patterns across them. Provides multiple comparison modes for different analytical needs.
datasets (array) - Names of datasets to compareon (string) - Key field used to identify and match records across datasetsmode (string) - Comparison operation to perform:
added - Records in second dataset not in firstremoved - Records in first dataset not in secondintersect - Records common to all datasetsunion - Records present in any datasetanalyze - Detailed field-level comparison between datasetsmerge - Combine datasets with field-level resolutioncompare_fields (array) - Specific fields to compare (for analyze and merge modes)diff, added, removed, and analyze modes, exactly 2 datasets are requiredunion and intersect modes, at least 2 datasets are requiredReturns records that exist in the second dataset but not in the first.
Returns records that exist in the first dataset but not in the second.
Returns only records that exist in all of the specified datasets.
Returns all records from all datasets, with duplicates removed based on the key field.
Performs a detailed field-by-field comparison between two datasets, identifying:
Combines two datasets with conflict resolution, prioritizing the most recent or valid values.
analyze mode, includes detailed change informationcomparison - Metadata about the comparison operation:
mode - Comparison mode useddatasets - Names of datasets comparedkey_field - Field used for record matchingstats - Statistics about the comparison results:added/removed: counts of matching and different recordsanalyze: counts of changed fields and types of changesunion/intersect: record counts by sourcecompare:
datasets:
- customers_2023
- customers_2024
mode: added
on: customer_id
compare:
datasets:
- inventory_before
- inventory_after
mode: analyze
on: sku
compare_fields:
- price
- stock_level
- category
| Dataset 1 (customers_2023) | customer_id | name | status |
|---|---|---|---|
| 1001 | John Smith | Active | |
| 1002 | Mary Jones | Inactive | |
| 1003 | David Lee | Active |
| Dataset 2 (customers_2024) | customer_id | name | status |
|---|---|---|---|
| 1001 | John Smith | Active | |
| 1003 | David Lee | Inactive | |
| 1004 | Sarah Wilson | Active |
| customer_id | name | status |
|---|---|---|
| 1004 | Sarah Wilson | Active |
| Dataset 1 (inventory_before) | sku | product_name | price | stock_level |
|---|---|---|---|---|
| ABC123 | Widget Pro | 29.99 | 150 | |
| DEF456 | Gadget Plus | 49.99 | 75 | |
| GHI789 | Tech Tool | 19.99 | 200 |
| Dataset 2 (inventory_after) | sku | product_name | price | stock_level |
|---|---|---|---|---|
| ABC123 | Widget Pro | 34.99 | 120 | |
| DEF456 | Gadget Plus | 49.99 | 50 | |
| JKL012 | Smart Device | 39.99 | 100 |
| sku | status | field | old_value | new_value | change_type |
|---|---|---|---|---|---|
| ABC123 | Modified | price | 29.99 | 34.99 | modified |
| ABC123 | Modified | stock_level | 150 | 120 | modified |
| DEF456 | Modified | stock_level | 75 | 50 | modified |
| GHI789 | Removed | - | - | - | removed |
| JKL012 | Added | - | - | - | added |