machine learning
slug: step-usemodelApplies a previously trained machine learning model to a dataset to generate predictions. This step allows workflows to score new records, append predicted values as a new column, and integrate ML outputs directly into the data pipeline.
data, pdv, and extras.$params.model reference.MachineLearning class to generate predictions for each row in the dataset.output_column parameter (default: prediction).model (string) — Reference to the trained model within the workflow interpreter._interpreter — Internal workflow object used to fetch the trained model from the pipeline.output_column (string) — Name of the column to store predictions. Default: "prediction"categorical (string or array) — Columns treated as categorical; default 'auto'missing_values (string) — How to handle missing values; default 'error'normalize (boolean) — Whether numeric features are normalized; default truetest_size (float) — Reserved for compatibility; default nullrandom_state (int) — Random seed for reproducibility; default 42k (int) — Number of neighbors for k-NN models; default 5distance_metric (string) — Metric for distance-based models; default 'euclidean'data) must be an array of associative arrays (rows).model parameter must reference a previously trained model stored in extras['ml'].| Key | Description |
|---|---|
data |
Original dataset with a new column containing predictions |
pdv |
Metadata about dataset columns |
extras |
Pipeline metadata and diagnostics |
outputType |
"array" — Indicates structured array output |
steps:
- loadInline:
data:
- {age: 22, income: 38000, spend: 800}
- {age: 25, income: 45000, spend: 1200}
- {age: 29, income: 56000, spend: 1800}
output: scoringData
- useModel:
dataset: scoringData
model: trainedModel
output_column: predictedOutcome
_interpreter: __interpreter__
[
{"age":22,"income":38000,"spend":800,"predictedOutcome":1},
{"age":25,"income":45000,"spend":1200,"predictedOutcome":0},
{"age":29,"income":56000,"spend":1800,"predictedOutcome":1}
]
extras['ml']