trainModel - Parameter Tuning Cheat Sheet

machine learning

slug: reference-trainmodel-parameter-tuning-cheat-sheet

practical quick-reference table training ML models, showing which parameters to adjust, for which model types, and what effect they typically have.

`trainModel` Parameter Tuning Cheat Sheet

Parameter	Model Type	Purpose / Effect	Guidance / Tuning Tips
`learning_rate`	Linear, Logistic	Controls the step size during gradient descent	Lower → more stable but slower; Higher → faster convergence but may overshoot
`max_iterations`	Linear, Logistic	Maximum training iterations for gradient descent	Increase if model hasn’t converged; decrease if convergence is fast
`normalize`	Linear, Logistic, k-NN	Scale numeric features	Keep `true` for k-NN or gradient-based models; ensures balanced feature contribution
`k`	k-NN	Number of neighbors	Smaller → sensitive to noise (overfit); Larger → smoother predictions (underfit)
`distance_metric`	k-NN	How distances are calculated	`euclidean` (default) or `manhattan`; affects neighbor selection
`categorical`	All	Columns treated as categorical	`'auto'` usually works; specify manually if automatic detection fails
`missing_values`	All	Handling of missing data	`'error'` → fail on missing; `'ignore'` → skip rows; `'impute'` → fill missing values
`test_size`	All	Fraction of data reserved for testing	Use 0.1–0.3; smaller datasets may require less, larger datasets more
`random_state`	All	Seed for reproducibility	Pick any integer; keeps train/test splits and model initialization consistent
`params`	Model-specific	Hyperparameters for optimization	e.g., linear regression → learning_rate, max_iterations; tune incrementally for best performance

Suggested Workflow for Tuning

Baseline Run
- Train with default parameters
- Evaluate test set metrics (MAE, MSE, R² for regression; accuracy/F1 for classification)
Adjust Parameters Incrementally
- Learning rate & max_iterations for gradient-based models
- k & distance_metric for k-NN
- Missing value handling or normalization if performance is poor
Evaluate Effects
- Compare metrics on training vs test set
- Detect overfitting (high train, low test) or underfitting (low train, low test)
Feature Engineering
- Add or remove features to improve predictive power
- Transform skewed distributions if needed
Document Each Run
- Record parameter settings and resulting metrics
- Helps identify the combination that generalizes best

DAZL Documentation | Data Analytics A-to-Z Processing Language

Contents

Quick Index Pages (1)

Steps (34)

Recipes (24)

Topic Maps (18)

Examples (18)

Tutorials (6)

Reference (7)

trainModel - Parameter Tuning Cheat Sheet

`trainModel` Parameter Tuning Cheat Sheet

Suggested Workflow for Tuning

DAZL Documentation | Data Analytics A-to-Z Processing Language

Contents

Quick Index Pages (1)

Steps (34)

Recipes (24)

Topic Maps (18)

Examples (18)

Tutorials (6)

Reference (7)

trainModel - Parameter Tuning Cheat Sheet

trainModel Parameter Tuning Cheat Sheet

Suggested Workflow for Tuning

`trainModel` Parameter Tuning Cheat Sheet