Model Methods Reference

Core functions for training models, making predictions, and creating machine learning pipelines.

Core ML Functions

ML.FIT()

Trains an estimator or transformer on the provided data.

Syntax:

=ML.FIT(model, X, y)

Parameters:

model (Object, Required): Untrained model or transformer object
X (Object, Required): Training features (DataFrame or array)
y (Object, Optional): Training target (for supervised learning)

Returns: Trained model object

Use Case: Train supervised/unsupervised models, fit transformers

Example:

# Train regression model
Cell A1: =ML.REGRESSION.LINEAR()
Cell B1: =ML.FIT(A1, X_train, y_train)

# Fit transformer (no y needed)
Cell C1: =ML.PREPROCESSING.STANDARD_SCALER()
Cell D1: =ML.FIT(C1, X_train)

# Train clustering (no y needed)
Cell E1: =ML.CLUSTERING.KMEANS(3)
Cell F1: =ML.FIT(E1, X_data)

ML.PREDICT()

Makes predictions using a trained model.

Syntax:

=ML.PREDICT(model, X)

Parameters:

model (Object, Required): Trained model object
X (Object, Required): Features for prediction

Returns: Predictions (array or DataFrame)

Use Case: Generate predictions for regression, classification, or clustering

Example:

# Predict with regression model
Cell A1: =ML.PREDICT(trained_regression, X_test)

# Predict with classifier
Cell B1: =ML.PREDICT(trained_classifier, X_test)

# Get cluster labels
Cell C1: =ML.PREDICT(trained_kmeans, X_data)

ML.TRANSFORM()

Transforms data using a fitted transformer.

Syntax:

=ML.TRANSFORM(transformer, X, y)

Parameters:

transformer (Object, Required): Fitted transformer object
X (Object, Required): Data to transform
y (Object, Optional): Target variable (rarely used)

Returns: Transformed data

Use Case: Apply preprocessing transformations, dimensionality reduction

Example:

# Scale test data using fitted scaler
Cell A1: =ML.TRANSFORM(fitted_scaler, X_test)

# Apply PCA transformation
Cell B1: =ML.TRANSFORM(fitted_pca, X_test)

# Encode categorical features
Cell C1: =ML.TRANSFORM(fitted_encoder, categories_test)

ML.FIT_TRANSFORM()

Fits transformer and transforms data in one step.

Syntax:

=ML.FIT_TRANSFORM(transformer, X, y)

Parameters:

transformer (Object, Required): Unfitted transformer object
X (Object, Required): Data to fit and transform
y (Object, Optional): Target variable (rarely used)

Returns: Transformed data

Use Case: Quick fit and transform (use only on training data)

Example:

# Fit and transform training data
Cell A1: =ML.PREPROCESSING.STANDARD_SCALER()
Cell B1: =ML.FIT_TRANSFORM(A1, X_train)

# Fit PCA and get components
Cell C1: =ML.DIM_REDUCTION.PCA(2)
Cell D1: =ML.FIT_TRANSFORM(C1, X_train)

# IMPORTANT: For test data, use fitted transformer
Cell E1: =ML.TRANSFORM(C1, X_test)  # Don't fit_transform test!

ML.PIPELINE()

Creates a pipeline of transformers and estimators.

Syntax:

=ML.PIPELINE(*args)

Parameters:

*args (Objects, Required): Sequence of transformers and final estimator
- All but last must be transformers (have fit/transform)
- Last can be transformer or estimator

Returns: Pipeline object

Use Case: Chain preprocessing steps with model, prevent data leakage

Example:

# Simple pipeline: scaler + model
Cell A1: =ML.PREPROCESSING.STANDARD_SCALER()
Cell A2: =ML.REGRESSION.LINEAR()
Cell B1: =ML.PIPELINE(A1, A2)

# Complex pipeline: multiple transformers + model
Cell C1: =ML.IMPUTE.SIMPLE_IMPUTER("mean")
Cell C2: =ML.PREPROCESSING.STANDARD_SCALER()
Cell C3: =ML.DIM_REDUCTION.PCA(10)
Cell C4: =ML.CLASSIFICATION.SVM()
Cell D1: =ML.PIPELINE(C1, C2, C3, C4)

# Fit and use pipeline
Cell E1: =ML.FIT(D1, X_train, y_train)
Cell F1: =ML.PREDICT(E1, X_test)

ML.OBJECT_INFO()

Returns information about a model or transformer object.

Syntax:

=ML.OBJECT_INFO(obj)

Parameters:

obj (Object, Required): Any ML object

Returns: String with object information

Use Case: Debug, inspect object state

Example:

# Get info about trained model
Cell A1: =ML.OBJECT_INFO(trained_model)

Common Patterns

Complete Training Workflow

# Create model
Cell A1: =ML.CLASSIFICATION.LOGISTIC()

# Fit on training data
Cell B1: =ML.FIT(A1, X_train, y_train)

# Make predictions
Cell C1: =ML.PREDICT(B1, X_test)

# Evaluate
Cell D1: =ML.EVAL.SCORE(B1, X_test, y_test)

Preprocessing + Model Pipeline

# Create components
Cell A1: =ML.PREPROCESSING.STANDARD_SCALER()
Cell A2: =ML.CLASSIFICATION.SVM(1.0, "rbf")

# Create pipeline
Cell B1: =ML.PIPELINE(A1, A2)

# Train pipeline (auto scales then trains)
Cell C1: =ML.FIT(B1, X_train, y_train)

# Predict (auto scales then predicts)
Cell D1: =ML.PREDICT(C1, X_test)

Dimensionality Reduction Workflow

# Create PCA transformer
Cell A1: =ML.DIM_REDUCTION.PCA(2)

# Fit and transform training data
Cell B1: =ML.FIT_TRANSFORM(A1, X_train)

# Transform test data (use same PCA fit)
Cell C1: =ML.TRANSFORM(A1, X_test)

# Sample transformed data
Cell D1: =ML.DATA.SAMPLE(B1, 10)
Cell D2: =ML.DATA.SAMPLE(C1, 10)

Feature Scaling Best Practice

# Create scaler
Cell A1: =ML.PREPROCESSING.STANDARD_SCALER()

# Fit on training data only!
Cell B1: =ML.FIT(A1, X_train)

# Transform both train and test
Cell C1: =ML.TRANSFORM(B1, X_train)  # Training data
Cell C2: =ML.TRANSFORM(B1, X_test)   # Test data

# Or use FIT_TRANSFORM for training
Cell D1: =ML.FIT_TRANSFORM(A1, X_train)  # Equivalent to C1
Cell D2: =ML.TRANSFORM(A1, X_test)       # Must use TRANSFORM for test

Multi-Step Pipeline

# Step 1: Impute missing values
Cell A1: =ML.IMPUTE.SIMPLE_IMPUTER("mean")

# Step 2: Scale features
Cell A2: =ML.PREPROCESSING.STANDARD_SCALER()

# Step 3: Reduce dimensions
Cell A3: =ML.DIM_REDUCTION.PCA(20)

# Step 4: Train classifier
Cell A4: =ML.CLASSIFICATION.RANDOM_FOREST_CLF(100)

# Create pipeline
Cell B1: =ML.PIPELINE(A1, A2, A3, A4)

# Fit entire pipeline
Cell C1: =ML.FIT(B1, X_train, y_train)

# Predict (all steps applied automatically)
Cell D1: =ML.PREDICT(C1, X_test)

Transformer-Only Pipeline

# Create preprocessing-only pipeline
Cell A1: =ML.PREPROCESSING.ROBUST_SCALER()
Cell A2: =ML.DIM_REDUCTION.PCA(10)

# Combine transformers
Cell B1: =ML.PIPELINE(A1, A2)

# Fit and transform
Cell C1: =ML.FIT_TRANSFORM(B1, X_train)
Cell C2: =ML.TRANSFORM(B1, X_test)

# Now use transformed data for modeling
Cell D1: =ML.REGRESSION.RIDGE(1.0)
Cell E1: =ML.FIT(D1, C1, y_train)

Pipeline with Grid Search

# Create pipeline
Cell A1: =ML.PREPROCESSING.STANDARD_SCALER()
Cell A2: =ML.CLASSIFICATION.SVM()
Cell B1: =ML.PIPELINE(A1, A2)

# Parameter grid for pipeline steps
# Model | Parameter | Value1 | Value2 | Value3
Cell C1: "model" | "C" | 0.1 | 1 | 10
Cell C2: "model" | "kernel" | "linear" | "rbf" |

# Grid search on pipeline
Cell D1: =ML.EVAL.GRID_SEARCH(B1, C1:E2, "accuracy", 5, TRUE)
Cell E1: =ML.FIT(D1, X_train, y_train)

# Get best parameters
Cell F1: =ML.EVAL.BEST_PARAMS(E1)

Reusing Fitted Transformers

# Fit scaler once
Cell A1: =ML.PREPROCESSING.STANDARD_SCALER()
Cell B1: =ML.FIT(A1, X_train)

# Use for multiple purposes
Cell C1: =ML.TRANSFORM(B1, X_train)  # Scaled training
Cell C2: =ML.TRANSFORM(B1, X_test)   # Scaled test
Cell C3: =ML.TRANSFORM(B1, X_new)    # Scaled new data

# All use same scaling parameters

Tips and Best Practices

Fit/Transform Pattern
- FIT on training data only
- TRANSFORM both train and test
- FIT_TRANSFORM = convenience for train only
- Never fit on test data (data leakage!)
Pipeline Benefits
- Prevents data leakage automatically
- Ensures correct preprocessing order
- Simplifies code and deployment
- Works seamlessly with grid search
When to Use Each Function
- FIT: Supervised models, transformers
- PREDICT: After fitting estimators
- TRANSFORM: After fitting transformers
- FIT_TRANSFORM: Quick train preprocessing
- PIPELINE: Combine multiple steps
Pipeline Best Practices
- Order: imputation → encoding → scaling → dim reduction → model
- All steps except last must be transformers
- Last step can be estimator or transformer
- Use clear, descriptive step names

Common Workflows

Regression: Scale → Model → Predict
Classification: Encode → Scale → Model → Predict
Clustering: Scale → Cluster → Labels
Dim Reduction: Scale → PCA → Transform

Avoiding Data Leakage
- ✅ Pipeline ensures no leakage
- ✅ Fit transformers on train only
- ✅ Use same fitted objects for test
- ❌ Never fit_transform on test
- ❌ Never include test in fitting
Memory and Performance
- FIT_TRANSFORM is more efficient than FIT + TRANSFORM
- Pipelines cache intermediate results
- Reuse fitted objects when possible
- Consider data size for complex pipelines
Debugging Pipelines
- Test each step individually first
- Use ML.DATA.SAMPLE to inspect outputs
- Check shapes at each step
- Use ML.INSPECT.GET_PARAMS for diagnostics

ML.PREPROCESSING Functions - Data scaling and encoding
ML.DIM_REDUCTION Functions - PCA and more
ML.IMPUTE Functions - Handle missing values
ML.EVAL Functions - Model evaluation
ML.COMPOSE Functions - Advanced pipelines

Model Methods Reference

Table of Contents

Model Methods Reference

Core ML Functions

ML.FIT()

ML.PREDICT()

ML.TRANSFORM()

ML.FIT_TRANSFORM()

ML.PIPELINE()

ML.OBJECT_INFO()

Common Patterns

Complete Training Workflow

Preprocessing + Model Pipeline

Dimensionality Reduction Workflow

Feature Scaling Best Practice

Multi-Step Pipeline

Transformer-Only Pipeline

Pipeline with Grid Search

Reusing Fitted Transformers

Tips and Best Practices

Navigation

Table of Contents

Model Methods Reference

Core ML Functions

ML.FIT()

ML.PREDICT()

ML.TRANSFORM()

ML.FIT_TRANSFORM()

ML.PIPELINE()

ML.OBJECT_INFO()

Common Patterns

Complete Training Workflow

Preprocessing + Model Pipeline

Dimensionality Reduction Workflow

Feature Scaling Best Practice

Multi-Step Pipeline

Transformer-Only Pipeline

Pipeline with Grid Search

Reusing Fitted Transformers

Tips and Best Practices

Related Functions

Navigation