Function Reference

Complete reference for all FormulaML functions organized by namespace. Each function includes syntax, parameters, return values, and practical examples.

Function Categories

📊 Data Functions

Functions for loading, exploring, and manipulating data.

  • ML.DATASETS.* - Built-in datasets (Iris, Diabetes, Digits, OpenML)
  • ML.DATA.* - Data manipulation and exploration

📈 Regression Models

Functions for predicting continuous values.

  • ML.REGRESSION.LINEAR - Linear Regression
  • ML.REGRESSION.RIDGE - Ridge Regression (L2 regularization)
  • ML.REGRESSION.LASSO - Lasso Regression (L1 regularization)
  • ML.REGRESSION.ELASTIC_NET - Elastic Net (L1 + L2)
  • ML.REGRESSION.RANDOM_FOREST_REG - Random Forest Regression ⭐

🎯 Classification Models

Functions for categorizing data.

  • ML.CLASSIFICATION.LOGISTIC - Logistic Regression
  • ML.CLASSIFICATION.SVM - Support Vector Machines
  • ML.CLASSIFICATION.RANDOM_FOREST_CLF - Random Forest Classifier ⭐

🔍 Clustering Models

Functions for finding groups in data.

  • ML.CLUSTERING.KMEANS - K-Means clustering with advanced parameters

⚙️ Preprocessing Functions

Functions for preparing data.

  • ML.PREPROCESSING.TRAIN_TEST_SPLIT - Split train/test sets
  • ML.PREPROCESSING.STANDARD_SCALER - Standardize features
  • ML.PREPROCESSING.MIN_MAX_SCALER - Scale to range [0,1]
  • ML.PREPROCESSING.ROBUST_SCALER - Scale robust to outliers
  • ML.PREPROCESSING.ONE_HOT_ENCODER - One-hot encode categories
  • ML.PREPROCESSING.ORDINAL_ENCODER - Ordinal encode categories

📏 Evaluation Functions

Functions for assessing model performance.

  • ML.EVAL.SCORE - R² or accuracy score
  • ML.EVAL.CV_SCORE - Cross-validation ⭐
  • ML.EVAL.GRID_SEARCH - Hyperparameter tuning ⭐
  • ML.EVAL.BEST_PARAMS - Extract best parameters ⭐
  • ML.EVAL.BEST_SCORE - Get best CV score ⭐
  • ML.EVAL.SEARCH_RESULTS - Detailed search results ⭐

🔧 Core ML Functions

Essential functions for model training and prediction.

  • ML.FIT - Train models and transformers
  • ML.PREDICT - Make predictions
  • ML.TRANSFORM - Transform data
  • ML.FIT_TRANSFORM - Fit and transform in one step
  • ML.PIPELINE - Create ML workflows
  • ML.OBJECT_INFO - Inspect object details

📉 Dimensionality Reduction

Functions for reducing feature dimensions.

  • ML.DIM_REDUCTION.PCA - Principal Component Analysis
  • ML.DIM_REDUCTION.PCA.RESULTS - PCA detailed results
  • ML.DIM_REDUCTION.KERNEL_PCA - Non-linear PCA ⭐

🔬 Inspection Tools

Functions for model analysis and visualization.

  • ML.INSPECT.GET_PARAMS - Extract model parameters
  • ML.INSPECT.DECISION_BOUNDARY - Visualize decision boundaries

🧩 Compose Functions

Functions for advanced column transformations.

  • ML.COMPOSE.COLUMN_TRANSFORMER - Apply transformer to columns
  • ML.COMPOSE.DATA_TRANSFORMER - Combine transformers
  • ML.COMPOSE.COLUMN_SELECTOR - Select columns by pattern/type
  • ML.COMPOSE.TRANSFORMERS.DROP - Drop columns
  • ML.COMPOSE.TRANSFORMERS.PASSTHROUGH - Pass columns unchanged

🔧 Impute & Feature Selection

Functions for handling missing values and selecting features.

  • ML.IMPUTE.SIMPLE_IMPUTER - Impute missing values
  • ML.FEATURE_SELECTION.SELECT_PERCENTILE - Select top features

Function Naming Convention

All FormulaML functions follow a consistent naming pattern:

ML.[NAMESPACE].[FUNCTION_NAME](parameters)

Examples:

  • ML.DATASETS.IRIS() - Load Iris dataset
  • ML.REGRESSION.LINEAR() - Create linear regression model
  • ML.EVAL.SCORE() - Evaluate model performance

Free vs Premium Functions

✅ Free Functions

Most core functionality is available in the free tier:

  • Basic models (Linear, Logistic, SVM, K-Means)
  • Data handling and exploration
  • Model training and prediction
  • Basic evaluation

⭐ Premium Functions

Advanced capabilities require premium subscription:

  • Random Forest models
  • Cross-validation (ML.EVAL.CV_SCORE)
  • Grid search (ML.EVAL.GRID_SEARCH)
  • Kernel PCA (ML.DIM_REDUCTION.KERNEL_PCA)
  • OpenML datasets (ML.DATASETS.OPENML)

Premium functions are marked with a ⭐ icon in the documentation.

Understanding Object Handles

Many functions return or accept “object handles” - references to complex data structures:

Cell A1: =ML.DATASETS.IRIS()           → Returns: <Dataset>
Cell A2: =ML.REGRESSION.LINEAR()       → Returns: <LinearRegression>
Cell A3: =ML.FIT(A2, features, target) → Returns: <LinearRegression> (with 🧠 brain icon)

These handles allow Excel to manage complex ML objects efficiently.

Common Parameters

Frequently Used Parameters

random_state

  • Type: Integer
  • Purpose: Ensures reproducible results
  • Example: 42 (any integer works)

fit_intercept

  • Type: Boolean (TRUE/FALSE)
  • Purpose: Whether to calculate the intercept
  • Default: TRUE

alpha

  • Type: Float
  • Purpose: Regularization strength
  • Range: > 0 (higher = more regularization)

n_estimators

  • Type: Integer
  • Purpose: Number of trees in ensemble
  • Default: 100

max_iter

  • Type: Integer
  • Purpose: Maximum iterations
  • Default: Varies by algorithm

Return Value Types

Functions return different types of values:

  1. Object Handles: Complex objects (models, dataframes)

    • Example: <SVC>
  2. Numeric Values: Single numbers

    • Example: 0.95 (accuracy score)
  3. Arrays: Multiple values

    • Example: Cross-validation scores
  4. DataFrames: Tabular data

    • Example: Sample data, parameters

Quick Function Lookup

By Task

Load Data:

  • ML.DATASETS.IRIS() - Classification dataset
  • ML.DATASETS.DIABETES() - Regression dataset
  • ML.DATA.CONVERT_TO_DF() - Excel to DataFrame

Explore Data:

  • ML.DATA.INFO() - Data structure
  • ML.DATA.DESCRIBE() - Statistics
  • ML.DATA.SAMPLE() - View rows

Prepare Data:

  • ML.DATA.SELECT_COLUMNS() - Choose columns
  • ML.PREPROCESSING.TRAIN_TEST_SPLIT() - Split data
  • ML.PREPROCESSING.STANDARD_SCALER() - Scale features

Train Models:

  • ML.FIT() - Train any model
  • ML.PREDICT() - Make predictions
  • ML.TRANSFORM() - Transform data

Evaluate:

  • ML.EVAL.SCORE() - Basic scoring
  • ML.EVAL.CV_SCORE() - Cross-validation ⭐
  • ML.EVAL.GRID_SEARCH() - Hyperparameter tuning ⭐

Error Messages

Common error messages and their meanings:

“Object handle not found”

  • The referenced cell doesn’t contain a valid object
  • Solution: Check cell reference is correct

“Invalid parameter value”

  • Parameter is outside acceptable range
  • Solution: Check parameter constraints

“Dimension mismatch”

  • Data shapes don’t match
  • Solution: Ensure X and y have same number of rows

“Premium function”

  • Function requires premium subscription
  • Solution: Upgrade or use free alternative

Best Practices

  1. Always use consistent data shapes

    • Features (X) and target (y) must have same number of rows
  2. Set random_state for reproducibility

    • Use same seed value across related operations
  3. Check data types

    • Ensure numerical data isn’t stored as text
  4. Handle missing values

    • Use ML.IMPUTE or clean data before analysis
  5. Start with simple models

    • Use as baseline before complex models

Browse functions by category:

Or return to: