Table of Contents
Evaluation Functions Reference
Functions for evaluating model performance, cross-validation, and hyperparameter optimization.
ML.EVAL Namespace
ML.EVAL.SCORE()
Evaluates model performance on test data.
Syntax:
=ML.EVAL.SCORE(model, X, y)
Parameters:
model
(Object, Required): Trained model objectX
(Object, Required): Test featuresy
(Object, Required): True target values
Returns: Float score value
- Regression: R² score (coefficient of determination)
- Classification: Mean accuracy
Use Case: Quick model performance evaluation
Example:
# Evaluate regression model
Cell A1: =ML.EVAL.SCORE(trained_regression, X_test, y_test)
Result: 0.85 # R² score
# Evaluate classifier
Cell B1: =ML.EVAL.SCORE(trained_classifier, X_test, y_test)
Result: 0.92 # Accuracy
ML.EVAL.CV_SCORE() ⭐
Performs cross-validation on a model (Premium feature).
Syntax:
=ML.EVAL.CV_SCORE(model, X, y, cv, scoring)
Parameters:
model
(Object, Required): Unfitted model objectX
(Object, Required): Training featuresy
(Object, Required): Training targetcv
(Integer, Required): Number of cross-validation foldsscoring
(String, Required): Scoring metric- Regression: “r2”, “neg_mean_squared_error”, “neg_mean_absolute_error”
- Classification: “accuracy”, “precision”, “recall”, “f1”
Returns: Array of scores (one per fold)
Use Case: Robust model evaluation, detect overfitting
Example:
# 5-fold cross-validation
Cell A1: =ML.EVAL.CV_SCORE(model, X_train, y_train, 5, "accuracy")
Result: [0.89, 0.91, 0.88, 0.90, 0.92] # 5 scores
# Average CV score
Cell B1: =AVERAGE(A1#)
Result: 0.90
ML.EVAL.GRID_SEARCH() ⭐
Performs exhaustive hyperparameter search (Premium feature).
Syntax:
=ML.EVAL.GRID_SEARCH(model, param_grid, scoring, cv, refit)
Parameters:
model
(Object, Required): Unfitted model or pipelineparam_grid
(DataFrame, Required): Parameter combinations to test- Format: Model | Parameter | Value1 | Value2 | …
scoring
(String/Array, Optional): Scoring metric(s)cv
(Integer, Optional): Cross-validation foldsrefit
(Boolean, Optional): Refit best model on full data (default: TRUE)
Returns: GridSearchCV object with best model
Use Case: Find optimal hyperparameters automatically
Example:
# Create parameter grid
# Cell B1:E3
# Model | Parameter | Value1 | Value2 | Value3
# model | C | 0.1 | 1 | 10
# model | kernel | linear | rbf |
Cell A1: =ML.CLASSIFICATION.SVM()
Cell A2: =ML.EVAL.GRID_SEARCH(A1, B1:E3, "accuracy", 5, TRUE)
Cell A3: =ML.FIT(A2, X_train, y_train)
# Now A3 contains best model
ML.EVAL.BEST_PARAMS() ⭐
Extracts best parameters from grid search (Premium feature).
Syntax:
=ML.EVAL.BEST_PARAMS(grid_search_model)
Parameters:
grid_search_model
(Object, Required): Fitted GridSearchCV object
Returns: DataFrame with best parameters
- Columns: Model | Parameter | Value
Use Case: Identify optimal hyperparameters
Example:
# After grid search
Cell A1: =ML.EVAL.BEST_PARAMS(fitted_grid_search)
Result:
# Model | Parameter | Value
# model | C | 10
# model | kernel | rbf
ML.EVAL.BEST_SCORE() ⭐
Gets the best cross-validation score from grid search (Premium feature).
Syntax:
=ML.EVAL.BEST_SCORE(grid_search_model)
Parameters:
grid_search_model
(Object, Required): Fitted GridSearchCV object
Returns: Float - best CV score achieved
Use Case: Compare grid search results
Example:
Cell A1: =ML.EVAL.BEST_SCORE(fitted_grid_search)
Result: 0.9456 # Best cross-validation score
ML.EVAL.SEARCH_RESULTS() ⭐
Returns detailed grid search results (Premium feature).
Syntax:
=ML.EVAL.SEARCH_RESULTS(grid_search_model)
Parameters:
grid_search_model
(Object, Required): Fitted GridSearchCV object
Returns: DataFrame with all parameter combinations and scores
Use Case: Analyze all tested combinations, identify patterns
Example:
Cell A1: =ML.EVAL.SEARCH_RESULTS(fitted_grid_search)
# Returns table with all parameter combos and their scores
Common Patterns
Basic Model Evaluation
# Train model
Cell A1: =ML.REGRESSION.LINEAR()
Cell B1: =ML.FIT(A1, X_train, y_train)
# Evaluate on test set
Cell C1: =ML.EVAL.SCORE(B1, X_test, y_test)
Result: 0.847 # R² score
# Check predictions
Cell D1: =ML.PREDICT(B1, X_test)
Cell E1: =ML.DATA.SAMPLE(D1, 10)
Cross-Validation Workflow
# Create model
Cell A1: =ML.CLASSIFICATION.SVM(1.0, "rbf")
# 10-fold cross-validation
Cell B1: =ML.EVAL.CV_SCORE(A1, X_train, y_train, 10, "accuracy")
# Calculate mean and std
Cell C1: =AVERAGE(B1#) # Mean: 0.913
Cell C2: =STDEV(B1#) # Std: 0.032
# Final training on full dataset
Cell D1: =ML.FIT(A1, X_train, y_train)
Cell E1: =ML.EVAL.SCORE(D1, X_test, y_test)
Complete Grid Search Workflow
# Create base model
Cell A1: =ML.CLASSIFICATION.RANDOM_FOREST_CLF()
# Define parameter grid
# Model | Parameter | V1 | V2 | V3
Cell B1: "model" | "n_estimators" | 50 | 100 | 200
Cell B2: "model" | "max_depth" | 5 | 10 | 20
Cell B3: "model" | "min_samples_split" | 2 | 5 | 10
# Grid search
Cell C1: =ML.EVAL.GRID_SEARCH(A1, B1:E3, "accuracy", 5, TRUE)
Cell D1: =ML.FIT(C1, X_train, y_train)
# Get best parameters
Cell E1: =ML.EVAL.BEST_PARAMS(D1)
Cell E2: =ML.EVAL.BEST_SCORE(D1)
# Evaluate on test set
Cell F1: =ML.EVAL.SCORE(D1, X_test, y_test)
# Detailed results
Cell G1: =ML.EVAL.SEARCH_RESULTS(D1)
Pipeline Grid Search
# Create pipeline
Cell A1: =ML.PREPROCESSING.STANDARD_SCALER()
Cell A2: =ML.CLASSIFICATION.SVM()
Cell B1: =ML.PIPELINE(A1, A2)
# Pipeline parameter grid (use step__param format)
# Model | Parameter | V1 | V2
Cell C1: "model" | "C" | 0.1 | 1.0 | 10
Cell C2: "model" | "kernel" | "linear" | "rbf" |
# Grid search on pipeline
Cell D1: =ML.EVAL.GRID_SEARCH(B1, C1:E2, "accuracy", 5, TRUE)
Cell E1: =ML.FIT(D1, X_train, y_train)
# Best params and score
Cell F1: =ML.EVAL.BEST_PARAMS(E1)
Cell F2: =ML.EVAL.BEST_SCORE(E1)
Comparing Multiple Models
# Create different models
Cell A1: =ML.CLASSIFICATION.LOGISTIC()
Cell A2: =ML.CLASSIFICATION.SVM()
Cell A3: =ML.CLASSIFICATION.RANDOM_FOREST_CLF()
# Cross-validate each
Cell B1: =ML.EVAL.CV_SCORE(A1, X_train, y_train, 5, "accuracy")
Cell B2: =ML.EVAL.CV_SCORE(A2, X_train, y_train, 5, "accuracy")
Cell B3: =ML.EVAL.CV_SCORE(A3, X_train, y_train, 5, "accuracy")
# Compare mean scores
Cell C1: =AVERAGE(B1#) # Logistic
Cell C2: =AVERAGE(B2#) # SVM
Cell C3: =AVERAGE(B3#) # Random Forest
# Select best and train on full data
Cell D1: =ML.FIT(A3, X_train, y_train) # Assuming RF was best
Cell E1: =ML.EVAL.SCORE(D1, X_test, y_test)
Multi-Metric Grid Search
# Create model
Cell A1: =ML.CLASSIFICATION.LOGISTIC()
# Parameter grid
Cell B1: "model" | "C" | 0.01 | 0.1 | 1.0 | 10
Cell B2: "model" | "penalty" | "l1" | "l2" |
# Grid search with multiple metrics
Cell C1: =ML.EVAL.GRID_SEARCH(A1, B1:E2, {"accuracy","precision","recall"}, 5, TRUE)
Cell D1: =ML.FIT(C1, X_train, y_train)
# Get results for all metrics
Cell E1: =ML.EVAL.SEARCH_RESULTS(D1)
Regression Model Tuning
# Create regression model
Cell A1: =ML.REGRESSION.RANDOM_FOREST_REG()
# Parameter grid
Cell B1: "model" | "n_estimators" | 100 | 200 | 300
Cell B2: "model" | "max_depth" | 10 | 20 | 30
Cell B3: "model" | "min_samples_leaf" | 1 | 2 | 5
# Grid search with R² scoring
Cell C1: =ML.EVAL.GRID_SEARCH(A1, B1:D3, "r2", 5, TRUE)
Cell D1: =ML.FIT(C1, X_train, y_train)
# Best parameters
Cell E1: =ML.EVAL.BEST_PARAMS(D1)
Cell E2: =ML.EVAL.BEST_SCORE(D1)
# Test set performance
Cell F1: =ML.EVAL.SCORE(D1, X_test, y_test)
Tips and Best Practices
-
Choosing Evaluation Metrics
- Regression: R², MSE, MAE
- R²: Overall fit quality (0-1)
- MSE: Penalizes large errors
- MAE: Robust to outliers
- Classification: Accuracy, Precision, Recall, F1
- Accuracy: Balanced datasets
- Precision: Minimize false positives
- Recall: Minimize false negatives
- F1: Balance precision and recall
- Regression: R², MSE, MAE
-
Cross-Validation Strategy
- 5-fold: Good default
- 10-fold: More reliable, slower
- 3-fold: Quick testing
- Always use same cv for fair comparison
- Stratified CV for classification (automatic)
-
Grid Search Best Practices
- Start with wide range, then narrow
- Use logarithmic scales for some params (e.g., C: 0.001, 0.01, 0.1, 1, 10)
- Limit grid size (computation grows exponentially)
- Use cross-validation (cv parameter)
-
Parameter Ranges
SVM C: [0.001, 0.01, 0.1, 1, 10, 100] SVM gamma: ['scale', 'auto', 0.001, 0.01, 0.1] Random Forest n_estimators: [50, 100, 200, 500] Random Forest max_depth: [5, 10, 20, None] Regularization alpha: [0.001, 0.01, 0.1, 1, 10]
-
Interpreting Results
- High train, low test: Overfitting
- Low train, low test: Underfitting
- CV std > 0.05: Model unstable
- Test < CV mean: Potential issue
-
Avoiding Common Mistakes
- ❌ Fit on test set (data leakage)
- ❌ Grid search on test set
- ❌ No cross-validation
- ❌ Too large parameter grids
- ✅ Always use separate test set
- ✅ Use CV for model selection
- ✅ Report both CV and test scores
-
Optimization Workflow
1. Train baseline model 2. Cross-validate to check stability 3. Grid search for hyperparameters 4. Retrain with best params 5. Final evaluation on test set 6. Report both CV and test scores
-
Performance Tips
- Reduce cv for faster experimentation
- Limit grid size (try RandomSearchCV alternative)
- Use scoring parameter efficiently
- Cache results when possible
Scoring Metrics Reference
Regression Metrics
r2
: R² score (default)neg_mean_squared_error
: Negative MSEneg_mean_absolute_error
: Negative MAEneg_root_mean_squared_error
: Negative RMSE
Classification Metrics
accuracy
: Accuracy (default)precision
: Precisionrecall
: Recall (Sensitivity)f1
: F1 Scoreroc_auc
: ROC AUCf1_weighted
: Weighted F1 (multi-class)
Related Functions
- ML.FIT() - Train models
- ML.PREDICT() - Make predictions
- Model Functions - All model types
- ML.INSPECT Functions - Model inspection