Evaluation Metrics Reference

These metric functions work on arrays of values — typically the spilled output of ML.DATA.SAMPLE on a predictions handle, or a column range from a DataFrame preview. They accept raw cell ranges directly, not object handles. For the model-level score shortcut (a single number from a fitted model), use ML.EVAL.SCORE instead.

Typical metric workflow:

Cell A31: =ML.FIT(A30, B19, B20)           <- Fitted model handle
Cell A32: =ML.PREDICT(A31, B19)            <- Predictions handle
Cell A35: =ML.DATA.SAMPLE(A32, -1)         <- SPILLS y_pred values down column A
Cell C35: =ML.DATA.SAMPLE(B20, -1)         <- SPILLS y_true values down column C
Cell E1:  =ML.EVAL.CLASSIFICATION.ACCURACY(C35:C184, A35:A184)
Cell E2:  =ML.EVAL.CLASSIFICATION.F1(C35:C184, A35:A184, , , "macro")

For score-based classification metrics (ROC_AUC, LOG_LOSS, etc.) you need probabilities — call ML.PREDICT_PROBA(model, X) and spill that 2-D result to pass as y_score.

Regression Metrics (ML.EVAL.REGRESSION)

These metrics compare a 1-D array of ground-truth values (y_true) to a 1-D array of predicted values (y_pred) and return a single float (or an array when multioutput="raw_values").

Common parameters

y_true (Range, Required): Ground-truth target values (1-D).
y_pred (Range, Required): Predicted target values (1-D).
sample_weight (Range, Optional): Per-sample weights (same length as y_true).
multioutput (String, Default: "uniform_average"): "raw_values", "uniform_average", or "variance_weighted" (where supported).

Shared example setup (used in examples below):

Cell A1: =ML.DATASETS.DIABETES()
Cell B1: =ML.DATA.SELECT_COLUMNS(A1, "0:9")  <- X (features)
Cell C1: =ML.DATA.SELECT_COLUMNS(A1, 10)     <- y (target)
Cell D1: =ML.REGRESSION.LINEAR()
Cell E1: =ML.FIT(D1, B1, C1)
Cell F1: =ML.PREDICT(E1, B1)
Cell A10: =ML.DATA.SAMPLE(C1, -1)            <- SPILLS y_true
Cell C10: =ML.DATA.SAMPLE(F1, -1)            <- SPILLS y_pred

ML.EVAL.REGRESSION.R2_SCORE()

R² (coefficient of determination). Best possible score is 1.0; can be negative for poor fits.

Syntax:

=ML.EVAL.REGRESSION.R2_SCORE(y_true, y_pred, [sample_weight], [multioutput], [force_finite])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.
force_finite (Boolean, Default: TRUE): Replace non-finite R² with 1.0 (perfect) or 0.0.

Returns: float (or array if multioutput="raw_values")

Example:

Cell E10: =ML.EVAL.REGRESSION.R2_SCORE(A10:A451, C10:C451)
Result: 0.517

ML.EVAL.REGRESSION.MEAN_ABSOLUTE_ERROR()

Mean absolute error (MAE). Lower is better.

Syntax:

=ML.EVAL.REGRESSION.MEAN_ABSOLUTE_ERROR(y_true, y_pred, [sample_weight], [multioutput])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.

Returns: float (or array)

Example:

Cell E11: =ML.EVAL.REGRESSION.MEAN_ABSOLUTE_ERROR(A10:A451, C10:C451)
Result: 44.3

ML.EVAL.REGRESSION.MEAN_SQUARED_ERROR()

Mean squared error (MSE). Lower is better.

Syntax:

=ML.EVAL.REGRESSION.MEAN_SQUARED_ERROR(y_true, y_pred, [sample_weight], [multioutput])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.

Returns: float (or array)

Example:

Cell E12: =ML.EVAL.REGRESSION.MEAN_SQUARED_ERROR(A10:A451, C10:C451)
Result: 2861.4

ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_ERROR()

Root mean squared error (RMSE). Lower is better. Requires scikit-learn ≥ 1.4.

Syntax:

=ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_ERROR(y_true, y_pred, [sample_weight], [multioutput])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.

Returns: float (or array)

Example:

Cell E13: =ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_ERROR(A10:A451, C10:C451)
Result: 53.5

ML.EVAL.REGRESSION.MEAN_SQUARED_LOG_ERROR()

Mean squared logarithmic error. Use for targets that grow exponentially. Both arrays must be non-negative.

Syntax:

=ML.EVAL.REGRESSION.MEAN_SQUARED_LOG_ERROR(y_true, y_pred, [sample_weight], [multioutput])

Parameters:

y_true (Range, Required): Ground-truth target values (non-negative).
y_pred (Range, Required): Predicted target values (non-negative).
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.

Returns: float (or array)

Example:

Cell E14: =ML.EVAL.REGRESSION.MEAN_SQUARED_LOG_ERROR(A10:A451, C10:C451)
Result: 0.082

Notes:

Both y_true and y_pred must be non-negative.

ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_LOG_ERROR()

Root mean squared logarithmic error. Requires scikit-learn ≥ 1.4. Both arrays must be non-negative.

Syntax:

=ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_LOG_ERROR(y_true, y_pred, [sample_weight], [multioutput])

Parameters:

y_true (Range, Required): Ground-truth target values (non-negative).
y_pred (Range, Required): Predicted target values (non-negative).
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.

Returns: float (or array)

Example:

Cell E15: =ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_LOG_ERROR(A10:A451, C10:C451)
Result: 0.286

Notes:

Both y_true and y_pred must be non-negative.

ML.EVAL.REGRESSION.MEDIAN_ABSOLUTE_ERROR()

Median absolute error. Robust to outliers. Note: parameter order places multioutput before sample_weight.

Syntax:

=ML.EVAL.REGRESSION.MEDIAN_ABSOLUTE_ERROR(y_true, y_pred, [multioutput], [sample_weight])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
multioutput (String, Default: "uniform_average"): Aggregation method.
sample_weight (Range, Optional): Per-sample weights.

Returns: float (or array)

Example:

Cell E16: =ML.EVAL.REGRESSION.MEDIAN_ABSOLUTE_ERROR(A10:A451, C10:C451)
Result: 37.1

ML.EVAL.REGRESSION.MAX_ERROR()

Maximum residual error (worst-case error). 1-D only — multioutput is not supported.

Syntax:

=ML.EVAL.REGRESSION.MAX_ERROR(y_true, y_pred)

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.

Returns: float

Example:

Cell E17: =ML.EVAL.REGRESSION.MAX_ERROR(A10:A451, C10:C451)
Result: 210.4

ML.EVAL.REGRESSION.EXPLAINED_VARIANCE_SCORE()

Explained variance regression score. Best possible score is 1.0.

Syntax:

=ML.EVAL.REGRESSION.EXPLAINED_VARIANCE_SCORE(y_true, y_pred, [sample_weight], [multioutput], [force_finite])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.
force_finite (Boolean, Default: TRUE): Replace non-finite scores with finite values.

Returns: float (or array)

Example:

Cell E18: =ML.EVAL.REGRESSION.EXPLAINED_VARIANCE_SCORE(A10:A451, C10:C451)
Result: 0.518

ML.EVAL.REGRESSION.MEAN_POISSON_DEVIANCE()

Mean Poisson deviance regression loss. Use for count targets. y_true must be non-negative; y_pred must be strictly positive.

Syntax:

=ML.EVAL.REGRESSION.MEAN_POISSON_DEVIANCE(y_true, y_pred, [sample_weight])

Parameters:

y_true (Range, Required): Ground-truth target values (non-negative).
y_pred (Range, Required): Predicted target values (strictly positive).
sample_weight (Range, Optional): Per-sample weights.

Returns: float

Example:

Cell E19: =ML.EVAL.REGRESSION.MEAN_POISSON_DEVIANCE(A10:A451, C10:C451)
Result: 0.31

Notes:

y_true must be non-negative; y_pred must be strictly positive.

ML.EVAL.REGRESSION.MEAN_GAMMA_DEVIANCE()

Mean Gamma deviance regression loss. Use for strictly positive continuous targets. Both arrays must be strictly positive.

Syntax:

=ML.EVAL.REGRESSION.MEAN_GAMMA_DEVIANCE(y_true, y_pred, [sample_weight])

Parameters:

y_true (Range, Required): Ground-truth target values (strictly positive).
y_pred (Range, Required): Predicted target values (strictly positive).
sample_weight (Range, Optional): Per-sample weights.

Returns: float

Example:

Cell E20: =ML.EVAL.REGRESSION.MEAN_GAMMA_DEVIANCE(A10:A451, C10:C451)
Result: 0.18

Notes:

Both arrays must be strictly positive.

ML.EVAL.REGRESSION.MEAN_ABSOLUTE_PERCENTAGE_ERROR()

Mean absolute percentage error (MAPE). Sensitive to small y_true values.

Syntax:

=ML.EVAL.REGRESSION.MEAN_ABSOLUTE_PERCENTAGE_ERROR(y_true, y_pred, [sample_weight], [multioutput])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.

Returns: float (or array)

Example:

Cell E21: =ML.EVAL.REGRESSION.MEAN_ABSOLUTE_PERCENTAGE_ERROR(A10:A451, C10:C451)
Result: 0.34

ML.EVAL.REGRESSION.D2_ABSOLUTE_ERROR_SCORE()

D² score, fraction of absolute error explained. Analogue of R² for MAE.

Syntax:

=ML.EVAL.REGRESSION.D2_ABSOLUTE_ERROR_SCORE(y_true, y_pred, [sample_weight], [multioutput])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
multioutput (String, Default: "uniform_average"): Aggregation method.

Returns: float (or array)

Example:

Cell E22: =ML.EVAL.REGRESSION.D2_ABSOLUTE_ERROR_SCORE(A10:A451, C10:C451)
Result: 0.46

ML.EVAL.REGRESSION.D2_PINBALL_SCORE()

D² score, fraction of pinball loss explained. Used for quantile regression evaluation.

Syntax:

=ML.EVAL.REGRESSION.D2_PINBALL_SCORE(y_true, y_pred, [sample_weight], [alpha], [multioutput])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
alpha (Float, Default: 0.5): Quantile level (0.5 = median).
multioutput (String, Default: "uniform_average"): Aggregation method.

Returns: float (or array)

Example:

Cell E23: =ML.EVAL.REGRESSION.D2_PINBALL_SCORE(A10:A451, C10:C451)
Result: 0.46

ML.EVAL.REGRESSION.D2_TWEEDIE_SCORE()

D² score, fraction of Tweedie deviance explained.

Syntax:

=ML.EVAL.REGRESSION.D2_TWEEDIE_SCORE(y_true, y_pred, [sample_weight], [power])

Parameters:

y_true (Range, Required): Ground-truth target values.
y_pred (Range, Required): Predicted target values.
sample_weight (Range, Optional): Per-sample weights.
power (Float, Default: 0): Tweedie power — 0=Normal, 1=Poisson, 2=Gamma, 3=Inverse Gaussian.

Returns: float

Example:

Cell E24: =ML.EVAL.REGRESSION.D2_TWEEDIE_SCORE(A10:A451, C10:C451)
Result: 0.52

Classification Metrics (ML.EVAL.CLASSIFICATION)

These metrics evaluate classifier predictions. Two input shapes are used depending on the metric:

Label-based metrics take y_pred (predicted class labels from ML.PREDICT)
Score-based metrics take y_score (probabilities from ML.PREDICT_PROBA or decision-function output)

Important: Passing 1-D integer labels to a score-based metric will raise an error. Score-based metrics require continuous probability scores, not class labels.

Common parameters

y_true (Range, Required): Ground-truth class labels.
y_pred or y_score (Range, Required): Predicted labels OR probability/score matrix (depending on metric — see each entry).
sample_weight (Range, Optional): Per-sample weights.
labels (Range, Optional): Subset/order of class labels.
pos_label (Integer or String, Default: 1): Positive class label for binary metrics.
average (String, Default: "binary" or "macro"): "binary", "micro", "macro", "weighted", "samples", or None (returns per-class array).
zero_division (String or Float, Default: "warn"): Behavior on 0/0 division — "warn", 0, 1, or np.nan.

Shared example setup (used in label-based examples below):

Cell A1: =ML.DATASETS.IRIS()
Cell B1: =ML.DATA.SELECT_COLUMNS(A1, "0:3")  <- X (features)
Cell C1: =ML.DATA.SELECT_COLUMNS(A1, 4)      <- y (target)
Cell D1: =ML.CLASSIFICATION.LOGISTIC()
Cell E1: =ML.FIT(D1, B1, C1)
Cell F1: =ML.PREDICT(E1, B1)
Cell A10: =ML.DATA.SAMPLE(C1, -1)             <- SPILLS y_true
Cell C10: =ML.DATA.SAMPLE(F1, -1)             <- SPILLS y_pred

Label-based metrics

These metrics compare y_true to y_pred (class labels, typically from =ML.PREDICT(...)).

ML.EVAL.CLASSIFICATION.ACCURACY()

Accuracy classification score — fraction of correctly classified samples.

Syntax:

=ML.EVAL.CLASSIFICATION.ACCURACY(y_true, y_pred, [normalize], [sample_weight])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_pred (Range, Required): Predicted class labels.
normalize (Boolean, Default: TRUE): If FALSE, returns raw count of correct predictions.
sample_weight (Range, Optional): Per-sample weights.

Returns: float (or int if normalize=FALSE)

Example:

Cell E10: =ML.EVAL.CLASSIFICATION.ACCURACY(A10:A160, C10:C160)
Result: 0.97

ML.EVAL.CLASSIFICATION.BALANCED_ACCURACY()

Balanced accuracy — macro-average of recall per class. Useful for imbalanced datasets.

Syntax:

=ML.EVAL.CLASSIFICATION.BALANCED_ACCURACY(y_true, y_pred, [sample_weight], [adjusted])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_pred (Range, Required): Predicted class labels.
sample_weight (Range, Optional): Per-sample weights.
adjusted (Boolean, Default: FALSE): Adjust so that chance performance equals 0.

Returns: float

Example:

Cell E11: =ML.EVAL.CLASSIFICATION.BALANCED_ACCURACY(A10:A160, C10:C160)
Result: 0.97

ML.EVAL.CLASSIFICATION.F1()

F1 score — harmonic mean of precision and recall.

Syntax:

=ML.EVAL.CLASSIFICATION.F1(y_true, y_pred, [labels], [pos_label], [average], [sample_weight], [zero_division])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_pred (Range, Required): Predicted class labels.
labels (Range, Optional): Subset/order of class labels.
pos_label (Integer or String, Default: 1): Positive class for binary metrics.
average (String, Default: "binary"): Aggregation method.
sample_weight (Range, Optional): Per-sample weights.
zero_division (String or Float, Default: "warn"): Behavior on 0/0.

Returns: float (or array if average=None)

Example:

Cell E12: =ML.EVAL.CLASSIFICATION.F1(A10:A160, C10:C160, , , "macro")
Result: 0.97

ML.EVAL.CLASSIFICATION.PRECISION()

Precision score — TP / (TP + FP).

Syntax:

=ML.EVAL.CLASSIFICATION.PRECISION(y_true, y_pred, [labels], [pos_label], [average], [sample_weight], [zero_division])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_pred (Range, Required): Predicted class labels.
labels (Range, Optional): Subset/order of class labels.
pos_label (Integer or String, Default: 1): Positive class for binary metrics.
average (String, Default: "binary"): Aggregation method.
sample_weight (Range, Optional): Per-sample weights.
zero_division (String or Float, Default: "warn"): Behavior on 0/0.

Returns: float (or array if average=None)

Example:

Cell E13: =ML.EVAL.CLASSIFICATION.PRECISION(A10:A160, C10:C160, , , "macro")
Result: 0.97

ML.EVAL.CLASSIFICATION.RECALL()

Recall score — TP / (TP + FN).

Syntax:

=ML.EVAL.CLASSIFICATION.RECALL(y_true, y_pred, [labels], [pos_label], [average], [sample_weight], [zero_division])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_pred (Range, Required): Predicted class labels.
labels (Range, Optional): Subset/order of class labels.
pos_label (Integer or String, Default: 1): Positive class for binary metrics.
average (String, Default: "binary"): Aggregation method.
sample_weight (Range, Optional): Per-sample weights.
zero_division (String or Float, Default: "warn"): Behavior on 0/0.

Returns: float (or array if average=None)

Example:

Cell E14: =ML.EVAL.CLASSIFICATION.RECALL(A10:A160, C10:C160, , , "macro")
Result: 0.97

ML.EVAL.CLASSIFICATION.JACCARD()

Jaccard similarity coefficient — |intersection| / |union|.

Syntax:

=ML.EVAL.CLASSIFICATION.JACCARD(y_true, y_pred, [labels], [pos_label], [average], [sample_weight], [zero_division])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_pred (Range, Required): Predicted class labels.
labels (Range, Optional): Subset/order of class labels.
pos_label (Integer or String, Default: 1): Positive class for binary metrics.
average (String, Default: "binary"): Aggregation method.
sample_weight (Range, Optional): Per-sample weights.
zero_division (String or Float, Default: "warn"): Behavior on 0/0.

Returns: float (or array if average=None)

Example:

Cell E15: =ML.EVAL.CLASSIFICATION.JACCARD(A10:A160, C10:C160, , , "macro")
Result: 0.94

ML.EVAL.CLASSIFICATION.MATTHEWS_CORRCOEF()

Matthews correlation coefficient. Range [-1, 1]; 0 = random. Robust to class imbalance.

Syntax:

=ML.EVAL.CLASSIFICATION.MATTHEWS_CORRCOEF(y_true, y_pred, [sample_weight])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_pred (Range, Required): Predicted class labels.
sample_weight (Range, Optional): Per-sample weights.

Returns: float

Example:

Cell E16: =ML.EVAL.CLASSIFICATION.MATTHEWS_CORRCOEF(A10:A160, C10:C160)
Result: 0.96

Score-based metrics

These metrics require continuous probability scores, not class labels. Use =ML.PREDICT_PROBA(model, X) and spill the result into a 2-D range to use as y_score.

Warning: Passing integer class labels (from ML.PREDICT) to these functions will raise an error.

Shared example setup for score-based metrics:

Cell A1: =ML.DATASETS.IRIS()
Cell B1: =ML.DATA.SELECT_COLUMNS(A1, "0:3")  <- X (features)
Cell C1: =ML.DATA.SELECT_COLUMNS(A1, 4)      <- y (target)
Cell D1: =ML.CLASSIFICATION.LOGISTIC()
Cell E1: =ML.FIT(D1, B1, C1)
Cell G1: =ML.PREDICT_PROBA(E1, B1)           <- Probabilities handle
Cell A10: =ML.DATA.SAMPLE(C1, -1)             <- SPILLS y_true
Cell C10: =ML.DATA.SAMPLE(G1, -1)             <- SPILLS probability scores (SPILLS 150×3)

ML.EVAL.CLASSIFICATION.TOP_K_ACCURACY()

Top-k accuracy — counts a prediction correct if the true label is among the top-k predicted classes. Requires y_score (probabilities or decision scores).

Syntax:

=ML.EVAL.CLASSIFICATION.TOP_K_ACCURACY(y_true, y_score, [k], [normalize], [sample_weight], [labels])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_score (Range, Required): Probability or decision score matrix (2-D).
k (Integer, Default: 2): Number of top predictions to consider.
normalize (Boolean, Default: TRUE): If FALSE, returns raw count.
sample_weight (Range, Optional): Per-sample weights.
labels (Range, Optional): Subset/order of class labels.

Returns: float (or int)

Example:

Cell E17: =ML.EVAL.CLASSIFICATION.TOP_K_ACCURACY(A10:A160, C10:C312, 2)
Result: 0.99

ML.EVAL.CLASSIFICATION.ROC_AUC()

Area under the ROC curve. Requires y_score (probabilities or decision function output).

Syntax:

=ML.EVAL.CLASSIFICATION.ROC_AUC(y_true, y_score, [average], [sample_weight], [max_fpr], [multi_class], [labels])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_score (Range, Required): Probability or decision score matrix.
average (String, Default: "macro"): "micro", "macro", "weighted", "samples", or None.
sample_weight (Range, Optional): Per-sample weights.
max_fpr (Float, Optional): Compute standardized partial AUC over [0, max_fpr].
multi_class (String, Default: "raise"): "raise", "ovr", or "ovo" (required for > 2 classes).
labels (Range, Optional): Subset/order of class labels.

Returns: float

Example:

Cell E18: =ML.EVAL.CLASSIFICATION.ROC_AUC(A10:A160, C10:C312, "macro", , , "ovr")
Result: 0.999

Notes:

Errors if only one class is present in y_true.

ML.EVAL.CLASSIFICATION.AVERAGE_PRECISION()

Average precision — area under the precision-recall curve. Requires y_score.

Syntax:

=ML.EVAL.CLASSIFICATION.AVERAGE_PRECISION(y_true, y_score, [average], [pos_label], [sample_weight])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_score (Range, Required): Probability or decision score matrix.
average (String, Optional): Aggregation method.
pos_label (Integer or String, Default: 1): Positive class for binary metrics.
sample_weight (Range, Optional): Per-sample weights.

Returns: float

Example:

Cell E19: =ML.EVAL.CLASSIFICATION.AVERAGE_PRECISION(A10:A160, C10:C312)
Result: 0.98

ML.EVAL.CLASSIFICATION.BRIER_SCORE_LOSS()

Brier score — mean squared error between predicted probabilities and true labels. Lower is better. Requires y_score.

Syntax:

=ML.EVAL.CLASSIFICATION.BRIER_SCORE_LOSS(y_true, y_score, [sample_weight], [pos_label])

Parameters:

y_true (Range, Required): Ground-truth class labels (binary).
y_score (Range, Required): Predicted probabilities for the positive class.
sample_weight (Range, Optional): Per-sample weights.
pos_label (Integer or String, Optional): Positive class label.

Returns: float

Example:

Cell E20: =ML.EVAL.CLASSIFICATION.BRIER_SCORE_LOSS(A10:A160, C10:C160)
Result: 0.03

ML.EVAL.CLASSIFICATION.LOG_LOSS()

Cross-entropy / log loss. Requires y_score (probabilities). Lower is better.

Syntax:

=ML.EVAL.CLASSIFICATION.LOG_LOSS(y_true, y_score, [normalize], [sample_weight], [labels])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_score (Range, Required): Predicted probabilities (2-D for multi-class).
normalize (Boolean, Default: TRUE): If FALSE, returns sum of per-sample losses.
sample_weight (Range, Optional): Per-sample weights.
labels (Range, Optional): Subset/order of class labels.

Returns: float

Example:

Cell E21: =ML.EVAL.CLASSIFICATION.LOG_LOSS(A10:A160, C10:C312)
Result: 0.07

ML.EVAL.CLASSIFICATION.D2_LOG_LOSS_SCORE()

D² score, fraction of log loss explained. Analogue of R² for log loss. Requires y_score.

Syntax:

=ML.EVAL.CLASSIFICATION.D2_LOG_LOSS_SCORE(y_true, y_score, [sample_weight], [labels])

Parameters:

y_true (Range, Required): Ground-truth class labels.
y_score (Range, Required): Predicted probabilities (2-D for multi-class).
sample_weight (Range, Optional): Per-sample weights.
labels (Range, Optional): Subset/order of class labels.

Returns: float

Example:

Cell E22: =ML.EVAL.CLASSIFICATION.D2_LOG_LOSS_SCORE(A10:A160, C10:C312)
Result: 0.91

Clustering Metrics (ML.EVAL.CLUSTERING)

These metrics compare two label arrays: labels_true (ground-truth cluster assignments, when available) and labels_pred (predicted cluster assignments, e.g., from ML.PREDICT on a fitted KMeans). All functions take 1-D arrays of integer labels and return a single float.

Common parameters

labels_true (Range, Required): Ground-truth cluster assignments (1-D integer array).
labels_pred (Range, Required): Predicted cluster assignments (1-D integer array).

Shared example setup (used in examples below):

Cell A1: =ML.DATASETS.IRIS()
Cell B1: =ML.DATA.SELECT_COLUMNS(A1, "0:3")  <- X (features)
Cell C1: =ML.DATA.SELECT_COLUMNS(A1, 4)      <- y_true (ground-truth labels)
Cell D1: =ML.CLUSTERING.KMEANS(3)
Cell E1: =ML.FIT(D1, B1)
Cell F1: =ML.PREDICT(E1, B1)                 <- Cluster predictions handle
Cell A10: =ML.DATA.SAMPLE(C1, -1)             <- SPILLS labels_true
Cell C10: =ML.DATA.SAMPLE(F1, -1)             <- SPILLS labels_pred

ML.EVAL.CLUSTERING.ADJUSTED_RAND_SCORE()

Adjusted Rand index. Range [-0.5, 1.0]; 0 = random, 1 = perfect match. Adjusted for chance.

Syntax:

=ML.EVAL.CLUSTERING.ADJUSTED_RAND_SCORE(labels_true, labels_pred)

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.

Returns: float

Example:

Using the setup above (KMeans on Iris, comparing predicted clusters to ground-truth species):

Cell E10: =ML.EVAL.CLUSTERING.ADJUSTED_RAND_SCORE(A10:A160, C10:C160)
Result: 0.73

ML.EVAL.CLUSTERING.RAND_SCORE()

Rand index — similarity measure between clusterings. Range [0, 1]. Not adjusted for chance.

Syntax:

=ML.EVAL.CLUSTERING.RAND_SCORE(labels_true, labels_pred)

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.

Returns: float

Example:

Cell E11: =ML.EVAL.CLUSTERING.RAND_SCORE(A10:A160, C10:C160)
Result: 0.88

ML.EVAL.CLUSTERING.MUTUAL_INFO_SCORE()

Mutual information between two clusterings.

Syntax:

=ML.EVAL.CLUSTERING.MUTUAL_INFO_SCORE(labels_true, labels_pred)

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.

Returns: float

Example:

Cell E12: =ML.EVAL.CLUSTERING.MUTUAL_INFO_SCORE(A10:A160, C10:C160)
Result: 0.87

ML.EVAL.CLUSTERING.ADJUSTED_MUTUAL_INFO_SCORE()

Adjusted mutual information. Range [0, 1]; adjusted for chance.

Syntax:

=ML.EVAL.CLUSTERING.ADJUSTED_MUTUAL_INFO_SCORE(labels_true, labels_pred, [average_method])

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.
average_method (String, Default: "arithmetic"): "arithmetic", "geometric", "min", or "max".

Returns: float

Example:

Cell E13: =ML.EVAL.CLUSTERING.ADJUSTED_MUTUAL_INFO_SCORE(A10:A160, C10:C160)
Result: 0.74

ML.EVAL.CLUSTERING.NORMALIZED_MUTUAL_INFO_SCORE()

Mutual information normalized by the average of the entropies. Range [0, 1].

Syntax:

=ML.EVAL.CLUSTERING.NORMALIZED_MUTUAL_INFO_SCORE(labels_true, labels_pred, [average_method])

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.
average_method (String, Default: "arithmetic"): "arithmetic", "geometric", "min", or "max".

Returns: float

Example:

Cell E14: =ML.EVAL.CLUSTERING.NORMALIZED_MUTUAL_INFO_SCORE(A10:A160, C10:C160)
Result: 0.76

ML.EVAL.CLUSTERING.HOMOGENEITY_SCORE()

Homogeneity — each cluster contains only members of a single class. Range [0, 1].

Syntax:

=ML.EVAL.CLUSTERING.HOMOGENEITY_SCORE(labels_true, labels_pred)

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.

Returns: float

Example:

Cell E15: =ML.EVAL.CLUSTERING.HOMOGENEITY_SCORE(A10:A160, C10:C160)
Result: 0.75

ML.EVAL.CLUSTERING.COMPLETENESS_SCORE()

Completeness — all members of a given class are assigned to the same cluster. Range [0, 1].

Syntax:

=ML.EVAL.CLUSTERING.COMPLETENESS_SCORE(labels_true, labels_pred)

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.

Returns: float

Example:

Cell E16: =ML.EVAL.CLUSTERING.COMPLETENESS_SCORE(A10:A160, C10:C160)
Result: 0.76

ML.EVAL.CLUSTERING.V_MEASURE_SCORE()

V-measure — harmonic mean of homogeneity and completeness. Range [0, 1].

Syntax:

=ML.EVAL.CLUSTERING.V_MEASURE_SCORE(labels_true, labels_pred, [beta])

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.
beta (Float, Default: 1.0): Weight of homogeneity vs completeness (> 1 weights completeness more).

Returns: float

Example:

Cell E17: =ML.EVAL.CLUSTERING.V_MEASURE_SCORE(A10:A160, C10:C160)
Result: 0.76

ML.EVAL.CLUSTERING.FOWLKES_MALLOWS_SCORE()

Fowlkes-Mallows index — geometric mean of pairwise precision and recall. Range [0, 1].

Syntax:

=ML.EVAL.CLUSTERING.FOWLKES_MALLOWS_SCORE(labels_true, labels_pred, [sparse])

Parameters:

labels_true (Range, Required): Ground-truth cluster assignments.
labels_pred (Range, Required): Predicted cluster assignments.
sparse (Boolean, Default: FALSE): Use sparse matrix internally for large label arrays.

Returns: float

Example:

Cell E18: =ML.EVAL.CLUSTERING.FOWLKES_MALLOWS_SCORE(A10:A160, C10:C160)
Result: 0.82

Back to Function Reference
Evaluation Functions — model-level scoring, cross-validation, grid search
Data Functions — loading and manipulating data
Model Methods — ML.FIT, ML.PREDICT, ML.TRANSFORM
Documentation Home

Table of Contents

Evaluation Metrics Reference

Regression Metrics (ML.EVAL.REGRESSION)

Common parameters

ML.EVAL.REGRESSION.R2_SCORE()

ML.EVAL.REGRESSION.MEAN_ABSOLUTE_ERROR()

ML.EVAL.REGRESSION.MEAN_SQUARED_ERROR()

ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_ERROR()

ML.EVAL.REGRESSION.MEAN_SQUARED_LOG_ERROR()

ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_LOG_ERROR()

ML.EVAL.REGRESSION.MEDIAN_ABSOLUTE_ERROR()

ML.EVAL.REGRESSION.MAX_ERROR()

ML.EVAL.REGRESSION.EXPLAINED_VARIANCE_SCORE()

ML.EVAL.REGRESSION.MEAN_POISSON_DEVIANCE()

ML.EVAL.REGRESSION.MEAN_GAMMA_DEVIANCE()

ML.EVAL.REGRESSION.MEAN_ABSOLUTE_PERCENTAGE_ERROR()

ML.EVAL.REGRESSION.D2_ABSOLUTE_ERROR_SCORE()

ML.EVAL.REGRESSION.D2_PINBALL_SCORE()

ML.EVAL.REGRESSION.D2_TWEEDIE_SCORE()

Classification Metrics (ML.EVAL.CLASSIFICATION)

Common parameters

Label-based metrics

ML.EVAL.CLASSIFICATION.ACCURACY()

ML.EVAL.CLASSIFICATION.BALANCED_ACCURACY()

ML.EVAL.CLASSIFICATION.F1()

ML.EVAL.CLASSIFICATION.PRECISION()

ML.EVAL.CLASSIFICATION.RECALL()

ML.EVAL.CLASSIFICATION.JACCARD()

ML.EVAL.CLASSIFICATION.MATTHEWS_CORRCOEF()

Score-based metrics

ML.EVAL.CLASSIFICATION.TOP_K_ACCURACY()

ML.EVAL.CLASSIFICATION.ROC_AUC()

ML.EVAL.CLASSIFICATION.AVERAGE_PRECISION()

ML.EVAL.CLASSIFICATION.BRIER_SCORE_LOSS()

ML.EVAL.CLASSIFICATION.LOG_LOSS()

ML.EVAL.CLASSIFICATION.D2_LOG_LOSS_SCORE()

Clustering Metrics (ML.EVAL.CLUSTERING)

Common parameters

ML.EVAL.CLUSTERING.ADJUSTED_RAND_SCORE()

ML.EVAL.CLUSTERING.RAND_SCORE()

ML.EVAL.CLUSTERING.MUTUAL_INFO_SCORE()

ML.EVAL.CLUSTERING.ADJUSTED_MUTUAL_INFO_SCORE()

ML.EVAL.CLUSTERING.NORMALIZED_MUTUAL_INFO_SCORE()

ML.EVAL.CLUSTERING.HOMOGENEITY_SCORE()

ML.EVAL.CLUSTERING.COMPLETENESS_SCORE()

ML.EVAL.CLUSTERING.V_MEASURE_SCORE()

ML.EVAL.CLUSTERING.FOWLKES_MALLOWS_SCORE()

Navigation