Table of Contents
Evaluation Metrics Reference
These metric functions work on arrays of values — typically the spilled output of ML.DATA.SAMPLE on a predictions handle, or a column range from a DataFrame preview. They accept raw cell ranges directly, not object handles. For the model-level score shortcut (a single number from a fitted model), use ML.EVAL.SCORE instead.
Typical metric workflow:
Cell A31: =ML.FIT(A30, B19, B20) <- Fitted model handle
Cell A32: =ML.PREDICT(A31, B19) <- Predictions handle
Cell A35: =ML.DATA.SAMPLE(A32, -1) <- SPILLS y_pred values down column A
Cell C35: =ML.DATA.SAMPLE(B20, -1) <- SPILLS y_true values down column C
Cell E1: =ML.EVAL.CLASSIFICATION.ACCURACY(C35:C184, A35:A184)
Cell E2: =ML.EVAL.CLASSIFICATION.F1(C35:C184, A35:A184, , , "macro")
For score-based classification metrics (ROC_AUC, LOG_LOSS, etc.) you need probabilities — call ML.PREDICT_PROBA(model, X) and spill that 2-D result to pass as y_score.
Regression Metrics (ML.EVAL.REGRESSION)
These metrics compare a 1-D array of ground-truth values (y_true) to a 1-D array of predicted values (y_pred) and return a single float (or an array when multioutput="raw_values").
Common parameters
y_true(Range, Required): Ground-truth target values (1-D).y_pred(Range, Required): Predicted target values (1-D).sample_weight(Range, Optional): Per-sample weights (same length asy_true).multioutput(String, Default:"uniform_average"):"raw_values","uniform_average", or"variance_weighted"(where supported).
Shared example setup (used in examples below):
Cell A1: =ML.DATASETS.DIABETES()
Cell B1: =ML.DATA.SELECT_COLUMNS(A1, "0:9") <- X (features)
Cell C1: =ML.DATA.SELECT_COLUMNS(A1, 10) <- y (target)
Cell D1: =ML.REGRESSION.LINEAR()
Cell E1: =ML.FIT(D1, B1, C1)
Cell F1: =ML.PREDICT(E1, B1)
Cell A10: =ML.DATA.SAMPLE(C1, -1) <- SPILLS y_true
Cell C10: =ML.DATA.SAMPLE(F1, -1) <- SPILLS y_pred
ML.EVAL.REGRESSION.R2_SCORE()
R² (coefficient of determination). Best possible score is 1.0; can be negative for poor fits.
Syntax:
=ML.EVAL.REGRESSION.R2_SCORE(y_true, y_pred, [sample_weight], [multioutput], [force_finite])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.force_finite(Boolean, Default:TRUE): Replace non-finite R² with 1.0 (perfect) or 0.0.
Returns: float (or array if multioutput="raw_values")
Example:
Cell E10: =ML.EVAL.REGRESSION.R2_SCORE(A10:A451, C10:C451)
Result: 0.517
ML.EVAL.REGRESSION.MEAN_ABSOLUTE_ERROR()
Mean absolute error (MAE). Lower is better.
Syntax:
=ML.EVAL.REGRESSION.MEAN_ABSOLUTE_ERROR(y_true, y_pred, [sample_weight], [multioutput])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.
Returns: float (or array)
Example:
Cell E11: =ML.EVAL.REGRESSION.MEAN_ABSOLUTE_ERROR(A10:A451, C10:C451)
Result: 44.3
ML.EVAL.REGRESSION.MEAN_SQUARED_ERROR()
Mean squared error (MSE). Lower is better.
Syntax:
=ML.EVAL.REGRESSION.MEAN_SQUARED_ERROR(y_true, y_pred, [sample_weight], [multioutput])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.
Returns: float (or array)
Example:
Cell E12: =ML.EVAL.REGRESSION.MEAN_SQUARED_ERROR(A10:A451, C10:C451)
Result: 2861.4
ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_ERROR()
Root mean squared error (RMSE). Lower is better. Requires scikit-learn ≥ 1.4.
Syntax:
=ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_ERROR(y_true, y_pred, [sample_weight], [multioutput])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.
Returns: float (or array)
Example:
Cell E13: =ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_ERROR(A10:A451, C10:C451)
Result: 53.5
ML.EVAL.REGRESSION.MEAN_SQUARED_LOG_ERROR()
Mean squared logarithmic error. Use for targets that grow exponentially. Both arrays must be non-negative.
Syntax:
=ML.EVAL.REGRESSION.MEAN_SQUARED_LOG_ERROR(y_true, y_pred, [sample_weight], [multioutput])
Parameters:
y_true(Range, Required): Ground-truth target values (non-negative).y_pred(Range, Required): Predicted target values (non-negative).sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.
Returns: float (or array)
Example:
Cell E14: =ML.EVAL.REGRESSION.MEAN_SQUARED_LOG_ERROR(A10:A451, C10:C451)
Result: 0.082
Notes:
- Both
y_trueandy_predmust be non-negative.
ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_LOG_ERROR()
Root mean squared logarithmic error. Requires scikit-learn ≥ 1.4. Both arrays must be non-negative.
Syntax:
=ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_LOG_ERROR(y_true, y_pred, [sample_weight], [multioutput])
Parameters:
y_true(Range, Required): Ground-truth target values (non-negative).y_pred(Range, Required): Predicted target values (non-negative).sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.
Returns: float (or array)
Example:
Cell E15: =ML.EVAL.REGRESSION.ROOT_MEAN_SQUARED_LOG_ERROR(A10:A451, C10:C451)
Result: 0.286
Notes:
- Both
y_trueandy_predmust be non-negative.
ML.EVAL.REGRESSION.MEDIAN_ABSOLUTE_ERROR()
Median absolute error. Robust to outliers. Note: parameter order places multioutput before sample_weight.
Syntax:
=ML.EVAL.REGRESSION.MEDIAN_ABSOLUTE_ERROR(y_true, y_pred, [multioutput], [sample_weight])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.multioutput(String, Default:"uniform_average"): Aggregation method.sample_weight(Range, Optional): Per-sample weights.
Returns: float (or array)
Example:
Cell E16: =ML.EVAL.REGRESSION.MEDIAN_ABSOLUTE_ERROR(A10:A451, C10:C451)
Result: 37.1
ML.EVAL.REGRESSION.MAX_ERROR()
Maximum residual error (worst-case error). 1-D only — multioutput is not supported.
Syntax:
=ML.EVAL.REGRESSION.MAX_ERROR(y_true, y_pred)
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.
Returns: float
Example:
Cell E17: =ML.EVAL.REGRESSION.MAX_ERROR(A10:A451, C10:C451)
Result: 210.4
ML.EVAL.REGRESSION.EXPLAINED_VARIANCE_SCORE()
Explained variance regression score. Best possible score is 1.0.
Syntax:
=ML.EVAL.REGRESSION.EXPLAINED_VARIANCE_SCORE(y_true, y_pred, [sample_weight], [multioutput], [force_finite])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.force_finite(Boolean, Default:TRUE): Replace non-finite scores with finite values.
Returns: float (or array)
Example:
Cell E18: =ML.EVAL.REGRESSION.EXPLAINED_VARIANCE_SCORE(A10:A451, C10:C451)
Result: 0.518
ML.EVAL.REGRESSION.MEAN_POISSON_DEVIANCE()
Mean Poisson deviance regression loss. Use for count targets. y_true must be non-negative; y_pred must be strictly positive.
Syntax:
=ML.EVAL.REGRESSION.MEAN_POISSON_DEVIANCE(y_true, y_pred, [sample_weight])
Parameters:
y_true(Range, Required): Ground-truth target values (non-negative).y_pred(Range, Required): Predicted target values (strictly positive).sample_weight(Range, Optional): Per-sample weights.
Returns: float
Example:
Cell E19: =ML.EVAL.REGRESSION.MEAN_POISSON_DEVIANCE(A10:A451, C10:C451)
Result: 0.31
Notes:
y_truemust be non-negative;y_predmust be strictly positive.
ML.EVAL.REGRESSION.MEAN_GAMMA_DEVIANCE()
Mean Gamma deviance regression loss. Use for strictly positive continuous targets. Both arrays must be strictly positive.
Syntax:
=ML.EVAL.REGRESSION.MEAN_GAMMA_DEVIANCE(y_true, y_pred, [sample_weight])
Parameters:
y_true(Range, Required): Ground-truth target values (strictly positive).y_pred(Range, Required): Predicted target values (strictly positive).sample_weight(Range, Optional): Per-sample weights.
Returns: float
Example:
Cell E20: =ML.EVAL.REGRESSION.MEAN_GAMMA_DEVIANCE(A10:A451, C10:C451)
Result: 0.18
Notes:
- Both arrays must be strictly positive.
ML.EVAL.REGRESSION.MEAN_ABSOLUTE_PERCENTAGE_ERROR()
Mean absolute percentage error (MAPE). Sensitive to small y_true values.
Syntax:
=ML.EVAL.REGRESSION.MEAN_ABSOLUTE_PERCENTAGE_ERROR(y_true, y_pred, [sample_weight], [multioutput])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.
Returns: float (or array)
Example:
Cell E21: =ML.EVAL.REGRESSION.MEAN_ABSOLUTE_PERCENTAGE_ERROR(A10:A451, C10:C451)
Result: 0.34
ML.EVAL.REGRESSION.D2_ABSOLUTE_ERROR_SCORE()
D² score, fraction of absolute error explained. Analogue of R² for MAE.
Syntax:
=ML.EVAL.REGRESSION.D2_ABSOLUTE_ERROR_SCORE(y_true, y_pred, [sample_weight], [multioutput])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.multioutput(String, Default:"uniform_average"): Aggregation method.
Returns: float (or array)
Example:
Cell E22: =ML.EVAL.REGRESSION.D2_ABSOLUTE_ERROR_SCORE(A10:A451, C10:C451)
Result: 0.46
ML.EVAL.REGRESSION.D2_PINBALL_SCORE()
D² score, fraction of pinball loss explained. Used for quantile regression evaluation.
Syntax:
=ML.EVAL.REGRESSION.D2_PINBALL_SCORE(y_true, y_pred, [sample_weight], [alpha], [multioutput])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.alpha(Float, Default:0.5): Quantile level (0.5 = median).multioutput(String, Default:"uniform_average"): Aggregation method.
Returns: float (or array)
Example:
Cell E23: =ML.EVAL.REGRESSION.D2_PINBALL_SCORE(A10:A451, C10:C451)
Result: 0.46
ML.EVAL.REGRESSION.D2_TWEEDIE_SCORE()
D² score, fraction of Tweedie deviance explained.
Syntax:
=ML.EVAL.REGRESSION.D2_TWEEDIE_SCORE(y_true, y_pred, [sample_weight], [power])
Parameters:
y_true(Range, Required): Ground-truth target values.y_pred(Range, Required): Predicted target values.sample_weight(Range, Optional): Per-sample weights.power(Float, Default:0): Tweedie power — 0=Normal, 1=Poisson, 2=Gamma, 3=Inverse Gaussian.
Returns: float
Example:
Cell E24: =ML.EVAL.REGRESSION.D2_TWEEDIE_SCORE(A10:A451, C10:C451)
Result: 0.52
Classification Metrics (ML.EVAL.CLASSIFICATION)
These metrics evaluate classifier predictions. Two input shapes are used depending on the metric:
- Label-based metrics take
y_pred(predicted class labels fromML.PREDICT) - Score-based metrics take
y_score(probabilities fromML.PREDICT_PROBAor decision-function output)
Important: Passing 1-D integer labels to a score-based metric will raise an error. Score-based metrics require continuous probability scores, not class labels.
Common parameters
y_true(Range, Required): Ground-truth class labels.y_predory_score(Range, Required): Predicted labels OR probability/score matrix (depending on metric — see each entry).sample_weight(Range, Optional): Per-sample weights.labels(Range, Optional): Subset/order of class labels.pos_label(Integer or String, Default:1): Positive class label for binary metrics.average(String, Default:"binary"or"macro"):"binary","micro","macro","weighted","samples", orNone(returns per-class array).zero_division(String or Float, Default:"warn"): Behavior on 0/0 division —"warn",0,1, ornp.nan.
Shared example setup (used in label-based examples below):
Cell A1: =ML.DATASETS.IRIS()
Cell B1: =ML.DATA.SELECT_COLUMNS(A1, "0:3") <- X (features)
Cell C1: =ML.DATA.SELECT_COLUMNS(A1, 4) <- y (target)
Cell D1: =ML.CLASSIFICATION.LOGISTIC()
Cell E1: =ML.FIT(D1, B1, C1)
Cell F1: =ML.PREDICT(E1, B1)
Cell A10: =ML.DATA.SAMPLE(C1, -1) <- SPILLS y_true
Cell C10: =ML.DATA.SAMPLE(F1, -1) <- SPILLS y_pred
Label-based metrics
These metrics compare y_true to y_pred (class labels, typically from =ML.PREDICT(...)).
ML.EVAL.CLASSIFICATION.ACCURACY()
Accuracy classification score — fraction of correctly classified samples.
Syntax:
=ML.EVAL.CLASSIFICATION.ACCURACY(y_true, y_pred, [normalize], [sample_weight])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_pred(Range, Required): Predicted class labels.normalize(Boolean, Default:TRUE): IfFALSE, returns raw count of correct predictions.sample_weight(Range, Optional): Per-sample weights.
Returns: float (or int if normalize=FALSE)
Example:
Cell E10: =ML.EVAL.CLASSIFICATION.ACCURACY(A10:A160, C10:C160)
Result: 0.97
ML.EVAL.CLASSIFICATION.BALANCED_ACCURACY()
Balanced accuracy — macro-average of recall per class. Useful for imbalanced datasets.
Syntax:
=ML.EVAL.CLASSIFICATION.BALANCED_ACCURACY(y_true, y_pred, [sample_weight], [adjusted])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_pred(Range, Required): Predicted class labels.sample_weight(Range, Optional): Per-sample weights.adjusted(Boolean, Default:FALSE): Adjust so that chance performance equals 0.
Returns: float
Example:
Cell E11: =ML.EVAL.CLASSIFICATION.BALANCED_ACCURACY(A10:A160, C10:C160)
Result: 0.97
ML.EVAL.CLASSIFICATION.F1()
F1 score — harmonic mean of precision and recall.
Syntax:
=ML.EVAL.CLASSIFICATION.F1(y_true, y_pred, [labels], [pos_label], [average], [sample_weight], [zero_division])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_pred(Range, Required): Predicted class labels.labels(Range, Optional): Subset/order of class labels.pos_label(Integer or String, Default:1): Positive class for binary metrics.average(String, Default:"binary"): Aggregation method.sample_weight(Range, Optional): Per-sample weights.zero_division(String or Float, Default:"warn"): Behavior on 0/0.
Returns: float (or array if average=None)
Example:
Cell E12: =ML.EVAL.CLASSIFICATION.F1(A10:A160, C10:C160, , , "macro")
Result: 0.97
ML.EVAL.CLASSIFICATION.PRECISION()
Precision score — TP / (TP + FP).
Syntax:
=ML.EVAL.CLASSIFICATION.PRECISION(y_true, y_pred, [labels], [pos_label], [average], [sample_weight], [zero_division])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_pred(Range, Required): Predicted class labels.labels(Range, Optional): Subset/order of class labels.pos_label(Integer or String, Default:1): Positive class for binary metrics.average(String, Default:"binary"): Aggregation method.sample_weight(Range, Optional): Per-sample weights.zero_division(String or Float, Default:"warn"): Behavior on 0/0.
Returns: float (or array if average=None)
Example:
Cell E13: =ML.EVAL.CLASSIFICATION.PRECISION(A10:A160, C10:C160, , , "macro")
Result: 0.97
ML.EVAL.CLASSIFICATION.RECALL()
Recall score — TP / (TP + FN).
Syntax:
=ML.EVAL.CLASSIFICATION.RECALL(y_true, y_pred, [labels], [pos_label], [average], [sample_weight], [zero_division])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_pred(Range, Required): Predicted class labels.labels(Range, Optional): Subset/order of class labels.pos_label(Integer or String, Default:1): Positive class for binary metrics.average(String, Default:"binary"): Aggregation method.sample_weight(Range, Optional): Per-sample weights.zero_division(String or Float, Default:"warn"): Behavior on 0/0.
Returns: float (or array if average=None)
Example:
Cell E14: =ML.EVAL.CLASSIFICATION.RECALL(A10:A160, C10:C160, , , "macro")
Result: 0.97
ML.EVAL.CLASSIFICATION.JACCARD()
Jaccard similarity coefficient — |intersection| / |union|.
Syntax:
=ML.EVAL.CLASSIFICATION.JACCARD(y_true, y_pred, [labels], [pos_label], [average], [sample_weight], [zero_division])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_pred(Range, Required): Predicted class labels.labels(Range, Optional): Subset/order of class labels.pos_label(Integer or String, Default:1): Positive class for binary metrics.average(String, Default:"binary"): Aggregation method.sample_weight(Range, Optional): Per-sample weights.zero_division(String or Float, Default:"warn"): Behavior on 0/0.
Returns: float (or array if average=None)
Example:
Cell E15: =ML.EVAL.CLASSIFICATION.JACCARD(A10:A160, C10:C160, , , "macro")
Result: 0.94
ML.EVAL.CLASSIFICATION.MATTHEWS_CORRCOEF()
Matthews correlation coefficient. Range [-1, 1]; 0 = random. Robust to class imbalance.
Syntax:
=ML.EVAL.CLASSIFICATION.MATTHEWS_CORRCOEF(y_true, y_pred, [sample_weight])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_pred(Range, Required): Predicted class labels.sample_weight(Range, Optional): Per-sample weights.
Returns: float
Example:
Cell E16: =ML.EVAL.CLASSIFICATION.MATTHEWS_CORRCOEF(A10:A160, C10:C160)
Result: 0.96
Score-based metrics
These metrics require continuous probability scores, not class labels. Use =ML.PREDICT_PROBA(model, X) and spill the result into a 2-D range to use as y_score.
Warning: Passing integer class labels (from
ML.PREDICT) to these functions will raise an error.
Shared example setup for score-based metrics:
Cell A1: =ML.DATASETS.IRIS()
Cell B1: =ML.DATA.SELECT_COLUMNS(A1, "0:3") <- X (features)
Cell C1: =ML.DATA.SELECT_COLUMNS(A1, 4) <- y (target)
Cell D1: =ML.CLASSIFICATION.LOGISTIC()
Cell E1: =ML.FIT(D1, B1, C1)
Cell G1: =ML.PREDICT_PROBA(E1, B1) <- Probabilities handle
Cell A10: =ML.DATA.SAMPLE(C1, -1) <- SPILLS y_true
Cell C10: =ML.DATA.SAMPLE(G1, -1) <- SPILLS probability scores (SPILLS 150×3)
ML.EVAL.CLASSIFICATION.TOP_K_ACCURACY()
Top-k accuracy — counts a prediction correct if the true label is among the top-k predicted classes. Requires y_score (probabilities or decision scores).
Syntax:
=ML.EVAL.CLASSIFICATION.TOP_K_ACCURACY(y_true, y_score, [k], [normalize], [sample_weight], [labels])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_score(Range, Required): Probability or decision score matrix (2-D).k(Integer, Default:2): Number of top predictions to consider.normalize(Boolean, Default:TRUE): IfFALSE, returns raw count.sample_weight(Range, Optional): Per-sample weights.labels(Range, Optional): Subset/order of class labels.
Returns: float (or int)
Example:
Cell E17: =ML.EVAL.CLASSIFICATION.TOP_K_ACCURACY(A10:A160, C10:C312, 2)
Result: 0.99
ML.EVAL.CLASSIFICATION.ROC_AUC()
Area under the ROC curve. Requires y_score (probabilities or decision function output).
Syntax:
=ML.EVAL.CLASSIFICATION.ROC_AUC(y_true, y_score, [average], [sample_weight], [max_fpr], [multi_class], [labels])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_score(Range, Required): Probability or decision score matrix.average(String, Default:"macro"):"micro","macro","weighted","samples", orNone.sample_weight(Range, Optional): Per-sample weights.max_fpr(Float, Optional): Compute standardized partial AUC over [0, max_fpr].multi_class(String, Default:"raise"):"raise","ovr", or"ovo"(required for > 2 classes).labels(Range, Optional): Subset/order of class labels.
Returns: float
Example:
Cell E18: =ML.EVAL.CLASSIFICATION.ROC_AUC(A10:A160, C10:C312, "macro", , , "ovr")
Result: 0.999
Notes:
- Errors if only one class is present in
y_true.
ML.EVAL.CLASSIFICATION.AVERAGE_PRECISION()
Average precision — area under the precision-recall curve. Requires y_score.
Syntax:
=ML.EVAL.CLASSIFICATION.AVERAGE_PRECISION(y_true, y_score, [average], [pos_label], [sample_weight])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_score(Range, Required): Probability or decision score matrix.average(String, Optional): Aggregation method.pos_label(Integer or String, Default:1): Positive class for binary metrics.sample_weight(Range, Optional): Per-sample weights.
Returns: float
Example:
Cell E19: =ML.EVAL.CLASSIFICATION.AVERAGE_PRECISION(A10:A160, C10:C312)
Result: 0.98
ML.EVAL.CLASSIFICATION.BRIER_SCORE_LOSS()
Brier score — mean squared error between predicted probabilities and true labels. Lower is better. Requires y_score.
Syntax:
=ML.EVAL.CLASSIFICATION.BRIER_SCORE_LOSS(y_true, y_score, [sample_weight], [pos_label])
Parameters:
y_true(Range, Required): Ground-truth class labels (binary).y_score(Range, Required): Predicted probabilities for the positive class.sample_weight(Range, Optional): Per-sample weights.pos_label(Integer or String, Optional): Positive class label.
Returns: float
Example:
Cell E20: =ML.EVAL.CLASSIFICATION.BRIER_SCORE_LOSS(A10:A160, C10:C160)
Result: 0.03
ML.EVAL.CLASSIFICATION.LOG_LOSS()
Cross-entropy / log loss. Requires y_score (probabilities). Lower is better.
Syntax:
=ML.EVAL.CLASSIFICATION.LOG_LOSS(y_true, y_score, [normalize], [sample_weight], [labels])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_score(Range, Required): Predicted probabilities (2-D for multi-class).normalize(Boolean, Default:TRUE): IfFALSE, returns sum of per-sample losses.sample_weight(Range, Optional): Per-sample weights.labels(Range, Optional): Subset/order of class labels.
Returns: float
Example:
Cell E21: =ML.EVAL.CLASSIFICATION.LOG_LOSS(A10:A160, C10:C312)
Result: 0.07
ML.EVAL.CLASSIFICATION.D2_LOG_LOSS_SCORE()
D² score, fraction of log loss explained. Analogue of R² for log loss. Requires y_score.
Syntax:
=ML.EVAL.CLASSIFICATION.D2_LOG_LOSS_SCORE(y_true, y_score, [sample_weight], [labels])
Parameters:
y_true(Range, Required): Ground-truth class labels.y_score(Range, Required): Predicted probabilities (2-D for multi-class).sample_weight(Range, Optional): Per-sample weights.labels(Range, Optional): Subset/order of class labels.
Returns: float
Example:
Cell E22: =ML.EVAL.CLASSIFICATION.D2_LOG_LOSS_SCORE(A10:A160, C10:C312)
Result: 0.91
Clustering Metrics (ML.EVAL.CLUSTERING)
These metrics compare two label arrays: labels_true (ground-truth cluster assignments, when available) and labels_pred (predicted cluster assignments, e.g., from ML.PREDICT on a fitted KMeans). All functions take 1-D arrays of integer labels and return a single float.
Common parameters
labels_true(Range, Required): Ground-truth cluster assignments (1-D integer array).labels_pred(Range, Required): Predicted cluster assignments (1-D integer array).
Shared example setup (used in examples below):
Cell A1: =ML.DATASETS.IRIS()
Cell B1: =ML.DATA.SELECT_COLUMNS(A1, "0:3") <- X (features)
Cell C1: =ML.DATA.SELECT_COLUMNS(A1, 4) <- y_true (ground-truth labels)
Cell D1: =ML.CLUSTERING.KMEANS(3)
Cell E1: =ML.FIT(D1, B1)
Cell F1: =ML.PREDICT(E1, B1) <- Cluster predictions handle
Cell A10: =ML.DATA.SAMPLE(C1, -1) <- SPILLS labels_true
Cell C10: =ML.DATA.SAMPLE(F1, -1) <- SPILLS labels_pred
ML.EVAL.CLUSTERING.ADJUSTED_RAND_SCORE()
Adjusted Rand index. Range [-0.5, 1.0]; 0 = random, 1 = perfect match. Adjusted for chance.
Syntax:
=ML.EVAL.CLUSTERING.ADJUSTED_RAND_SCORE(labels_true, labels_pred)
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.
Returns: float
Example:
Using the setup above (KMeans on Iris, comparing predicted clusters to ground-truth species):
Cell E10: =ML.EVAL.CLUSTERING.ADJUSTED_RAND_SCORE(A10:A160, C10:C160)
Result: 0.73
ML.EVAL.CLUSTERING.RAND_SCORE()
Rand index — similarity measure between clusterings. Range [0, 1]. Not adjusted for chance.
Syntax:
=ML.EVAL.CLUSTERING.RAND_SCORE(labels_true, labels_pred)
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.
Returns: float
Example:
Cell E11: =ML.EVAL.CLUSTERING.RAND_SCORE(A10:A160, C10:C160)
Result: 0.88
ML.EVAL.CLUSTERING.MUTUAL_INFO_SCORE()
Mutual information between two clusterings.
Syntax:
=ML.EVAL.CLUSTERING.MUTUAL_INFO_SCORE(labels_true, labels_pred)
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.
Returns: float
Example:
Cell E12: =ML.EVAL.CLUSTERING.MUTUAL_INFO_SCORE(A10:A160, C10:C160)
Result: 0.87
ML.EVAL.CLUSTERING.ADJUSTED_MUTUAL_INFO_SCORE()
Adjusted mutual information. Range [0, 1]; adjusted for chance.
Syntax:
=ML.EVAL.CLUSTERING.ADJUSTED_MUTUAL_INFO_SCORE(labels_true, labels_pred, [average_method])
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.average_method(String, Default:"arithmetic"):"arithmetic","geometric","min", or"max".
Returns: float
Example:
Cell E13: =ML.EVAL.CLUSTERING.ADJUSTED_MUTUAL_INFO_SCORE(A10:A160, C10:C160)
Result: 0.74
ML.EVAL.CLUSTERING.NORMALIZED_MUTUAL_INFO_SCORE()
Mutual information normalized by the average of the entropies. Range [0, 1].
Syntax:
=ML.EVAL.CLUSTERING.NORMALIZED_MUTUAL_INFO_SCORE(labels_true, labels_pred, [average_method])
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.average_method(String, Default:"arithmetic"):"arithmetic","geometric","min", or"max".
Returns: float
Example:
Cell E14: =ML.EVAL.CLUSTERING.NORMALIZED_MUTUAL_INFO_SCORE(A10:A160, C10:C160)
Result: 0.76
ML.EVAL.CLUSTERING.HOMOGENEITY_SCORE()
Homogeneity — each cluster contains only members of a single class. Range [0, 1].
Syntax:
=ML.EVAL.CLUSTERING.HOMOGENEITY_SCORE(labels_true, labels_pred)
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.
Returns: float
Example:
Cell E15: =ML.EVAL.CLUSTERING.HOMOGENEITY_SCORE(A10:A160, C10:C160)
Result: 0.75
ML.EVAL.CLUSTERING.COMPLETENESS_SCORE()
Completeness — all members of a given class are assigned to the same cluster. Range [0, 1].
Syntax:
=ML.EVAL.CLUSTERING.COMPLETENESS_SCORE(labels_true, labels_pred)
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.
Returns: float
Example:
Cell E16: =ML.EVAL.CLUSTERING.COMPLETENESS_SCORE(A10:A160, C10:C160)
Result: 0.76
ML.EVAL.CLUSTERING.V_MEASURE_SCORE()
V-measure — harmonic mean of homogeneity and completeness. Range [0, 1].
Syntax:
=ML.EVAL.CLUSTERING.V_MEASURE_SCORE(labels_true, labels_pred, [beta])
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.beta(Float, Default:1.0): Weight of homogeneity vs completeness (> 1 weights completeness more).
Returns: float
Example:
Cell E17: =ML.EVAL.CLUSTERING.V_MEASURE_SCORE(A10:A160, C10:C160)
Result: 0.76
ML.EVAL.CLUSTERING.FOWLKES_MALLOWS_SCORE()
Fowlkes-Mallows index — geometric mean of pairwise precision and recall. Range [0, 1].
Syntax:
=ML.EVAL.CLUSTERING.FOWLKES_MALLOWS_SCORE(labels_true, labels_pred, [sparse])
Parameters:
labels_true(Range, Required): Ground-truth cluster assignments.labels_pred(Range, Required): Predicted cluster assignments.sparse(Boolean, Default:FALSE): Use sparse matrix internally for large label arrays.
Returns: float
Example:
Cell E18: =ML.EVAL.CLUSTERING.FOWLKES_MALLOWS_SCORE(A10:A160, C10:C160)
Result: 0.82
Navigation
- Back to Function Reference
- Evaluation Functions — model-level scoring, cross-validation, grid search
- Data Functions — loading and manipulating data
- Model Methods — ML.FIT, ML.PREDICT, ML.TRANSFORM
- Documentation Home