Data Functions Reference
Complete reference for FormulaML data handling functions including datasets, data manipulation, and exploration.
Functions for loading, exploring, and manipulating data.
ML.DATASETS.*
- Built-in datasets (Iris, Diabetes, Digits, OpenML)ML.DATA.*
- Data manipulation and explorationFunctions for predicting continuous values.
ML.REGRESSION.LINEAR
- Linear RegressionML.REGRESSION.RIDGE
- Ridge Regression (L2 regularization)ML.REGRESSION.LASSO
- Lasso Regression (L1 regularization)ML.REGRESSION.ELASTIC_NET
- Elastic Net (L1 + L2)ML.REGRESSION.RANDOM_FOREST_REG
- Random Forest Regression ⭐Functions for categorizing data.
ML.CLASSIFICATION.LOGISTIC
- Logistic RegressionML.CLASSIFICATION.SVM
- Support Vector MachinesML.CLASSIFICATION.RANDOM_FOREST_CLF
- Random Forest Classifier ⭐Functions for finding groups in data.
ML.CLUSTERING.KMEANS
- K-Means clustering with advanced parametersFunctions for preparing data.
ML.PREPROCESSING.TRAIN_TEST_SPLIT
- Split train/test setsML.PREPROCESSING.STANDARD_SCALER
- Standardize featuresML.PREPROCESSING.MIN_MAX_SCALER
- Scale to range [0,1]ML.PREPROCESSING.ROBUST_SCALER
- Scale robust to outliersML.PREPROCESSING.ONE_HOT_ENCODER
- One-hot encode categoriesML.PREPROCESSING.ORDINAL_ENCODER
- Ordinal encode categoriesFunctions for assessing model performance.
ML.EVAL.SCORE
- R² or accuracy scoreML.EVAL.CV_SCORE
- Cross-validation ⭐ML.EVAL.GRID_SEARCH
- Hyperparameter tuning ⭐ML.EVAL.BEST_PARAMS
- Extract best parameters ⭐ML.EVAL.BEST_SCORE
- Get best CV score ⭐ML.EVAL.SEARCH_RESULTS
- Detailed search results ⭐Essential functions for model training and prediction.
ML.FIT
- Train models and transformersML.PREDICT
- Make predictionsML.TRANSFORM
- Transform dataML.FIT_TRANSFORM
- Fit and transform in one stepML.PIPELINE
- Create ML workflowsML.OBJECT_INFO
- Inspect object detailsFunctions for reducing feature dimensions.
ML.DIM_REDUCTION.PCA
- Principal Component AnalysisML.DIM_REDUCTION.PCA.RESULTS
- PCA detailed resultsML.DIM_REDUCTION.KERNEL_PCA
- Non-linear PCA ⭐Functions for model analysis and visualization.
ML.INSPECT.GET_PARAMS
- Extract model parametersML.INSPECT.DECISION_BOUNDARY
- Visualize decision boundariesFunctions for advanced column transformations.
ML.COMPOSE.COLUMN_TRANSFORMER
- Apply transformer to columnsML.COMPOSE.DATA_TRANSFORMER
- Combine transformersML.COMPOSE.COLUMN_SELECTOR
- Select columns by pattern/typeML.COMPOSE.TRANSFORMERS.DROP
- Drop columnsML.COMPOSE.TRANSFORMERS.PASSTHROUGH
- Pass columns unchangedFunctions for handling missing values and selecting features.
ML.IMPUTE.SIMPLE_IMPUTER
- Impute missing valuesML.FEATURE_SELECTION.SELECT_PERCENTILE
- Select top featuresAll FormulaML functions follow a consistent naming pattern:
ML.[NAMESPACE].[FUNCTION_NAME](parameters)
Examples:
ML.DATASETS.IRIS()
- Load Iris datasetML.REGRESSION.LINEAR()
- Create linear regression modelML.EVAL.SCORE()
- Evaluate model performanceMost core functionality is available in the free tier:
Advanced capabilities require premium subscription:
ML.EVAL.CV_SCORE
)ML.EVAL.GRID_SEARCH
)ML.DIM_REDUCTION.KERNEL_PCA
)ML.DATASETS.OPENML
)Premium functions are marked with a ⭐ icon in the documentation.
Many functions return or accept “object handles” - references to complex data structures:
Cell A1: =ML.DATASETS.IRIS() → Returns: <Dataset>
Cell A2: =ML.REGRESSION.LINEAR() → Returns: <LinearRegression>
Cell A3: =ML.FIT(A2, features, target) → Returns: <LinearRegression> (with 🧠 brain icon)
These handles allow Excel to manage complex ML objects efficiently.
random_state
42
(any integer works)fit_intercept
alpha
n_estimators
max_iter
Functions return different types of values:
Object Handles: Complex objects (models, dataframes)
<SVC>
Numeric Values: Single numbers
0.95
(accuracy score)Arrays: Multiple values
DataFrames: Tabular data
Load Data:
ML.DATASETS.IRIS()
- Classification datasetML.DATASETS.DIABETES()
- Regression datasetML.DATA.CONVERT_TO_DF()
- Excel to DataFrameExplore Data:
ML.DATA.INFO()
- Data structureML.DATA.DESCRIBE()
- StatisticsML.DATA.SAMPLE()
- View rowsPrepare Data:
ML.DATA.SELECT_COLUMNS()
- Choose columnsML.PREPROCESSING.TRAIN_TEST_SPLIT()
- Split dataML.PREPROCESSING.STANDARD_SCALER()
- Scale featuresTrain Models:
ML.FIT()
- Train any modelML.PREDICT()
- Make predictionsML.TRANSFORM()
- Transform dataEvaluate:
ML.EVAL.SCORE()
- Basic scoringML.EVAL.CV_SCORE()
- Cross-validation ⭐ML.EVAL.GRID_SEARCH()
- Hyperparameter tuning ⭐Common error messages and their meanings:
“Object handle not found”
“Invalid parameter value”
“Dimension mismatch”
“Premium function”
Always use consistent data shapes
Set random_state for reproducibility
Check data types
Handle missing values
Start with simple models
Browse functions by category:
Or return to:
Complete reference for FormulaML data handling functions including datasets, data manipulation, and exploration.
Complete reference for FormulaML regression models including Linear, Ridge, Lasso, Elastic Net, and Random Forest regression.
Complete reference for FormulaML classification models including Logistic Regression, SVM, and Random Forest classification.
Complete reference for FormulaML clustering models including K-Means for unsupervised learning.
Complete reference for FormulaML preprocessing functions including scaling, encoding, and data splitting.
Complete reference for FormulaML model methods including fit, predict, transform, and pipeline creation.
Complete reference for FormulaML evaluation functions including scoring, cross-validation, and hyperparameter tuning.
Complete reference for FormulaML dimensionality reduction functions including PCA and Kernel PCA.
Complete reference for FormulaML inspection tools including parameter extraction and decision boundary visualization.
Complete reference for FormulaML compose functions for building advanced transformation pipelines.