Impute & Feature Selection Reference
Complete reference for FormulaML imputation and feature selection functions.
Functions for loading, exploring, and manipulating data.
ML.DATASETS.*
- Built-in datasets (Iris, Diabetes, Digits, OpenML)ML.DATA.*
- Data manipulation and explorationFunctions for predicting continuous values.
ML.REGRESSION.LINEAR
- Linear RegressionML.REGRESSION.RIDGE
- Ridge Regression (L2 regularization)ML.REGRESSION.LASSO
- Lasso Regression (L1 regularization)ML.REGRESSION.ELASTIC_NET
- Elastic Net (L1 + L2)ML.REGRESSION.RANDOM_FOREST_REG
- Random Forest Regression ⭐Functions for categorizing data.
ML.CLASSIFICATION.LOGISTIC
- Logistic RegressionML.CLASSIFICATION.SVM
- Support Vector MachinesML.CLASSIFICATION.RANDOM_FOREST_CLF
- Random Forest Classifier ⭐Functions for finding groups in data.
ML.CLUSTERING.KMEANS
- K-Means clustering with advanced parametersFunctions for preparing data.
ML.PREPROCESSING.TRAIN_TEST_SPLIT
- Split train/test setsML.PREPROCESSING.STANDARD_SCALER
- Standardize featuresML.PREPROCESSING.MIN_MAX_SCALER
- Scale to range [0,1]ML.PREPROCESSING.ROBUST_SCALER
- Scale robust to outliersML.PREPROCESSING.ONE_HOT_ENCODER
- One-hot encode categoriesML.PREPROCESSING.ORDINAL_ENCODER
- Ordinal encode categoriesFunctions for assessing model performance.
ML.EVAL.SCORE
- R² or accuracy scoreML.EVAL.CV_SCORE
- Cross-validation ⭐ML.EVAL.GRID_SEARCH
- Hyperparameter tuning ⭐ML.EVAL.BEST_PARAMS
- Extract best parameters ⭐ML.EVAL.BEST_SCORE
- Get best CV score ⭐ML.EVAL.SEARCH_RESULTS
- Detailed search results ⭐Essential functions for model training and prediction.
ML.FIT
- Train models and transformersML.PREDICT
- Make predictionsML.TRANSFORM
- Transform dataML.FIT_TRANSFORM
- Fit and transform in one stepML.PIPELINE
- Create ML workflowsML.OBJECT_INFO
- Inspect object detailsFunctions for reducing feature dimensions.
ML.DIM_REDUCTION.PCA
- Principal Component AnalysisML.DIM_REDUCTION.PCA.RESULTS
- PCA detailed resultsML.DIM_REDUCTION.KERNEL_PCA
- Non-linear PCA ⭐Functions for model analysis and visualization.
ML.INSPECT.GET_PARAMS
- Extract model parametersML.INSPECT.DECISION_BOUNDARY
- Visualize decision boundariesFunctions for advanced column transformations.
ML.COMPOSE.COLUMN_TRANSFORMER
- Apply transformer to columnsML.COMPOSE.DATA_TRANSFORMER
- Combine transformersML.COMPOSE.COLUMN_SELECTOR
- Select columns by pattern/typeML.COMPOSE.TRANSFORMERS.DROP
- Drop columnsML.COMPOSE.TRANSFORMERS.PASSTHROUGH
- Pass columns unchangedFunctions for handling missing values and selecting features.
ML.IMPUTE.SIMPLE_IMPUTER
- Impute missing valuesML.FEATURE_SELECTION.SELECT_PERCENTILE
- Select top featuresAll FormulaML functions follow a consistent naming pattern:
ML.[NAMESPACE].[FUNCTION_NAME](parameters)
Examples:
ML.DATASETS.IRIS()
- Load Iris datasetML.REGRESSION.LINEAR()
- Create linear regression modelML.EVAL.SCORE()
- Evaluate model performanceMost core functionality is available in the free tier:
Advanced capabilities require premium subscription:
ML.EVAL.CV_SCORE
)ML.EVAL.GRID_SEARCH
)ML.DIM_REDUCTION.KERNEL_PCA
)ML.DATASETS.OPENML
)Premium functions are marked with a ⭐ icon in the documentation.
Many functions return or accept “object handles” - references to complex data structures:
Cell A1: =ML.DATASETS.IRIS() → Returns: <Dataset>
Cell A2: =ML.REGRESSION.LINEAR() → Returns: <LinearRegression>
Cell A3: =ML.FIT(A2, features, target) → Returns: <LinearRegression> (with 🧠 brain icon)
These handles allow Excel to manage complex ML objects efficiently.
random_state
42
(any integer works)fit_intercept
alpha
n_estimators
max_iter
Functions return different types of values:
Object Handles: Complex objects (models, dataframes)
<SVC>
Numeric Values: Single numbers
0.95
(accuracy score)Arrays: Multiple values
DataFrames: Tabular data
Load Data:
ML.DATASETS.IRIS()
- Classification datasetML.DATASETS.DIABETES()
- Regression datasetML.DATA.CONVERT_TO_DF()
- Excel to DataFrameExplore Data:
ML.DATA.INFO()
- Data structureML.DATA.DESCRIBE()
- StatisticsML.DATA.SAMPLE()
- View rowsPrepare Data:
ML.DATA.SELECT_COLUMNS()
- Choose columnsML.PREPROCESSING.TRAIN_TEST_SPLIT()
- Split dataML.PREPROCESSING.STANDARD_SCALER()
- Scale featuresTrain Models:
ML.FIT()
- Train any modelML.PREDICT()
- Make predictionsML.TRANSFORM()
- Transform dataEvaluate:
ML.EVAL.SCORE()
- Basic scoringML.EVAL.CV_SCORE()
- Cross-validation ⭐ML.EVAL.GRID_SEARCH()
- Hyperparameter tuning ⭐Common error messages and their meanings:
“Object handle not found”
“Invalid parameter value”
“Dimension mismatch”
“Premium function”
Always use consistent data shapes
Set random_state for reproducibility
Check data types
Handle missing values
Start with simple models
Browse functions by category:
Or return to:
Complete reference for FormulaML imputation and feature selection functions.