About MLExplain
An interactive machine learning model explainer
What is MLExplain?
MLExplain is an educational and practical tool for exploring how different machine learning algorithms perform on classic datasets. It provides an intuitive web interface for training models, inspecting feature importance, studying confusion matrices, and comparing algorithms side by side.
Built with Flask and scikit-learn, MLExplain requires no external AI APIs and runs entirely locally. All experiments are stored in a SQLite database for full reproducibility and history tracking.
Supported Algorithms
Decision Tree
Interpretable tree-based classifier that splits data based on feature thresholds. Provides direct feature importance via Gini impurity reduction.
Random Forest
Ensemble of decision trees trained with bagging. Reduces overfitting and provides robust feature importance averaged across all trees.
Support Vector Machine
Kernel-based classifier that finds optimal hyperplanes for separating classes. Effective in high-dimensional spaces.
K-Nearest Neighbours
Instance-based learning that classifies samples by majority vote of their k nearest neighbours in feature space.
Logistic Regression
Linear model for classification using log-odds. Fast to train and provides probabilistic predictions with confidence scores.
Built-in Datasets
Iris
150 samples · 4 features · 3 classes (setosa, versicolor, virginica)
Wine
178 samples · 13 features · 3 cultivar classes
Breast Cancer
569 samples · 30 features · 2 classes (malignant, benign)
Digits
1,797 samples · 64 features · 10 classes (0-9)
Explanation Methods
Tree-based Feature Importance
Available for Decision Tree and Random Forest. Uses the total Gini impurity reduction contributed by each feature across all splits in the tree(s).
Permutation Importance
Available for all models. Measures the decrease in model accuracy when a single feature's values are randomly shuffled, breaking the relationship between the feature and the target.
Technology Stack
REST API
MLExplain provides a full REST API for programmatic access to all features:
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/health | Health check |
| GET | /api/datasets | List datasets |
| GET | /api/datasets/<name> | Dataset info |
| POST | /api/train | Train a model |
| GET | /api/experiments | List experiments |
| GET | /api/experiments/<id> | Experiment details |
| DELETE | /api/experiments/<id> | Delete experiment |
| GET | /api/experiments/<id>/importance | Feature importance |
| GET | /api/experiments/<id>/confusion | Confusion matrix |
| GET | /api/experiments/<id>/metrics | All metrics |
| POST | /api/predict/<id> | Prediction |
| POST | /api/compare | Compare models |