In this tutorial, we'll train two models, log both to MLflow, and explore what was captured in the MLflow UI.
In Part 1, we'll use autologging to capture a classification model automatically. In Part 2, we'll log a regression model by hand so we can see exactly what gets recorded and why. In Part 3, we'll compare both runs side by side in the UI.
Part 1 uses the Iris dataset, a sample dataset included with scikit-learn, and trains a logistic regression model — a standard algorithm for classification tasks. Part 2 uses the diabetes dataset, also included with scikit-learn, and trains a random forest regressor.
By the end, we'll have two logged training runs in the MLflow UI, a clear understanding of the difference between autologging and manual logging, and a comparison view showing how the two approaches differ in what they capture.
Before you begin
- Python installed (3.8 or later)
pipavailable in your environment- MLflow installed — if not, run
pip install mlflow - scikit-learn installed — if not, run
pip install scikit-learn - A terminal or command prompt with Python accessible
⏱ Time required: approximately 30 to 60 minutes
Part 1Log a run with autologging
Autologging is the fastest way to get a training run into MLflow. A single call to mlflow.autolog() before training is all it takes: MLflow captures parameters, metrics, and the trained model automatically.
1Write and run the script
MLflow organizes training runs into experiments. We'll name one to hold our runs. If the experiment doesn't exist yet, MLflow creates it automatically — or connects to an existing one and returns no output either way.
Create a new .py file and add the following code.
set_tracking_uri line is required when logging to a local MLflow server. Update the port number if your server is running on something other than 5000.
import mlflow
mlflow.set_tracking_uri('http://127.0.0.1:5000')
mlflow.set_experiment("MLflow Quick Start")
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
X, y = datasets.load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
params = {"solver": "lbfgs", "max_iter": 1000, "random_state": 8888}
mlflow.autolog()
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)
predictions = lr.predict(X_test)
mlflow.autolog() enables automatic logging for scikit-learn. When lr.fit() runs, MLflow captures the hyperparameters, performance metrics, and the trained model without any additional logging code.
2Start the MLflow UI
The MLflow UI lets us explore logged runs, compare metrics, and inspect saved models.
If the server isn't already running, open a terminal and run:
mlflow server --port 5000
Then open http://127.0.0.1:5000 in a browser.
3Find your run
Now we'll locate the run we logged in Step 1 and review what MLflow captured automatically.
- Select MLflow Quick Start from the Experiments list.
- In the Runs table, select the run listed there.
- On the Overview page, scroll down to review the Parameters, Metrics, and Logged model sections.
solver, max_iter, and random_state. The Metrics section shows accuracy. The Logged model section shows the saved model artifact.
Part 2Log a run manually
Manual logging gives us control over exactly what gets captured and when. We'll log a second run using the diabetes dataset, this time writing each logging call instead of relying on autologging. Each call records a specific aspect of the run. See MLflow Tracking API for full details.
1Write and run the script
Create a new .py file in the same directory as your Part 1 script and add the following code.
set_tracking_uri line is required when logging to a local MLflow server. Update the port number if your server is running on something other than 5000.
import mlflow
mlflow.set_tracking_uri('http://127.0.0.1:5000')
mlflow.set_experiment("MLflow Quick Start")
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
db = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(db.data, db.target)
dataset = mlflow.data.from_numpy(db.data, targets=db.target, name="diabetes")
params = {"n_estimators": 100, "max_depth": 6, "max_features": 3}
with mlflow.start_run():
mlflow.log_input(dataset, context="training")
mlflow.log_params(params)
rf = RandomForestRegressor(**params)
rf.fit(X_train, y_train)
predictions = rf.predict(X_test)
mse = mean_squared_error(y_test, predictions)
mlflow.log_metric("mse", mse)
mlflow.sklearn.log_model(sk_model=rf, name="diabetes_model")
print("script completed")
script completed. The run appears in the MLflow UI under the MLflow Quick Start experiment with parameters, a metric, a dataset, and a saved model.
2View the run in the UI
If you closed the browser, open http://127.0.0.1:5000. If it's still open, refresh the page. Navigate to the MLflow Quick Start experiment.
- Select the most recent run in the table.
- Review the Metrics section:
mseshould appear. - Review the Parameters section:
n_estimators,max_depth, andmax_featuresshould appear. - Review the Datasets section: the diabetes dataset should appear.
- Scroll to the Model section and confirm the
diabetes_modelartifact is present.
Part 3Compare runs in the UI
We already have two runs in the MLflow Quick Start experiment: one logged automatically in Part 1, one logged manually in Part 2.
1Select both runs
Open http://127.0.0.1:5000 (or refresh if it's already open) and navigate to the MLflow Quick Start experiment.
- Check the box next to the Part 1 run.
- Check the box next to the Part 2 run.
- Select Compare above the runs table.
2Review the comparison
The comparison view shows parameters, metrics, and datasets for each run in parallel. We'll use it to see exactly what each logging approach captured.
- Review the Parameters section. The Part 1 run shows
solver,max_iter, andrandom_state. The Part 2 run showsn_estimators,max_depth, andmax_features. The parameter sets are different because the two runs used different models. - Review the Metrics section. Part 1 logged
accuracy. Part 2 loggedmse. Again, different models, different metrics. - Review the Dataset section. Part 1 shows two datasets (training and test splits, captured automatically). Part 2 shows one dataset, logged explicitly with
mlflow.log_input().
The comparison view also includes several visualization options. The default table view shows parameters and metrics side by side. The parallel coordinates plot maps each run as a line across parameter and metric axes, useful for spotting patterns across many runs. Charts are also available for plotting individual metrics.
What you built
We trained and logged two models to MLflow: one classification model using autologging, and one regression model using manual logging. We then compared both runs in the MLflow UI to see how the two logging approaches differ in what they capture and how they structure it.
This means we can now track every training run in one place, control exactly what gets captured, and evaluate the differences between runs without leaving the UI.