Setting Up MLflow in Google Colab: A Beginner-Friendly Guide
Bidut Sharkar Shemanto
Posted on November 30, 2024
Tracking machine learning experiments is essential for improving model performance and maintaining reproducibility. MLflow is a powerful tool that helps track experiments, log parameters, metrics, and artifacts, and even deploy models. In this blog, I'll guide you through setting up MLflow in Google Colab, complete with an accessible dashboard using ngrok.
Why Use MLflow in Google Colab?
Google Colab offers a free environment to run Python notebooks. Combining it with MLflow enables you to:
- Track your model's performance over time.
- Save and organize metrics, parameters, and artifacts for various experiments.
- Share your experiment dashboard with collaborators using ngrok.
Step 1: Installing MLflow and Dependencies
First, we need to install MLflow and ngrok. Run the following commands in your Colab notebook:
!pip install mlflow -q
!pip install pyngrok -q
Step 2: Starting the MLflow Server
Google Colab doesn’t natively support localhost connections. To overcome this, we’ll use ngrok to expose the MLflow UI on the internet.
Here’s the complete setup:
import mlflow
import subprocess
from pyngrok import ngrok, conf
import getpass
# Define the MLflow tracking URI with SQLite
MLFLOW_TRACKING_URI = "sqlite:///mlflow.db"
# Start the MLflow server using subprocess
subprocess.Popen(["mlflow", "ui", "--backend-store-uri", MLFLOW_TRACKING_URI, "--port", "5000"])
# Set MLflow tracking URI
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
# Set or create an experiment
mlflow.set_experiment("BD House Price Prediction")
# Set up ngrok for exposing the MLflow UI
print("Enter your authtoken, which can be copied from https://dashboard.ngrok.com/auth")
conf.get_default().auth_token = getpass.getpass()
# Expose the MLflow UI on port 5000
port = 5000
public_url = ngrok.connect(port).public_url
print(f' * ngrok tunnel "{public_url}" -> "http://127.0.0.1:{port}"')
we will get an ngrok tunnel url after running this command.go to that url and you will find your experiment has been setup.from there you can see all your expected results
Step 3: Logging Experiments
You can log metrics, parameters, and even artifacts (e.g., models or plots) in MLflow. Here’s an example:
# Start an MLflow run
with mlflow.start_run():
# Log parameters and metrics
mlflow.log_param("learning_rate", 0.01)
mlflow.log_metric("rmse", 0.25)
print("Run logged successfully!")
Congratulations! You’ve successfully set up MLflow in Google Colab. With this setup, you can track experiments, log results, and share your work effortlessly. MLflow’s flexibility and ease of use make it a must-have tool for any data scientist or machine learning enthusiast.
Follow me on LinkedIn and GitHub
Have fun experimenting and tracking your models like a pro! 🚀
Posted on November 30, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024