Setting Up MLflow in Google Colab: A Beginner-Friendly Guide

shemanto_sharkar

Bidut Sharkar Shemanto

Posted on November 30, 2024

Setting Up MLflow in Google Colab: A Beginner-Friendly Guide

Tracking machine learning experiments is essential for improving model performance and maintaining reproducibility. MLflow is a powerful tool that helps track experiments, log parameters, metrics, and artifacts, and even deploy models. In this blog, I'll guide you through setting up MLflow in Google Colab, complete with an accessible dashboard using ngrok.

Why Use MLflow in Google Colab?

Google Colab offers a free environment to run Python notebooks. Combining it with MLflow enables you to:

  • Track your model's performance over time.
  • Save and organize metrics, parameters, and artifacts for various experiments.
  • Share your experiment dashboard with collaborators using ngrok.

Step 1: Installing MLflow and Dependencies
First, we need to install MLflow and ngrok. Run the following commands in your Colab notebook:

!pip install mlflow -q
!pip install pyngrok -q
Enter fullscreen mode Exit fullscreen mode

Step 2: Starting the MLflow Server
Google Colab doesn’t natively support localhost connections. To overcome this, we’ll use ngrok to expose the MLflow UI on the internet.

Here’s the complete setup:

import mlflow
import subprocess
from pyngrok import ngrok, conf
import getpass

Enter fullscreen mode Exit fullscreen mode
# Define the MLflow tracking URI with SQLite
MLFLOW_TRACKING_URI = "sqlite:///mlflow.db"

# Start the MLflow server using subprocess
subprocess.Popen(["mlflow", "ui", "--backend-store-uri", MLFLOW_TRACKING_URI, "--port", "5000"])
Enter fullscreen mode Exit fullscreen mode
# Set MLflow tracking URI
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
Enter fullscreen mode Exit fullscreen mode
# Set or create an experiment
mlflow.set_experiment("BD House Price Prediction")
Enter fullscreen mode Exit fullscreen mode
# Set up ngrok for exposing the MLflow UI
print("Enter your authtoken, which can be copied from https://dashboard.ngrok.com/auth")
conf.get_default().auth_token = getpass.getpass()
Enter fullscreen mode Exit fullscreen mode
# Expose the MLflow UI on port 5000
port = 5000
public_url = ngrok.connect(port).public_url
print(f' * ngrok tunnel "{public_url}" -> "http://127.0.0.1:{port}"')
Enter fullscreen mode Exit fullscreen mode

we will get an ngrok tunnel url after running this command.go to that url and you will find your experiment has been setup.from there you can see all your expected results

Step 3: Logging Experiments
You can log metrics, parameters, and even artifacts (e.g., models or plots) in MLflow. Here’s an example:

# Start an MLflow run
with mlflow.start_run():
    # Log parameters and metrics
    mlflow.log_param("learning_rate", 0.01)
    mlflow.log_metric("rmse", 0.25)
    print("Run logged successfully!")
Enter fullscreen mode Exit fullscreen mode

Congratulations! You’ve successfully set up MLflow in Google Colab. With this setup, you can track experiments, log results, and share your work effortlessly. MLflow’s flexibility and ease of use make it a must-have tool for any data scientist or machine learning enthusiast.

Follow me on LinkedIn and GitHub

Have fun experimenting and tracking your models like a pro! 🚀

💖 💪 🙅 🚩
shemanto_sharkar
Bidut Sharkar Shemanto

Posted on November 30, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related