How to Use KitOps with MLflow
Jesse Williams
Posted on November 29, 2024
As artificial intelligence (AI) projects grow in complexity, managing dependencies, maintaining reproducibility, and deploying models efficiently become critical challenges. These processes require tools that can streamline development, tracking, and deployment. Tools like KitOps and MLflow simplify these workflows by automating key aspects of the machine learning (ML) project lifecycle. KitOps simplifies the AI project setup, while MLflow keeps track of and manages the machine learning experiments. With these tools, developers can create robust, scalable, and reproducible ML pipelines at scale.
In this guide, you’ll learn the step-by-step process to integrate KitOps with MLflow for end-to-end model management. However, before diving into the steps, it’s important to understand why integrating KitOps with MLflow is important.
Why integrate KitOps with MLflow?
KitOps integration with MLflow enables machine learning teams to work more efficiently. KitOps establishes a well-defined framework for workflows, while MLflow focuses on detailed tracking and efficient deployment.
Together, they allow data science teams to set up AI projects in minutes, monitor and compare experiments, and deploy models seamlessly to production.
Here’s how this integration enhances the machine learning project lifecycle for AI/ML engineers:
End-to-end workflow management
KitOps uses Kitfile to provide a well-structured modular architecture, while MLflow tracks the details of each experiment. Together, they manage the entire lifecycle—from data ingestion to model training, evaluation, and deployment.Reproducibility and traceability
Reproducibility ensures that AI systems replicate consistent output with the same version of data and code, which keeps AI system trustworthy. With KitOps, you can version and reuse your workflow components (existing container registries, while MLflow logs every experiment with detailed metadata. This integration keeps every experiment traceable, making collaboration easy.Collaboration across teams
Machine learning projects often involve multiple stakeholders across teams, from data engineers to ML engineers and DevOps teams. KitOps provides a structured framework that promotes collaboration, while MLflow offers a unified interface for tracking progress. Together, they bridge the gap between development and operations.Scalability for production
As projects move to production, scalability becomes a priority for ML teams. With KitOps’ modular design, ML teams can scale project workflows easily, while MLflow’s registry and deployment tools handle model lifecycle management at scale. This combination is ideal for organizations looking to operationalize their AI projects.
How to use KitOps with MLflow
The diagram below outlines the steps for using KitOps with MLflow to launch your AI project. This tutorial uses a custom diabetes ML-based model developed from scratch. With the unpack command kit unpack
, you can pull a sample ModelKit from Jozu Hub and follow along.
Each part of the diagram is explained in more detail in the following sections.
Step 1: Install KitOps
First, install and set up KitOps for your operating system (OS). This varies depending on the operating system (OS). However, the central idea is to download the KitOps executable and add it to the path where your OS can detect it.
After installation, you can verify the Kit CLI is correctly installed by opening a new terminal and typing the command kit version.
kit version
------------------------
version: 0.4.0
Commit: 4d208b6cccdefdce2e79d3bea2e54d08d65dee8f
Built: 2024-08-26T15:08:11Z
Go version: go1.21.6
Step 2: Configure your Kitfile to create a ModelKit
Create a ModelKit for your AI project. This requires you to initialize its native configuration document called a Kitfile. To create a Kitfile for your ModelKit in your development directory, use the command below:
echo. > Kitfile
Open the Kitfile and specify all the folders relevant to your ML development so the Kitfile knows what to track when you package your ModelKit image. This lets you maintain the structure and workflows in your local development environment. The Kitfile is a YAML
file; here is a sample.
manifestVersion: v2.0.0
package:
author:
- Chris
description: This project is used to predict if a patient is diabetic or not
name: Diabetes
code:
- description: Jupyter notebook with model training code in Python
path: ./code
datasets:
- description: Pima Indians Diabetes dataset (tabular)
name: data
path: ./datasets
model:
framework: Tensorflow
name: Diabetes_model
path: ./models
version: 0.0.1
description: Diabetes prediction model using Scikit-learn
Step 3: Unpack a ModelKit from Jozu Hub or use your custom ML model developed from scratch
You can build your model from scratch or grab any ModelKit from Jozu Hub’s container registry. This guide uses a base model developed from scratch. However, with the unpack command kit unpack
, you can pull a sample ModelKit from Jozu Hub.
Step 4: Setup your MLflow tracking_uri
Next, set up your MLflow tracking_uri and an experiment name to log your model parameters, as seen below.
## MLflow tracking
mlflow.set_tracking_uri(uri="http://127.0.0.1:5000")
## Creat a MLflow experiment
mlflow.set_experiment("KitOps_MLflow_Quickstart")
## Start an MLflow run
with mlflow.start_run():
## log the hyperparameters
mlflow.log_params(params)
## Log the accuracy metrics
mlflow.log_metric("accuracy", test_data_accuracy)
## Set a tag we can use to remind ourselves what this run is for
mlflow.set_tag("Training info","Basic LR model for diabetes data" )
## Infer the model signature
signature = infer_signature(X_train, lr.predict(X_train))
##log the model
model_info = mlflow.sklearn.log_model(
sk_model= lr,
artifact_path= "diabetes_model",
signature = signature,
input_example= X_train,
registered_model_name="tracking-KitOps_MLflow_Quickstart",
)
Step 5: Pack the ModelKit
Package the Kitfile in your development directory with an AI project name, using ModelKit tag names to create the ModelKit image in your local KitOps registry. The version and tag workflows enable consistency across code, data, and models by keeping everything in one location. The command will be formatted as follows:
kit pack . -t jozu.ml/chukoz71/quick- start:Diabetes
In the command above, a ModelKit tagged **Diabetes**
(representing your tag name) is packaged using the pack command kit pack
command to a JozuHub registry user called chukoz71
kit pack . -t jozu.ml/chukoz71/quick- start:Diabetes
-----------
Saved configuration: sha256:cbda96a6aa80efebb93fbb008af80f225f76ae44d88891b77423c8e1f4a031d9
Saved model layer: sha256:00b46593828668f174341ac834d748b7f6cc118b300708331cc25cbe9ea66ad0
Saved code layer: sha256:756da1aacd63cd0469e5daed4d82daf402b30b1ea2b6916f7a90575f13dc086c
Saved dataset layer: sha256:f307c84c53603257125f28fe808c42bab5e5c02bf1df6962d8df41418a70d743
Saved manifest to storage: sha256:017c18b0e03e9d2c95586114c7b8e780f97bfb420d03df99310ee1387a7e08f1
Model saved: sha256:017c18b0e03e9d2c95586114c7b8e780f97bfb420d03df99310ee1387a7e08f1
The ModelKit automatically tracks and stores updates of the directories you specified in the Kitfile. At every development stage, you only need to repackage the ModelKit to track updates. Packaging your ModelKit from the development phase minimizes errors, secures uniform practices, and even enables easy rollback if needed.
Step 6: Push the ModelKit
After packaging locally, you can push your ModelKit’s image from the local repository to your configured remote container registry. But first, you have to tag the ModelKit using the name of your configured remote registry to create a clear reference and destination for the image in the remote registry. Then, push it to your remote container registry, the command will be formatted as follows:
kit push . -t "REMOTE_ADDRESS"/"REGISTRY_USER"/"REPOSITORY_USERNAME":"TAG"
In this example, the command above is used to push a ModelKit tagged **Diabetes**
**(representing **your tag name) to:
- The Jozu Hub → your registry address
- The
chukoz71
→ your registry user or organization name - The
quick-start
→ your repository name
As a result, the command will look like:
kit push -t jozu.ml/chukoz71/quick- start:Diabetes
kit push jozu.ml/chukoz71/quick-start:Diabetes
--------------
Pushed sha256:017c18b0e03e9d2c95586114c7b8e780f97bfb420d03df99310ee1387a7e08f1
Afterward, log in to your repository in Jozu Hub to confirm that the ModelKit Digest has been successfully pushed, as shown below. The image displays the date it was pushed, the ModelKit digest, the tag name, the size, and other evidence of a successful push.
Now, developers can reproduce your ML workflows or extract only relevant assets for further development, testing, integration, or deployment.
Conclusion
Integrating KitOps with MLflow provides a thorough solution for managing the complexities of ML workflows. KitOps provides a modular framework for the design of the ML processes while MLflow offers a unified platform for experiment tracking, model versioning, and deployment, which guarantees reproducibility, traceability, and operational efficiency.
Following these steps, you’ve learned how to use KitOps with MLflow to get your AI project off the ground in minutes. You’ve explored using your custom base model, log the experiments with MLflow, package your model as a ModelKit, and push it to the remote repository. To learn more about how to deploy a ModelKit, follow the next steps with KitOps and optimize your AI models for production environment.
Posted on November 29, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.