Zenml for beautiful beautiful orchestration pt 3: Mlflow Aliases/Tags
Akan
Posted on October 16, 2024
Deploying a Local MLOps Solution: The Final Chapter
In the previous parts of this series, we explored the fundamentals of MLOps and set up our local environment. Now, we'll conclude by deploying our solution locally.
Adapting Our Approach: From ZenML to MLflow
Model deployment with the MLflow flavor in ZenML proved challenging on Windows. Since our overarching goal is to create a simple solution that can run exclusively on your local machine, we opted for the best alternative: Model Registry using Tags/Aliases on MLflow.
Instead of deploying the MODEL object through ZenML, we now deploy entirely through MLflow. This change primarily impacts our promotion logic.
MLflow: Model Registry with Tags/Aliases
The MLflow Model Registry documentation is invaluable and will save you a ton of despair.
We've adapted the approach using the tags [staging, production, archived] route,
we use the MLflow's client API to manage these tag changes, deleting old tags and setting new ones to reflect the
updated status of each model version.
This logic ensures that only better-performing models (based on RMSE) are promoted to production, while maintaining a history of all models in the "archived" state for future reference or rollback if needed.
- We tag a model as "staging" immediately after training.
- We pass it to the promotion stage, where we compare the Root Mean Squared Error (RMSE).
- Based on performance, we update the tag to either "archived" or "production".
Here's how the crux of the promotion_logic
as modified from our previous versions:
# Determine the new tag based on RMSE comparison
if production_model:
# Compare RMSEs
if staging_rmse < production_rmse: # Staging is better, promote it to production
# Change tag "production" - "archived"
mlflowClient.delete_model_version_tag(
name=model_name, version=str(production_version), key="production"
)
mlflowClient.set_model_version_tag(
name=model_name, version=str(production_version), key="archived", value=str(production_rmse)
)
# Change tag "staging" - "production"
mlflowClient.delete_model_version_tag(
name=model_name, version=str(staging_version), key=tag_key
)
mlflowClient.set_model_version_tag(
name=model_name, version=str(staging_version), key="production", value=str(staging_rmse)
)
print(f"Promoted staging model to production. Archived previous production model.")
return True
else: # Production is better, archive staging
mlflowClient.delete_model_version_tag(
name=model_name, version=str(staging_version), key=tag_key
)
mlflowClient.set_model_version_tag(
name=model_name, version=str(staging_version), key="archived", value=str(staging_rmse)
)
return False
Here is how it works:
- It first checks if there's an existing production model.
- If a production model exists, it compares the RMSE of the staging model with the production model:
- If the staging model performs better (lower RMSE):
- The current production model is demoted to "archived" status.
- The staging model is promoted to "production" status.
- The function returns True to indicate a promotion occurred.
- If the staging model performs better (lower RMSE):
- If the production model performs better: - The staging model is moved to "archived" status. - The production model remains unchanged. - The function returns False to indicate no promotion occurred.
Using the Predictive Service
The culmination of this exercise is hosting a User Interface where
the end-user can use the predictive service.
On the backside, we fetch the best-performing (Production) model from the model
register as a service on a streamlit.
Streamlit [link] is an open-source Python framework designed for quickly developing interactive app prototypes like this.
Here's a simple script that demonstrates this:
@st.cache_resource
def load_production_model():
mlflow.set_tracking_uri("file:C:/Users/buasc/zenml_store/mlruns")
mlflowClient = MlflowClient()
model_name = "salary_prediction_regression-model"
try:
production_model, production_rmse, production_version = get_model_by_tag(
tag="production",
model_name=model_name
)
st.success(f"Loaded production model version {production_version} with RMSE: {production_rmse}")
return production_model
except Exception as e:
st.warning("No production model found!")
try:
local_production_model_pth = "model_dir/5152.801173710256_randomForest_2024-09-18.pkl"
with open(local_production_model_pth, 'rb') as file:
production_model = pickle.load(file)
st.success("Loaded from local dir!")
return production_model
except Exception as e:
st.warning("No Local Production Model Found!")
return None
This helper function attempts to load the model from MLflow.
It first sets the tracking URI and tries to retrieve the model tagged
as “production” from the specified MLflow tracking server. If successful,
it returns the model along with its version and RMSE.
If the production model is not found, it falls back to loading a local model
from a specified file path. If both attempts fail, it returns None and displays
appropriate warnings.
# Load the model only once
production_model = load_production_model()
st.title("Model Prediction App")
# Create input fields for your features
work_experience = st.number_input(
"Experience Years",
min_value=1, value=5, step=1
)
if st.button("Predict"):
if production_model is not None:
# Create a DataFrame with the input data
input_data = pd.DataFrame(
[[float(work_experience)]],
columns=["Experience Years"]
)
# Make prediction
prediction = production_model.predict(input_data)
st.write(f"Prediction: {prediction[0]}")
else:
st.error("No model available for prediction.")
Now, we pass the function, users can enter their years of experience,
after the "Predict" button is clicked,
the app checks if the model is available, it makes a prediction and
displays the predicted salary.
That's it! With this script, you can now interact with your deployed model through a user-friendly interface.
Conclusion: Your Local MLOps Pipeline
This concludes our series on building a local MLOps solution. We've created a simple yet effective MLOps pipeline that:
- Fetches and wrangles data
- Trains a defined model
- Stores the model's artifacts
- Registers the model to a model registry
- Iteratively promotes or discards models based on performance
- Retrieves the best-performing model for end-use via a User Interface
This pipeline embodies the essence of ML Engineering, bringing together various components to create a robust, iterative process for managing machine learning models.
The full repository for this project is available here. Please don't hesitate to reach out if you have any questions or need further clarification. And if you are a video person, see the screencast here.
Happy ML Engineering!
Posted on October 16, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 27, 2024