Automate SageMaker Real-Time ML Inference in a ServerLess way
prasanth mathesh
Posted on May 19, 2021
Introduction
Amazon SageMaker is a fully managed service that enables data scientists and ML engineers to quickly create, train and deploy models and ML pipelines in an easily scalable and cost-effective way. The SageMaker was launched around Nov 2017 and I had a chance to get to know about inbuilt algorithms and features of SageMaker from Kris Skrinak during a boot camp roadshow for the Amazon Partners. Over the period, SageMaker has matured a lot to enable ML engineers to deploy and track models quickly and scalable. Apart from its built-in Algorithms, there were many new features like AutoPilot, Model Clarify and Feature Store, Docker Container. This blog will look into these new SageMaker features and the ServerLess way of training, deployment, and real-time inference.
Architecture
The steps for the below reference architecture are explained at the end of the SageMaker Pipeline section of this article.
SageMaker Features
A) Auto Pilot-Low Code Machine Learning
- Launched around DEC 2019
- Industry-first Automated ML to give control and visibility to ML Models
- Does Feature Processing, picks the best algorithm, trains and selects the best model with just a few clicks
- Vertical AI services like Amazon Personalize and Amazon Forecast can be used for personalized recommendation and forecasting problems
- AutoPilot is a generic ML service for all kinds of classification and regression problems like fraud detection and churn analysis and targeted marketing
- Supports inbuilt Algorithms of SageMaker like xgboost and linear learner
- Default max size of input dataset is 5 GB but can be increased in GBs only
Auto-Pilot Demo
Data for AutoPilot Experiment
The dataset considered is public data provided by UCI.
Data Set Information
The survey data describes different driving scenarios including the destination, current time, weather, passenger, etc., and then asks the person whether he will accept the coupon if he is the driver. The task we will be performing on this dataset is Classification
AutoPilot Experiment
Import the data for training.
%%sh
wget https://archive.ics.uci.edu/ml/machine-learning-databases/00603/in-vehicle-coupon-recommendation.csv
Once data is uploaded, the AutoPilot can be set up within minutes using the SageMaker studio. Add the training input and output data paths, Label to predict and enable the auto-deployment of the model. SageMaker deploys the best model and creates an endpoint after the successful training.
Alternately one can select the model of their wish and deploy it.
The endpoint configurations and endpoint details of deployed model can be found in the console
Infer and Evaluate Model
Take a validation record and invoke the endpoint. The feature engineering tasks are done by autopilot and thus raw features data can infer the trained model and predict.
No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21,Unmarried partner,1,Some college - no
degree,Unemployed,$37500 - $49999,,never,never,,4~8,1~3,1,1,0,0,1,1
Infer the model using validation data set using the code given in Github.
B) SageMaker Clarify
- Launched around DEC 2020
- Explains how machine learning (ML) models made predictions during the Autopilot experiments
- Monitors Bias Drift for Models in Production
- Provides components that help AWS customers build less biased and more understandable machine learning models
- Provides explanations for individual predictions available via API
- Helps in establishing the model governance for ML applications
The bias information can be generated for the AutoPilot experiment.
bias_data_config = sagemaker.clarify.DataConfig(
s3_data_input_path=training_data_s3_uri,
s3_output_path=bias_report_1_output_path,
label="Y",
headers=train_cols,
dataset_type="text/csv",
)
model_config = sagemaker.clarify.ModelConfig(
model_name=model_name,
instance_type=train_instance_type,
instance_count=1,
accept_type="text/csv",
)
C) SageMaker Feature Store
- Launched around DEC 2020
- Amazon SageMaker Feature Store is a fully managed repository to store, update, retrieve, and share machine learning (ML) features in S3.
- The feature set that was used to train the model needs to be available to make real-time predictions (inference).
- Data Wrangler of SageMaker Studio can be used to engineer features and ingest features into a feature store
- Feature Store - both online and offline stores can be ingested via separate Featuring Engineering Pipeline via SDK
- Streaming sources can directly ingest features to the online feature store for inference or feature creation
- Feature Store automatically builds an Amazon Glue Data Catalog when Feature Groups are created and can optionally be turned off
The below table shows various data stores used to maintain the features. Some open source frameworks like Feast have evolved as feature store platform and any key-value data store that supports fast lookup can be used as Feature Store.
The feature stores are end-stage of the feature engineering pipeline and the features can be stored in cloud Data Warehouses like Snowflake, RedShift too as shown in the image of featurestore.org.
record_identifier_value = str(2990130)
featurestore_runtime.get_record(FeatureGroupName=transaction_feature_group_name, RecordIdentifierValueAsString=record_identifier_value)
The feature group can be accessed as Hive external table too.
CREATE EXTERNAL TABLE IF NOT EXISTS sagemaker_featurestore.coupon (
write_time TIMESTAMP
event_time TIMESTAMP
is_deleted BOOLEAN
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS
INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
LOCATION 's3://coupon-featurestore/onlinestore/139219451296/sagemaker/ap-south-1/offline-store/coupon-1621050755/data'
D) SageMaker Pipelines
- Launched around DEC 2020
- SageMaker natively supports MLOPS via the SageMaker project and pipelines are created during the SageMaker Project creation
- MLOPS is a standard to streamline the continuous delivery of models. It is essential for a successful production-grade ML application.
- SageMaker pipeline is a series of interconnected steps that are defined by a JSON pipeline definition to perform build, train and deploy or only train and deploy etc.
- The alternate ways to set up the MLOPS in SageMaker are Mlflow, Airflow and Kubeflow, Step Functions, etc.
Docker Containers
SageMaker Studio itself runs from a Docker container. The docker containers can be used to migrate the existing on-premise live ML pipelines and models into the SageMaker environment.
Both stateful and stateless inference pipelines can be created. For example the anomaly and fraud detection pipelines are stateless and the example considered in this article is a stateful model inference pipeline.
SageMaker Container Demo
Download the Github folder. The container folder should show files as shown in the image.
The dataset is the same as we have considered for Autopilot Experiment.
The sckit-learn algorithm is used for the local training and model tuning. After various iterations, the features having less importance have been removed and then encoding has been performed for the key features.
The final encoded features (97 labels) are stored in coupon_train.csv and will be used for training and validation locally.
Docker Container Build
The following steps have to be performed in an orderly manner.
- Build the image
docker build -t recommend-in-vehicle-coupon:latest .
- Train the features in local mode
./train_local.sh recommend-in-vehicle-coupon:latest
- Serve the model in local mode
./serve_local.sh recommend-in-vehicle-coupon:latest
The servers are up and waiting for request.
- Predict locally
The payload.csv will have features to predict the model. Run below command to predict the response for the features available in the csv.
./predict.sh payload.csv
Once the request is accepted, servers listening will respond to the requests received.
- Push Image
Once the local testing is completed, the container train, deploy and serve image can be pushed to AWS ECR. In case any code change is done, the final build and push step alone is enough.
./build_and_push.sh
The AWS ECR images can be pulled and containers can be run from Lambda, AWS EKS etc.
Lambda Function
The SageMaker API calls meant for training, deployment and inference are created as Lambda Functions. Then deployed Lambda handler function should be integrated with API Gateway so that pipeline can be run for any triggered API event.
The lambda function kept in Github has three major blocks.
Create SageMaker Training Function
The lambda will read features from s3 and complete the training.
client = boto3.client("sagemaker", region_name=region)
client.create_training_job(**create_training_params)
status = client.describe_training_job(TrainingJobName=job_name)["TrainingJobStatus"]
Create SageMaker Model and Endpoint Function
Create the model
The training job will place model artifacts in s3 and that model has to be registered with SageMaker.
Register the models in the SageMaker environment using the below API call.
create_model_response = client.create_model(
ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=primary_container
)
Create End Point Config
response = client.create_endpoint_config(
EndpointConfigName=endpoint_config_name,
ProductionVariants=[
{
'VariantName': 'variant-1',
'ModelName': model_name,
'InitialInstanceCount': 1,
'InstanceType': 'ml.t2.medium'
}
]
)
Create End Point
response = client.create_endpoint(
EndpointName=endpoint_name,
EndpointConfigName=endpoint_config_name
)
Invoke SageMaker Model Function
Based on the API request body message, the endpoint will be invoked by the Lambda.
response = client.invoke_endpoint(
EndpointName=EndpointName,
Body=event_body.encode('utf-8'),
ContentType='text/csv'
)
The status of the In-service endpoint and the requests made to the endpoint can be checked in the cloud watch logs.
Testing State-full Real-time Inference
Trigger SageMaker Training
Once API Gateway and Lambda have been integrated, Training Job can be triggered by passing the below request body to Lambda function.
{"key":"train_data"}
Trigger SageMaker Model and Endpoint Deployment
Once the training job is completed, deploy the model with the below request body. The training job should be the job which we created recently.
{"key" : "deploy_model",
"training_job" :"<training job name>"
}
Trigger SageMaker Model Endpoint
Invoke the endpoint with the below request. The feature is encoded and should be the same as we used to train.
The predicted response will be as shown below.
The events created during invoking can be viewed in cloud watch logs.
Conclusion
Machine Learning inference costs account for more than 80 percent of operational costs for running the ML workloads. The SageMaker capabilities like container orchestration, multi-model endpoint, serverless inference can save both operational and development costs. Also,the event-driven training and inference pipelines can enable any non-technical person from the sales or marketing team to refresh both batch and real-time predictions with a click of a button built using the mechanisms like API, webhooks from their sales portal on an Adhoc basis before running their campaign.
Posted on May 19, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.