Debugging Python Code in Amazon SageMaker Locally Using Visual Studio Code and PyCharm: A Step-by-Step Guide

Introduction to Amazon SageMaker and Local Debugging

Many data scientist work on custom models and scripts. Efficient debugging and browsing the code using IDE’s like Visual Studio Code (vscode) or Pycharm enables fast model development and iteration. On the other side data scientist need to be able to train and deploy increasingly larger machine learning models, for which high compute and storage resources are required.

Amazon SageMaker is a flexible machine learning platform that allows you to more effectively build, train, and deploy machine learning models in production. Amazon SageMaker Python SDK supports local mode, which allows you to create pipelines or estimators and deploy them to your local environment. This is a great way to test your deep learning scripts before running them in SageMaker’s managed training or hosting environments. Local Mode is supported for frameworks images (TensorFlow, MXNet, Chainer, PyTorch, and Scikit-Learn) and your custom container images, see this blog for an introduction to the topic.

In this blog we show how you can run and debug your python code and Sagemaker Pipelines locally using VS Code or PyCharm Professional.

Setting Up Your Development Environment

First lets make sure you have the prerequisites installed and ready

Install python, e.g. as described here or here
Install Docker Desktop (Windows, Mac, or Linux) and docker-compose.
Install your favorite IDE, e.g. VS code or PyCharm Professional

If you have Windows I would recommend to utilize the Linux subsystem (WSL).
Note that the approach described here should also work on Ubuntu or other Linux systems.

Lets check out an example Sagemaker pipeline which contains multiple steps and start jupyter lab.



git clone https://github.com/aws-samples/amazon-sagemaker-local-mode/
cd amazon-sagemaker-local-mode/general_pipeline_local_debug
python3 -m venv .venv
source .venv/bin/activate
pip install jupyter
jupyter lab

Open sagemaker-pipelines-local-mode-debug.ipynb, which defines an example pipeline with multiple steps
Make sure that you have your AWS credentials set up and define the right profile in the first cell of the notebook.
Define your Sagemaker execution role default_sagemaker_execution_role

Debugging SageMaker Python Scripts with VS Code

The main steps to debug with VSCode are:

Add code to start debug server
Run the Sagemaker pipeline so that the container with the debug code gets started
Connect VS Code to the running container and attach the debugger to the waiting debug server

Add code to start debug server

The first pipeline step is a preprocessing step for feature engineering. We want to debug this preprocessing.py while it is executed as part of Sagemaker pipeline. Lets add a hook to start the debug server when the code will run within a docker container. First we need to install the debugpy package. Then we can import and start the listener with debugpy.
Note that you could also add the debugpy package to your requirements.txt or any other approach that makes sure that the package is available for the python execution environment. For simplicity we install it directly from within the python code.



%%writefile code/preprocessing.py
...

# For vscode debugging
import sys
import subprocess
# install debugpy
subprocess.check_call([sys.executable, "-m", "pip", "install", "debugpy"])
# required only for this specific scikit learn container to avoid a package conflict
subprocess.check_call([sys.executable, "-m", "pip", "uninstall", "-y", "typing"])

# import and run the debug hook from debugpy 
import debugpy
debugpy.listen(("0.0.0.0",5678))
debugpy.wait_for_client()  # blocks execution until client is attached
breakpoint() # add breakpoint

Run the Sagemaker pipeline to start the container with the debug code

We can execute all cells of the notebook. This will create a sagemaker pipeline and run it in local mode using docker and docker-compose. The preprocessing job will wait for the debugger to connect.

Run → Run All Cells

The first time when you run the Sagemaker pipeline locally, it might take some time to download the required container images. After the first time the steps will execute much faster, since docker caches the images.

Connect VS Code to the running container and attach the debugger to the waiting debug server

First, lets start VS Code. Click the open remote window button (lower left). Then Attach to the running container.
VS Code will attach to the running container and install vscode server into it. Then it will open a new window which is attached to that container. You can now browse or explore the directories of the container if you want. The preprocessing step code is located in /opt/ml/processing/input/code/preprocessing.py.

To start debugging we have to create a debug configuration and hit the play button to run the debugger with remote attach. This will bring us to the breakpoint which we defined in preprocessing.py.

Great, you are now able to debug your custom code step by step. The gif below shows these steps visually.

Debugging SageMaker Python Scripts with PyCharm

The debugging in PyCharm Professional works slightly different compared to VS Code. Instead of connecting from the IDE towards the code with a listener, in PyCharm the code connects to the debug server that has to be started in the IDE.
In the following we want to debug our preprocessing.py.

The approach is the following:

Start PyCharm Debug Server
Add the debug code to your python file
Run Sagemaker pipeline which automatically connects to the debug server

Start PyCharm Debug Server

First we need to start PyCharm and its Debug Server for python.
We create a new python project in PyCharm and then a new debug configuration. We select a Python Debug Server.
See the screenshots below for a visual flow.

Add the debug code to your python file, e.g. preprocessing.py

Lets add the code to the python file that we want to debug.



%%writefile code/preprocessing.py
...
# For PyCharm debugging
import sys
import subprocess
subprocess.check_call([sys.executable, "-m", "pip", "install", "pydevd-pycharm~=232.10203.26"])
import pydevd_pycharm
pydevd_pycharm.settrace('host.docker.internal', port=8200, stdoutToServer=True, stderrToServer=True) # host.docker.internal to route from container to your host
breakpoint()

Make sure that you have defined the right version of pydevd-pycharm package (e.g. pydevd-pycharm~=232.10203.26. The version needs to match the version shown above in the debug configuration screen of PyCharm.

Note that you could also add the specific pydevd-pycharm~=232.10203.26 package to your requirements.txt or use any other approach that makes sure that the package is available for the python execution environment. For simplicity we install it directly from within the python code.

Run Sagemaker pipeline which automatically connects to the debug server

Now, we can create and run the pipeline by executing all the cells in the notebook. This will create a sagemaker pipeline and run it in local mode using docker and docker-compose. The preprocessing job will attach now to the debug server from PyCharm.

Run → Run All Cells

Ones the container with the preprocessing step is running, it will connect to the PyCharm Debug Server and we will be able to debug our code. We just have to tell PyCharm, whether it should auto-detect the location of our python source file (preprocessing.py) or whether it should download the file from the running container. The animation with screenshots below shows this process.

In the same way you can also debug all the other steps of the pipeline, e.g. training (abalone.py), evaluation.py or inference.py.

Conclusion

We showed how you can run a full sagemaker pipeline with preprocessing, training, evaluation, and inference step in local mode and how to debug it locally with VS Code or PyCharm Professional. This allows you to iterate fast on your code and gain speed.

The same pipeline can now be also executed in the cloud by a simple switch parameter (run_locally=False), which allows you to use additional cloud compute and storage resources on AWS.

Blog