Bundling Python Environments in a ZIP Archive
Jürgen Hermann
Posted on March 12, 2020
Shipping dependencies for your scripts as a single file, built with ‘shiv’.
The Basic Idea
If you have a set of Python scripts that are all using the same set of required packages, you can distribute those dependencies in the form of a zipapp, i.e. in a single executable file. See Building Zipapps (PEP 441) for details if you're new to the concept of zipped Python application bundles
Unlike shipping a script in a virtualenv built within a single project, you can have a project for the base libraries and other projects for the scripts, including scripts written by end users who are just using your dependencies.
You can also deploy any PyPI package that way, with a simple call of shiv
, as shown in the next section using Pandas.
A Practical Example
The following example uses the well-known Pandas data science library, but this works for any project built with setuptools or any other build tool creating Python packages that declare their requirements.
So, to create your base library release artifact, install and call shiv like this:
python3.8 -m pip install --user shiv
python3.8 -m shiv -p '/usr/bin/python3.8 -IS' \
-o ~/bin/_lib-pandas pandas==1.0.1
Do this in a virtualenv and leave out the --user
option if you want to keep your account's home directory clean.
Note that we do not provide an entry point here, which means this zipapp drops into the given Python interpreter and is thus usable as an interpreter, with the contained packages available for import
.
Now we can exploit this to write a script using the zipapp as its interpreter:
cat >script <<'EOF'
#! /usr/bin/env _lib-pandas
import re
import sys
from pathlib import Path
import pandas as pd
print('Using Pandas from',
Path(pd. __file__ ).parent.relative_to(Path.home()),
'\n\nPython path:')
df = pd.DataFrame(sys.path, columns=['Path'])
df.Path = df.Path.str.replace(f'^{ re.escape(str(Path.home())) }/', '~/')
print(df)
EOF
chmod +x script
./script
Calling the script produces the following output:
Using Pandas from .shiv/_lib-pandas_23b2…d2/site-packages/pandas
Python path:
Path
0 ~/bin/_lib-pandas
1 /usr/lib/python38.zip
2 /usr/lib/python3.8
3 /usr/lib/python3.8/lib-dynload
4 ~/.shiv/_lib-pandas_23b2bb7d64c26139950435a64d...
If you're familiar with Pandas, you'll instantly recognize the Python path output as coming from a Pandas data frame. 🎉
This first execution is a bit slow on startup, because the cache directory you see at the end of the Python path has to be populated first. shiv's boot-strapping code unpacks extension packages containing native code into the file system, so the OS can load them.
The underscore prefix in the zipapp name indicates this is not a command humans would normally use. Alternatively and especially in production you can deploy into e.g. /usr/local/lib/python3.8/
and then use an absolute path instead of an env
call as the script's interpreter.
Posted on March 12, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.