Python Executable Packaging With pex

cwprogram

Chris White

Posted on August 25, 2023

Python Executable Packaging With pex

pex stands for Python EXecutable, and is a method to produce an easy to distribute python package. One important thing to note is that pex doesn't have reliable Windows support. Due to this you'll want to be running pex on *NIX systems. This article will showcase some of the things you can do with pex to make distributing different types of python projects.

Basic Usage

Given that the python interpreter being used for pex packaging matters, it's highly recommended to utilize a virtual environment. As an example I'll use a python 3.11 environment:

$ virtualenv --python=python3.11 venv
$ source venv/bin/activate
$ python -m pip install pex
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Collecting pex
  Using cached https://www.piwheels.org/simple/pex/pex-2.1.144-py2.py3-none-any.whl (2.9 MB)
Installing collected packages: pex
Successfully installed pex-2.1.144
Enter fullscreen mode Exit fullscreen mode

The general format for pex CLI execution is:

pex [MODULES] [OPTIONS]

where [MODULES] is a space separated list of modules in pip style dependency declaration strings:

$ pex "requests" "setproctitle==1.3.2" "uvicorn[standard]"
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>
Enter fullscreen mode Exit fullscreen mode

Without any other options pex will drop into an interactive shell and the modules provided will be available within:

>>> import requests
>>> import setproctitle
>>> import uvicorn
>>> 
Enter fullscreen mode Exit fullscreen mode

After closing out of the console we can see that the virtual environment packages are not affected at all:

$ pip list
Package    Version
---------- -------
pex        2.1.144
pip        23.2.1
setuptools 65.5.0
$
Enter fullscreen mode Exit fullscreen mode

Requirements Management

As listing out each module is generally not ideal, two alternative methods an be utilized to pass in requirements. The first solution is to use a requirements.txt file:

requirements.txt

requests
setproctitle==1.3.2
uvicorn[standard]
Enter fullscreen mode Exit fullscreen mode

Then pex can be ran with the -r option and the requirements.txt file passed in:

$ pex -r requirements.txt 
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> 
Enter fullscreen mode Exit fullscreen mode

-r arguments can also be passed in multiple times in case you have multiple projects being bundled. If you have a virtual environment already setup then you can pass in pip freeze to pex:

$ pex $(pip freeze)
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>
Enter fullscreen mode Exit fullscreen mode

The requirements.txt method would be good if you have a lot of modules to work with. pip freeze is good for cases where a virtualenv is already setup.

Python Project Structured Modules

pex also supports python packages as modules which have a structure similar to the basic one in the python packaging docs. For this example I'll be using the project layout in this git repository. It includes a basic layout with a README, LICENSE, a simple module, and a pyproject.toml. This is enough for it to be recognized by pex much like a development mode pip install:

$ pex .
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from simple_pex import simple_math
>>> simple_math(3,4)
7
>>>
Enter fullscreen mode Exit fullscreen mode

This was all made possible without having to build the project itself.

Resource Directories

pex can also add in directories for important items such as test data and configurations. In the app repository there's a resources directory which contains a test_data.json file that looks like this:

{
    "a": 1,
    "b": 2
}
Enter fullscreen mode Exit fullscreen mode

We can use pex with the -D argument to add a specific directory for bundling. Then it can be used within the script/interactive prompt like so:

$ pex . -D resources
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from simple_pex import simple_math
>>> import json
>>> fp = open('resources/test_data.json', 'r')
>>> data = json.load(fp)
>>> fp.close()
>>> simple_math(data['a'], data['b'])
3
>>> 
Enter fullscreen mode Exit fullscreen mode

As you can see, the JSON data is loaded in and then passed to the simple_math function where the proper result is returned.

Entry Points

One feature of python scripts is the ability to set an entry point as if running a basic program. For this example I'll be using code hosted in this repository. What makes this work is the declaration of a console script like so:

[project.scripts]
adder = "cli_pex:run"
Enter fullscreen mode Exit fullscreen mode

This will produce a script called "adder" which will execute run from the cli_pex package:

import argparse

def run():
    parser = argparse.ArgumentParser()
    parser.add_argument("--integer1", type=int, help="First Integer")
    parser.add_argument("--integer2", type=int, help="Second Integer")
    args = parser.parse_args()
    print(args.integer1 + args.integer2)
Enter fullscreen mode Exit fullscreen mode

While not a very practical program it gets the job done of showing off how pex works with console scripts. To showcase this:

$ pex . -o adder.pex -c adder
$ ./adder.pex --integer1 3 --integer2 4
7
Enter fullscreen mode Exit fullscreen mode

Using -c tells pex we want to use the adder script defined in pyproject.toml. Now when we package everything it acts just like a basic program. There's also an option to utilize fixed arguments so only the execution of the .pex file is necessary:

$ pex . -o adder.pex -c adder --inject-args "--integer1 3 --integer2 4"
$ ./adder.pex 
7
Enter fullscreen mode Exit fullscreen mode

This is useful for making easy to deploy server scripts which take arguments such as bind ports and hostnames.

Docker Deployments

To put this all together I'll make a Docker deployment of a pex web application. It will bundle gunicorn with a flask app which will act as the entrypoint for the container. The code that's used in this example can be found here. In this setup there is a simple flask app, a gunicorn configuration file, and a Dockerfile to enable deployment. This time the pyproject.toml declares some dependencies:

dependencies = [
    "flask",
    "gunicorn",
    "setproctitle",
]
Enter fullscreen mode Exit fullscreen mode

Another thing to consider is that pex will need the setup of the system packaging it to be fairly close to the target system. That means I'll build on an Unbuntu box and my container will be based off Debian (slimmer, and close enough system wise). A few other things that need to be done:

  • The pex executable needs to point to the gunicorn console script to run the server
  • gunicorn config file will need to be copied over to the system
  • --inject-args will need to have the --config argument set to the gunicorn config
  • The resulting .pex file will need to be set as an entry point

Looking over the requirements, the resulting pex call will be:

pex . -o web_pex.pex -c gunicorn --inject-args "--config /home/gunicorn/app/gunicorn.config.py"
Enter fullscreen mode Exit fullscreen mode

While the Dockerfile will look like:

FROM python:3.11.4-bullseye

USER root
RUN useradd -d /home/gunicorn -r -m -U -s /bin/bash gunicorn

USER gunicorn
RUN mkdir /home/gunicorn/app
COPY config/gunicorn.config.py /home/gunicorn/app
COPY web_pex.pex /home/gunicorn/app

ENTRYPOINT /home/gunicorn/app/web_pex.pex
EXPOSE 8000
Enter fullscreen mode Exit fullscreen mode

Given that my interpreter building the .pex bundle is python 3.11, I set that as the base image. Now all that remains is to build the Dockerfile and then run the resulting image:

$ docker buildx build  -f Dockerfile -t flask/web-pex:latest .
$ docker run -it -p 8000:8000 flask/web-pex:latest
[2023-08-25 00:13:11 +0000] [7] [INFO] Starting gunicorn 21.2.0
[2023-08-25 00:13:11 +0000] [7] [INFO] Listening at: http://0.0.0.0:8000 (7)
[2023-08-25 00:13:11 +0000] [7] [INFO] Using worker: sync
[2023-08-25 00:13:11 +0000] [8] [INFO] Booting worker with pid: 8
[2023-08-25 00:13:11 +0000] [9] [INFO] Booting worker with pid: 9
Enter fullscreen mode Exit fullscreen mode

This will run the newly created flask/web-pex:latest image and expose port 8000. Now to test with curl:

$ curl http://127.0.0.1:8000
Hello World
Enter fullscreen mode Exit fullscreen mode

Thanks to setproctitle the process list also comes out cleaner:

$ ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
gunicorn       1  0.0  0.0   2480   512 pts/0    Ss+  00:13   0:00 /bin/sh -c /home/gunicorn/app/web_pex.pex
gunicorn       7  4.5  0.2  53904 48244 pts/0    S+   00:13   0:00 gunicorn: master [gunicorn]
gunicorn       8  1.1  0.3  63244 52084 pts/0    S+   00:13   0:00 gunicorn: worker [gunicorn]
gunicorn       9  0.6  0.3  62024 51644 pts/0    S+   00:13   0:00 gunicorn: worker [gunicorn]
gunicorn      10  0.5  0.0   6052  3784 pts/1    Ss   00:13   0:00 /bin/bash
gunicorn      17  0.0  0.0   8648  3276 pts/1    R+   00:13   0:00 ps aux
Enter fullscreen mode Exit fullscreen mode

This makes it easier to discern the various gunicorn processes on the container.

pex Tools

Another interesting feature is that pex also has some tools available which let us create a more performant docker image. To make this work we need to add --include-tools to the pex build command:

$ pex . -o web_pex.pex -c gunicorn --inject-args "--config /home/gunicorn/app/gunicorn.config.py" --include-to
ols
Enter fullscreen mode Exit fullscreen mode

The Dockerfile will also be updated to a multi-stage build to produce a finalized image:

FROM python:3.11.4-bullseye as deps
RUN mkdir -p /home/gunicorn/app
COPY web_pex.pex /home/gunicorn/
RUN PEX_TOOLS=1 /usr/local/bin/python3.11 /home/gunicorn/web_pex.pex venv --scope=deps --compile /home/gunicorn/app

FROM python:3.11.4-bullseye as srcs
RUN mkdir -p /home/gunicorn/app
COPY web_pex.pex /home/gunicorn
COPY config/gunicorn.config.py /home/gunicorn/app
RUN PEX_TOOLS=1 /usr/local/bin/python3.11 /home/gunicorn/web_pex.pex venv --scope=srcs --compile /home/gunicorn/app

FROM python:3.11.4-bullseye
RUN useradd -d /home/gunicorn -r -m -U -s /bin/bash gunicorn
COPY --from=deps --chown=gunicorn:gunicorn /home/gunicorn/app /home/gunicorn/app
COPY --from=srcs --chown=gunicorn:gunicorn /home/gunicorn/app /home/gunicorn/app
USER gunicorn
ENTRYPOINT /home/gunicorn/app/pex
EXPOSE 8000
Enter fullscreen mode Exit fullscreen mode

This will separate out the dependency and source compilation. When python does compilation it will create an interpreter specific set of bytecode so it doesn't have to be done at runtime. This makes things run much faster. The docker build's only change is a different Dockerfile while the run command stays the same:

$ docker buildx build  -f Dockerfile_pex_tools -t flask/web-pex:latest .
$ docker run -it -p 8000:8000 flask/web-pex:latest
[2023-08-25 01:25:47 +0000] [7] [INFO] Starting gunicorn 21.2.0
[2023-08-25 01:25:47 +0000] [7] [INFO] Listening at: http://0.0.0.0:8000 (7)
[2023-08-25 01:25:47 +0000] [7] [INFO] Using worker: sync
[2023-08-25 01:25:47 +0000] [8] [INFO] Booting worker with pid: 8
[2023-08-25 01:25:47 +0000] [9] [INFO] Booting worker with pid: 9
Enter fullscreen mode Exit fullscreen mode

Looking inside the container you can see the layout of pex in the ~/app directory of the gunicorn user:

$ cd ~/app
$ ls
PEX-INFO  __main__.py  __pycache__  bin  gunicorn.config.py  include  lib  lib64  pex  pyvenv.cfg
Enter fullscreen mode Exit fullscreen mode

And the cache files also show up a time earlier than the gunicorn workers spawning to show that they are indeed compiled output and not just python generating them naturally:

$ ls -lah lib/python3.11/site-packages/flask/__pycache__/
total 388K
drwxr-xr-x 2 gunicorn gunicorn 4.0K Aug 25 01:03 .
drwxr-xr-x 4 gunicorn gunicorn 4.0K Aug 25 01:03 ..
-rw-r--r-- 1 gunicorn gunicorn 4.0K Aug 25 01:03 __init__.cpython-311.pyc
-rw-r--r-- 1 gunicorn gunicorn  249 Aug 25 01:03 __main__.cpython-311.pyc
-rw-r--r-- 1 gunicorn gunicorn  86K Aug 25 01:03 app.cpython-311.pyc
-rw-r--r-- 1 gunicorn gunicorn  32K Aug 25 01:03 blueprints.cpython-311.pyc
Enter fullscreen mode Exit fullscreen mode

Conclusion

This concludes a look at using pex for packaging python code. It's an interesting system and judging from a GitHub issue also has the potential for reproducible builds. Having tools enabled allows for both an easy to work with single package deploy while at the same time enabling a more performant option via multi-stage compilation. I encourage taking a look to see how it can enhance your python projects.

💖 💪 🙅 🚩
cwprogram
Chris White

Posted on August 25, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related