Python Application Deployment: Protocols

cwprogram

Chris White

Posted on August 17, 2023

Python Application Deployment: Protocols

Note Code used in this article can be found at https://github.com/cwgem/python-deployment-protocols

Serving dynamic content built with python has several solutions. Each of them have different ways of working with Python code to produce output that can be returned to a client. This article will look at how to work with python using CGI, FastCGI, SCGI, and uwsgi protocols. First off we'll look at the technology that helped the web become dynamic in the first place: CGI.

Note: I've decided to leave AJP out of this as server implementation would have taken a lot more time than I wanted to invest. It will be covered in later portions of the series though.

CGI

Common Gateway Interface (CGI) was created for the purpose of helping establish dynamic content for the web. The Internet's original form was merely sharing information through static documents. RFC 3875 was created to define the standards for CGI. The essential components where a set of specific environment variables passed on to a spawned process along with any data to its stdin (standard input). Then the process would handle the dynamic request and return it back to the server via stdout (standard output) along with any errors on stderr. A very simplified example is shown here:

cgi_server.py

from http.server import ThreadingHTTPServer, BaseHTTPRequestHandler
from io import BytesIO
from multipart import MultipartParser, copy_file
from subprocess import run as process_run, PIPE
import gzip
import os

class MyHTTPHandler(BaseHTTPRequestHandler):
    HTTP_DT_FORMAT = '%a, %d %b %Y %H:%M:%S GMT'

    def write_data(self, bytes_data):
        self.send_header('Content-Encoding', 'gzip')
        return_data = gzip.compress(bytes_data)
        self.send_header('Content-Length', len(return_data))
        self.end_headers()
        self.wfile.write(return_data)

    def handle_post_request(self):
        self.log_message(f"Reading POST request from {self.client_address}")
        content_length = int(self.headers['Content-Length'])
        content_boundary = self.headers['Content-Type'].split('=')[1]

        self.data = MultipartParser(self.rfile, content_boundary, content_length)
        image_data = self.execute_cgi_process(self.data.get('image_data'))
        self.send_response(200)
        self.send_header('Content-Type', 'image/png')
        return image_data

    def execute_cgi_process(self, image_entry):
        image_buffer = BytesIO()
        size = copy_file(image_entry.file, image_buffer)
        cgi_env = {
            **os.environ,
            'SERVER_SOFTWARE': 'PyPy/3.10.12',
            'SERVER_NAME': 'raspberrypi',
            'GATEWAY_INTERFACE': 'CGI/1.1',
            'SERVER_PROTOCOL': 'HTTP/1.1',
            'SERVER_PORT': str(PORT),
            'REQUEST_METHOD': 'POST',
            'QUERY_STRING': '',
            'SCRIPT_NAME': 'cgi_application.py',
            'REMOTE_HOST': self.client_address[0],
            'CONTENT_TYPE': 'image/png',
            'CONTENT_LENGTH': str(size)
        }
        p_handle = process_run(['python', 'cgi_application.py'], input=image_buffer.getvalue(), capture_output=True, env=cgi_env)
        if p_handle.stderr:
            print(p_handle.stderr.decode())
        return p_handle.stdout

    def do_POST(self):
        bytes_data = self.handle_post_request()
        self.write_data(bytes_data)
        self.request.close()

if __name__ == "__main__":
    HOST, PORT = "localhost", 8087

    with ThreadingHTTPServer((HOST, PORT), MyHTTPHandler) as server:
        print(f'Server bound to port {PORT}')
        server.serve_forever()

Enter fullscreen mode Exit fullscreen mode

cgi_application.py

import os
from io import BytesIO
from PIL import Image, ImageDraw
from sys import stdin, stdout

if __name__ == '__main__':
    image_data = stdin.buffer.read(int(os.environ['CONTENT_LENGTH']))
    cgi_headers = [
        "SERVER_SOFTWARE", "SERVER_NAME", "GATEWAY_INTERFACE",
        "SERVER_PROTOCOL", "SERVER_PORT", "REQUEST_METHOD",
        "QUERY_STRING", "SCRIPT_NAME", "REMOTE_HOST",
        "CONTENT_TYPE", "CONTENT_LENGTH"]
    header_output = '\n'.join([ os.environ[x] for x in cgi_headers ])
    with Image.open(BytesIO(image_data)) as im:
        resized_image = im.resize((800,800))
        draw = ImageDraw.Draw(resized_image)
        draw.multiline_text((10,10), header_output, fill="black")
        resized_image.save(stdout, 'png')
Enter fullscreen mode Exit fullscreen mode

cgi_client.py

import requests
from io import BytesIO
from PIL import Image

img = Image.new('RGB', (400,400), "white")
img_bytes = BytesIO()
img.save(img_bytes, 'png')

multipart_data = {
    'image_data': ('blank-image.png', img_bytes.getbuffer(), 'image/png')
}

r = requests.post('http://127.0.0.1:8087/', files=multipart_data)
with open('image-resized.png', 'wb') as resized_image_fp:
    resized_image_fp.write(r.content)
Enter fullscreen mode Exit fullscreen mode

If you want to run this locally, the following pip modules will be required:

  • Pillow
  • multipart

You'll want to check the Pillow installation page to get a list of dependencies for the operating system itself. The client first makes a 400x400 blank PNG and then sends it off as a multipart POST request.

CGI Server

The server receives this POST request and starts to process it:

    def handle_post_request(self):
        self.log_message(f"Reading POST request from {self.client_address}")
        content_length = int(self.headers['Content-Length'])
        content_boundary = self.headers['Content-Type'].split('=')[1]

        self.data = MultipartParser(self.rfile, content_boundary, content_length)
        image_data = self.execute_cgi_process(self.data.get('image_data'))
        self.send_response(200)
        self.send_header('Content-Type', 'image/png')
        return image_data
Enter fullscreen mode Exit fullscreen mode

multipart module is then used to obtain a reference to the image data and some properties about it. Now everything is passed on to the CGI process:

    def execute_cgi_process(self, image_entry):
        image_buffer = BytesIO()
        size = copy_file(image_entry.file, image_buffer)
        cgi_env = {
            **os.environ,
            'SERVER_SOFTWARE': 'PyPy/3.10.12',
            'SERVER_NAME': 'raspberrypi',
            'GATEWAY_INTERFACE': 'CGI/1.1',
            'SERVER_PROTOCOL': 'HTTP/1.1',
            'SERVER_PORT': str(PORT),
            'REQUEST_METHOD': 'POST',
            'QUERY_STRING': '',
            'SCRIPT_NAME': 'cgi_application.py',
            'REMOTE_HOST': self.client_address[0],
            'CONTENT_TYPE': 'image/png',
            'CONTENT_LENGTH': str(size)
        }
        p_handle = process_run(['python', 'cgi_application.py'], input=image_buffer.getvalue(), capture_output=True, env=cgi_env)
        if p_handle.stderr:
            print(p_handle.stderr.decode())
        return p_handle.stdout
Enter fullscreen mode Exit fullscreen mode

So first off is that a BytesIO is used to hold the image data in a buffer. copy_file obtains the size for use in the CONTENT_LENGTH header and fills the BytesIO buffer with the image data. The unpack operator fills in the base environment along with the CGI specific headers. The ones we utilize here are:

  • SERVER_SOFTWARE - "The name and version of the information server software making the CGI request" which I simply put PyPy and the version. Practically this would be something like a web framework of sorts
  • SERVER_NAME - Simply the hostname of the server
  • GATEWAY_INTERFACE - The CGI version, which will be CGI/1.1 a majority of the time
  • SERVER_PROTOCOL - HTTP is passed in, along with the version for the standard used (1.1)
  • SERVER_PORT - The is set to the server's listening port, which is conveniently set via a const define
  • REQUEST_METHOD - This is the HTTP method being used, in this case POST
  • QUERY_STRING - There's no query string so this is just blank
  • SCRIPT_NAME - This is the name of the script. Technically it also includes path info, but I've simplified it since it's in the same directory
  • REMOTE_HOST - The IP address of the client, which is the first part of the (host, port) tuple of client_address
  • CONTENT_TYPE - The MIME type of the content, similar to the HTTP header
  • CONTENT_LENGTH - The length of the content being passed in, similar to the HTTP header. In this case we'll pass the image on directly to the CGI application, so it's just the size of the image from copy_file

CGI Process Spawn

For the actual process spawning:

        p_handle = process_run(['python', 'cgi_application.py'], input=image_buffer.getvalue(), capture_output=True, env=cgi_env)
        if p_handle.stderr:
            print(p_handle.stderr.decode())
        return p_handle.stdout
Enter fullscreen mode Exit fullscreen mode

process_run is just an alias for subprocess.run as run is a bit too generic to comfortably import as-is. The first argument is a list which the first value is the command to run, and the values after are arguments. In this case python is spawned and ran against the cgi_application.py script. input is simply the entire stream of bytes from the BytesIO object obtained via getvalue(). capture_output is required as the CGI process will return data via stdout. It also makes sure stderr is filled if there were errors. If there were, they're printed out for debugging purposes. Finally, env is used to pass in both the parent environment and the custom CGI environment.

CGI Application

Now on the application side:

import os
from io import BytesIO
from PIL import Image, ImageDraw
from sys import stdin, stdout

if __name__ == '__main__':
    image_data = stdin.buffer.read(int(os.environ['CONTENT_LENGTH']))
    cgi_headers = [
        "SERVER_SOFTWARE", "SERVER_NAME", "GATEWAY_INTERFACE",
        "SERVER_PROTOCOL", "SERVER_PORT", "REQUEST_METHOD",
        "QUERY_STRING", "SCRIPT_NAME", "REMOTE_HOST",
        "CONTENT_TYPE", "CONTENT_LENGTH"]
    header_output = '\n'.join([ os.environ[x] for x in cgi_headers ])
    with Image.open(BytesIO(image_data)) as im:
        resized_image = im.resize((800,800))
        draw = ImageDraw.Draw(resized_image)
        draw.multiline_text((10,10), header_output, fill="black")
        resized_image.save(stdout, 'png')
Enter fullscreen mode Exit fullscreen mode

The image data is obtained via reading stdin CONTENT_LENGTH bytes. While this is a controlled test, practical application would potentially have this reading in the data in chunks for memory purposes (potentially to a temp file). A list of CGI specific header values is obtained and joined with a newline. image_data is wrapped in a BytesIO container so it's in a streamable format for Image.open. The image is resized and black text is printed with all of our headers. As stdout is the destination, it's passed in as the file handle for Image.save along with png to designate the output format. After running the server along with the client script in another shell instance:

Resulting CGI process modified image

The image is resized to 800x800 and the text of all the CGI header values are shown to confirm the environment is passed in properly.

FastCGI

Now the problem with this manner of handling tings is that a single request is mapped to a single OS process. This becomes difficult to scale, especially if the HTTP server is using worker processes to manage connections. The FastCGI protocol was created as a solution to these issues. Instead of creating a process each time, FastCGI uses a binary protocol to send the request to a long running server. This also allows for better separation between front end and backend server.

Nginx Setup

As an example I've have an nginx server forward traffic to a backend python server via the fastcgi protocol. nginx has fastcgi support by default, and for this demonstration I'll use the debian package:

$ sudo apt-get install -y nginx
Enter fullscreen mode Exit fullscreen mode

The package itself gives sensible defaults in /etc/nginx/sites-enabled/default. I'll update this file directly since it's only for the purpose of this article. If you plan to use nginx for other sites you can make a separate file such as /etc/nginx/sites-enabled/fcgi-test instead. The contents are as follows:

server {
    listen 8888 default_server;
    listen [::]:8888 default_server;
    root /var/www/html;
    index index.html index.htm index.nginx-debian.html;
    server_name _;

    location / {
        fastcgi_pass 127.0.0.1:8087;
        include fastcgi_params;
    }
}
Enter fullscreen mode Exit fullscreen mode

This sets up a basic server listening on port 8888. It will forward all traffic to port 8087 on localhost, which I'll be setting up a python script to be the target for. include fastcgi_params; ensures that a number of CGI related environment variables are set as well.

FastCGI Protocol Inspection

Now for the python script to handle the fastcgi part, the fastcgi module will need to be installed first via pip install fastcgi. First I'll do a simple server just to show what's being sent down the wire:

from fastcgi import FcgiHandler
from socketserver import ThreadingTCPServer

class MyFcgiHandler(FcgiHandler):
    def handle(self):
        print(self.environ)
        print(self['stdin'].read())
        self['stdout'].write(b'Content-Type: text/plain\r\n\r\nHello, World')

if __name__ == "__main__":
    HOST, PORT = "localhost", 8087

    with ThreadingTCPServer((HOST, PORT), MyFcgiHandler) as server:
        print(f'Server bound to port {PORT}')
        server.serve_forever()
Enter fullscreen mode Exit fullscreen mode

I'll also modify the client to use the nginx port instead:

import requests
from io import BytesIO
from PIL import Image

img = Image.new('RGB', (400,400), "white")
img_bytes = BytesIO()
img.save(img_bytes, 'png')

multipart_data = {
    'image_data': ('blank-image.png', img_bytes.getbuffer(), 'image/png')
}

r = requests.post('http://localhost:8888/', files=multipart_data)
with open('image-resized.png', 'wb') as resized_image_fp:
    resized_image_fp.write(r.content)
Enter fullscreen mode Exit fullscreen mode

Now I have 3 shells open, one for nginx work, one for the fastcgi server, and one for the fastcgi client:

shell 1

$ sudo systemctl reload nginx
Enter fullscreen mode Exit fullscreen mode

shell 2

$ python fcgi_application.py 
Server bound to port 8087
Enter fullscreen mode Exit fullscreen mode

Now that the services are up I'll run the client and check the output of the server:

$ python fcgi_client.py
Enter fullscreen mode Exit fullscreen mode
{'QUERY_STRING': '', 'REQUEST_METHOD': 'POST', 'CONTENT_TYPE': 'multipart/form-data; boundary=1a29f6cbb68a9a0d01d25a8030646b94', 'CONTENT_LENGTH': '1569', 'SCRIPT_NAME': '/', 'REQUEST_URI': '/', 'DOCUMENT_URI': '/', 'DOCUMENT_ROOT': '/var/www/html', 'SERVER_PROTOCOL': 'HTTP/1.1', 'REQUEST_SCHEME': 'http', 'GATEWAY_INTERFACE': 'CGI/1.1', 'SERVER_SOFTWARE': 'nginx/1.18.0', 'REMOTE_ADDR': '::1', 'REMOTE_PORT': '33474', 'REMOTE_USER': '', 'SERVER_ADDR': '::1', 'SERVER_PORT': '8888', 'SERVER_NAME': '_', 'REDIRECT_STATUS': '200', 'HTTP_HOST': 'localhost:8888', 'HTTP_USER_AGENT': 'python-requests/2.31.0', 'HTTP_ACCEPT_ENCODING': 'gzip, deflate', 'HTTP_ACCEPT': '*/*', 'HTTP_CONNECTION': 'keep-alive', 'HTTP_CONTENT_LENGTH': '1569', 'HTTP_CONTENT_TYPE': 'multipart/form-data; boundary=1a29f6cbb68a9a0d01d25a8030646b94'}
b'--1a29f6cbb68a9a0d01d25a8030646b94\r\nContent-Disposition: form-data; name="image_data"; filename="blank-image.png"\r\nContent-Type: image/png\r\n\r\n\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01\x90\x00\x00\x01\x90\x08\x02\x00\x00\x00\x0f\xdd\xa1\x9b\x00<snip>
Enter fullscreen mode Exit fullscreen mode

So there are quite a lot of headers here, including both the CGI headers and the various HTTP headers. After that is the multipart encoded data.

FastCGI Image Resize Server

As we have what we need and more data wise we can combine some of the CGI server and application code to consolidate it into the fastcgi application:

from fastcgi import FcgiHandler
from multipart import MultipartParser
from PIL import Image, ImageDraw
from socketserver import ThreadingTCPServer

HEADER_ENCODING = 'iso-8859-1'

class MyFcgiHandler(FcgiHandler):
    def handle(self):
        content_length = int(self.environ['CONTENT_LENGTH'])
        content_boundary = self.environ['CONTENT_TYPE'].split('=')[1]
        self.data = MultipartParser(self['stdin'], content_boundary, content_length)

        image_entry = self.data.get('image_data')
        self['stdout'].write(b'Content-Type: image/png\r\n')
        header_output = '\n'.join([ f'{k}: {v}' for k, v in self.environ.items() ])

        with Image.open(image_entry.file) as im:
            resized_image = im.resize((800,800))
            draw = ImageDraw.Draw(resized_image)
            draw.multiline_text((10,10), header_output, fill="black")
            self['stdout'].write(f'Content-Length: {resized_image.size}\r\n\r\n'.encode(HEADER_ENCODING))
            resized_image.save(self['stdout'], 'png')

        image_entry.file.close()

if __name__ == "__main__":
    HOST, PORT = "localhost", 8087

    with ThreadingTCPServer((HOST, PORT), MyFcgiHandler) as server:
        print(f'Server bound to port {PORT}')
        server.serve_forever()
Enter fullscreen mode Exit fullscreen mode

Now after running the client code again, I notice my image is modified just like the CGI version:

Fast CGI Modified Image

While our code still acts as a server, notice I'm not using an HTTP server anymore. Due to the protocol nginx is smart enough to know the fastcgi request was successful. I can just send the content type header along with content and be on my way. Handling of the stdin/stdout are also slightly different and accessed by a dictionary key instead.

Thoughts on FastCGI

Given that the actual application code is part of a server handler in a long running process, this means more flexibility is available in handling more requests such as threads and multiprocessing. PyPy in particular benefits this as it's able to utilize JIT for the handler code path. While FastCGI introduces multiplexing as a feature not much of a benefit was found to implement it on the backend versus traditional multi-socket requests. While code is somewhat compact, the protocol itself is a bit on the complicate side.

SCGI

While FastCGI helps with some of the scaling concerns of CGI the binary format can be complicated to work with. As an answer to this the aptly named Simple Common Gateway Interface (SCGI) was proposed in 2008. The format was very simple to parse compared to the several binary structures of FastCGI. Like FastCGI, nginx also supports scgi transport.

SCGI protocol

Unlike FastCGI, the SCGI protocol uses a much more simplified format. The basic format is:

len:string,

where len is the length of string and string is a list of headers. Both headers and values are separated by a null byte. Content-Length is then used to read in the rest of the request.

SCGI Server

I'll create a new sites enabled that listens on port 9999 and forwards requests to the backend via SCGI:

server {
    listen 9999;
    listen [::]:9999;
    root /var/www/html;
    index index.html index.htm index.nginx-debian.html;
    server_name _;

    location / {
        scgi_pass 127.0.0.1:8087;
        include scgi_params;
    }
}
Enter fullscreen mode Exit fullscreen mode

Only a few minor changes need to be made to support this. I'll update the python client to point to port 9999 and create an SCGI server. This will require installing the scgi module via pip install scgi:

from multipart import MultipartParser
from PIL import Image, ImageDraw
from scgi.util import ns_reads, parse_env
from socketserver import ThreadingTCPServer, StreamRequestHandler

HEADER_ENCODING = 'iso-8859-1'

class MyHandler(StreamRequestHandler):
    def parse_body(self):
        header_segment = ns_reads(self.rfile)
        self.headers = parse_env(header_segment)

    def write_headers(self, content_length):
        response_headers = {
            'Content-Type': 'image/png',
            'Content-Length': content_length
        }
        self.wfile.write(b'HTTP/1.1 200 OK\r\n')
        for k,v in response_headers.items():
            self.wfile.write(f'{k}: {v}\r\n'.encode(HEADER_ENCODING))
        self.wfile.write(b'\r\n')

    def send_return(self):
        content_length = int(self.headers['CONTENT_LENGTH'])
        content_boundary = self.headers['CONTENT_TYPE'].split('=')[1]
        self.data = MultipartParser(self.rfile, content_boundary, content_length)

        image_entry = self.data.get('image_data')
        header_output = '\n'.join([ f'{k}: {v}' for k, v in self.headers.items() ])

        with Image.open(image_entry.file) as im:
            resized_image = im.resize((800,800))
            draw = ImageDraw.Draw(resized_image)
            draw.multiline_text((10,10), header_output, fill="black")
            self.write_headers(resized_image.size)
            resized_image.save(self.wfile, 'png')

        image_entry.file.close()

    def handle(self):
        self.headers = {}
        self.content = b''

        self.parse_body()
        self.send_return()
        self.finish()

if __name__ == "__main__":
    HOST, PORT = "localhost", 8087

    with ThreadingTCPServer((HOST, PORT), MyHandler) as server:
        print(f'Server bound to port {PORT}')
        server.serve_forever()
Enter fullscreen mode Exit fullscreen mode

Most of the work is handled by the scgi module via ns_reads and parse_env. After that is reading in the rest of the data containing the image. Results are not much different from the FastCGI version:

A picture showing the resulting image

Thoughts on SCGI

While the protocol is simple, header parsing does seem a bit risky. The header length does give an overall guard value from reading too much, but the lack of size attribute for each header/value means having to read in until a null byte. As a protocol though it certainly is much easier to work with than FastCGI though less feature rich.

uwsgi

This protocol is a rather confusing result of odd naming decisions. The server, uWsgi, can serve content over the protocol, uwsgi. Not to mention the server itself is not exclusive to WSGI, something that will be covered in another installment of this series. It's a binary protocol much like FastCGI but closer to the simplicity of SCGI. That said, the documentation on the protocol is somewhat sparse. There is a header segment which is a a modifier, followed by a data size, then another modifier:

<0>(1 byte)<data size>(2 bytes)<0>

While the modifiers being 0 are normal for an HTTP request, the protocol defines other types as well. Following the heaers are a set of key/value pairs are presented which are CGI like in nature. These will take up bytes and come after the last <NULL> byte in the header. These key values are:

<key size>(2 bytes)<key data>(key size bytes)<value size>(2 bytes)<value data>(value size bytes)

One thing to note here is that unlike several network protocols uwsgi data uses little endian byte order. I'll setup a similar nginx config for this site:

server {
    listen 9898;
    listen [::]:9898;
    root /var/www/html;
    index index.html index.htm index.nginx-debian.html;
    server_name _;

    location / {
        uwsgi_pass 127.0.0.1:8087;
        include uwsgi_params;
    }
}

Enter fullscreen mode Exit fullscreen mode

The usual client is modified to use port 9898 instead and the uwsgi protocol server code looks like this:

import struct
from multipart import MultipartParser
from PIL import Image, ImageDraw
from socketserver import ThreadingTCPServer, StreamRequestHandler

class MyHandler(StreamRequestHandler):
    def process_headers(self, header_size):
        bytes_processed = 0

        while bytes_processed < header_size:
            key_size  = struct.unpack('<H', self.rfile.read(2))[0]
            key = struct.unpack(f'<{key_size}s', self.rfile.read(key_size))[0]
            value_size  = struct.unpack('<H', self.rfile.read(2))[0]
            value = struct.unpack(f'<{value_size}s', self.rfile.read(value_size))[0]
            self.headers[key.decode('utf-8')] = value.decode('utf-8')
            bytes_processed += (2 + key_size + 2 + value_size)

    def parse_body(self):
        modifier = self.rfile.read(1)
        header_size = struct.unpack('<H', self.rfile.read(2))[0]
        modifier2 = self.rfile.read(1)
        self.process_headers(header_size)

    def write_headers(self, content_length):
        response_headers = {
            'Content-Type': 'image/png',
            'Content-Length': content_length
        }
        self.wfile.write(b'HTTP/1.1 200 OK\r\n')
        for k,v in response_headers.items():
            self.wfile.write(f'{k}: {v}\r\n'.encode('utf-8'))
        self.wfile.write(b'\r\n')

    def send_return(self):
        content_length = int(self.headers['CONTENT_LENGTH'])
        content_boundary = self.headers['CONTENT_TYPE'].split('=')[1]
        self.data = MultipartParser(self.rfile, content_boundary, content_length)

        image_entry = self.data.get('image_data')
        header_output = '\n'.join([ f'{k}: {v}' for k, v in self.headers.items() ])

        with Image.open(image_entry.file) as im:
            resized_image = im.resize((800,800))
            draw = ImageDraw.Draw(resized_image)
            draw.multiline_text((10,10), header_output, fill="black")
            self.write_headers(resized_image.size)
            resized_image.save(self.wfile, 'png')

        image_entry.file.close()

    def handle(self):
        self.headers = {}
        self.content = b''
        self.parse_body()
        self.send_return()
        self.request.close()

if __name__ == "__main__":
    HOST, PORT = "localhost", 8087

    with ThreadingTCPServer((HOST, PORT), MyHandler) as server:
        print(f'Server bound to port {PORT}')
        server.serve_forever()
Enter fullscreen mode Exit fullscreen mode

So the parsing begins with obtaining values from the header:

    def parse_body(self):
        modifier = self.rfile.read(1)
        header_size = struct.unpack('<H', self.rfile.read(2))[0]
        modifier2 = self.rfile.read(1)
        self.process_headers(header_size)
Enter fullscreen mode Exit fullscreen mode

Since the modifiers aren't really of functional concern they are simply read in. In this case I've left the unused variables around instead of just calling read() without a return to show how it maps to the protocol. Next the datasize is read in with < indicating little endian and H indicating a 2 byte value. A [0] slice is used as struct.unpack returns a tuple. Now for the HTTP headers:

    def process_headers(self, header_size):
        bytes_processed = 0

        while bytes_processed < header_size:
            key_size  = struct.unpack('<H', self.rfile.read(2))[0]
            key = struct.unpack(f'<{key_size}s', self.rfile.read(key_size))[0]
            value_size  = struct.unpack('<H', self.rfile.read(2))[0]
            value = struct.unpack(f'<{value_size}s', self.rfile.read(value_size))[0]
            self.headers[key.decode(HEADER_ENCODING)] = value.decode(HEADER_ENCODING)
            bytes_processed += (2 + key_size + 2 + value_size)
Enter fullscreen mode Exit fullscreen mode

Since the protocol has indicators of byte length all over we can use the amount of bytes processed as our iteration terminator. 2 bytes are read in for both the size of the key an the size of the value. Unpacking the respective values is done via <[size]s which pulls in [size] number of bytes. Finally the headers are created by decoding the respective key/value binary data so it ends up as a string. The result is still similar to the other servers:

Resulting image with headers included

Thoughts on uwsgi protocol

Update 08/27 The documentation has been updated across the board so the table width isn't an issue anymore. That said, it could still use some code examples (would make implementation easier).

I like the simplicity of this format, though the documentation feels rather lacking compared to other standards (there's also not enough width for the packet description table so it's not apparent you can scroll horizontally). The one thing I really like is the emphasis on explicit byte sizes. This could be used to prevent reading in an unusually large value ahead of time by comparing the size against a specified byte limit. Having a simplified python interface for reading handling this would be a nice to have.

Conclusion

It's amazing to see how far things have come since the CGI days. Personally out of all of these the uwsgi protocol is a personal favorite. Some more python side support and more server availability could make it a strong contender. While I do like FastCGI, it does have the issue of a standard on the very verbose side along with a more involved protocol. SCGI seems good for simple projects but the lack of stricter structure on header key values is a bit of a concern as you can indirectly have user generated data via header forwarding.

💖 💪 🙅 🚩
cwprogram
Chris White

Posted on August 17, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related