Juan Benito
Posted on June 7, 2020
Many python developers use gunicorn to deploy their python applications, it is one of the most used python WSGI servers. In order to understand better how these servers works under the hood, I decided to create one on my own!
The server have the next main features:
- Clean code: The project has educational purpose, in order to show how the wsgi specification works under the hood, the code is as explicit as possible, the code is decoupled in three different layers: Network Server, WSGI (wsgi specification), http2(domain entities)
- It has to work!: The server has to be able to receive income requests, load the python application and return a valid response.
This project is not production ready, and it could contains bugs. Also, the project doesn't fulfill all the use cases and all the possible corner cases, it is not safe deploying our applications using it.
Funicorn
I decided to create Funicorn as the not production ready version of Gunicorn.
Overall Architecture
The architecture of the project is pretty simple, it contains three components:
- Network components (Server and Request Handler): which handle the low level interaction with the OS.
- WSGI Layer: It follows the wsgi specification, and it is able to talk to the python application.
- Python application: It contains the business logic you want to perform with you application, it usually use a framework such as Flask or Django
Main Application
The main application is our server entry point, it receives all the necessary configuration to handle properly your application. Also, it is the place to handle the life of your server.
The funicorn class only receives the essential information to work properly. It passes the application_path parameters to the handler, and wrap the server features.
# funicorn/main.py
class Funicorn:
def __init__(self, app_path: str, app_obj: str, host: str, port: int):
FunicornHandler.app_path = (app_path, app_obj)
self.server = TCPServer((host, port), FunicornHandler)
def run(self):
self.server.serve_forever()
def stop(self):
self.server.shutdown()
self.server.server_close()
Server
The network server handle the low level network requirements. It establishes, keeps and finishes the network connection successfully, such as opening the network socket, binding the socket to the specific host and port, it also channel the information to the handler.
In order to prevent me from building the server from scratch and handling myself the network connection, I used the TCPServer
from the socketserver python module, this module provides a few server classes with the basic features you would need.
Request Handler
The Funicorn Handler handle appropriately every request the server receive. it loads the python application, processes the data received by the server, and return the appropriate response.
My handler inherit from BaseRequestHandler which is a base class provided by the socket library of python.
# funicorn/main.py
class FunicornHandler(BaseRequestHandler):
app_path: Tuple[str, str] = None
def handle(self):
application = load_application(self.app_path[0], self.app_path[1])
http_request = self.request.recv(1024)
request: Request = RequestDTO(http_request).request
response: Response = application.process_request(request)
response_data = ResponseDTO(response).http_data
self.request.sendall(response_data)
Request and Response
In order to work easily with HTTP and network entities, I created a Request and Response classes, with all the necessary information.
The classes only isolate the request and response information such as headers, the method used and the body.
The body must be a file-like object, it is passed to the application and it must implement the method read, and write at least. I have modeled it using BytesIO.
# funicorn/http2.py
class Request:
def __init__(self, method: str, uri: str, headers: Dict[str, Header], body: Body, query_param: Optional[str],
server_name: str, ssl=False, protocol: str='HTTP/1.1'):
self.method = method
self.headers = headers
self.body = body
self.uri = uri
self.server_name = server_name
self.query_param = query_param or ''
self.scheme = 'https' if ssl else 'http'
self.protocol = protocol
class Response:
def __init__(self, status: str, headers: List[Header], body: Body, protocol: str):
self.status = status
self.headers = headers
self.body = body
self.protocol = protocol
class Body:
def __init__(self, data: bytes = None):
self._data = io.BytesIO(data) if data else io.BytesIO()
def read(self):
return self._data.getvalue()
def write(self, data: bytes):
return self._data.write(data)
def __len__(self):
return len(self._data.getvalue())
WSGI
The WSGI Application entity is the layer which wrap the python application and WSGI specification for the rest of the server. It receives a request entity, passes it to the python application and returns the processed response.
The main method is process_request
, where the request is processed, and passed to the application through the environment variables. Finally, the response is built through the start_response
method (defined in the specification), and returned to the Handler.
# funicorn/wsgi.py
class WSGIApplication:
def __init__(self, application: Callable):
self.application = application
self._request: Request = None
self._response_builder = ResponseBuilder()
def process_request(self, request: Request) -> Response:
self.request = request
environ = WSGIEnvironment(request).environ
http_body = self.application(environ, self.start_response)
self.set_response_body(http_body)
response: Response = self._response_builder.build()
return response
def start_response(self, status, response_headers: List[Tuple[str, str]], exc_info=None):
if not exc_info:
self.set_response_metadata(status, response_headers)
else:
raise exc_info[1].with_traceback(exc_info[2])
All the information provided by the request, it is wrapped in the environment variables in the following way.
There are many environment variables you need to provide to the Python Application In order to work correctly, there variables are defined in the python proposals, and much of them are mandatory.
Also, many of these variables are defined by the HTTP Protocol, each one provides important information to the application. I have only implemented the mandatory variables.
I won't explain these variables in more detail, because I consider there is already enough information about it on the web.
# funicorn/wsgi.py
class WSGIEnvironment:
def set_cgi_environ(self):
self.environ['REQUEST_METHOD'] = self._request.method
self.environ['SCRIPT_NAME'] = ''
self.environ['PATH_INFO'] = self._request.uri
self.environ['QUERY_STRING'] = self._request.query_param
self.environ['SERVER_NAME'] = self._request.server_name
self.environ['SERVER_PROTOCOL'] = self._request.protocol
self.environ['SERVER_PORT'] = '8000'
self.environ['CONTENT_TYPE'] = self._request.headers.pop('CONTENT_TYPE', Header('', '')).value
self.environ['CONTENT_LENGTH'] = self._request.headers.pop('CONTENT_LENGTH', Header('', '')).value
self.environ.update({f'HTTP_{header.key}': header.value for header in self._request.headers.values()})
def set_wsgi_environ(self):
self.environ['wsgi.input'] = self._request.body
self.environ['wsgi.input'] = self._request.body
self.environ['wsgi.errors'] = sys.stderr
self.environ['wsgi.version'] = (1, 0)
self.environ['wsgi.multithread'] = False
self.environ['wsgi.input_terminated'] = True
self.environ['wsgi.url_scheme'] = self._request.scheme
self.environ['wsgi.run_once'] = True
Python Application
The python application executed by our server is a simple Flask example. The application is able to receive GET and POST request.
- GET: The application return "Hello World"
- POST: The application return the same body from the request (an echo application)
# funicorn/examples/flask_example.py
app = Flask(__name__)
@app.route("/hola", methods=['GET', 'POST'])
def hello():
if request.method == 'POST':
return request.json
return "Hello World"
Conclusion
Implementing this project was a great exercise because it forces you to understand many details you didn't know about some of the standard most used by the python community, and how it actually works under the hood, it is not magic, it is actually code!
I would recommend to anybody take a look to the respository where you can see the whole code, in order to take a better look to many details I was not able to summary here. You can clone it and play with it, as I commented before I tried to be as explicit as possible, but if you have any doubt let me know :)
Also, let me know in the comment section, if you liked the article and it helps you to understand how this kind of projects works.
Posted on June 7, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
September 10, 2021