An Introduction to Scaling Distributed Python Applications

Python is often dismissed when it comes to building scalable, distributed applications. The trick is knowing the right implementation and tools for writing Python distributed applications that scale horizontally.

With the right methods, technologies, and practices, you can make Python applications fast and able to grow in order to handle more work or requirements.

In this tutorial, we will introduce you to scaling in Python. We'll learn the key things you need to know when building a scalable, distributed system in Python.

This guide at a glance:

What is scaling?
CPU scaling in Python
Daemon Processes in Python
Event loops and Asyncio in Python
Next steps for your learning

What is scaling?

Scalability is a somewhat vague term. A scalable system is able to grow to accommodate required growth, changes, or requirements. Scaling refers to the methods, technologies, and practices that allow an app to grow.

A key part of scaling is building distributed systems. This means that you distribute the workload across multiple workers and with multiple processing units. Workers divide tasks across multiple processors or computers.

Spreading workload over multiple hosts makes it possible to achieve horizontal scalability, which is the ability to add more nodes. It also helps with fault tolerance. If a node fails, another can pick up the traffic.

Before we look at the methods of building scalable systems in Python, let's go over the fundamental properties of distributed systems.

Single-threaded application

This is a type of system that implies no distribution. This is the simplest kind of application. However, they are limited by the power of using a single processor.

Multi-threaded application

Most computers are equipped with this type of system. Multi-threading applications are more error-prone, but they offer few failure scenarios, as no network is involved.

Network distributed application

This type of system is for applications that need to scale significantly. They are the most complicated applications to write, as they require a network.

Multithreading

Scaling across processors is done with multithreading. This means we are running code in parallel with threads, which are contained in a single process. Code will run in parallel only if there is more than one CPU available. Multithreading involves many traps and issues, such as Python's Global Interpreter Lock (GIL).

CPU scaling in Python

Using multiple CPUs is one of the best options for scalability in Python. To do so, we must use concurrency and parallelism, which can be tricky to implement properly. Python offers two options for spreading your workload across multiple local CPUs: threads and processes.

Threads in Python

Threads are a good way to run a function concurrently. If there are multiple CPUs available, threads can be scheduled on multiple processing units. Scheduling is determined by the operating system.

There is only one thread, the main one, by default. This is the thread that runs your Python application. To start another thread, Python offers a threading module.

import threading

def print_something(something):
    print(something)

t = threading.Thread(target=print_something, args=("hello",))
t.start()
print("thread started")
t.join()

Once started, the main thread waits for the second thread to complete by calling its join method. But, if you do not join all your threads, it is possible that the main thread finishes before the other threads can join, and your program will appear to be blocked.

To prevent this, you can configure your threads as daemons. When a thread is a daemon, it is like a background thread and will be terminated once the main thread exits. Note that we don't need to use the join method.

import threading

def print_something(something):
    print(something)

t = threading.Thread(target=print_something, args=("hello",))
t.daemon = True
t.start()
print("thread started")

-->
hello
thread started

Processes in Python

Multithreading is not perfect for scalability due to the Global Interpreter Lock (GIL). We can also use processes instead of threads as an alternative. The multiprocessing package is a good, high-level option for processes. It provides an interface that starts new processes. Each process is a new, independent instance, so each process has its own independent global state.

import random
import multiprocessing

def compute(results):
    results.append(sum(
        [random.randint(1, 100) for i in range(1000000)]))

if __name__ == "__main__":
    with multiprocessing.Manager() as manager:
        results = manager.list()
        workers = [multiprocessing.Process(target=compute, args=(results,))
                   for x in range(8)]
        for worker in workers:
            worker.start()
        for worker in workers:
            worker.join()
        print("Results: %s" % results)

When work can be parallelized for a certain amount of time, it’s better to use multiprocessing and fork jobs. This spreads the workload among several CPU cores.

We can also use multiprocessing.Pool, which is a multiprocessing library that provides a pool mechanism. With multiprocessing.Pool, we don't need to manage the processes manually. It also makes processes reusable.

import multiprocessing
import random

def compute(n):
    return sum(
        [random.randint(1, 100) for i in range(1000000)])

if __name__ == "__main__":
    # Start 8 workers
    pool = multiprocessing.Pool(processes=8)
    print("Results: %s" % pool.map(compute, range(8)))

-->
Results: [50482130, 50512647, 50450070, 50505847, 50513886, 50539079, 50489976, 50518142]

Daemon Processes in Python

As we learned, using multiple processes to schedule jobs is more efficient in Python. Another good option is using daemons, which are long-running, background processes that are responsible for scheduling tasks regularly or processing jobs from a queue.

We can use cotyledon, a Python library for building long-running processes. It can be leveraged to build long-running, background, job workers.

Below, we create a class named PrinterService to implement the method for cotyledon.Service: run. This contains the main loop and terminate. This library does most of its work behind scenes, such as os.fork calls and setting up modes for daemons.

Cotyledon uses several threads internally. This is why the threading.Event object is used to synchronize our run and terminate methods.

import threading
import time
import cotyledon

class PrinterService(cotyledon.Service):
    name = "printer"

    def __init__(self, worker_id):
        super(PrinterService, self).__init__(worker_id)
        self._shutdown = threading.Event()

    def run(self):
        while not self._shutdown.is_set():
            print("Doing stuff")
            time.sleep(1)

    def terminate(self):
        self._shutdown.set()

# Create a manager
manager = cotyledon.ServiceManager()
# Add 2 PrinterService to run
manager.add(PrinterService, 2)
# Run all of that
manager.run()

Cotyledon runs a master process that is responsible for handling all its children. It then starts the two instances of PrinterService and gives new process names so they're easy to track. With Cotyledon, if one of the processes crashes, it is automatically relaunched.

Note: Cotyledon also offers features for reloading a program configuration or dynamically changing the number of workers for a class.

Event loops and Asyncio in Python

An event loop is a type of control flow for a program where messages are pushed into a queue. The queue is then consumed by the event loop, dispatching them to appropriate functions.

A very simple event loop could like this in Python:

while True: message = get_message() if message == quit: break
  process_message(message)

Asyncio is a new, state-of-the-art event loop provided in Python 3. Asyncio stands for asynchronous input-output. It refers to a programming paradigm that achieves high concurrency using a single thread or event loop. This is a good alternative to multithreading for the following reasons:

It’s difficult to write code that is thread safe. With asynchronous code, you know exactly where the code will shift between tasks.
Threads consume a lot of data. With async code, all the code shares the same small stack.
Threads are OS structures, so they require more memory. This is not the case for ssynico.

Asyncio is based on the concept of event loops. When asyncio creates an event loop, the application registers the functions to call back when a specific event happens. This is a type of function called a coroutine. It works similarly to a generator, as it gives back the control to a caller with a yield statement.

import asyncio

async def hello_world():
    print("hello world!")
    return 42

hello_world_coroutine = hello_world()
print(hello_world_coroutine)

event_loop = asyncio.get_event_loop()
try:
    print("entering event loop")
    result = event_loop.run_until_complete(hello_world_coroutine)
    print(result)
finally:
    event_loop.close()

-->
<coroutine object hello_world at 0x7f36a2c4da98>
entering event loop
hello world!
42

Above, the coroutine hello_world is defined as a function, but the keyword used to start its definition is async def. This coroutine will print a message and returns a result. The event loop runs the coroutine and is terminated when the coroutine returns.

Next steps for your learning

Congrats on making it to the end! You should now have a good introduction to the tools we can use to scale in Python. We can leverage these tools to build distributed systems effectively. But there is still more to learn. Next, you'll want to learn about:

Run coroutine cooperatively
aiohttp library
Queue-based distribution
Lock management
Deploying on PaaS

To get started with these concepts, check out Educative's comprehensive course The Hacker's Guide to Scaling Python. You'll cover everything from concurrency to queue-based distribution, lock management, and group memberships. At the end, you'll get hands-on with building a REST API in Python and deploying an app to a PaaS.

By the end, you’ll be more productive with Python, and you’ll be able to write distributed applications.

Happy learning!

Continue reading about Python on Educative

Start a discussion

What do you want to learn about Python next? Was this article helpful? Let us know in the comments below!

Blog