Threading vs Asyncio vs Multiprocessing
Atul Kushwaha
Posted on February 22, 2024
Before Diving into Differences: Threading, Asyncio, and Multiprocessing
Let's solidify our understanding of processes, threads, and context switching before delving into the intricacies of threading, asyncio, and multiprocessing.
Process:
- A process is a program in execution. It's a set of instructions actively being carried out.
- When a processor runs a program, that program becomes a process. It requires resources like RAM and I/O operations, provided by the operating system.
- Each process has its own unique memory address space. This means a malfunction in one process won't affect others. This isolation is what makes Chrome's tabbed browsing powerful. - - Each tab runs as a separate process, ensuring issues in one tab don't disrupt others.
๐งต Thread:
- A thread is a unit of execution within a process. Like chapters in a book, a process can have multiple threads (at least one, the main thread).
- Threads within a process share the same memory address space. This means a failure in any thread affects the entire process execution.
๐ Context Switch:
- Imagine reading a book when someone interrupts you with a task. Before switching, you wisely place a bookmark on the page.This allows you to resume reading from the exact spot later.
Similarly, when one process/thread needs to execute while another is active, a context switch occurs. This involves saving the current state of the active process/thread and restoring the one needing execution, allowing it to pick up where it left off.
Context switching between processes is more expensive due to their distinct memory spaces. It involves updating page tables and other overhead.In contrast, context switching between threads is much faster thanks to their shared memory space.
Threading
- Threading is a python module which allows us to create a run threads
- Threading gives an illusion of parallelism and not true parallelism due to Python's GIL(Global Interpreter Lock) even if we have multiple cores we can't achieve true parallelism
python's GIL : is a mutex(mutual-exclusion) lock which makes sure that only one thread of a process can execute at a given time
- so under the hood a thread sheduler shedules the execution of threads (depending upon the os) by rapid context swicthing
- due Global interpreter lock limitation threading is not a good candidate for cpu heavy task's, so where can we use threads ? we can use threading in I/O bound tasks( I/O operations like waiting for network requests or reading files); while a thread waits for I/O completion, other threads can be executed
Asyncio
- Asyncio is a python module for asynchronous programming
- Asyncio is same in the terms of functioning it also gives an illusion of parallelism but appoarch is different from threading
unlike threading where most of the things are handled by os, the event loop handles most scheduling and avoids explicit threading concerns in asynchronous programming. However, understanding coroutines and event loops can have a learning curve.
- Asyncio Leverages an event loop and coroutines (special functions). The event loop manages tasks and schedules them for execution when resources become available (e.g., I/O completes). This minimizes context switching and avoids blocking the main thread. Coroutines: These are special functions that can be paused and resumed later. They are the building blocks of asynchronous code in asyncio. Event loops: These are responsible for managing the execution of coroutines and handling events. Asynchronous I/O: asyncio provides functions for performing I/O operations asynchronously, without blocking the main thread. async and await keywords: These keywords are used to define and use coroutines, respectively.
Multiprocessing
- Multiprocessinh overcomes all of the problems that threadig has
- Multiprocessing in Python allows you to create multiple processes that leverage multiple CPU cores, significantly improving performance for CPU-bound tasks
- Processes communicate and share data using methods like pipes, queues, and shared memory
- Remember to implement proper synchronization mechanisms like locks and semaphores to prevent race conditions when accessing shared resources
while technically we may achieve true parallelism with multiprocessing but we should consider some caveats
- Global Interpreter Lock (GIL): The CPython implementation of Python has a Global Interpreter Lock (GIL) that restricts a single thread from executing Python bytecode at a time. This means even with multiple processes, only one thread within each process can truly execute Python code in parallel. While the processes themselves are separate, the Python execution within each process can still be serialized due to the GIL.
- Overhead: Creating and managing processes involves overhead that can negate the benefits of parallelism for smaller tasks.
Posted on February 22, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.