So, you want to launch several threads in Python and something does not work?
Fernando Tricas García
Posted on July 19, 2021
I was refactoring some program and I discovered that my concurrence was not working well: I programmed some concurrent.futures.ThreadPoolExecutor
tasks but some of them were waiting until the others finished.
This was a problem because the program was launched once each hour and it was not finishing in time for the next run. I had the program running several times (not a problem of concurrency since it is mostly well behaved, but definitely a non desirable way of working).
The problem? Well, these methods (ThreadPoolExecutor
, ProcessPoolExecutor
, ...) have a max_workers
limit which is defined as follows:
Changed in version 3.8: Default value of max_workers is changed to min(32, os.cpu_count() + 4). This default value preserves at least 5 workers for I/O bound tasks. It utilizes at most 32 CPU cores for CPU bound tasks which release the GIL. And it avoids using very large resources implicitly on many-core machines.
The machine where this program is running is a cheap 1 vCPU machine, and this is a problem. However, my processes are very light, they just do some input, wait some time, do some output and that's all.
The solution? You can count (or at least, have some idea about it) the number of threads and set an adequate value for this max_workers
parameter.
In my case:
with concurrent.futures.ThreadPoolExecutor(max_workers=75) as pool:
...
That is, 75 workers. As stated previously I have no problems with these processes and this allows the program to run all of them in a concurrent way.
Posted on July 19, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.