gs-fastcopy: get CPU count for upload workers

dchaley

David Haley

Posted on July 23, 2024

gs-fastcopy: get CPU count for upload workers

See previous post: Introducing gs-fastcopy

I shipped the enhancement gs-fastcopy-python#10: Inspect processor count for better upload defaults.

Previously, we were defaulting to 8 workers (Google's default). On a system with more than 8 cores, that's leaving a lot idle!

Now, we inspect the available CPU count. We honor os.get_schedaffinity on systems that support it (processors available to this process not just in general) otherwise, we use os.cpu_count().

Benchmarking results: [source sheet]

Bar chart showing time taken to complete upload operation, with and without compressing first

Note how adding workers speeds up the process, but yields diminishing returns. I think that's when the network transfer itself becomes more the bottleneck, but it's likely that tweaking chunk sizes would help too.

Also note the more dramatic effects when using compression (using pigz, parallel gzip). pigz would've picked up on the max workers before, what's new here is including them in the upload as well.

💖 💪 🙅 🚩
dchaley
David Haley

Posted on July 23, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related