As for 20 ms, if you deal with 20 dependencies in parallel, that's 400ms just to start working.
Shaving half a second on many things make things fast.
Althought as we saw with zeeek in the other comment, you likely don't need multiprocessing since the network stack and unzip in the stdlib release the gil.
Threads are cheaper.
Maybe if you'd bundle pubgrub as a compiled extension, you coukd get pretty close to uv's perf.
At least one worker for each virtual cpu core you get for CPU. I got 16 on my laptop. My servers have much more.
If I have 64 cores, and 20 dependencies, I do want the 20 of them to be uncompressed in parallel. That's faster and if I'm installing something, I wanna prioritize that workload.
But it doesn't have to be 20. Even say 5 with queues, that's 100ms. It adds up.
... but the archive directory is at the end of the file?
> no python VM startup overhead
This is about 20 milliseconds on my 11-year-old hardware.