> Every code path you don’t have is a code path you don’t wait for.
No, every code path you don't execute is that. Like
> No .egg support.
How does that explain anything if the egg format is obsolete and not used?
Similar with spec strictness fallback logic - it's only slow if the packages you're installing are malformed, otherwise the logic will not run and not slow you down.
And in general, instead of a list of irrelevant and potentially relevant things would be great to understand some actual time savings per item (at least those that deliver the most speedup)!
But otherwise great and seemingly comprehensive list!
Even in compiled languages, binaries have to get loaded into memory. For Python it's much worse. On my machine:
$ time python -c 'pass'
real 0m0.019s
user 0m0.013s
sys 0m0.006s
$ time pip --version > /dev/null
real 0m0.202s
user 0m0.182s
sys 0m0.021s
Almost all of that extra time is either the module import process or garbage collection at the end. Even with cached bytecode, the former requires finding and reading from literally hundreds of files, deserializing via `marshal.loads` and then running top-level code, which includes creating objects to represent the functions and classes.
It used to be even worse than this; in recent versions, imports related to Requests are deferred to the first time that an HTTPS request is needed.
> Unless memory mapped by the OS with no impact on runtime for unused parts?
Yeah, this is presumably why a no-op `uv` invocation on my system takes ~50 ms the first time and ~10 ms each other time.
> Exactly, so again have no impact?
Only if your invocation of pip manages to avoid an Internet request. Note: pip will make an Internet request if you try to install a package by symbolic name even if it already has the version it wants in cache, because its cache is an HTTP cache rather than a proper download cache.
But even then, there will be hundreds of imports mainly related to Rich and its dependencies.
I got confused by the direction of the discussion.
My original point was that Requests imports in pip used to not be deferred like that, so you would pay for them up front, even if they turned out to be irrelevant. (But also they are relevant more often than they should be, i.e. the deferral system doesn't work as well as it should.)
Part of the reason you pay for them is to run top-level code (to create function and class objects) that are irrelevant to what the program is actually doing. But another big part is the cost of actually locating the files, reading them, and deserializing bytecode from them. This happens at import time even if you don't invoke any of the functionality.
rtld does a lot of work even in “static” binaries to rewrite relocations even in “unused parts” of any PIE (which should be all of them today) and most binaries need full dyld anyway.
No, every code path you don't execute is that. Like
> No .egg support.
How does that explain anything if the egg format is obsolete and not used?
Similar with spec strictness fallback logic - it's only slow if the packages you're installing are malformed, otherwise the logic will not run and not slow you down.
And in general, instead of a list of irrelevant and potentially relevant things would be great to understand some actual time savings per item (at least those that deliver the most speedup)!
But otherwise great and seemingly comprehensive list!