-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid running apt-cache policy on every import #4690
Comments
FWIW, that apt-cache invocation is via
and it is coming from our datalad.utils def get_linux_distribution():
"""Compatibility wrapper for {platform,distro}.linux_distribution().
"""
if hasattr(platform, "linux_distribution"):
# Use deprecated (but faster) method if it's available.
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=DeprecationWarning)
result = platform.linux_distribution()
else:
import distro # We require this for Python 3.8 and above.
result = distro.linux_distribution(full_distribution_name=False)
return result
try:
linux_distribution_name, linux_distribution_release \
= get_linux_distribution()[:2]
on_debian_wheezy = on_linux \
and linux_distribution_name == 'debian' \
and linux_distribution_release.startswith('7.')
except: # pragma: no cover
# MIH: IndexError?
on_debian_wheezy = False
linux_distribution_name = linux_distribution_release = None since unlikely the HOME is shared across radically different OSes (and wheezy is in the past, and I do not even see |
just confirming that it is indeed about 200ms for the entire distro import/invocation $> python -c 'from time import time; t0=time(); import distro; distro.linux_distribution(full_distribution_name=False); print(time()-t0)'
0.20607256889343262 |
and we have missed when our overall import time has grown twice
|
Quick note: A backward compatible fix for distro necessity could be to rely on what discovered from stack overflow:
since distro module is needed iirc only since 3.8 . So not used until asked for, but yet to check if check if unavoidable for typical cases... |
I guess through some module/checks. That ends up quite a number of invocations when we test our base external remotes. E.g. while testing datalad-crawler:
which is more less in agreement with number of special remote invocations:
given that each invocation is about 200-300ms on my laptop:
that is at least 20 seconds of tests run time (unlikely it is some async) contributing (and this is just datalad-crawler which tests are relatively speedy .
altogether running the datalad-crawler tests results in over 100k invocations of external tools
I hope that identifying the source of some of such invocations and adding some caching, where possible (our direct invocation), could help to at least partially mitigate them. May be https://github.com/con/pyfscacher (see con/fscacher#1) whenever it is born could be of use.
The text was updated successfully, but these errors were encountered: