|
From: | Michael Terry |
Subject: | Re: [Duplicity-talk] Caching for pwd and grp operations |
Date: | Wed, 7 Nov 2012 21:33:15 -0500 |
I'm hitting some nasty performance problems with duplicity and large
/etc/passwd and /etc/group files. See
https://bugs.launchpad.net/duplicity/+bug/1013446 for more details.
But basically lookups using the pwd and grp modules can be pretty
slow, and duplicity does a lot of them. Group lookups are
particularly expensive if you have a group with lots of members
because grp parses the comma-separated member list into a Python list
for every lookup.
Seems like the best way to fix the problem is to cache pwd and grp
lookups in duplicity. I'm happy to submit a patch that does this.
Assuming this is a good idea, I've got a question about
duplicity.tarfile. It looks like this is a lightly modified copy of
Python 2.7's tarfile module, and caching would need to be added here
(as well as duplicity.path). Would it be fine to replace calls to
grp.* and pwd.* with cached versions? (E.g., replace calls to
grp.getgrgid with duplicity.cached_ops.getgrgid or something like
that.) Or would it be better to avoid modifications by writing some
nastier code that replaces the actual functions in the grp and pwd
modules?
--
Steve Atwell <address@hidden>
_______________________________________________
Duplicity-talk mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
[Prev in Thread] | Current Thread | [Next in Thread] |