[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Duplicity-talk] Caching for pwd and grp operations
From: |
Steve Atwell |
Subject: |
[Duplicity-talk] Caching for pwd and grp operations |
Date: |
Wed, 7 Nov 2012 17:09:30 -0800 |
I'm hitting some nasty performance problems with duplicity and large
/etc/passwd and /etc/group files. See
https://bugs.launchpad.net/duplicity/+bug/1013446 for more details.
But basically lookups using the pwd and grp modules can be pretty
slow, and duplicity does a lot of them. Group lookups are
particularly expensive if you have a group with lots of members
because grp parses the comma-separated member list into a Python list
for every lookup.
Seems like the best way to fix the problem is to cache pwd and grp
lookups in duplicity. I'm happy to submit a patch that does this.
Assuming this is a good idea, I've got a question about
duplicity.tarfile. It looks like this is a lightly modified copy of
Python 2.7's tarfile module, and caching would need to be added here
(as well as duplicity.path). Would it be fine to replace calls to
grp.* and pwd.* with cached versions? (E.g., replace calls to
grp.getgrgid with duplicity.cached_ops.getgrgid or something like
that.) Or would it be better to avoid modifications by writing some
nastier code that replaces the actual functions in the grp and pwd
modules?
--
Steve Atwell <address@hidden>
- [Duplicity-talk] Caching for pwd and grp operations,
Steve Atwell <=