savannah-hackers-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-hackers-public] Re: Git hosting techniques


From: Petr Baudis
Subject: [Savannah-hackers-public] Re: Git hosting techniques
Date: Sat, 4 Nov 2006 13:08:45 +0100
User-agent: Mutt/1.5.13 (2006-08-11)

  Hi,

  cc'ing address@hidden since this might be interesting for other
Git people as well.

On Sun, Oct 29, 2006 at 06:54:46PM CET, Sylvain Beucler wrote:
> We're currently setting up something similar at
> http://cvs.sv.gnu.org/gitweb/,

  That's great!

> I would like to know if you considered the ability to autopack
> repositories to optimize space and disk i/o. For example, we're
> experimenting with the coreutils repository which weighs 1.1GB. Since
> you mirror the glibc repository, maybe you have similar issues?

  currently I do it in a rather silly way and when I do an "all-repo
check" every hour (which updates mirrors of external repositories etc.)
- I also check for unpacked objects and if there are any, I will repack
the repository; see

        http://repo.or.cz/w/repo.git?a=blob;f=updatecheck.sh;hb=HEAD

  This is not an optimal behaviour, for two reasons:

  (i) Full repack can be a lot of work on large repositories, so we
shouldn't *always* repack but more importantly, we should only rarely do
a full repack - see below.

  (ii) This is very unfriendly to those who fetch over HTTP, because
after you do a full repack, they will need to download the whole new
packfile instead of just the missing objects.

  The best solution would be to have a more intelligent repacking
strategy, where you have "archival" packs with very old history and an
active pack with just the new changes, and when you pack the loose
objects they just get appended to the "current" pack. Alternatively,
a slightly more complicated but even more flexible "logarithmic"
repacking strategy could be implemented, see

        http://news.gmane.org/find-root.php?message_id=<address@hidden>

  Even with the dumb packing strategy though, I think it pays off if you
have at least a bit of CPU power to spare. The packing saving are
really immense. For example with the glibc repository, an incremental
CVS import worth of few days of changes _doubled_ the size of the
repository (from 100M to 200M), while repacking brought it back to the
original size (100M) + epsilon.

-- 
                                Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]