monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: New strange monotone errors


From: Nathaniel Smith
Subject: Re: [Monotone-devel] Re: New strange monotone errors
Date: Thu, 11 Mar 2004 03:50:12 -0800
User-agent: Mutt/1.5.5.1+cvs20040105i

On Mon, Mar 08, 2004 at 10:05:32AM -0500, graydon hoare wrote:
> the upshot was that during netsync, you wrote some manifest certs to 
> disk but not the accompanying manifests (because monotone crashed during 
> formatting of an internal log message). as a result, you had a cert 
> without a manifest, and it complained when you tried to update to the 
> manifest (located by examining certs).
> 
> I should make the update algorithm a bit more tolerant about this. in 
> general, it is always possible to have a cert referring to a 
> not-yet-existant version. perhaps it should issue a warning if it 
> selects one as an update target, but then it should back off and 
> continue updating, not crash.

I think this is important, and maybe a more general solution is
needed.  Perhaps during graph operations, certs that don't have
corresponding manifests should be collapsed out on the fly (i.e., if
a given node doesn't really exist, remove it from the graph and
attach all its parents directly to all its children), and operations
performed on the resulting logical graph?  Or something like that.

The problem is that 1) we want to avoid creating fake ancestry links
that hop over missing manifests (as per the discussion a while
back in the context of depots), and 2) we do want to be able to prune
old history -- not just by truncation, but also by general subsets
(e.g., past some date only storing tagged releases).  I propose
therefore that the way to do this is to always keep the entire cert
graph around, so that if manifests ever become available they will
transparently be used, we minimize the chance of getting a
disconnected graph, etc., and that graph operations be made robust
against missing manifests.

I don't think storing all certs will cause space problems, even in
projects big enough for pruning to be desirable.  The biggest MT db I
happen to have around right now is an import of the libjava CVS tree,
and here are some numbers:

~/src/monotone/gcc$ ls -l libjava.db 
-rw-r--r--    1 njs      users        133M Oct 21 14:32 libjava.db
~/src/monotone/gcc$ echo .dump manifest_certs | sqlite libjava.db | wc -c
14783793
~/src/monotone/gcc$ echo .dump manifest_certs | sqlite libjava.db | gzip -9 -c 
| wc -c
7029262
~/src/monotone/gcc$ du -sh gcc/libjava 
27M     gcc/libjava
~/src/monotone/gcc$ tar cf - gcc/libjava | gzip -9 | wc -c 
3622483

I.e., the certs alone take roughly 15M, or 7M compressed (which is the
relevant number when considering the amount of downloading required to
get a starting repository).  A single working directory, for
comparison, takes 27M, or 3.5M (!) compressed.  It'd be interesting to
run similar numbers for, say, the whole gcc repository, but things
don't look immediately prohibitive...

-- Nathaniel

-- 
  /* Tell the world that we're going to be the grim
   * reaper of innocent orphaned children.
   */
-- Linux kernel 2.4.5, main.c




reply via email to

[Prev in Thread] Current Thread [Next in Thread]