[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Monotone-devel] Re: Support for binary files, scalability and Windows p
From: |
graydon hoare |
Subject: |
[Monotone-devel] Re: Support for binary files, scalability and Windows port |
Date: |
Tue, 20 Jan 2004 13:25:18 -0500 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 |
Asger Kunuk Ottar Alstrup wrote:
In order to represent this accurately, in the face of distributed use, I
think you need to represent every single change as an edge in the graph
somehow, and the order in which they happened. In other words, you
effectively have to record an ordering of your back-edges or
cancellation edges.
well, in a sense you're right. I don't want to beleaguer the point too
much, except to point out again that the DAG is kept over *manifest*
versions, not file versions. so for example if you add even 1 bit to a
ChangeLog file on each revision, the ChangeLog SHA1 changes, and the
manifest ID changes, and I have distinct nodes in my graph again.
granted, this is a bit of a cheap hack; it's just (a) simple and (b) in
the hands of the user. if they want to incorporate the date and time of
the last revision -- or a UUID for that matter -- into the notion of a
"version", it's as simple as making sure it shows up in an easily-merged
file somewhere in the manifest.
it's a simple model, and simplicity is important to me: if monotone's
model of something grows too complex, my reasoning about the model gets
weak and error-prone, not to mention it becomes harder to explain the
model to users. since users like to consider version control "very
permanent and safe", it's important for them to understand what it's
doing beneath the covers, at least in general.
That is a good proposal, and that might work for video files. I think I
need to give you a little background of where I am coming from.
> ...
ahh, here is the juicy part. your needs are clearly formidable, and you
are willing to dedicate some effort to solving them. fair enough. let me
split what I see as your requirements into 3 sections:
- the need to mark some files as "opaque", in the sense that they are
not necessarily scanned for common substructure with their own past
versions or neighbours, not gzipped, not merged.
- the need to support very large files: overcoming the 16mb limit in
the database, and removing any cases in which files are loaded into
memory in their entirety.
- the need, possibly, to change the way files are identified for one
of two reasons: hashing takes too long, and (possibly) there are
unacceptable failure cases in history graphs built from hashes.
I can imagine handling "opaqueness" with a hook: call the hook with a
pathname (or other identifier), and if it returns true, monotone always
stores and sends complete versions (no xdelta or similar-block scanning)
and doesn't bother gzipping. not too much effort to implement. we'd need
to locate all the places we make assumptions about gzip and xdelta, and
predicate them on the hook.
I think we're on our way to supporting large files. breaking the 16mb
barrier is probably the easy part since it can be confined to the
storage system. if we're going to a block-collection model for storage
anyways, that would buy you 16mb of block commands. say each block
command is 128 bits, then you can fit a million of those in an existing
16mb fragment, so you might be able to store say files of 16tb in size.
removing all the places where we assume we can load a file into memory
might be hard, might not be. if you can live with loading the "top" item
in a file -- the up-to-16mb block-command list -- into memory all at
once, we only need to change places where we "reach inside" that data,
rather than all possible references. or, if even that is too expensive,
we could possibly make the data object lazy, so that it keeps a small
memory cache of its own sections, and loads/flushes them on demand.
complex, but doable.
finally, the change of identifier type: again, I am wary of the
indirection-table approach, so I am trying to consider alternatives. I
think this could be done with a hook. the calculate_identifier() calls
could be changed to depend on a hook which optionally picks some
non-SHA1 way of calculating identifiers. then if you have something else
in mind you can use it. it would require all the users of a given
project to have that hook installed, but otherwise monotone would be
completely ignorant of your chosen strategy.
would this set of changes satisfy your needs? I would be happy to
accomodate these as they are mostly hidden from smaller-scale users, and
can be described in "advanced use" sections of the manual. they are the
sort of compromise unlikely to cause mainstream breakage or unnecessary
multiplication of ideas in the easy cases.
-graydon
- Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, (continued)
- Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Zbynek Winkler, 2004/01/19
- Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Ori Berger, 2004/01/18
- Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Zack Weinberg, 2004/01/18
- [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/19
- [Monotone-devel] RE: Support for binary files, scalability and Windows port, Asger Kunuk Alstrup, 2004/01/18
- [Monotone-devel] Re: Support for binary files, scalability and Windows port, Peter Simons, 2004/01/18
- [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/19
- [Monotone-devel] RE: Support for binary files, scalability and Windows port, Asger Kunuk Ottar Alstrup, 2004/01/19
- [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/19
- [Monotone-devel] RE: Support for binary files, scalability and Windows port, Asger Kunuk Ottar Alstrup, 2004/01/20
- [Monotone-devel] Re: Support for binary files, scalability and Windows port,
graydon hoare <=
- [Monotone-devel] RE: Support for binary files, scalability and Windows port, Asger Kunuk Ottar Alstrup, 2004/01/21
- Re: [Monotone-devel] RE: Support for binary files, scalability and Windows port, Zbynek Winkler, 2004/01/21
- RE: [Monotone-devel] RE: Support for binary files, scalability andWindows port, Asger Kunuk Ottar Alstrup, 2004/01/21
- [Monotone-devel] Re: Support for binary files, scalability andWindows port, graydon hoare, 2004/01/21
- Re: [Monotone-devel] Re: Support for binary files, scalability andWindows port, Zbynek Winkler, 2004/01/27
- [Monotone-devel] Re: Support for binary files, scalability andWindows port, graydon hoare, 2004/01/27