[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Monotone-devel] Re: results of mercurial user survey
From: |
Graydon Hoare |
Subject: |
[Monotone-devel] Re: results of mercurial user survey |
Date: |
Sat, 29 Apr 2006 10:47:51 -0700 |
User-agent: |
Thunderbird 1.5.0.2 (Windows/20060308) |
Bruce Stephens wrote:
I was just doing a quick estimate, and I think it's likely that the
SHA1 and RSA cost for checking everything in the current venge.net
repository is a minute or two rather than an hour or two.
If monotone were to give up verification, then it would have to be
because that would avoid some other aspects of work: reconstructing
files, reversing deltas, or whatever.
Two points:
First, boring though it feels, please stick to using profiles; do not
make up performance stories. The profiles sometimes mention SHA1, but
they almost always mention things which account for a lot more than it
too. Inlining opportunities, combinatorial explosions, bad buffering,
pessimistic cache behavior, etc. Please stick to what the profiles tell you.
Second, there is no specific part of monotone which you can point to and
say "this is where we do verification"; the concept is spread all
through the program's design. And it's really not so much that we
"verify"; as Nathaniel pointed out, the things specifically marked as
"sanity checking" or "verifying" code rarely dominate any profiles.
However, there's a kernel of truth in here: the fact is that we "do
work" in between the network and the disk. What work?
- Selecting the right information to send.
- Transforming from the format we store in to the format we send.
- Transforming back to the format to store in.
- Integrating the received information into a uniform store.
These design decisions are deeply embedded in the program. The storage
format is intended not to leak out. I'm confident that we can make the
existing structure a fair bit faster -- there is still a lot to tune --
but without extensive redesign there will be a limit to the speed, and
it will be a lower limit than our competitors. The reason is simple: our
competitors decided to use the opposite design:
- Their transmission format is identical to their storage format.
- Their storage units are pre-separated into bundles representing the
types of transmission you might like to make.
These decisions mean that their networking often reduces to something
like sendfile(). The decisions also imply some negatives:
- They are forced to separate branches into separate locations, and
cannot easily do fine-grained access control or mix branches the
way we can.
- By avoiding reconstruction of the storage format very often, they
are more likely to let global or structural inconsistencies sit
without noticing them.
- By coupling the storage and transmission formats, they make it
harder to adjust one without adjusting the other. We have more
flexibility there.
- Since we're synthesizing the storage format on the fly anyways,
we can do things like repacking and rearranging the delta graph
as we write.
- Their repositories contain lots of files, typically, rather than
our single sqlite file.
You might, by analogy, think of it as the difference between a
CGI-driven website and one serving static content. Which is better? The
CGI-driven site can do more stuff, and do more *detailed* stuff, because
it has more logic in it. The static site can serve the fixed set of
pages it has much faster. Can you make a slow CGI run faster? Often. But
seldom as fast as a static site. The logic of sendfile() is hard to beat.
There is some work -- called "monotone dumb" -- to make monotone have an
"externalization form" which can be retrieved at sendfile() speed. It
will carry some of the same limitations of our competitors, but maybe
those limitations will prove acceptable. The difficulty lies in the fact
that the monotone *client* will still need to integrate the externalized
information into its database. None of the normal monotone commands know
how to work with such externalized forms. They all expect there to be a
database. So the client will remain a bottleneck in such a situation,
though only "half a bottleneck" compared to today.
-graydon
- Re: [Monotone-devel] Re: results of mercurial user survey, (continued)
- Re: [Monotone-devel] Re: results of mercurial user survey, Justin Patrin, 2006/04/27
- Re: [Monotone-devel] Re: results of mercurial user survey, Nathaniel Smith, 2006/04/27
- Re: [Monotone-devel] Re: results of mercurial user survey, Justin Patrin, 2006/04/27
- Re: [Monotone-devel] Re: results of mercurial user survey, hendrik, 2006/04/27
- Re: [Monotone-devel] Re: results of mercurial user survey, Richard Levitte - VMS Whacker, 2006/04/28
- Re: [Monotone-devel] Re: results of mercurial user survey, Nathaniel Smith, 2006/04/28
- Re: [Monotone-devel] Re: results of mercurial user survey, Justin Patrin, 2006/04/28
- [Monotone-devel] Re: results of mercurial user survey, Bruce Stephens, 2006/04/28
- Re: [Monotone-devel] Re: results of mercurial user survey, Richard Levitte - VMS Whacker, 2006/04/28
- [Monotone-devel] Re: results of mercurial user survey, Bruce Stephens, 2006/04/28
- [Monotone-devel] Re: results of mercurial user survey,
Graydon Hoare <=
- Re: [Monotone-devel] Re: results of mercurial user survey, Timothy Brownawell, 2006/04/29
- [Monotone-devel] Re: results of mercurial user survey, Koen Kooi, 2006/04/29
- Re: [Monotone-devel] Re: results of mercurial user survey, Timothy Brownawell, 2006/04/29
- [Monotone-devel] Re: results of mercurial user survey, Graydon Hoare, 2006/04/29
- [Monotone-devel] Re: results of mercurial user survey, Graydon Hoare, 2006/04/29
- gzipping analysis (was Re: [Monotone-devel] Re: results of mercurial user survey), Nathaniel Smith, 2006/04/29
- Re: [Monotone-devel] Re: results of mercurial user survey, Nathaniel Smith, 2006/04/29
- Re: [Monotone-devel] Re: results of mercurial user survey, Richard Levitte - VMS Whacker, 2006/04/30
- [Monotone-devel] Re: results of mercurial user survey, Lapo Luchini, 2006/04/30
- Re: [Monotone-devel] Re: results of mercurial user survey, Jon Bright, 2006/04/30