Re: [Monotone-devel] url schemes

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] url schemes

From:	Markus Schiltknecht
Subject:	Re: [Monotone-devel] url schemes
Date:	Mon, 24 Mar 2008 14:14:23 +0100
User-agent:	Mozilla-Thunderbird 2.0.0.9 (X11/20080109)

Hi,

Derek Scherger wrote:

Xxdiff does work reasonably well to look over whitespace polluted diffsif you turn off display of whitespace. ;)


Oh, thanks, I will give that a try.

that that's not the case where nuskool is supposed to be the winner.
I'm assuming that if this does work out it will replace netsync and itjust can't be slower and be successful imho.

I can see two workarounds to that: either partial pull (which will bemuch easier to implement on top of nuskool), or automating the "fullrepository download" hack, in case the target database is still empty.

However, if we can tweak nuskool to handle initial pulls equally good,that would be even better, yes.

Yeah, the revision refinement phase is really quick. Side note: I'm not100% sure it's correct yet. I do recall seeing a push saying that thereare X outbound revs while pull, with the databases reversed, had someother number of inbound revs. We need to double check this.

Oh, really? Hm.. Well, I've started writing unit tests for gsync. Maybewe should enhance those to be useful, i.e. test some random DAGs orsomething.

Oh, another note here. I purposely set things up in run_gsync_protocolso that the client knows exactly which revisions are inbound andoutbound, thinking that we really want something like push/pull/sync--check to list (but not transfer) revisions that will be transferred.The mercurial equivalents are the incoming/outgoing commands.

Yeah, that would certainly be useful as well. Should be straight forwardto implement with nuskool. :-)

This may require a bit more information coming back in the descendantsresponse, including author/date/changelog/branch certs for example. Thethought of combining author/date/changelog/branch into one commit certcrossed my mind here again. The current certs don't allow us to tie thecorrect things together. Maybe we should start another branch to combinethese certs into a single commit cert.


Absolutely, yes.

That applies to the current http channel. Other channels might ormight not use JSON. Or maybe we even want to add differentcontent-types for http, i.e. return json or raw binary, depending thehttp accept header.
Yeah, both ideas have crossed my mind as well.

:-)

I may just try having get_revision include all of the file data/deltadetails as well, and see how big these get in the monotone database. Ifwe didn't first encode the json object as a string and subsequentlywrite it to the network we could just start writing bytes until we weredone and not have to hold them all in memory. However this causesproblems with trying to set the Content-Length header.

Yes, plus problems with dumb servers... (am I not annoying with thattopic, am I? :-) )

I'm not sure whatto think of issuing several requests (one for each file data/delta in arevision, perhaps up to some limit). Actually, I don't think it wouldhelp, because the server can only handle one request at a time afaict orthere will be multiple scgi processes running and there will be databaselock issues.

Right, however, normal (dumb?) http servers can perfectly well handlemultiple request, so this time, dumb servers don't look that dumb. ;-)

And even if the scgi server would serialize request to the database,overall it would still reduce waits for network latency on avg, becausemultiple requests could be issued concurrently.

So would a hand-optimized sha1 implementation. Would someone just writeone of these already! ;)

Uh.. isn't the botan provided one hand optimized enough? If not, let'splease tweak that, so other botan users can take advantage as well.

(The library-build branch should allow us to link against optimizedbotan code, one day. That's one of the things we should really work on...)

I went with the fine-grained get/put request/response pairs so thatneither side would end up having to hold too many files in memory atany one time. If we instead requested all file data/deltas for onerev the number of round trips would be reduced but we'd end up havingto hold at least one copy (probably more) of the works in memorywhich didn't seem so good. I'm open to suggestions. ;)
I don't think files necessarily need to be put together by revision -that would be a rather useless collection for small changes. Instead,we should be able to collect any number of files together - and deferwriting the revision until we have all of them.
I'm not really sure where you're going with this.

You were saying, that "if we requested all file data/deltas *for onerevision*" in one request, that would reduce round trips.

However, lots of revisions just change one file, so that optimizationwouldn't reduce round trips for those.

AFAICT, mercurial transfers so called changegroups, where they grouptogether file data/deltas across revisions - they simply don't care whatrevision a file is in for those changegroups.

However, such a thing would not be compatible to dumb servers, so Iconsider in an optimization, at best.

Agreed, however, I'm wondering how popular or useful scriptedpushing/pulling is going to be.

Yeah, that's hard to say. But with all the Web 2.0 hype... maybe we canstart a VCS 2.0 hype and get surprised by all the new applications builton top of our API... :-)

When I first say the json format Ithough that it might have been nice to have that rather than basic_iobut it probably didn't exist at the time basic_io was invented.

You mean, using JSON internally, as a revision/manifest/roster format?Might save some space, but I find the basic_io more human readable, andfavor it for that reason.

Yeah, the base64 encoding/decoding of file content is another extra stepthat shouldn't really be needed.


Yeah, json is not optimal in that respect.

How about other, more space efficient encodings, like Ascii85? Theremust be something which encodes binary data in UTF-8 strings, no?Something, that's also used for UTF-8 encoded XML CDATA. Such anencoding must exist, otherwise I'm gonna write one.... (ehm, or probablynot...).

Or storing the revisions in the database as binary rather than text, butI guess we don't actually use the revisions themselves that much do we.Seems like a reasonable idea.

..ehm.. yeah, I had quite a debate with Nathaniel on this topic duringthe last summit. Since then I didn't really dare to bring this idea upagain ;-)

His counter argument was, that rosters are cached data and it's nice tobe able to regenerate them. He was concerned about not being able tochange the rosters format.

And I can certainly see his point. Just recently, fornvm.e.db-compaction, it looks like I've changed the rosters format(didn't double check, yet, so I'm not quite sure) And it was nice tojust force a db regenerate_caches and be done, instead of having towrite a rosters format converter.

However, I still think dropping the revisions from the database would beworthwhile - because it would not only reduce database size, but alsohelp reducing i/o bandwidth.

In general, I think it would be great if we had a few people workingtogether on all of these things, rather than one poor lonely soul oneach of them. You and Zack seem to have been doing a bit of this on thecompaction and encapsulation branches and I'm sure it's more fun andproduces better results that way.

Sure it is. And I'd say that we had some of that fun as well on nuskool.I'd certainly like to continue that. However, spare time is pretty tightsometimes... I just dedicated these easter days, instead of doing bybook-keeping. But I cannot defer that endlessly :-(


Regards

Markus

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Monotone-devel] url schemes, (continued)
- Re: [Monotone-devel] url schemes, Derek Scherger, 2008/03/22
  - Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/23
    - Re: [Monotone-devel] url schemes, Derek Scherger, 2008/03/23
    - Re: [Monotone-devel] url schemes, Markus Schiltknecht <=

Prev by Date: Re: [Monotone-devel] url schemes
Next by Date: Re: [Monotone-devel] fatal: Botan::PRNG_Unseeded
Previous by thread: Re: [Monotone-devel] url schemes
Next by thread: [Monotone-devel] nvm.experiment.db-compaction
Index(es):
- Date
- Thread