monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Project separation


From: Thomas Keller
Subject: Re: [Monotone-devel] Project separation
Date: Thu, 07 Oct 2010 15:28:24 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; de; rv:1.9.1.11) Gecko/20100714 SUSE/3.0.6 Lightning/1.0b2pre Thunderbird/3.0.6

Am 07.10.2010 14:58, schrieb Timothy Brownawell:
> On 10/07/2010 04:45 AM, Thomas Keller wrote:
>>
>> Hi all!
>>
>> I already brought up the idea on IRC some time ago - I am looking for a
>> way to restrict allowed incoming revisions on the server-side. No, I
>> don't plan to go towards the complexity of policy branches, which Tim is
>> working on for quite some time already, but I'm simply looking for a
>> simple way to keep different project trees separated in different
>> databases. (Guess what I'm talking about, right, our IDF setup!)
>>
>> Our merkle sync algorithm right now is solely based on branches -
>> whatever revision has a certain branch name attached, gets transferred,
>> including all of the needed history. So in theory all you need to do is
>> attach a wrong branch certificate on a revision of a completly different
>> tree and create some merge chaos (sure, only temporary, until somebody
>> suspends the wrong head).
>>
>> So what I maybe headed for was some notion of a "origin" cache which all
>> of our revisions and certificates could get. This could be a simply the
>> root revision of a project which separates different project trees from
>> each other and which is just added as token to every issued cert and
>> every revision.
>>
>> The merkle trie algorithm would now take this origin into account:
>> Before the actual changes are determined, both nodes agree at first for
>> which origins they want to change contents for. By default all origins
>> are taken into consideration, but people could overwrite this setting
>> per database with a specific variable. (Yes, this would also mean that
>> you could pull "net.venge.monotone{,.*}" into your monotone-only
>> database and you would really get all monotone branches, and not the
>> guitone, usher, tracmtn, debian, et cetera ones unless wanted!)
>>
>> Now that the origin agreement lead to a specific set of revisions and
>> certs on both sides (keys are not restricted by this), the normal merkle
>> algorithms would apply to find out what actually needs to be transferred
>> to either side.
> 
> Hm... I suppose this would have to go in the anonymous/auth and confirm
> cmds, but then the client would have to wait and not build it's merkle
> tree until it got the confirm_cmd.
> 
> I suppose if there was a restriction set, the side with the restriction
> would abort the connection if version negotiation resulted in something
> too old to support it?

For backwards compatibility I thought it might be cool to let the new
server speak both, the old and the new netsync chat. The client already
tells the server what netsync version it supports and we could simply
abort the connection if a restriction is configured / applied somehow
server-side and the client is too old to negotiate this restriction. If
there is no restriction in place, its probably not needed to deny old
clients.
Well, maybe I underestimate the effort of making "optional" protocol
parts actually work. Maybe this is not even a good idea and we should
start simple...

> What if we had a write-permission hook that took
> (branch_name, set<root_hash>)? This would require another
> exchange/refinement before the current merkle refinements we have (so
> that non-transferable things don't get included), and need some
> mechanism for enforcement (drop illegal branch certs, and
> garbage-collect unreferenced revisions?).

In general this sounds like a good idea, just that I'd decouple this
more from the notion of a "branch", but rather add the notion of a
"project", i.e. a standalone tree. In the hook you described above the
most interesting information for me would probably be the incoming
set<root_hash> - I don't care what branch certs are tacked on the
individual revisions (well, maybe others do) and I don't want to limit
anyhow the namespace for new branches with the same root hash. But maybe
this offers the long-wanted wish of selective, branch-specific write
permissions.

>> I haven't looked at the actual implementation (this would certainly
>> require a netsync flag day) and I have a vague idea how this "project
>> marking" for certs and revisions could be done in a space-efficient way
>> (by using locally unique identifiers which map on the global unique
>> ones), but I think it should be doable.
> 
> bytes:
>   full rosters    :  13,846,435
>   roster deltas   :  15,262,168
>   full files      :  46,419,141
>   file deltas     :  90,026,670
>   revisions       :  20,131,203
>   cached ancestry :     842,340
>   certs           :  17,272,341
>   heights         :   2,087,624
>   total           : 205,887,922
> 
> Having a table like revision_ancestry that maps each revision to one or
> more roots should be fine, even with having it use the full hashes.
> Certs don't really need to be directly labeled, since the revision
> they're attached to will be.

You're right, the certs don't need the label. This improves things
certainly.

>> One particular issue could however be the handling of merge_into_dir -
>> here the algorithm would probably leave out history of one side of the
>> merge when this side is prohibited to be synced. I have no immediate
>> answer how to solve this...
> 
> The revisions after the merge_into_dir will be labeled with both roots.
> If the semantics are "only sync things descended from these roots" then
> they'll be included and insert_with_parents will drag in the
> non-included side of the ancestry. If the semantics are "prohibit things
> descended from any roots other than these", then the dual-labeled
> merge_into_dir descendants will be excluded and there won't be anything
> to drag in the other half.

Tim, you are the man. That sounds like a very good plan!

Too bad that we have these ideas always so late in the game. We don't
want to introduce a netsync incompatibility in 1.1 and we don't want to
wait with 0.99/1.0 so much longer either. The list of changes is already
tremendously long...

But lets get a working implementation first before we make ourselves
headaches in which release the good stuff should go into :)

Thomas.

-- 
GPG-Key 0x160D1092 | address@hidden | http://thomaskeller.biz
Please note that according to the EU law on data retention, information
on every electronic information exchange might be retained for a period
of six months or longer: http://www.vorratsdatenspeicherung.de/?lang=en


Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]