monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] automate sync remote process startup


From: Stephen Leake
Subject: Re: [Monotone-devel] automate sync remote process startup
Date: Thu, 07 Oct 2010 09:17:32 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (windows-nt)

Thomas Keller <address@hidden> writes:

> Am 06.10.10 14:56, schrieb Stephen Leake:
>> Thomas Keller <address@hidden> writes:
>> 
>>> Am 02.10.2010 11:30, schrieb Stephen Leake:
>>>> [...] So this is an Emacs DVC problem, not a mtn problem.
>>>>
>>>> Which means that the nvm.netsync.automate branch is ready, except the
>>>> --dryrun stuff is not tested or documented; I'll work on that.
>>>
>>> Coolness, drop me a note when you finished it and I'll review it. This
>>> branch will then contain all the previous work from Tim, right?
>> 
>> This is now complete, and all tests pass on Mingw and Debian (Cygwin
>> tests running, very slowly with some failures).
>> 
>> This includes Tim's work on 'sync --dry-run', and my work on basic_io
>> output from 'automate sync'.
>
> My observations so far:
>
> 1) Even though --dryrun is given, mtn gives a lot of unneeded chatter,
> especially the following:
>
> mtn: bytes in | bytes out | revs in | revs out
> mtn:   10.1 k |    13.4 k |       0 |        0
> mtn: successful exchange with <server>

The bytes in/out is meaningful, since you are talking to the server.
The revs in/out is not needed, but I'm not sure it's harmful; it's at
least familiar :).

> mtn: note: your workspace has not been updated

I vote for eliminating this entirely from sync. Which workspace is it
talking about (I have many that are potentially impacted by every sync),
and why would I expect it to be updated?

Currently, 'pull' does 'maybe_do_update', but 'sync' does not. Why?

> Could we prevent the initialization of the transfer tickers 

I'll see if I can figure this out.

> and hide the "note: your workspace has not been updated" message by
> disabling the optional workspace updater?

That's easy, and certainly desireable.

> 2) If a sync would not push anything, the output is:
>
>    mtn: would send 0 revisions:
>
> It would be better if the colon is removed, so it looks less like "hey,
> something is missing here". 

Good point.

> 3) If multiple branch certs are attached to a single revision, this
> revision of course pops up (and is counted) multiple times in the final
> result. Would it make a lot of work to group these a little smarter,
> i.e. not counting revisions by any single, but multiple branch?
> Something which would lead to an output like
>
> mtn: would send 5 revisions:
> mtn:         2 in branches first.branch,
> mtn:                       second branch
> mtn:         3 in branch first branch

I find this confusing. Mostly because it's not what the current netsync
hooks in examples/display_branches.lua do.

For me, the main use of the sent rev count per branch is to remind me
that my work actually got to the central server, so I can expect other
people to see it. Similarly, if someone tells me they don't see
something, I run sync, and if I see the "sent branch" message, I can say
"oh, I forgot to sync". 

The exact count doesn't matter much.

> 4) For nvm.basic_io-doc Tim wrote in monotone.texi:
>
>   Each "symbol" is followed by one or more 'string's and/or 'hex'es.
>
> The basic_io format for sync / push output is not correct by this
> definition, as the "send" and "receive" symbols are not followed by one
> string / hash. (see also
> http://code.monotone.ca/p/monotone/issues/85/)

Good point. 

This format is supported directly by the basic_io stanza class; it has
push_symbol.

It is used in conflict resolutions, for 'resolved_internal'.

> We could make it ourselves easy now and just change the documentation
> accordingly, but are value-less, symbol-only lines really something we
> want to have / introduce?

I think they are important in the current case. How would you change the
output to avoid them? We'd need 'receive' or 'send' in every line:

receive revision "0"
receive     cert "1"
receive      key "1"
    
send revision "1"
send     cert "6"
send      key "1"
send   branch "foo2" "1"

> 5) Likewise the non-dry-run format "symbol symbol" which is used like
> "receive revision" or "send key" is not defined either. While the set of
> possible values is predefined here, it might be better to just put them
> in double quotes.

This format is supported directly by the basic_io stanza class; it has
push_str_pair (symbol, symbol) (which actually pushes a symbol pair).
    
Conflict resolutions use a symbol pair as well, for the conflict type.

I don't see other uses of symbol pairs in the code.

So all of the violations of the (then undocumented) basic-io format were
written by me :).

Clearly it is useful for the parser to be able to make a distinction
between symbols and strings; symbols have a finite set of values, that
can be checked. If we change the second symbol to a string syntax, the
higher level semantics still says it has a finite set of values, but I
think having the syntax say it as well is a good idea.

It doesn't complicate the parser much.

At this point, I don't think we should change the conflict resolution
format. Although I suspect Emacs DVC is the only thing that actually
parses it, so the change would be manageable.

So I'm recommending changing nvm.basic-io-doc, and apologize for not
realizing it was violated when I reviewed it earlier this week.

> 6) Under what circumstances are received certs listed under "received
> cert" and at what time under "received revision"? If only a certificate
> is transferred, but not the revision of this certificate? 

Correct.

> Do we really need a cert-grouping-by-revision in first instance or
> wouldn't it be enough to have one format where each cert stanza always
> gets an identifying
>
>       revision [123abc]
>
> at the end? 

This was partly motivated by what examples/display_branches.lua
currently does. But it's also the prefered display grouping for my use
case.

My use case for the sync output is to allow the user to determine what
workspaces need to be updated, or newly checked out.

Certs that accompany revisions are irrelevant in that use case; in the
current format, they are easily ignored.

Branch certs that do not accompany revisions indicate new branches, that
need to be newly checked out.

Other certs are simply ignored.

> Maybe it would also be a good idea to make the format a bit more flat,
> i.e.
>
> send_revision [123abc...]
>
> send_cert "branch"
>     value "net.venge.monotone"
>       key [123abc...]
>  revision [123abc...]
>
> receive_revision [123abc...]
>
> receive_cert "author"
> ....
>
> so we and the implementor has to rely less on the particular placement
> of the information. 

With this flat format, the front-end has to do the sorting that is
currently done by the code in cmd_netsync.cc print_info_auto. That code
is almost a duplicate of code in network/netsync_session.cc
netsync_session::on_end.

I did start with a similar flat format, but changed it to the nested
one, because that sorting code was easier to write in C++ than in Emacs
lisp.

> Remember that data hierarchies and dependencies are still quite hard
> to model with such a simple format.

In general, true, but I think I succeeded in this case :).

> 7) Right now there is a line-break between the send and the receive
> stanza implemented, but in monotone.texi this is missing:
>
> +The following is example dry-run main channel data:
> address@hidden
> +  dryrun
> + receive
> +revision "0"
> +    cert "1"
> +     key "1"
> +    send
> +revision "1"
> +    cert "6"
> +     key "1"
> +  branch "foo2" "1"
> address@hidden verbatim

Yes, I changed the format after I wrote the doc; fixed now. Note that
the leading 'dryrun' symbol is also no longer present.

> 8) Please read again over the "Output format" section in monotone.texi -
> some sentences sound incomplete to me, but then I'm not a native speaker
> either, so don't hit me if everything is correct :)

You are right; fixed now. 

Some of the other 'output format' sections have a similar style
(non-sentences), but they are much shorter, so it works better there.

> 9) For the sake of consistency - would it make sense / is it possible to
> unify the basic_io output for the dry-run and non-dry-run version when
> we know what we want to push somewhere? I understand that we have no
> information of incoming items we just haven't received yet, but we
> should surely have the information for the outgoing items, right?

It depends on the use case for dry-run. I'm not clear what that is; the
best I can think of is "do I have time to do this push over this slow
cell-phone modem, or should I wait till I get to a better connection"?
For that, all you need is some sense of how much data is involved. Of
course, we could compute the actual number of bytes to transfer, which
would be more useful.

But I guess for automate, we should be consistent, and err on the side
of outputting too much data; we assume there is a front end that filters
and reformats.

So I'll look at this.

Ah; another use case; put up the list of stuff that would be
transferred, and let the user modify the branch pattern before doing the
actual sync. In this case, the names of the received branches are more
important (for example monotone.ca?* vs
monotone.ca?net.venge.monotone*). Would it be possible to get the
to-be-received branch names?

> The code and tests look otherwise very clean and consistent - 

Thanks.

-- 
-- Stephe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]