|
From: | Larry Hastings |
Subject: | [Monotone-devel] A Two-Fold Proposal: On Formats And Front-Ends |
Date: | Tue, 04 Oct 2005 04:21:26 -0700 |
User-agent: | Mozilla Thunderbird 1.0.6 (Windows/20050716) |
I have for you two separate but intimately related proposals, and I'm going to describe them in reverse order of implementation because it flows better. Keep in mind, I'm still quite wet-behind-the-ears with respect to Monotone; I typed my first monotone command less than two weeks ago. It wouldn't surprise me if this was all a bad idea for any number of unforseen reasons. And yet I bravely post, having read somewhere that the monotone community is "friendly". So... be kind. :) Right now monotone
makes life tough for front-end
developers. The main problem I see is the N different output/file
formats of monotone, where
N is distressingly large:
Additionally: while the monotone automate brace of commands is a reasonably good idea, and a step in the right direction, it is woefully incomplete. When writing m7, I had to call lots of non-automated monotone commands directly because monotone automate doesn't offer any actions. You can't add/delete/rename, you can't commit, you can't create certs, you can't push/pull/sync. All but one of the automate commands are queries; the only exception is stdio. I suspect this is because monotone doesn't "eat its own dog food": monotone doesn't implement its user-visible commands using the automate interface, it just does operations directly on the database. Seemingly all the automate commands were added ad-hoc, to support the needs of history visualization tools I'm guessing. But, since people weren't writing general-purpose front-ends that day, monotone automate offers no assistance to people adding files or committing changesets. Sure, we could add a spate of new monotone automate commands, in an effort to catch it up with the interactive command-line interface. But even if it caught up, it would likely fall back behind later, again because monotone doesn't "eat [its] own dog food". One unfortunate side effect of this: it denies a front-end any real atomicity. There are many operations where m7 has to run mulitple monotone commands, and since each one is a separate monotone instance, there's no way to ensure atomicity between them. As a result, most m7 commands suffer from race conditions. This is particularly bad for commands that change the database; if you ran two m7 commits at once, I rather suspect you'd get two revisions with the same local revision number cert. To me the solution is clear, and it is here we arrive at my first actual proposal: separate presentation from application logic by breaking the current monotone executable into two pieces. One piece would be the "engine" that did all the actual work. The second piece would be the "front-end", or "driver", which drives the monotone engine (as gcc drives the front-end, back-end, and linker). The driver would provide the command-line interface of the current monotone executable, convert internal messages to their user-friendly localized equivalents, etc. The communication between the two would be done over pipes in some easy-to-cope-with data specification. This has many advantages:
A moment ago I handwaved what "data representation" I had in mind. I propose there are three main candidates: XML, ASN.1, and JSON. I will immediately dispense with the first two, and show why the third is far more likely. :)
http://www.loglibrary.com/show_page/view/106?Multiplier=3600&Interval=6&StartTime=1124773859 JSON is small, easy to parse, easy to generate, and covers all the bases. It has explicit and well-defined quoting rules. It's flexible; we could add new fields to a message and it wouldn't break a receiver who wasn't expecting that field. So I'll go ahead and assume that, if something like this did come to pass, it'd use JSON. Specifically, JSON encoded in UTF-8. You can read more about JSON here: http://www.json.org/ While proposing this I realized that, while this would fix all the output of various commands, it wouldn't fix the commands that were really just dumping monotone internal files--revisions, certs, manifests, and the like. Thus is my second proposal revealed: rework monotone's internal data structures using this data format. This would make the monotone engine itself easier to write and maintain, as we wouldn't have N mini-libraries for reading/writing these N formats. We'd just have one library we used for everything. (For backwards compatibility we could have the front-end massage the data it prints out back into the old format upon request.) Specifically, revisions/certs/manifests would be stored as JSON, indented by one tab (\t) per indention level, lines ended with \n (no \r), and children of objects stored in sorted order. We're already breaking all the eggs when doing rosters; upgrading existing databases will already require rebuilding every relevant data structure. If we were ever going to consider a radical move like this, it seems to me that now, before rosters ship, is the best possible time. Breaking up the monotone monolithic executable would (will?) be nice, but it can wait. Undertaking any of the above would be a ton of work. And I am sadly not volunteering to do it myself, or even contribute all that much code to it. (Though I am working up a reasonably-clever JSON library, and I would love to help define the shape of the JSON data, particularly the communications protocol between the front- and back-ends.) I realize that waltzing into an open-source project, vaguely sketching castles in the air, then saying "now go build it guys!", easily strays into boorishness. So I apologize if I'm stepping on any toes. What do you think? Cheers, larry |
[Prev in Thread] | Current Thread | [Next in Thread] |