[Monotone-devel] url schemes

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] url schemes

From:	Markus Schiltknecht
Subject:	[Monotone-devel] url schemes
Date:	Sat, 22 Mar 2008 16:19:03 +0100
User-agent:	Mozilla-Thunderbird 2.0.0.9 (X11/20080109)

Hi,

since I've been critiquing Timothy's current extensions of the URLscheme, I think I need to try coming up with something better. Or atleast help in doing so. First of all, I've put together a list of URLschemes we are using in and around monotone, including nuskool, whichprobably is what we will use someday.

In the first part of the URL, we obviously encode the protocol anddatabase location in the URL. Existing samples are:


 * file:/path/to/monotone/db.mtn
 * ssh://host[:port]/path/to/monotone/db.mtn

And for mtndumb, we already have:

 * http[s]://host[:port]/path/to/repo
 * [s]ftp://[user[:address@hidden:port]/path/to/repo
 * file:/path  (or file:///path??)

Upcoming URLs to specify a database location might be:

 * mtn://host[:port]           (as proposed for netsync)
 * http://host[:port]/path/to/scgi       (as in nuskool)
 * xmpp:[//address@hidden/address@hidden
           (as recently proposed on IRC - might somehow
                             work with nuskool, someday)
 * pgsql://user:address@hidden:port/database/schema
                                      (pipe dreaming...)

Often enough, specifying a database isn't enough, because we want toaddress only parts of the repository, i.e. only a certain brach, only arevision or even only a single file delta.

Almost all of the above protocols support additional slashes and morepath components after the database. The only exception being pgsql,which isn't really much of a standard URL scheme anyway, AFAICT. (Incase of an underlying filesystem - i.e. file and ssh - it should bepossible to walk down the path and use the first monotone dabatase ormonotone dumb data directory you find. That would only prevent you fromaccessing a monotone database file within a dumb data directory, butthat wouldn't make much sense anyway).

Most protocol types also support an argument list, separated by & - butnot all of them. Exceptions are the dumb ones, which cannot parsearguments, because there's no clever server to process them. For pgsql,arguments are often used to specify options for the database connection,but as mentioned above, it's not really a standard - we could certainlyuse some monotone specific arguments, if needed.

Now, the question which started that discussion is, what should the restof the URL look like? IMO, we should take a look at existing and planneduse cases. Then take care they don't conflict with each other.

The only existing rest-URL-scheme is from mtndumb. However, that oneuses a rather meaningless scheme to retrieve data from a repository. Itlooks like it was designed to resemble the merkle trie, while stillproviding a good compromise with round trips required:


 $DB/DATA
 $DB/HASHES_
 $DB/HASHES_??  (multiple times, where ?? are the first two hex chars)
 ...

Then, there are the planned nuskool commands. Those are currentlyencoded entirely in JSON. The HTTP client requests the same URL everytime, and encodes the query in JSON. ATM nuskool doesn't support branchinclusion or exclusion patterns. The commands currently are:


 * inquiring revisions: asks the server if it has certain revisions
 * getting descendants: querying the ancestry map of the server
 * getting (pulling) a revision
 * putting (pushing) a revision
 * getting file data
 * putting file data
 * getting file delta
 * putting file delta

These are current facts and observations, or am I missing somethingimportant?

Then, there are wishes and feature requests. I personally find thefollowing ones very compelling:


 * mtn itself should be able to talk to dumb servers
 * it should be possible to do checkouts from remote databases
 * mtn should feature a simple API for 3rd party tools
 * faster and firewall compatible protocol (covered by nuskool)

Taking all of that together, to me this smells very much like we need aRESTful API. One which is easy to read, understand and remember, simpleto process and universally usable for all supported protocols (as far aspossible). What I have in mind would look somewhat like this:


 * GET $DB/capabilities: inquire capabilities of that mtn repository
           (i.e. if arguments are supported or not)
 * GET/PUT $DB/revision/$HASH/data: pull or push a revision
 * GET/PUT $DB/file_data/$HASH: pull or push file data
 * GET/PUT $DB/file_delta/$HASH: pull or push file delta
 * GET $DB/branch/$BRANCHNAME/heads: get heads of $BRANCHNAME
 * GET $DB/revision/$HASH/inquire: inquire *one* revision
 * GET $DB/revision/$HASH/descendants: fetch descendants of a revision

This might appear http centric, but think about it: ftp, file and ssh,maybe even xmpp, all of these provide put and get methods in a way.(Even if pushing to dumb servers might not work - at least not withoutsome additional processing on the server side. Or maybe with properauthentication support, so clients can update meta data on the dumbserver?). And as http is about the best known protocol, so what's badabout being http centric? ;-)

For browsable protocols which support index files (like http[s] andftp[s]) we could offer those for the following URLs:


 * GET $DB/:    a listing of branches in the repo, general purpose
                repository information and statistics, etc..
 * GET $DB/revision/$HASH/: a browsable directory tree
 * GET $DB/branch/$BRANCHNAME/: some branch information, maybe a graph
                with the most recent revisions, links to the branch
                heads and to sub-branches

And others, but you get the point...

What's important for me is, that these URL schemes should be compatibleto another. I would find it a waste of opportunity, if we would now specify:


 $DB/$BRANCHNAME[?$PATTERNS]

..or similar for the mtn (i.e. netsync) protocol, because it certainlyconflicts with future extensions for other protocols.

While the following is longer and more to type, it's certainly morecross-protocol compatible and wouldn't prevent future extensions:


 $DB/branch/$BRANCHNAME?PATTERNS

In other words: omitting that "branch" in between there would restrictus from providing other resources. Or forcing us to use different URLschemes for different protocols, i.e.:


 $DB/$BRANCHNAME for mtn://

but:

 $DB/branch/$BRANCHNAME for http://

..which would certainly confuse people.

As another, minor point, IMO the second is also easier to read andunderstand. A good (but admittedly deprecated) example might be:


 http://venge.net/net.venge.monotone

Looks quite confusing to me, where as:

 http://venge.net/branch/net.venge.monotone

Makes the thing easier to understand. Especially for starters, I think.

So, that got rather longish now. Thanks for being with me so far. I'mcurious on your opinions, thoughts and criticism.


Regards

Markus

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] url schemes, Markus Schiltknecht <=
- Re: [Monotone-devel] url schemes, Timothy Brownawell, 2008/03/22
  - Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/23
  - Re: [Monotone-devel] url schemes, Philipp Gröschler, 2008/03/24
    - Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/24
    - Re: [Monotone-devel] url schemes, Timothy Brownawell, 2008/03/24
    - Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/25
    - Re: [Monotone-devel] url schemes, Timothy Brownawell, 2008/03/30
    - Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/31
    - Re: [Monotone-devel] url schemes, Derek Scherger, 2008/03/31
    - Re: [Monotone-devel] url schemes, Philipp Gröschler, 2008/03/24

Prev by Date: Re: [Monotone-devel] mtn:// sync
Next by Date: Re: [Monotone-devel] url schemes
Previous by thread: [Monotone-devel] automate show_conflict
Next by thread: Re: [Monotone-devel] url schemes
Index(es):
- Date
- Thread