#
# add_file "cset.cc"
#
# patch "ChangeLog"
# from [c5b6af776442ffcda8b25b0b94a9785795da0817]
# to [461c25a92640c41f5d01de41ee652f4ba900980b]
#
# patch "Makefile.am"
# from [bedd826d1991853466ab57d77d80be1635014d17]
# to [052baab7e5e3fdc87b29094be9d3f8383c6797f1]
#
# patch "cset.cc"
# from []
# to [1fb425ad8e367a70b3cd9668c69bcf37b4d4418e]
#
# patch "unit_tests.cc"
# from [230b4227d08bed323a2625f593073427bca9fe9b]
# to [8eb81e81294b5e8ac18540b19826656903a725cf]
#
# patch "unit_tests.hh"
# from [b4b20ac190077a08db547eb1dcd79f40c58b2ebc]
# to [0bb9b3b8b1320afaf40c88a254bdd5f5d9546ef4]
#
--- ChangeLog
+++ ChangeLog
@@ -1,3 +1,7 @@
+2005-06-19 graydon hoare
+
+ * cset.cc: Rewrite-in-progress on change_set.cc.
+
2005-06-14 Richard Levitte
* std_hooks.lua (get_preferred_merge2_command,
--- Makefile.am
+++ Makefile.am
@@ -31,6 +31,7 @@
rcs_import.cc rcs_import.hh \
revision.cc revision.hh \
change_set.cc change_set.hh \
+ cset.cc \
mt_version.cc mt_version.hh \
automate.cc automate.hh \
database_check.cc database_check.hh \
--- cset.cc
+++ cset.cc
@@ -0,0 +1,1762 @@
+/*
+
+ an mfest is a collection of nodes, indexed by ident number and
+ also held in a tree of nodes. the bucket of identified nodes is
+ possibly larger than the tree of nodes.
+
+ a change_set is a pair of mfests. every entry in each mfest exists in
+ the other, though some might be detached from the file tree and just
+ known by ident.
+
+ a change_consumer -- and the serial form of a change_set -- is the
+ way a human wants to read about some work: organized into a set of
+ deletes, moves, and adds.
+
+ there is ambiguity in a change_consumer though, regarding the
+ resolution of names and the simultineity of operations. specifically,
+ each entry in a change_set is either looked up in the pre state or
+ post state mfest of a change. a delete is looked up in the pre
+ state. a rename's source in the pre state. a rename's destination in
+ the post state. an add is looked up in the post state.
+
+ when playing back a change_set, there is a canonical order in which
+ entries are emitted: {delete,rename,add}{file,dir}.
+
+ furthermore within each type of entry, the order in which the entries
+ are emitted is specified: all entries are emitted in lexicographic
+ order, which happens to induce topological order as well (A is always
+ emitted before A/B).
+
+ crucially, deletes are ordered by *source space* and adds and renames
+ are ordered by *destination space*. we replay these by walking the
+ tree rooted in each mfest.
+
+*/
+
+
+/*
+
+ aliases, splits, joins, nursery, graveyard:
+
+ each node has a set of alias nodes.
+
+ if N is a node and A is an alias node of N in the old_manifest of a
+ cset C, we say that A split from N in C. if A is an alias of N in
+ the new_manifest of C, we say that A joined N in C.
+
+ if N is not an alias, but actually present in a manifest, we say that N
+ is *live* in that manifest.
+
+ - if N is live in the old_manifest of C, we say N is live-in in C.
+ - if N is live in the new_manifest of C, we say N is live-out in C.
+
+ two special nodes exist *always* in every manifest:
+
+ 1. the nursery
+ 2. the graveyard
+
+ - if A was split from the nursery in C, we say that A was added in C.
+ - if A was joined to the graveyard in C, we say that A was deleted in C.
+
+ for any nodes M, N in C, we must have a proper lifecycle nesting. that is:
+
+ - if M splits from N in C then N must be live-in in C
+ - if M joins to N in C, then N must be live-out in C
+
+ this is normally trivially satisfied by splitting and joining from the
+ nursery and graveyard, respectively.
+
+ splits exist only in order to serve as the inverse of joins. joins
+ exist for only one purpose: when merging two trees which have
+ different (non-identified) files occupying the same name, and the
+ user decides they *want* to identify the files (eg. 2-way merge). in
+ this case we want to join one file to the other, to give future edits
+ on the joiner a target to apply to.
+
+ yes? hmm.. we want to avoid this:
+ ---------------------------------
+
+ bob makes a tree and syncs it with jane.
+ bob adds file foo
+ bob syncs with sid
+ jane adds file foo
+ jane syncs with bob
+ jane merges with bob. jane's foo is not the same as bob's foo. jane's foo wins.
+ sid makes a change to bob's foo.
+ sid syncs with jane.
+ sid's change is silently eaten.
+
+
+
+ a point: every file *does* have a GUID. the (rev,path) pair it was added in.
+ or at least it ought to. the problem is you can add a file in two (rev,path)
+ pairs, and then later decide you *wanted* them to behave -- for the sake of
+ merging -- as though they are the same file. iow, behave like they had a
+ common ancestor immediately before the states they were both added in.
+
+ so how do joins help here? simply this: when you are working on a merge
+ between A and B, the first step is to decide which files will be live
+ in the merge. to do this you take all the files which are live in A and
+ live in B. then for each file you've decided will be live, you pull out
+ a subgraph for it, containing all the nodes in which it's alive. then you
+ add to this subgraph all the nodes *joined* to the live one. that's the
+ central trick: you will be doing graph contraction on the joiner as well,
+ so that you do not lose any of its deltas.
+
+ further restriction: foo/bar can only ever join foo/bar, and this can
+ only happen in a merge in which one of the incoming edges is
+ providing foo/bar already; it's actually written (unambiguously) in
+ the joiner's changeset as 'join "foo/bar"' (inverse: 'split
+ "foo/bar"'). so, scratch previous idea: adds and deletes are still
+ expressed as being renamed to or from the nursery or graveyard, not
+ split/joined with them.
+
+
+ further: suppose in A->C we have "join foo/bar", where there's a B->C
+ edge containing a foo/bar to join with. all well and good. then
+ suppose we append D="rename foo/bar foo/baz" to C. then the B->C->D
+ edge has "rename foo/bar foo/baz", and the A->C->D edge has ...
+ "join foo/baz"? that makes no sense.
+
+ we know a join is essentially a "delete with a forwarding address";
+ so like a delete it has to take effect *before* anything else in a
+ cset. which means it is written in the pre-space of the cset. the
+ problem is that the name it refers to is clearly a name in the
+ post-space of the delta: the name it's colliding with!
+
+ perhaps "join" is not the right word. perhaps we want something more
+ like "doom foo/bar; rename foo/bar foo/baz" in the composite cset: an
+ acknowledgment that the node named by foo/bar at the *beginning* of
+ the cset is doomed, even if it gets moved around thereafter: the
+ proviso of a doomed node eixsting in a revision is that there's a
+ non-doomed node occupying the same node. a doomed node is not yet
+ dead, but it adheres to some other node until that other node dies.
+
+ subsequently if you doom the node foo/baz (which requires some other
+ incoming node in a merge, using that name) then the composite cset
+ is "doom foo/bar; doom foo/baz; rename foo/bar foo/baz" and you have
+ *two* doomed nodes in the doomed set of the third node. this should
+ continue to work until such time as the live node carrying the doomed
+ ones actually dies.
+
+ actually that reads terribly. it would be better to have "delete_foo"
+ have a secondary clause, "with_heir", where the heir is a name in the
+ post-space of the cset. this way the heir can get moved around; and
+ the natural inverse is "add_foo" ... "with_sire", where the sire is
+ in the pre-space of the cset. that feels more likely to work.
+
+ */
+
+
+#define __STDC_CONSTANT_MACROS
+
+#include
+#include
+#include