monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Testresult


From: Timothy Brownawell
Subject: Re: [Monotone-devel] Testresult
Date: Sat, 26 Dec 2009 16:30:53 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0

On 11/24/2009 1:48 PM, Judson Lester wrote:
Following the bisect thread brought some questions about testresult back
to mind.  This was one of the features of monotone that drew me to it,
back around 0.14, and I've never really gotten around to really
implementing testresult in my code projects.

Is there a best practice on using the mtn testresult command?

No. In fact I'm not sure that it's even used much, probably related to what you mentioned below. I think I know a better (more flexible) approach, but it would need new hooks to be added.

I've been
poking at the idea of a CI-style monotone bot that would update a
project, run the associated test suite, and mark the current revision
with testresults.  I keep butting up against problems of suite changes:
* If I change how a test works, are it's results valid on past
revisions, or do I need to re-run that test against past revisions?
* If I change the name of a test, is that trackable?  Won't that break
acceptance of future revisions, since there's an old "pass" value that
isn't on current revisions? (Changeable in Lua, obviously)
* Is it reasonable to use monotone style testresults and keep tests and
code together?

Let's see, what can happen when you run an updated testsuite against a new program version...

/- Old status (Pass/Fail)
|/- New Status (Pass/Fail)
||/- Action (Add/Rename/Modify/Delete/Unchanged)
|||/- Other Action (Rename/Modify)
|||| /- Outcome/Meaning
-FA  Added failing test... probably don't care
-PA  Added passing test... assume new program version is "better" ***
F-D  Deleted failing test... probably don't care
P-D  Deleted passing test... probably don't care
FFU  Failing test... don't care, both versions suck
FPU  Test started passing... new version is better ***
PFU  Test started failing... new version is worse, don't update ***
PPU  Passing test... don't care, both versions are good
??R  Renamed test... ideally treat same as an Unchanged test
FFM  Modified failing test... both versions suck
PFM  Modified test starts failing... test was wrong, both versions suck
FPM  Modified test starts passing... new version *may* be better ***
PPM  Modified passing test... new version *may* be better ***
     (it passes a better test)
??RM Modified/Renamed test... ideally treat same as a Modified test

So:
* Renamed tests have to be distinguishable from an Add/Delete pair (or
  just don't allow tests to be renamed)
* If an unmodified test goes from Pass to Fail, that's bad
* Nothing else really matters.


This could be implemented with testresult certs, by having the name of each signing key be the hash of a particular test (or a guid or something embedded in the test, if you want to be able to refactor the test without the bot thinking that the meaning has changed). But, each cert includes an RSA signature (which are large), and for example the monotone testsuites have 600+ tests. Even with the normal ~4 certs/revision certs are a significant fraction of database size, so this may not be a good idea.

Using a single cert with multiple (test-id, pass/fail) pairs would help, but this would still be rather large since you can't get delta compression. Leaving out 'pass' results would make this smaller, but would make it impossible to tell the difference between an added failing test and an existing test that started failing.


What would probably work would be a separate branch containing (only?) a file with the list of (test-id, pass/fail) pairs; revisions in this branch could be given an extra cert (because certs are indexable, and this is something you'd use for lookups) indicating which revision the contents refer to, and could maybe have an extra file with information on what version of the testsuite was used (for external testsuites).

Doing it this way would require a new "accept_update(old_revision, new_revision)" hook, and helper functions to allow that hook to look up particular revisions (use 'automate select') and then read the contents of those revisions (use 'automate get_file_of'). (We already have an 'mtn_automate' function that would allow these, except that it will E() on its reentrancy check if called this way.)

Does this seem like a reasonable approach?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]