Re: [Monotone-devel] Testresult

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Testresult

From:	Timothy Brownawell
Subject:	Re: [Monotone-devel] Testresult
Date:	Sat, 26 Dec 2009 16:30:53 -0600
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0

On 11/24/2009 1:48 PM, Judson Lester wrote:

Following the bisect thread brought some questions about testresult back
to mind.  This was one of the features of monotone that drew me to it,
back around 0.14, and I've never really gotten around to really
implementing testresult in my code projects.

Is there a best practice on using the mtn testresult command?

No. In fact I'm not sure that it's even used much, probably related towhat you mentioned below. I think I know a better (more flexible)approach, but it would need new hooks to be added.

I've been
poking at the idea of a CI-style monotone bot that would update a
project, run the associated test suite, and mark the current revision
with testresults.  I keep butting up against problems of suite changes:
* If I change how a test works, are it's results valid on past
revisions, or do I need to re-run that test against past revisions?
* If I change the name of a test, is that trackable?  Won't that break
acceptance of future revisions, since there's an old "pass" value that
isn't on current revisions? (Changeable in Lua, obviously)
* Is it reasonable to use monotone style testresults and keep tests and
code together?

Let's see, what can happen when you run an updated testsuite against anew program version...


/- Old status (Pass/Fail)
|/- New Status (Pass/Fail)
||/- Action (Add/Rename/Modify/Delete/Unchanged)
|||/- Other Action (Rename/Modify)
|||| /- Outcome/Meaning
-FA  Added failing test... probably don't care
-PA  Added passing test... assume new program version is "better" ***
F-D  Deleted failing test... probably don't care
P-D  Deleted passing test... probably don't care
FFU  Failing test... don't care, both versions suck
FPU  Test started passing... new version is better ***
PFU  Test started failing... new version is worse, don't update ***
PPU  Passing test... don't care, both versions are good
??R  Renamed test... ideally treat same as an Unchanged test
FFM  Modified failing test... both versions suck
PFM  Modified test starts failing... test was wrong, both versions suck
FPM  Modified test starts passing... new version *may* be better ***
PPM  Modified passing test... new version *may* be better ***
     (it passes a better test)
??RM Modified/Renamed test... ideally treat same as a Modified test

So:
* Renamed tests have to be distinguishable from an Add/Delete pair (or
  just don't allow tests to be renamed)
* If an unmodified test goes from Pass to Fail, that's bad
* Nothing else really matters.

This could be implemented with testresult certs, by having the name ofeach signing key be the hash of a particular test (or a guid orsomething embedded in the test, if you want to be able to refactor thetest without the bot thinking that the meaning has changed). But, eachcert includes an RSA signature (which are large), and for example themonotone testsuites have 600+ tests. Even with the normal ~4certs/revision certs are a significant fraction of database size, sothis may not be a good idea.

Using a single cert with multiple (test-id, pass/fail) pairs would help,but this would still be rather large since you can't get deltacompression. Leaving out 'pass' results would make this smaller, butwould make it impossible to tell the difference between an added failingtest and an existing test that started failing.

What would probably work would be a separate branch containing (only?) afile with the list of (test-id, pass/fail) pairs; revisions in thisbranch could be given an extra cert (because certs are indexable, andthis is something you'd use for lookups) indicating which revision thecontents refer to, and could maybe have an extra file with informationon what version of the testsuite was used (for external testsuites).

Doing it this way would require a new "accept_update(old_revision,new_revision)" hook, and helper functions to allow that hook to look upparticular revisions (use 'automate select') and then read the contentsof those revisions (use 'automate get_file_of'). (We already have an'mtn_automate' function that would allow these, except that it will E()on its reentrancy check if called this way.)


Does this seem like a reasonable approach?

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Monotone-devel] Testresult, Timothy Brownawell <=

Prev by Date: Re: [Monotone-devel] review of nvm.automate_out_of_band
Next by Date: [Monotone-devel] nonimportant monotone bug
Previous by thread: Re: [Monotone-devel] cvs_import failure
Next by thread: [Monotone-devel] nonimportant monotone bug
Index(es):
- Date
- Thread