[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] Testresult
From: |
Timothy Brownawell |
Subject: |
Re: [Monotone-devel] Testresult |
Date: |
Sat, 26 Dec 2009 16:30:53 -0600 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0 |
On 11/24/2009 1:48 PM, Judson Lester wrote:
Following the bisect thread brought some questions about testresult back
to mind. This was one of the features of monotone that drew me to it,
back around 0.14, and I've never really gotten around to really
implementing testresult in my code projects.
Is there a best practice on using the mtn testresult command?
No. In fact I'm not sure that it's even used much, probably related to
what you mentioned below. I think I know a better (more flexible)
approach, but it would need new hooks to be added.
I've been
poking at the idea of a CI-style monotone bot that would update a
project, run the associated test suite, and mark the current revision
with testresults. I keep butting up against problems of suite changes:
* If I change how a test works, are it's results valid on past
revisions, or do I need to re-run that test against past revisions?
* If I change the name of a test, is that trackable? Won't that break
acceptance of future revisions, since there's an old "pass" value that
isn't on current revisions? (Changeable in Lua, obviously)
* Is it reasonable to use monotone style testresults and keep tests and
code together?
Let's see, what can happen when you run an updated testsuite against a
new program version...
/- Old status (Pass/Fail)
|/- New Status (Pass/Fail)
||/- Action (Add/Rename/Modify/Delete/Unchanged)
|||/- Other Action (Rename/Modify)
|||| /- Outcome/Meaning
-FA Added failing test... probably don't care
-PA Added passing test... assume new program version is "better" ***
F-D Deleted failing test... probably don't care
P-D Deleted passing test... probably don't care
FFU Failing test... don't care, both versions suck
FPU Test started passing... new version is better ***
PFU Test started failing... new version is worse, don't update ***
PPU Passing test... don't care, both versions are good
??R Renamed test... ideally treat same as an Unchanged test
FFM Modified failing test... both versions suck
PFM Modified test starts failing... test was wrong, both versions suck
FPM Modified test starts passing... new version *may* be better ***
PPM Modified passing test... new version *may* be better ***
(it passes a better test)
??RM Modified/Renamed test... ideally treat same as a Modified test
So:
* Renamed tests have to be distinguishable from an Add/Delete pair (or
just don't allow tests to be renamed)
* If an unmodified test goes from Pass to Fail, that's bad
* Nothing else really matters.
This could be implemented with testresult certs, by having the name of
each signing key be the hash of a particular test (or a guid or
something embedded in the test, if you want to be able to refactor the
test without the bot thinking that the meaning has changed). But, each
cert includes an RSA signature (which are large), and for example the
monotone testsuites have 600+ tests. Even with the normal ~4
certs/revision certs are a significant fraction of database size, so
this may not be a good idea.
Using a single cert with multiple (test-id, pass/fail) pairs would help,
but this would still be rather large since you can't get delta
compression. Leaving out 'pass' results would make this smaller, but
would make it impossible to tell the difference between an added failing
test and an existing test that started failing.
What would probably work would be a separate branch containing (only?) a
file with the list of (test-id, pass/fail) pairs; revisions in this
branch could be given an extra cert (because certs are indexable, and
this is something you'd use for lookups) indicating which revision the
contents refer to, and could maybe have an extra file with information
on what version of the testsuite was used (for external testsuites).
Doing it this way would require a new "accept_update(old_revision,
new_revision)" hook, and helper functions to allow that hook to look up
particular revisions (use 'automate select') and then read the contents
of those revisions (use 'automate get_file_of'). (We already have an
'mtn_automate' function that would allow these, except that it will E()
on its reentrancy check if called this way.)
Does this seem like a reasonable approach?
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [Monotone-devel] Testresult,
Timothy Brownawell <=