automake
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: faster tests [was: rhel8 test failure confirmation?]


From: Bogdan
Subject: Re: faster tests [was: rhel8 test failure confirmation?]
Date: Tue, 18 Apr 2023 13:07:08 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0

Karl Berry <karl@freefriends.org>, Mon Apr 17 2023 22:16:38 GMT+0200 (Central European Summer Time)
Hi Bogdan,

     Then, I analysed the files and added the trick from t/backcompat2.sh
     (if possible) and/or removed the extra calls to $ACLOCAL (if possible).

Thanks much for looking into this.

     Short version: after a few hours of testing and modifications, I
     *may* have saved up to 1 minute and 12 seconds of testing...

Well, at least you get kudos for doing all the research :).


 :)


     You may look at the attached patch as a result of the investigation
     and then ... you're free to completely ignore it :). It works for me,
     but I wonder if it won't cause more confusion than it's worth...

I agree. Not worth the complications.

     t/backcompat-acout.sh: 35 -> 24s

That seems to me like the only one that might be worth applying the
patch for. Quite a bit more savings than anything else in the list.


Yes. Aclocal is called in a loop here, always with the same set of (Automake) macros in configure.ac, so it probably always generates the same aclocal.m4 (no external macros called either). This duplication can be avoided. Strange that the trick doesn't work in all cases, but at least it works here.



      # A trick to make the test run muuuch faster, by avoiding repeated
      # runs of aclocal (one order of magnitude improvement in speed!).
      echo 'AC_INIT(x,0) AM_INIT_AUTOMAKE' > configure.ac

Alternatively, I wonder how much this is really saving. Maybe the trick
should not be used anywhere.


5 seconds is the gain, just checked. From about 17.5s to about 12.5s. On the whole test, of course. So, in this particular case, the saving was meaningful (cut 25-30% of time, even if this is just 5 wallclock seconds). On the other hand, I'm now calling aclocal 7 times instead of 1, so 6 skipped aclocal calls give about 5s of savings, or about 1s per aclocal call. In a loop with even tens of iterations the saving would be visible, but otherwise it's just single seconds, or even literally 1 second, like some of my results show.


     - having 1277 .sh files in 't/' means that even if each runs in 30
     seconds, you have 10 hours of testing just from the number of tests,

Indeed. The only practical way to run make check is in parallel.  I
discovered that early on :). It still takes painfully long for me
(10-15min at best, on a new and fast machine).


I have 4 vcores and I'm afraid the full set would literally take hours to complete on my machine. 10-15 minutes is a luxury! :)


     - it may be better to determine if there are duplicate tests

Sounds awfully hard to do.


I agree. Compare each test with each other. Sometimes within "a reasonable group" (like tests with the same name and just a number appended), sometimes with all other tests (like the ones named after a problem ID). A complexity of n^2 operations :).


My impression is that the (vast?) majority of tests are the direct
result of bug reports. I would not be inclined to tweak, remove, merge,
or change them. Even if two tests are nearly identical, that "nearly"
can make all the difference.


Not sure about the "majority" (I didn't read each and every file), but I totally agree with the last part. Sometimes some simple change in one of the files causes some problem to appear and the test succeeds (or fails if needed) and e.g. not porting the change to a new "merged" test may cause that we lose the test for an actual problem without realizing it (in other words, we lose coverage for some part of the code). Furthermore, more "atomic" tests allow you to check single functionalities and give more "atomic" results - you narrow down which functionality is failing vs. having a file "t/test-everything.sh", browsing through the log on each failure, and restarting the (hours long) test after each fix. We don't want that. That's the "balance slider" here: more tests = more time (no guarantee that less tests would actually merge something and require less time), but also more tests = more comfort.


     - as you see above, t/pr401b.sh takes 1m42s to run. I wonder if e.g.
     running the 'distcheck' target in tests would be the main factor

Sounds very likely to me. Distcheck is inherently a lengthy process. I
can't imagine how it can be sped up. Although I agree that 1:42 seems
rather long for a trivial package like those in the tests.


It runs '$MAKE distcheck' 3 times and one '$MAKE check' + '$MAKE distclean' pair, so fortunately it's not one 'distcheck' that is taking so long. Having 25-30s per one 'distcheck' means there isn't a minute to chop off any longer, but - again - maybe just single seconds. 25-30s doesn't sound so bad compared to 1:42...


     Same case for t/pr401c.sh and t/pr401.sh, although shorter times.

At a glance, I see required='cc libtoolize' in 401b, whereas 401c and
401 only have cc. Testing libtool really is different, and really does
take time. So I'm not sure there's any low-hanging fruit here.


I took a quick look at those 3, and you're probably right. All the 'distcheck' are made on different configurations, so they most probably must stay as-is.


Thanks again for doing all this work,


 :)

--
Regards - Bogdan ('bogdro') D.                 (GNU/Linux & FreeDOS)
X86 assembly (DOS, GNU/Linux):    http://bogdro.evai.pl/index-en.php
Soft(EN): http://bogdro.evai.pl/soft  http://bogdro.evai.pl/soft4asm
www.Xiph.org  www.TorProject.org  www.LibreOffice.org  www.GnuPG.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]