lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Unit tests hygiene


From: Greg Chicares
Subject: Re: [lmi] Unit tests hygiene
Date: Fri, 3 Jun 2022 16:29:18 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0

On 5/21/22 18:34, Vadim Zeitlin wrote:
> 
>  Some time ago I proposed to run lmi unit tests under UBSAN, the undefined
> behaviour sanitizer, and I've finally tried doing this. I used the
> autotools build, as it was simple to do there because it's enough to just
> append CXX='g++ -fsanitize=undefined' to configure command line and then
> build as usual.

I believe something like this will do the same in 'workhorse.make'
(and, notably, $(debug_flag) gets passed to the linker as well as
to the compiler):

-debug_flag := -ggdb
+# debug_flag := -ggdb -fsanitize=undefined -fno-sanitize-recover=all 
--param=max-vartrack-size=60000000
+
+debug_flag := -ggdb -fsanitize=undefined -fno-sanitize-recover=all 
-fno-var-tracking-assignments

My first question is whether you ever got 'i7702.cpp' to finish
compiling. It's gone for many minutes here already, even with
the second (not '#'-commented) set of flags above.

>  This _almost_ worked out of the box, but not quite: loads and
> mortality_rates tests refused to link with errors complaining about
> undefined references to BasicValues typeinfo.
[...]
> The file loads_test.cpp currently uses a quite horrible hack
[...]
> and so finally I just abandoned the idea and disabled these
> two tests.

The only real solution is a thoroughgoing rewrite of the code
underlying those tests. I have little ambition to do that,
so it's expedient to disable these two tests for UBSAN.

>  And then running "make check" didn't find _any_ problems.

I've seen some things on the terminal, but you're looking
for the return code...

> so I double checked... and found that gcc implementation of UBSAN
> doesn't set the error code even if UBSAN finds any problems (unlike clang
> one, which exits with 127 error code in this case, resulting in the tests
> failing, as would be expected, rather than passing). So, if you ever run
> into this too, you either need to compile with -fno-sanitize-recover=all in
> addition to -fsanitize=undefined itself, or set UBSAN_OPTIONS environment
> variable to contain ""halt_on_error=1", which is what I did.

Thanks, I'm trying that (above).

>  After doing this I did find (only!) 2 errors. First one occurs in
> input_test:
> 
> input_test.cpp:165:5: runtime error: load of value 25000, which is not a 
> valid value for type 'oenum_alb_or_anb'
> input_test.cpp:167:5: runtime error: load of value 25000, which is not a 
> valid value for type 'oenum_alb_or_anb const'
> 
> This happens for the part of the code for which we already had had to
> disable a clang warning and the fact that UBSAN doesn't like it neither
> makes me wonder if the comment saying that "C++ allows that" is actually
> correct.

The specific code is
  return static_cast<T>(bourn_cast<std::underlying_type_t<T>>(d));
whose behavior was once unspecified, but is now undefined:
  https://cplusplus.github.io/CWG/issues/1766.html

> In any case, this is not a real problem, of course, but perhaps it
> would be better to remove this part of the test to avoid to deal with the
> compilation and UB-testing issues arising due to it? Or is it
> representative of what the actual code does and so you consider it
> important to keep it? FWIW I couldn't find any way to avoid UBSAN warning
> about doing this.

It is desirable for UBSAN to complain about this. It would be
even more desirable for the compiler to issue a diagnostic,
but of course it doesn't. The real question is whether we can
write code that detects it; if, as I suspect, that's not possible,
then the tests should be commented out. I'd leave them there, as
comments, to document that such a thing can occur without any
diagnostic.

>  The second error is in irc7702a_test:
> 
> any_member.hpp:177:33: runtime error: load of value 2, which is not a valid 
> value for type 'bool'

Can you easily tell on which line of 'irc7702a_test.cpp' that occurs?

> which happens because mec_state::C3_init_is_mc boolean field is
> uninitialized when we "use" it value_cast() to which it is passed by
> _value_ when we call it from that line (which is in holder::assign()) of
> any_member.hpp. So this isn't a real problem neither, of course, because we
> never actually use this value and an optimizing compiler wouldn't even
> bother reading the value before calling value_cast(), but it does look like
> an uninitialized variable access to UBSAN. I see a couple of potentially
> worthwhile ways to fix it:
> 
> - Either we can actually initialize all the fields of mec_state by adding
>   "{}" to their declarations.

Apparently I was assuming that all code paths would initialize this
field to a valid boolean value, and I'd like to know specifically
why that's untrue. Reconsidering that might lead to the discover of
other latent issues. This is especially important because this seems
to be the only case of UB that UBSAN found, excluding the 'input_test'
case whose intention is to point out UB that can arise undiagnosed.

I hesitate to default-initialize everything just to make the problem
"go away". But ultimately we probably will want to default-initialize
everything, after diagnosing the root cause and remediating it.

> - Or we could change "To" in value_cast() to "To const&" to ensure that the
>   value never needs to be accessed.

Can you readily state the location of the offending value_cast()?

>  To summarize, this was all rather underwhelming as UBSAN didn't find any
> real problems, which is pretty impressive, as I've never seen it fail to
> find anything when used for the first time with any non-trivial project. Of
> course, it's still possible that there are occurrences of UB in lmi not
> covered by the unit tests, but knowing that all of the code covered by the
> existing tests doesn't have any UB in it is already congratulations worthy
> on its own -- so please accept them!

Thank you, that's good news.

>  However I didn't want to stop here and, while I was sanitizing lmi,
> decided to also check the address sanitizer (ASAN).
Let's split this thread; I'll reply to that part later, perhaps after
'i7702.cpp' builds. It hasn't finished yet.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]