lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Unit tests hygiene


From: Greg Chicares
Subject: Re: [lmi] Unit tests hygiene
Date: Thu, 9 Jun 2022 21:37:00 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0

On 5/21/22 18:34, Vadim Zeitlin wrote:
> 
>  Some time ago I proposed to run lmi unit tests under UBSAN, the undefined
> behaviour sanitizer, and I've finally tried doing this. I used the
> autotools build, as it was simple to do there because it's enough to just
> append CXX='g++ -fsanitize=undefined' to configure command line and then
> build as usual.

I've run all of lmi's nychthemeral tests with these options:

  -fsanitize=address,undefined,leak \
  -Wno-duplicated-branches \
  -fno-omit-frame-pointer \
  -fno-var-tracking \
  -fno-var-tracking-assignments \
  -O3 \
  --param max-gcse-memory=1000000 \

and now everything works, except for that xmlwrapp thing,
and excluding these unit tests (in 'objects.make'):

    actuarial_table_test \
    loads_test \
    math_functions_test \
    mortality_rates_test \

I'll look into 'math_functions_test', which has a dubious shift
operator in some fdblibm function. As for 'actuarial_table_test',
should we just force that to use '-O2', because it seems that
the failure with '-O3' is a gcc defect?

Oh, wait, I forgot one--I also have to look into this:
  /opt/lmi/src/lmi/ihs_acctval.cpp:1486:8: runtime error: load of value 253, 
which is not a valid value for type 'bool'

>  This _almost_ worked out of the box, but not quite: loads and
> mortality_rates tests refused to link with errors complaining about
> undefined references to BasicValues typeinfo.

Excluded. I don't see any easy way to make them work with UBSan:
it would take a lot of drudgerous work for little gain, as the
loads and mortality rates are already tested by 'system_test'.

> input_test.cpp:165:5: runtime error: load of value 25000, which is not a 
> valid value for type 'oenum_alb_or_anb'
> input_test.cpp:167:5: runtime error: load of value 25000, which is not a 
> valid value for type 'oenum_alb_or_anb const'

That tested a deliberate abuse, whose behavior was once unspecified,
but is now undefined--so it's been commented out.

> any_member.hpp:177:33: runtime error: load of value 2, which is not a valid 
> value for type 'bool'
> 
> which happens because mec_state::C3_init_is_mc boolean field is
> uninitialized when we "use" it value_cast() to which it is passed by
> _value_ when we call it from that line (which is in holder::assign()) of
> any_member.hpp. So this isn't a real problem neither, of course, because we
> never actually use this value and an optimizing compiler wouldn't even
> bother reading the value before calling value_cast(), but it does look like
> an uninitialized variable access to UBSAN. I see a couple of potentially
> worthwhile ways to fix it:
> 
> - Either we can actually initialize all the fields of mec_state by adding
>   "{}" to their declarations.

Done.

This makes the initialize-to-zero-using-any_member loop seem redundant,
so you had suggested removing that. I haven't done so yet because the
redundancy is benign, and I think the same idiom is used elsewhere but
I don't want to touch lots of code that doesn't need to be touched,
certainly not before all fixable UBSan and ASan issues are resolved.

>  To summarize, this was all rather underwhelming as UBSAN didn't find any
> real problems, which is pretty impressive, as I've never seen it fail to
> find anything when used for the first time with any non-trivial project. Of
> course, it's still possible that there are occurrences of UB in lmi not
> covered by the unit tests, but knowing that all of the code covered by the
> existing tests doesn't have any UB in it is already congratulations worthy
> on its own -- so please accept them!

Here's an example that I think is UB, though UBSan doesn't flag it:

    // UBSAN doesn't complain about this; shouldn't it?
    auto const b {db.query<oenum_alb_or_anb>(DB_ChildRiderMinAmt)};
    LMI_TEST_EQUAL(25000, b);

Am I missing something, or is this truly a false negative?

>  However I didn't want to stop here and, while I was sanitizing lmi,
> decided to also check the address sanitizer (ASAN). And here I was slightly
> luckier as it did find a real memory-safety problem: the code in
> fill_interval_gaps() in input_sequence.cpp can, in theory, dereference a
> dangling "last" reference because it refers to the last element of the
> "out" vector which may be reallocated after calling push_back() on it.

Fixed, though not yet pushed.

>  Except for this, ASAN only found a memory leak in input_test, but it is
> due to a problem in xmlwrapp that I've fixed there and will include in the
> patch updating wxWidgets and wxPdfDoc to ensure that all the changes get
> tested at once.

Could you update xmlwrapp separately now please? The reason why I ask
is that we're almost sanitizer-clean, and, having figured out all the
compiler options, I balk at taking even a single step back.

I'd be happy enough if I could just enable ASan and UBSan,
except that
  -fsanitize=address,undefined \
seems to imply
  -fsanitize=address,undefined,leak \
unless I failed to 'make clean' at some point where it was necessary.

BTW: reading the gcc documentation gave me the impression that this:
  -fsanitize=address,undefined,leak
doesn't enable all the checks we could have, and maybe this:
  
-fsanitize=address,undefined,leak,pointer-compare,pointer-subtract,float-divide-by-zero,float-cast-overflow
would be better. Do you happen to know whether that's correct?
My goal is to enable all available suboptions that can work
together (excluding incompatible things like TSan).

>  To really summarize now, by doing my best (worst?) I was able to find just
> a single problem in lmi using ASAN and no real problems at all using UBSAN
> which is, again, great news and is, IMO, indicative of high code quality.
> I think it would be worth enabling various sanitizers for the CI builds,
> especially if the problems preventing them from being used out of the box
> can be fixed. For ASAN this would involve just fixing the problematic code
> above and updating xmlwrapp (or disabling memory leak detection for now).
> For UBSAN it's a bit trickier, as I expect that properly using BasicValues
> from the affected tests is not that simple, or otherwise it would have been
> already done, but should hopefully still be possible -- and if not, I could
> always disable building these 2 tests in the CI build using UBSAN.

Let's just exclude those two tests for UBSan.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]