[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug#36370: 27.0.50; XFIXNAT called on negative numbers

From: Pip Cet
Subject: Re: bug#36370: 27.0.50; XFIXNAT called on negative numbers
Date: Fri, 28 Jun 2019 21:07:37 +0000

On Fri, Jun 28, 2019 at 7:11 PM Bruno Haible <address@hidden> wrote:
> Pip Cet wrote:
> > Sorry, can't reproduce that here. I'm sure the changes I need to make
> > are obvious once I've found them, but can you let me know your gcc
> > version?
> I reproduce this with GCC versions 5.4.0, 6.5.0, 7.4.0, 8.3.0, and 9.1.0.
> 1. Take the files foo.c and bar.c from
>    <https://lists.gnu.org/archive/html/bug-gnulib/2019-06/msg00096.html>
> 2. Compile and disassemble:
>     gcc -O2 -m32 -flto foo.c bar.c -shared -o libfoo.so && objdump 
> --disassemble
> libfoo.so
>    Observe that f_assume pushes an immediate $0 argument on the stack for the
>    function call.
> 3. Enable the second 'assume' definition instead of the first one.
> 4. Compile and disassemble:
>     gcc -O2 -m32 -flto foo.c bar.c -shared -o libfoo.so && objdump 
> --disassemble
> libfoo.so
>    Observe that f_assume is now an alias of f_generic, that pushes a
>    computed value as argument on the stack for the function call (push %eax).

Thanks. I have no idea what I was doing wrong, but I can confirm that
this case indeed demonstrates a GCC weakness when there is a non-split
eassume (A && B), where A and B are both assumable (in the sense that
__builtin_constant_p(!A == !A).

> > Sorry to be pedantic, but do you disagree that it is better in these
> > cases, or in general?
> I disagree that it is better in general.

> You're apparently setting out a high goal for the 'assume' macro:
> (1) that the programmer may call it with an expression that involves
>     function calls,

That's the status quo in Emacs and many other projects that have
started believing the "an inline function is as fast as a macro"
mantra*, assuming you include inline functions with "function calls".

(* - as have I)

> (2) that the generated code will never include these function calls,
>     because the generated code with the 'assume' invocation should be
>     optimized at least as well as the generated code without the
>     'assume' invocation.

I think it should be the rarest of exceptions for an assume() to
result in slower code, yes. I believe that includes the case where
functions marked inline aren't inlined, because of compiler options,
for example.

> I'm adding certain quality criteria:
>   - It is not "good" if a construct behaves unpredictably, that is,
>     if it hard to document precisely how it will behave. (*)

Uhm, isn't the whole point of assume() to make the program behavior
undefined in a certain case?

int i = 1;
assume(i == 0);

will behave unpredictably. That's why we use the assume(), to speed up
execution by providing the compiler with an expression which "may or
may not be evaluated", and allowing it to do just anything if the
expression evaluates to false. That's the documented API.

If you want your program to behave predictably, in the strict sense,
you cannot ever use the current assume() API. If you want it to behave
predictably in some looser sense, the sense I suggest is that
assume(C) may or may not evaluate C. Again, that's what the
documentation says.

>   - It is not "good" if the behaviour with no LTO is different from
>     the behaviour with LTO.

I agree I should investigate the LTO example further, but please let's
keep in mind that I observed problematic behavior of assume(A && B) in
other contexts, so I'm not convinced LTO is the decisive factor here.

However, optimization can and will influence program behavior in many
ways. One of these ways is whether the argument to assume, which the
programmer knows "may or may not be evaluated", is evaluated or not.

> The implementation with the __builtin_constant, while attaining the
> goals (1) and (2), does not satisfy the two quality criteria.

For the record, my quality criteria are:

(1) implement the documented API, and don't change it
(2) when optimizing for speed, do not produce slower code with
eassume() than we would without it. Even when the programmer wrongly
guessed that a function would be inlined.
(3) when optimizing for size, do not produce larger code with
eassume() than we would without it. Even when inline functions are not
(4) be at least as fast as gnulib assume()

> I believe the only way to attain the goals and the quality criteria
> is, as you suggested, to ask the GCC people to add a __builtin_assume
> built-in.

I think there's a significant probability that the GCC people would
agree to add such a built-in, but insist on its having "may or may not
evaluate its argument" semantics.

> > >   1. The new 'assume' is worse when -flto is in use.
> >
> > Maybe. Even if it is, though, that's a GCC limitation which I consider
> > likely to be fixable
> Yes, *maybe* the GCC people can change the semantics of __builtin_constant_p
> so that it is effectively computed at link-time, rather than when a single
> compilation unit gets compiled. Or maybe not. I don't know...

Hmm. My observations suggest that __builtin_constant_p is effectively
computed at link-time; all I have to do is to split the assume() in
your example.

> > It's way too easy to do something like
> >
> > eassume(ptr->field >= 0 && f(ptr));
> >
> > when what you mean is
> >
> > eassume(ptr->field >= 0);
> > eassume(f(ptr));
> I argue that it's unnatural if the two don't behave exactly the same.

But you're arguing for a __builtin_assume which doesn't evaluate its
argument, still? Because

  __builtin_assume(i == 0);

could validly be optimized to i = 1, while

  __builtin_assume(i == 0 && infloop());

could not be.

Or did I misunderstand the semantics you suggested for __builtin_assume?

> Like everyone expects that
>   x = foo ? yes : no;
> is equivalent to
>   if (foo) x = yes; else x = no;

It is.

> And everyone expects that
>   if (A && B) { ... }
> is equivalent to
>   if (A) if (B) { ... }

It is.

Sorry if I'm being a bit dense here.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]