tinycc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] Weird bitfield size handling, discrepancy with gcc


From: David Mertens
Subject: Re: [Tinycc-devel] Weird bitfield size handling, discrepancy with gcc
Date: Tue, 22 Nov 2016 16:11:29 -0500

Hello avih,

In short, very little is expected to break. Very few libraries present a public API susceptible to this sort of problem. The linked discussion mentions that this could be an issue with GTK+ v2, which apparently exposes these sorts of bitfields in its public API. Obviously parts of the Perl API could have issues. More examples of failures, or lack of failures, are discussed here: http://mingw-users.1079350.n2.nabble.com/mms-bitfields-td2288523.html

To be clear, failures in this case would come in the form of corrupt data and, in therefore segmentation faults when that corrupt data is a pointer. Possibly also stack mis-alignment issues, though I'm not sure how that would manifest itself.

David

On Tue, Nov 22, 2016 at 2:21 PM, avih <address@hidden> wrote:
Thanks for the detailed reply. I haven't yet read the linked discussion, but I will.

I like that it will work better on Linux, but the latter part of my question was intended to understand better what would stop working on Windows if you take approach #1 but the flag isn't used when invoking tcc.


On Tuesday, November 22, 2016 8:38 PM, David Mertens <address@hidden> wrote:


Hello,

Folks have been using tcc on linux for years and have not reported this as an issue. Juding from the evidence, the situations where this is problematic are vanishingly small. That's because it is only an issue when a library presents the following sort of struct as part of its public api:

struct {
    void * some;
    double data;
    int a : 2;
    char vowel;
}

On 32-bit systems, the "int" in the a bitfield force the value of a to have an alignment of four bytes. But, does the bitfield take four bytes followed by a single char, causing the struct to consume eight bytes across its last two members? Or, does the bitfield take a single byte, followed by a char with a single byte, and two unused bytes? VC++ does the former behavior, gcc without -mms-bitfields does the latter.

Perl uses gcc's close-packing behavior. That is what made me aware of the problem. Even so, it is possible to compile perl.h and call functions from the Perl API with tcc without running into trouble. If you use functions that only need struct info that comes ahead of this bitfield stuff, you're layouts will still agree. This means you can directly manipulate s->some, s->data, and s->a, even when the alignment of vowel is wrong. Obviously if you pass around pointers to your structs, you can call functions as pass the pointers without any problems. It's really only an issue in two circumstances: (1) if you are allowed/expected to directly manipulate the elements of the struct (or use macros that do so, which was the case with the Perl API), or (2) if you are supposed to call a function on a struct directly, rather than passing via a pointer, in which case the stack will not be properly set up.

Here is a discussion of enabling -mms-bitfields by default for another project. You might find it informative: https://groups.google.com/forum/#!topic/cocotron-dev/jwhEynyoZms

To reiterate, there is currently no way to get this to work correctly on Linux. I'm proposing something that'll make it possible for everyone to achieve the correct behavior, albeit possibly with an additional command-line flag if you're on Windows.

David

On Tue, Nov 22, 2016 at 9:40 AM, avih <address@hidden> wrote:
Sounds fine to me to have better interoperability with existing/common tools on Linux.

> If a library was compiled on Linux with gcc, there is no way to compile consuming code using tcc that is binary-compatible.

Could you please elaborate on the scope of the affected use cases? does it apply to both .a and .so libs? I think the windows side has a potential to suffer more, if I understand it correctly.

Assuming we take suggestion 1 (so on windows we'd need to invoke tcc with special cli flag to set a non default alignment) but we _don't_ use that flag, what would work which didn't before? interoperability with gcc? what wouldn't work? linking with libs created by native windows tools (MSVC)? What about using windows DLLs (which happens almost always as AFAIK binaries built with tcc use the ms crt for many things, and tcc comes with interfaces definitions to major system DLLs)?

etc.

Thanks.


On Tuesday, November 22, 2016 5:16 AM, David Mertens <address@hidden> wrote:


Hello all,

I have finally found a bit of time to work on this. Just to re-iterate, I've found variation in alignment with bitfields across different flavors of compilers, and tcc is incompatible with gcc on Linux. If a library was compiled on Linux with gcc, there is no way to compile consuming code using tcc that is binary-compatible. I've listed the test program and the known results below.

I believe the sensible thing to do is to make tcc more gcc-ish. This would involve (1) making tcc's default behavior align with gcc's default behavior, which appears to be consistent across platforms, and (2) providing the -mms-bitfields command-line option for Windows folks who need it. This makes binary compatibility possible across both Linux and Windows, and asks no more of Windows folks than what gcc asks.

Alternatively, we could make tcc's default behavior configurable. This would require an additional configure flag. It would also require both -mms-bitfields and -no-mms-bitfields (or whatever the proper name for that flag should be). And finally, it would probably be best to make the default configuration OS-dependent.

I prefer the first option because (a) it is simpler and (b) any configuration tools that think they're working with gcc will be smart enough to include the -mms-bitfields flag when it's needed on Windows. (Perl's build chain does this, for example.) Pragmatically, it is what I have the time to accomplish. I think the second approach is in some ways "better", but it'll also add a bunch of configuration code that I think tcc would be better without.

Preferences?
David


Here is a re-iteration of known results, and a new one: mingw on Windows.

--------%<--------
#include <stdint.h>
#include <stdio.h>
struct t1 {
    uint8_t op_type:1;
    uint8_t op_flags;
};
struct t2 {
    uint32_t op_type:1;
    uint8_t op_flags;
};
struct t3 {
    unsigned op_type:1;
    char op_flags;
};

int main() {
    printf("t1 struct size: %ld\n", sizeof(struct t1));
    printf("t2 struct size: %ld\n", sizeof(struct t2));
    printf("t3 struct size: %ld\n", sizeof(struct t3));
    return 0;
}
-------->%--------

With tcc on 64-bit Linux, this prints:
t1 struct size: 2
t2 struct size: 8
t3 struct size: 8

With gcc on 64-bit Linux, this prints:
t1 struct size: 2
t2 struct size: 4
t3 struct size: 4

With i686-w64-mingw32 (i.e. with MinGW on 64-bit Windows), this prints
t1 struct size: 2
t2 struct size: 4
t3 struct size: 4

According to Christian Jullien, VC++ 32 and 64 both return:
t1 struct size: 2
t2 struct size: 8
t3 struct size: 8

--
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan

______________________________ _________________
Tinycc-devel mailing list
address@hidden
https://lists.nongnu.org/ mailman/listinfo/tinycc-devel



______________________________ _________________
Tinycc-devel mailing list
address@hidden
https://lists.nongnu.org/ mailman/listinfo/tinycc-devel




--
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan





--
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]