On Thu, Feb 2, 2012 at 8:28 AM, Michael Matz <matz@suse.de> wrote:
>
> Sure.  Simplest example: struct s {int i:24;} __attribute__((packed)).
>
> You must access only three bytes, no matter what.  The basetype (int) is
> four bytes.

Ok, so here's a really *stupid* (but also really really simple) patch attached.

What it does is to simply change the meaning of the SLOW_BYTE_ACCESS thing.

What SLOW_BYTE_ACCESS means is currently is not just that byte
accesses are slow: it means that *everything* but full-word accesses
are slow!

Which is (a) not what the name really implies, (b) not even the
historical meaning of that #define (why? the meaning of "full word"
has changed since: it used to be 32 bits, now it is 64 bits) and (c)
stupid, because that's not how hardware with slow byte accesses really
work anyway.

Finally, it's what causes gcc to do 64-bit accesses to bitfields that
aren't 64 bits in size.

So because of this, I propose gcc just change the rules for what
SLOW_BYTE_ACCESS means.

I propose to just change it to mean "accesses smaller than 32 bits are
slow". That's actually much closer to what the hardware definition
tends to be.

It doesn't fix the generic issue, it doesn't fix any C++11/C11 issues,
it doesn't really change anything, but what it *does* do is make for a
hell of a lot saner model. And it avoids bugs in practice.

NOTE! On at least some machines with SLOW_BYTE_ACCESS, accesses
smaller than a word cannot possibly be atomic anyway (well, not
without jumping through hoops), so the fact that we still extend to 32
bits and the problem of touching too much still exists with 'char' and
'short' variables that are in the same 32-bit word as a bitfield is
kind of unavoidable.

So this suggested patch doesn't really "fix" anything fundamental, but
it is (a) simple, (b) totally untested and (c) at least fixes *some*
cases.

For example, it might fix the 'sig_atomic_t' shadowning, since that is
usually 'int'. It obviously can never fix the issue with volatile
char/short, but at least it works around it for 'long'.

In other words - I'm not trying to say that this patch really "fixes"
anything (except sig_atomic_t interaction, which really does get fixed
for the normal 'int' case). But what it does do is to limit the damage
of a bad situation.

And it is small, and should hopefully be easy to accept even for
stable versions of gcc.

So can we please do something like this for maintenance releases, and
consider the much more complex C++11/C11 issues to be a separate much
bigger issue for the future?

Because the current SLOW_BYTE_ACCESS meaning really is crap. The
*only* thing that architecture define seems to do is literally the
bitfield extract semantics, and it does that horribly horribly badly
as shown by the bug on both 64-bit POWER and Sparc. For no good reason
- both of those have perfectly fine word operations afaik.

I did not look at other architectures that define SLOW_BYTE_ACCESS,
but if they have a 32-bit integer type, I'm pretty sure they support
fast loads/stores of it.

Hacky? Yes. Pretty? No. But better than the disaster that is there
now? Hell yes.

                                   Linus

PS. The only reason I hardcoded "32" was that I didn't know how to ask
the quesion "is this mode at least as wide as 'int'" in the proper gcc
way. So I'm really not suggesting you apply this patch as-is, I'm just
sending it out as a "hey, this is a pretty obvious way to work around
part of the problem, somebody who really knows what they are doing
should probably improve on it".