On Thu, Feb 2, 2012 at 8:28 AM, Michael Matz wrote: > > Sure.  Simplest example: struct s {int i:24;} __attribute__((packed)). > > You must access only three bytes, no matter what.  The basetype (int) is > four bytes. Ok, so here's a really *stupid* (but also really really simple) patch attached. What it does is to simply change the meaning of the SLOW_BYTE_ACCESS thing. What SLOW_BYTE_ACCESS means is currently is not just that byte accesses are slow: it means that *everything* but full-word accesses are slow! Which is (a) not what the name really implies, (b) not even the historical meaning of that #define (why? the meaning of "full word" has changed since: it used to be 32 bits, now it is 64 bits) and (c) stupid, because that's not how hardware with slow byte accesses really work anyway. Finally, it's what causes gcc to do 64-bit accesses to bitfields that aren't 64 bits in size. So because of this, I propose gcc just change the rules for what SLOW_BYTE_ACCESS means. I propose to just change it to mean "accesses smaller than 32 bits are slow". That's actually much closer to what the hardware definition tends to be. It doesn't fix the generic issue, it doesn't fix any C++11/C11 issues, it doesn't really change anything, but what it *does* do is make for a hell of a lot saner model. And it avoids bugs in practice. NOTE! On at least some machines with SLOW_BYTE_ACCESS, accesses smaller than a word cannot possibly be atomic anyway (well, not without jumping through hoops), so the fact that we still extend to 32 bits and the problem of touching too much still exists with 'char' and 'short' variables that are in the same 32-bit word as a bitfield is kind of unavoidable. So this suggested patch doesn't really "fix" anything fundamental, but it is (a) simple, (b) totally untested and (c) at least fixes *some* cases. For example, it might fix the 'sig_atomic_t' shadowning, since that is usually 'int'. It obviously can never fix the issue with volatile char/short, but at least it works around it for 'long'. In other words - I'm not trying to say that this patch really "fixes" anything (except sig_atomic_t interaction, which really does get fixed for the normal 'int' case). But what it does do is to limit the damage of a bad situation. And it is small, and should hopefully be easy to accept even for stable versions of gcc. So can we please do something like this for maintenance releases, and consider the much more complex C++11/C11 issues to be a separate much bigger issue for the future? Because the current SLOW_BYTE_ACCESS meaning really is crap. The *only* thing that architecture define seems to do is literally the bitfield extract semantics, and it does that horribly horribly badly as shown by the bug on both 64-bit POWER and Sparc. For no good reason - both of those have perfectly fine word operations afaik. I did not look at other architectures that define SLOW_BYTE_ACCESS, but if they have a 32-bit integer type, I'm pretty sure they support fast loads/stores of it. Hacky? Yes. Pretty? No. But better than the disaster that is there now? Hell yes. Linus PS. The only reason I hardcoded "32" was that I didn't know how to ask the quesion "is this mode at least as wide as 'int'" in the proper gcc way. So I'm really not suggesting you apply this patch as-is, I'm just sending it out as a "hey, this is a pretty obvious way to work around part of the problem, somebody who really knows what they are doing should probably improve on it".