inet: frags: Turn fqdir->dead into an int for old Alphas
diff mbox series

Message ID 20190607140949.tzwyprrhmqdx33iu@gondor.apana.org.au
State New, archived
Headers show
Series
  • inet: frags: Turn fqdir->dead into an int for old Alphas
Related show

Commit Message

Herbert Xu June 7, 2019, 2:09 p.m. UTC
On Tue, Jun 04, 2019 at 09:04:55AM -0700, Linus Torvalds wrote:
>
> In fact, the alpha port was always subtly buggy exactly because of the
> "byte write turns into a read-and-masked-write", even if I don't think
> anybody ever noticed (we did fix cases where people _did_ notice,
> though, and we might still have some cases where we use 'int' for
> booleans because of alpha issues.).

This is in fact a real bug in the code in question that no amount
of READ_ONCE/WRITE_ONCE would have caught.  The field fqdir->dead is
declared as boolean so writing to it is not atomic (on old Alphas).

I don't think it currently matters because padding would ensure
that it is in fact 64 bits long.  However, should someone add another
char/bool/bitfield in this struct in future it could become an issue.

So let's fix it.

---8<--
The field fqdir->dead is meant to be written (and read) atomically.
As old Alpha CPUs can't write a single byte atomically, we need at
least an int for it to work.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Comments

Eric Dumazet June 7, 2019, 3:26 p.m. UTC | #1
On 6/7/19 7:09 AM, Herbert Xu wrote:
> On Tue, Jun 04, 2019 at 09:04:55AM -0700, Linus Torvalds wrote:
>>
>> In fact, the alpha port was always subtly buggy exactly because of the
>> "byte write turns into a read-and-masked-write", even if I don't think
>> anybody ever noticed (we did fix cases where people _did_ notice,
>> though, and we might still have some cases where we use 'int' for
>> booleans because of alpha issues.).
> 
> This is in fact a real bug in the code in question that no amount
> of READ_ONCE/WRITE_ONCE would have caught.  The field fqdir->dead is
> declared as boolean so writing to it is not atomic (on old Alphas).
> 
> I don't think it currently matters because padding would ensure
> that it is in fact 64 bits long.  However, should someone add another
> char/bool/bitfield in this struct in future it could become an issue.
> 
> So let's fix it.


There is common knowledge among us programmers that bit fields
(or bool) sharing a common 'word' need to be protected
with a common lock.

Converting all bit fields to plain int/long would be quite a waste of memory.

In this case, fqdir_exit() is called right before the whole
struct fqdir is dismantled, and the only cpu that could possibly
change the thing is ourself, and we are going to start an RCU grace period.

Note that first cache line in 'struct fqdir' is read-only.
Only ->dead field is flipped to one at exit time.

Your patch would send a strong signal to programmers to not even try using
bit fields.

Do we really want that ?

> 
> ---8<--
> The field fqdir->dead is meant to be written (and read) atomically.
> As old Alpha CPUs can't write a single byte atomically, we need at
> least an int for it to work.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
> index e91b79ad4e4a..8c458fba74ad 100644
> --- a/include/net/inet_frag.h
> +++ b/include/net/inet_frag.h
> @@ -14,7 +14,9 @@ struct fqdir {
>  	int			max_dist;
>  	struct inet_frags	*f;
>  	struct net		*net;
> -	bool			dead;
> +
> +	/* We can't use boolean because this needs atomic writes. */
> +	int			dead;
>  
>  	struct rhashtable       rhashtable ____cacheline_aligned_in_smp;
>  
> diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
> index 35e9784fab4e..05aa7c145817 100644
> --- a/net/ipv4/inet_fragment.c
> +++ b/net/ipv4/inet_fragment.c
> @@ -193,7 +193,7 @@ void fqdir_exit(struct fqdir *fqdir)
>  {
>  	fqdir->high_thresh = 0; /* prevent creation of new frags */
>  
> -	fqdir->dead = true;
> +	fqdir->dead = 1;
>  
>  	/* call_rcu is supposed to provide memory barrier semantics,
>  	 * separating the setting of fqdir->dead with the destruction
>
Herbert Xu June 7, 2019, 3:32 p.m. UTC | #2
On Fri, Jun 07, 2019 at 08:26:12AM -0700, Eric Dumazet wrote:
>
> There is common knowledge among us programmers that bit fields
> (or bool) sharing a common 'word' need to be protected
> with a common lock.
> 
> Converting all bit fields to plain int/long would be quite a waste of memory.
> 
> In this case, fqdir_exit() is called right before the whole
> struct fqdir is dismantled, and the only cpu that could possibly
> change the thing is ourself, and we are going to start an RCU grace period.
> 
> Note that first cache line in 'struct fqdir' is read-only.
> Only ->dead field is flipped to one at exit time.
> 
> Your patch would send a strong signal to programmers to not even try using
> bit fields.
> 
> Do we really want that ?

If this were a bitfield then I'd think it would be safer because
anybody adding a new bitfield is unlikely to try modifying both
fields without locking or atomic ops.

However, because this is a boolean, I can certainly see someone
else coming along and adding another bool right next to it and
expecting writes them to still be atomic.

As it stands, my patch has zero impact on memory usage because
it's simply using existing padding.  Should this become an issue
in future, we can always revisit this and use a more appropriate
method of addressing it.

But the point is to alert future developers that this field is
not an ordinary boolean.

Cheers,
Eric Dumazet June 7, 2019, 4:13 p.m. UTC | #3
On 6/7/19 8:32 AM, Herbert Xu wrote:
> On Fri, Jun 07, 2019 at 08:26:12AM -0700, Eric Dumazet wrote:
>>
>> There is common knowledge among us programmers that bit fields
>> (or bool) sharing a common 'word' need to be protected
>> with a common lock.
>>
>> Converting all bit fields to plain int/long would be quite a waste of memory.
>>
>> In this case, fqdir_exit() is called right before the whole
>> struct fqdir is dismantled, and the only cpu that could possibly
>> change the thing is ourself, and we are going to start an RCU grace period.
>>
>> Note that first cache line in 'struct fqdir' is read-only.
>> Only ->dead field is flipped to one at exit time.
>>
>> Your patch would send a strong signal to programmers to not even try using
>> bit fields.
>>
>> Do we really want that ?
> 
> If this were a bitfield then I'd think it would be safer because
> anybody adding a new bitfield is unlikely to try modifying both
> fields without locking or atomic ops.
> 
> However, because this is a boolean, I can certainly see someone
> else coming along and adding another bool right next to it and
> expecting writes them to still be atomic.
> 
> As it stands, my patch has zero impact on memory usage because
> it's simply using existing padding.  Should this become an issue
> in future, we can always revisit this and use a more appropriate
> method of addressing it.
> 
> But the point is to alert future developers that this field is
> not an ordinary boolean.

Okay, but you added a quite redundant comment.

/* We can't use boolean because this needs atomic writes. */

Should we add a similar comment in front of all bit-fields,
or could we factorize this in a proper Documentation perhaps ?

Can we just add a proper bit-field and not the comment ?

unsigned int dead:1;

This way, next programmer can just apply normal rules to add a new bit.

Thanks !
Linus Torvalds June 7, 2019, 4:19 p.m. UTC | #4
On Fri, Jun 7, 2019 at 8:26 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> There is common knowledge among us programmers that bit fields
> (or bool) sharing a common 'word' need to be protected
> with a common lock.
>
> Converting all bit fields to plain int/long would be quite a waste of memory.

Yeah, and we really don't care about alpha. So 'char' should be safe.

No compiler actually turns a 'bool' in a struct into a bitfield,
afaik, because you're still supposed to be able to take the address of
a boolean.

But on the whole, I do not believe that we should ever use 'bool' in
structures anyway, because it's such a badly defined type. I think
it's 'char' in practice on just about all architectures, but there
really were traditional use cases where 'bool' was int.

But:

 - we shouldn't turn them into 'int' anyway - alpha is dead, and no
sane architecture will make the same mistake anyway. People learnt.

 - we might want to make sure 'bool' really is 'char' in practice, to
double-check that fthe compiler doesn't do anything stupid.

 - bitfields obviously do need locks. 'char' does not.

If there's somebody who really notices the alpha issue in PRACTICE, we
can then bother to fix it. But there is approximately one user, and
it's not a heavy-duty one.

                   Linus
Paul E. McKenney June 8, 2019, 3:27 p.m. UTC | #5
On Fri, Jun 07, 2019 at 09:19:42AM -0700, Linus Torvalds wrote:
> On Fri, Jun 7, 2019 at 8:26 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > There is common knowledge among us programmers that bit fields
> > (or bool) sharing a common 'word' need to be protected
> > with a common lock.
> >
> > Converting all bit fields to plain int/long would be quite a waste of memory.
> 
> Yeah, and we really don't care about alpha. So 'char' should be safe.
> 
> No compiler actually turns a 'bool' in a struct into a bitfield,
> afaik, because you're still supposed to be able to take the address of
> a boolean.
> 
> But on the whole, I do not believe that we should ever use 'bool' in
> structures anyway, because it's such a badly defined type. I think
> it's 'char' in practice on just about all architectures, but there
> really were traditional use cases where 'bool' was int.
> 
> But:
> 
>  - we shouldn't turn them into 'int' anyway - alpha is dead, and no
> sane architecture will make the same mistake anyway. People learnt.
> 
>  - we might want to make sure 'bool' really is 'char' in practice, to
> double-check that fthe compiler doesn't do anything stupid.
> 
>  - bitfields obviously do need locks. 'char' does not.
> 
> If there's somebody who really notices the alpha issue in PRACTICE, we
> can then bother to fix it. But there is approximately one user, and
> it's not a heavy-duty one.

C11 and later compilers are supposed to use read-modify-write atomic
operations in this sort of situation anyway because they are not supposed
to introduce data races.  So if this problem comes up, the fix should
be in GCC rather than the Linux kernel, right?

								Thanx, Paul
Linus Torvalds June 8, 2019, 5:42 p.m. UTC | #6
On Sat, Jun 8, 2019 at 8:32 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
>
> On Fri, Jun 07, 2019 at 09:19:42AM -0700, Linus Torvalds wrote:
> >
> >  - bitfields obviously do need locks. 'char' does not.
> >
> > If there's somebody who really notices the alpha issue in PRACTICE, we
> > can then bother to fix it. But there is approximately one user, and
> > it's not a heavy-duty one.
>
> C11 and later compilers are supposed to use read-modify-write atomic
> operations in this sort of situation anyway because they are not supposed
> to introduce data races.

I don't think that's possible on any common architecture. The
bitfields themselves will need locking, to serialize writes of
different fields against each other.

There are no atomic rmw sequences that have reasonable performance for
the bitfield updates themselves.

The fields *around* the bitfields had better be safe, but that's
something we already depend on, and which falls under the heading of
"we don't accept garbage compilers".

              Linus
Linus Torvalds June 8, 2019, 5:50 p.m. UTC | #7
On Sat, Jun 8, 2019 at 10:42 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> There are no atomic rmw sequences that have reasonable performance for
> the bitfield updates themselves.

Note that this is purely about the writing side. Reads of bitfield
values can be (and generally _should_ be) atomic, and hopefully C11
means that you wouldn't see intermediate values.

But I'm not convinced about that either: one natural way to update a
bitfield is to first do the masking, and then do the insertion of new
bits, so a bitfield assignment very easily exposes non-real values to
a concurrent read on another CPU.

What I think C11 is supposed to protect is from compilers doing
horribly bad things, and accessing bitfields with bigger types than
the field itself, ie when you have

   struct {
       char c;
       int field1:5;
   };

then a write to "field1" had better not touch "char c" as part of the
rmw operation, because that would indeed introduce a data-race with a
completely independent field that might have completely independent
locking rules.

But

   struct {
        int c:8;
        int field1:5;
   };

would not sanely have the same guarantees, even if the layout in
memory might be identical. Once you have bitfields next to each other,
and use a base type that means they can be combined together, they
can't be sanely modified without locking.

(And I don't know if C11 took up the "base type of the bitfield"
thing. Maybe you still need to use the ":0" thing to force alignment,
and maybe the C standards people still haven't made the underlying
type be meaningful other than for sign handling).

            Linus
Paul E. McKenney June 8, 2019, 6:14 p.m. UTC | #8
On Sat, Jun 08, 2019 at 10:42:41AM -0700, Linus Torvalds wrote:
> On Sat, Jun 8, 2019 at 8:32 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
> >
> > On Fri, Jun 07, 2019 at 09:19:42AM -0700, Linus Torvalds wrote:
> > >
> > >  - bitfields obviously do need locks. 'char' does not.
> > >
> > > If there's somebody who really notices the alpha issue in PRACTICE, we
> > > can then bother to fix it. But there is approximately one user, and
> > > it's not a heavy-duty one.
> >
> > C11 and later compilers are supposed to use read-modify-write atomic
> > operations in this sort of situation anyway because they are not supposed
> > to introduce data races.

Apologies, I should have explicitly stated that I was talking about char
stores, not bitfield stores.  And last I checked, the C11 standard's
prohibition against data races did not extend to individual fields within
a bitfield.  So, yes, for bitfields, the programmer must use a lock or
similar if it is necessary for updates to fields within a bitfield to
be atomic.

> I don't think that's possible on any common architecture. The
> bitfields themselves will need locking, to serialize writes of
> different fields against each other.

Yes, and again the C standard doesn't make any atomicity guarantees
regarding storing to different fields within a bitfield.  The compiler is
free to assume that nothing else is happening anywhere in the bitfield
when storing to a field within that bitfield.  Which gets back to your
"bitfields obviously do need locks", and it is of course the developer
(not the compiler) who must supply those locks.  Plus a given lock must
cover the entire bitfield -- having one lock for half the fields within
a given bitfield and another lock for the other half will break.

Switching from bitfields to char, the C standard -does- require that
storing to one char must avoid even momentary corruption of adjacent
char, so given an old Alpha the compiler would need to use something
like an LL/SC loop.  If it fails to do so, that compiler is failing to
comply with the standard.

> There are no atomic rmw sequences that have reasonable performance for
> the bitfield updates themselves.

Agreed, in the general case.  In a few specific special cases, we do
sometimes hand-craft bitfields using shifts and masks, and sometimes
we use atomic RMW operations to update them.  I suppose we could use
unions as an alternative, but it is not clear to me that this would
help anything.

> The fields *around* the bitfields had better be safe, but that's
> something we already depend on, and which falls under the heading of
> "we don't accept garbage compilers".

And the C standard does require the compiler to make that guarantee, so
for once the standard is even on our side.  ;-)

							Thanx, Paul
Paul E. McKenney June 8, 2019, 6:50 p.m. UTC | #9
On Sat, Jun 08, 2019 at 10:50:51AM -0700, Linus Torvalds wrote:
> On Sat, Jun 8, 2019 at 10:42 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > There are no atomic rmw sequences that have reasonable performance for
> > the bitfield updates themselves.
> 
> Note that this is purely about the writing side. Reads of bitfield
> values can be (and generally _should_ be) atomic, and hopefully C11
> means that you wouldn't see intermediate values.
> 
> But I'm not convinced about that either: one natural way to update a
> bitfield is to first do the masking, and then do the insertion of new
> bits, so a bitfield assignment very easily exposes non-real values to
> a concurrent read on another CPU.

Agreed on the "not convinced" part (though perhaps most implementations
would handle concurrent reads and writes involving different fields of
the same bitfield).  And the C standard does not guarantee this, because
data races are defined in terms of memory locations.  So as far as the
C standard is concerned, if there are two concurrent accesses to fields
within a bitfield that are not separated by ":0", there is a data race
and so the compiler can do whatever it wants.

But do we really care about this case?

> What I think C11 is supposed to protect is from compilers doing
> horribly bad things, and accessing bitfields with bigger types than
> the field itself, ie when you have
> 
>    struct {
>        char c;
>        int field1:5;
>    };
> 
> then a write to "field1" had better not touch "char c" as part of the
> rmw operation, because that would indeed introduce a data-race with a
> completely independent field that might have completely independent
> locking rules.
> 
> But
> 
>    struct {
>         int c:8;
>         int field1:5;
>    };
> 
> would not sanely have the same guarantees, even if the layout in
> memory might be identical. Once you have bitfields next to each other,
> and use a base type that means they can be combined together, they
> can't be sanely modified without locking.
>
> (And I don't know if C11 took up the "base type of the bitfield"
> thing. Maybe you still need to use the ":0" thing to force alignment,
> and maybe the C standards people still haven't made the underlying
> type be meaningful other than for sign handling).

The C standard draft (n2310) gives similar examples:

	EXAMPLE A structure declared as

		struct {
			char a;
			int b:5, c:11,:0, d:8;
			struct { int ee:8; } e;
		}

	contains four separate memory locations: The member a, and
	bit-fields d and e.ee are each separate memory locations,
	and can be modified concurrently without interfering with each
	other. The bit-fields b and c together constitute the fourth
	memory location. The bit-fields b and c cannot be concurrently
	modified, but b and a, for example, can be.

So yes, ":0" still forces alignment to the next storage unit.  And it
can be used to allow concurrent accesses to fields within a bitfield,
but only when those two fields are separated by ":0".

On the underlying type, according to J.3.9 of the current C working draft,
the following are implementation-specified behavior:

-	Whether a "plain" int bit-field is treated as a signed int
	bit-field or as an unsigned int bit-field (6.7.2, 6.7.2.1).

-	Whether atomic types are permitted for bit-fields (6.7.2.1).

This last is strange because you are not allowed to take the address of
a bit field, and the various operations on atomic types take addresses.
Search me!

							Thanx, Paul

Patch
diff mbox series

diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index e91b79ad4e4a..8c458fba74ad 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -14,7 +14,9 @@  struct fqdir {
 	int			max_dist;
 	struct inet_frags	*f;
 	struct net		*net;
-	bool			dead;
+
+	/* We can't use boolean because this needs atomic writes. */
+	int			dead;
 
 	struct rhashtable       rhashtable ____cacheline_aligned_in_smp;
 
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 35e9784fab4e..05aa7c145817 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -193,7 +193,7 @@  void fqdir_exit(struct fqdir *fqdir)
 {
 	fqdir->high_thresh = 0; /* prevent creation of new frags */
 
-	fqdir->dead = true;
+	fqdir->dead = 1;
 
 	/* call_rcu is supposed to provide memory barrier semantics,
 	 * separating the setting of fqdir->dead with the destruction