All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers
@ 2020-07-01 15:50 Eric Dumazet
  2020-07-01 15:57 ` Mathieu Desnoyers
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2020-07-01 15:50 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Mathieu Desnoyers, Herbert Xu

My prior fix went a bit too far, according to Herbert and Mathieu.

Since we accept that concurrent TCP MD5 lookups might see inconsistent
keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()

Clearing all key->key[] is needed to avoid possible KMSAN reports,
if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.

data_race() was added in linux-5.8 and will prevent KCSAN reports,
this can safely be removed in stable backports, if data_race() is
not yet backported.

Fixes: 6a2febec338d ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
---
 net/ipv4/tcp.c      |  4 +---
 net/ipv4/tcp_ipv4.c | 19 ++++++++++++++-----
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f111660453241692a17c881dd6dc2910a1236263..c3af8180c7049d5c4987bf5c67e4aff2ed6967c9 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4033,11 +4033,9 @@ EXPORT_SYMBOL(tcp_md5_hash_skb_data);
 
 int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key *key)
 {
-	u8 keylen = key->keylen;
+	u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in tcp_md5_do_add */
 	struct scatterlist sg;
 
-	smp_rmb(); /* paired with smp_wmb() in tcp_md5_do_add() */
-
 	sg_init_one(&sg, key->key, keylen);
 	ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen);
 	return crypto_ahash_update(hp->md5_req);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 99916fcc15ca0be12c2c133ff40516f79e6fdf7f..04bfcbbfee83aadf5bca0332275c57113abdbc75 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1111,12 +1111,21 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
 
 	key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen, l3index);
 	if (key) {
-		/* Pre-existing entry - just update that one. */
-		memcpy(key->key, newkey, newkeylen);
+		/* Pre-existing entry - just update that one.
+		 * Note that the key might be used concurrently.
+		 * data_race() is telling kcsan that we do not care of
+		 * key mismatches, since changing MD5 key on live flows
+		 * can lead to packet drops.
+		 */
+		data_race(memcpy(key->key, newkey, newkeylen));
 
-		smp_wmb(); /* pairs with smp_rmb() in tcp_md5_hash_key() */
+		/* Pairs with READ_ONCE() in tcp_md5_hash_key().
+		 * Also note that a reader could catch new key->keylen value
+		 * but old key->key[], this is the reason we use __GFP_ZERO
+		 * at sock_kmalloc() time below these lines.
+		 */
+		WRITE_ONCE(key->keylen, newkeylen);
 
-		key->keylen = newkeylen;
 		return 0;
 	}
 
@@ -1132,7 +1141,7 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
 		rcu_assign_pointer(tp->md5sig_info, md5sig);
 	}
 
-	key = sock_kmalloc(sk, sizeof(*key), gfp);
+	key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
 	if (!key)
 		return -ENOMEM;
 	if (!tcp_alloc_md5sig_pool()) {
-- 
2.27.0.212.ge8ba1cc988-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers
  2020-07-01 15:50 [PATCH net] tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers Eric Dumazet
@ 2020-07-01 15:57 ` Mathieu Desnoyers
  2020-07-01 16:05   ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2020-07-01 15:57 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, netdev, Eric Dumazet, Herbert Xu

----- On Jul 1, 2020, at 11:50 AM, Eric Dumazet edumazet@google.com wrote:

> My prior fix went a bit too far, according to Herbert and Mathieu.
> 
> Since we accept that concurrent TCP MD5 lookups might see inconsistent
> keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
> 
> Clearing all key->key[] is needed to avoid possible KMSAN reports,
> if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
> using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
> 
> data_race() was added in linux-5.8 and will prevent KCSAN reports,
> this can safely be removed in stable backports, if data_race() is
> not yet backported.
> 
> Fixes: 6a2febec338d ("tcp: md5: add missing memory barriers in
> tcp_md5_do_add()/tcp_md5_hash_key()")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> ---
> net/ipv4/tcp.c      |  4 +---
> net/ipv4/tcp_ipv4.c | 19 ++++++++++++++-----
> 2 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index
> f111660453241692a17c881dd6dc2910a1236263..c3af8180c7049d5c4987bf5c67e4aff2ed6967c9
> 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -4033,11 +4033,9 @@ EXPORT_SYMBOL(tcp_md5_hash_skb_data);
> 
> int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key
> *key)
> {
> -	u8 keylen = key->keylen;
> +	u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in
> tcp_md5_do_add */
> 	struct scatterlist sg;
> 
> -	smp_rmb(); /* paired with smp_wmb() in tcp_md5_do_add() */
> -
> 	sg_init_one(&sg, key->key, keylen);
> 	ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen);
> 	return crypto_ahash_update(hp->md5_req);

I think we should change this to:

    return data_race(crypto_ahash_update(hp->md5_req));

because both sides can race on the data. Hopefully that would let
KCSAN know that deep within the ->update callback the data race
is OK (?)

Thanks,

Mathieu

> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index
> 99916fcc15ca0be12c2c133ff40516f79e6fdf7f..04bfcbbfee83aadf5bca0332275c57113abdbc75
> 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1111,12 +1111,21 @@ int tcp_md5_do_add(struct sock *sk, const union
> tcp_md5_addr *addr,
> 
> 	key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen, l3index);
> 	if (key) {
> -		/* Pre-existing entry - just update that one. */
> -		memcpy(key->key, newkey, newkeylen);
> +		/* Pre-existing entry - just update that one.
> +		 * Note that the key might be used concurrently.
> +		 * data_race() is telling kcsan that we do not care of
> +		 * key mismatches, since changing MD5 key on live flows
> +		 * can lead to packet drops.
> +		 */
> +		data_race(memcpy(key->key, newkey, newkeylen));
> 
> -		smp_wmb(); /* pairs with smp_rmb() in tcp_md5_hash_key() */
> +		/* Pairs with READ_ONCE() in tcp_md5_hash_key().
> +		 * Also note that a reader could catch new key->keylen value
> +		 * but old key->key[], this is the reason we use __GFP_ZERO
> +		 * at sock_kmalloc() time below these lines.
> +		 */
> +		WRITE_ONCE(key->keylen, newkeylen);
> 
> -		key->keylen = newkeylen;
> 		return 0;
> 	}
> 
> @@ -1132,7 +1141,7 @@ int tcp_md5_do_add(struct sock *sk, const union
> tcp_md5_addr *addr,
> 		rcu_assign_pointer(tp->md5sig_info, md5sig);
> 	}
> 
> -	key = sock_kmalloc(sk, sizeof(*key), gfp);
> +	key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
> 	if (!key)
> 		return -ENOMEM;
> 	if (!tcp_alloc_md5sig_pool()) {
> --
> 2.27.0.212.ge8ba1cc988-goog

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers
  2020-07-01 15:57 ` Mathieu Desnoyers
@ 2020-07-01 16:05   ` Eric Dumazet
  2020-07-01 18:25     ` Marco Elver
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2020-07-01 16:05 UTC (permalink / raw)
  To: Mathieu Desnoyers, Marco Elver
  Cc: David S. Miller, netdev, Eric Dumazet, Herbert Xu

On Wed, Jul 1, 2020 at 8:57 AM Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
>
> ----- On Jul 1, 2020, at 11:50 AM, Eric Dumazet edumazet@google.com wrote:
>
> > My prior fix went a bit too far, according to Herbert and Mathieu.
> >
> > Since we accept that concurrent TCP MD5 lookups might see inconsistent
> > keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
> >
> > Clearing all key->key[] is needed to avoid possible KMSAN reports,
> > if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
> > using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
> >
> > data_race() was added in linux-5.8 and will prevent KCSAN reports,
> > this can safely be removed in stable backports, if data_race() is
> > not yet backported.
> >
> > Fixes: 6a2febec338d ("tcp: md5: add missing memory barriers in
> > tcp_md5_do_add()/tcp_md5_hash_key()")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > ---
> > net/ipv4/tcp.c      |  4 +---
> > net/ipv4/tcp_ipv4.c | 19 ++++++++++++++-----
> > 2 files changed, 15 insertions(+), 8 deletions(-)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index
> > f111660453241692a17c881dd6dc2910a1236263..c3af8180c7049d5c4987bf5c67e4aff2ed6967c9
> > 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -4033,11 +4033,9 @@ EXPORT_SYMBOL(tcp_md5_hash_skb_data);
> >
> > int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key
> > *key)
> > {
> > -     u8 keylen = key->keylen;
> > +     u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in
> > tcp_md5_do_add */
> >       struct scatterlist sg;
> >
> > -     smp_rmb(); /* paired with smp_wmb() in tcp_md5_do_add() */
> > -
> >       sg_init_one(&sg, key->key, keylen);
> >       ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen);
> >       return crypto_ahash_update(hp->md5_req);
>
> I think we should change this to:
>
>     return data_race(crypto_ahash_update(hp->md5_req));
>
> because both sides can race on the data. Hopefully that would let
> KCSAN know that deep within the ->update callback the data race
> is OK (?)
>

Let's ask Marco.

Before data_race() has been there, using READ_ONCE() or WRITE_ONCE()
was enough to silence KCSAN.
Not sure if this is the same for data_race().

I would prefer not having to add annotations all over the places, to
reduce backport pains.

Unless we have plans to backport data_race() to all stable versions.

> Thanks,
>
> Mathieu
>
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index
> > 99916fcc15ca0be12c2c133ff40516f79e6fdf7f..04bfcbbfee83aadf5bca0332275c57113abdbc75
> > 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -1111,12 +1111,21 @@ int tcp_md5_do_add(struct sock *sk, const union
> > tcp_md5_addr *addr,
> >
> >       key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen, l3index);
> >       if (key) {
> > -             /* Pre-existing entry - just update that one. */
> > -             memcpy(key->key, newkey, newkeylen);
> > +             /* Pre-existing entry - just update that one.
> > +              * Note that the key might be used concurrently.
> > +              * data_race() is telling kcsan that we do not care of
> > +              * key mismatches, since changing MD5 key on live flows
> > +              * can lead to packet drops.
> > +              */
> > +             data_race(memcpy(key->key, newkey, newkeylen));
> >
> > -             smp_wmb(); /* pairs with smp_rmb() in tcp_md5_hash_key() */
> > +             /* Pairs with READ_ONCE() in tcp_md5_hash_key().
> > +              * Also note that a reader could catch new key->keylen value
> > +              * but old key->key[], this is the reason we use __GFP_ZERO
> > +              * at sock_kmalloc() time below these lines.
> > +              */
> > +             WRITE_ONCE(key->keylen, newkeylen);
> >
> > -             key->keylen = newkeylen;
> >               return 0;
> >       }
> >
> > @@ -1132,7 +1141,7 @@ int tcp_md5_do_add(struct sock *sk, const union
> > tcp_md5_addr *addr,
> >               rcu_assign_pointer(tp->md5sig_info, md5sig);
> >       }
> >
> > -     key = sock_kmalloc(sk, sizeof(*key), gfp);
> > +     key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
> >       if (!key)
> >               return -ENOMEM;
> >       if (!tcp_alloc_md5sig_pool()) {
> > --
> > 2.27.0.212.ge8ba1cc988-goog
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers
  2020-07-01 16:05   ` Eric Dumazet
@ 2020-07-01 18:25     ` Marco Elver
  2020-07-01 18:32       ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Marco Elver @ 2020-07-01 18:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mathieu Desnoyers, David S. Miller, netdev, Eric Dumazet, Herbert Xu

On Wed, 1 Jul 2020 at 18:05, Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Jul 1, 2020 at 8:57 AM Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> >
> > ----- On Jul 1, 2020, at 11:50 AM, Eric Dumazet edumazet@google.com wrote:
> >
> > > My prior fix went a bit too far, according to Herbert and Mathieu.
> > >
> > > Since we accept that concurrent TCP MD5 lookups might see inconsistent
> > > keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
> > >
> > > Clearing all key->key[] is needed to avoid possible KMSAN reports,
> > > if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
> > > using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
> > >
> > > data_race() was added in linux-5.8 and will prevent KCSAN reports,
> > > this can safely be removed in stable backports, if data_race() is
> > > not yet backported.
> > >
> > > Fixes: 6a2febec338d ("tcp: md5: add missing memory barriers in
> > > tcp_md5_do_add()/tcp_md5_hash_key()")
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > > ---
> > > net/ipv4/tcp.c      |  4 +---
> > > net/ipv4/tcp_ipv4.c | 19 ++++++++++++++-----
> > > 2 files changed, 15 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > index
> > > f111660453241692a17c881dd6dc2910a1236263..c3af8180c7049d5c4987bf5c67e4aff2ed6967c9
> > > 100644
> > > --- a/net/ipv4/tcp.c
> > > +++ b/net/ipv4/tcp.c
> > > @@ -4033,11 +4033,9 @@ EXPORT_SYMBOL(tcp_md5_hash_skb_data);
> > >
> > > int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key
> > > *key)
> > > {
> > > -     u8 keylen = key->keylen;
> > > +     u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in
> > > tcp_md5_do_add */
> > >       struct scatterlist sg;
> > >
> > > -     smp_rmb(); /* paired with smp_wmb() in tcp_md5_do_add() */
> > > -
> > >       sg_init_one(&sg, key->key, keylen);
> > >       ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen);
> > >       return crypto_ahash_update(hp->md5_req);
> >
> > I think we should change this to:
> >
> >     return data_race(crypto_ahash_update(hp->md5_req));
> >
> > because both sides can race on the data. Hopefully that would let
> > KCSAN know that deep within the ->update callback the data race
> > is OK (?)
> >
>
> Let's ask Marco.
>
> Before data_race() has been there, using READ_ONCE() or WRITE_ONCE()
> was enough to silence KCSAN.
> Not sure if this is the same for data_race().
>
> I would prefer not having to add annotations all over the places, to
> reduce backport pains.
>
> Unless we have plans to backport data_race() to all stable versions.

data_race() actually changed its final semantics. The first version of
data_race() still expected all racing locations to be marked (whether
with data_race(), or ONCE, or ..). The current version doesn't, and
strictly speaking only marking the reader location with a data_race()
will silence KCSAN for good. Although it is entirely up to preference,
and marking both sides is certainly valid and documents what's
happening, KCSAN doesn't care.

If you only mark the writers, we may still get reports at the reader
location with KCSAN's default config in the form of "race of unknown
origin" because there is an observable race on the reader side, but
the writer is entirely hidden (syzbot won't report them though,
because we have CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=n).

Also, wrapping entire function calls in data_race() is valid.

Does this answer the question?

> > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > index
> > > 99916fcc15ca0be12c2c133ff40516f79e6fdf7f..04bfcbbfee83aadf5bca0332275c57113abdbc75
> > > 100644
> > > --- a/net/ipv4/tcp_ipv4.c
> > > +++ b/net/ipv4/tcp_ipv4.c
> > > @@ -1111,12 +1111,21 @@ int tcp_md5_do_add(struct sock *sk, const union
> > > tcp_md5_addr *addr,
> > >
> > >       key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen, l3index);
> > >       if (key) {
> > > -             /* Pre-existing entry - just update that one. */
> > > -             memcpy(key->key, newkey, newkeylen);
> > > +             /* Pre-existing entry - just update that one.
> > > +              * Note that the key might be used concurrently.
> > > +              * data_race() is telling kcsan that we do not care of
> > > +              * key mismatches, since changing MD5 key on live flows
> > > +              * can lead to packet drops.
> > > +              */
> > > +             data_race(memcpy(key->key, newkey, newkeylen));
> > >
> > > -             smp_wmb(); /* pairs with smp_rmb() in tcp_md5_hash_key() */
> > > +             /* Pairs with READ_ONCE() in tcp_md5_hash_key().
> > > +              * Also note that a reader could catch new key->keylen value
> > > +              * but old key->key[], this is the reason we use __GFP_ZERO
> > > +              * at sock_kmalloc() time below these lines.
> > > +              */
> > > +             WRITE_ONCE(key->keylen, newkeylen);

From what I read, removing the barriers is safe; but in case it
matters in future, KCSAN is certainly aware of other primitives such
as smp_load_acquire/smp_store_release.

> > > -             key->keylen = newkeylen;
> > >               return 0;
> > >       }
> > >
> > > @@ -1132,7 +1141,7 @@ int tcp_md5_do_add(struct sock *sk, const union
> > > tcp_md5_addr *addr,
> > >               rcu_assign_pointer(tp->md5sig_info, md5sig);
> > >       }
> > >
> > > -     key = sock_kmalloc(sk, sizeof(*key), gfp);
> > > +     key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
> > >       if (!key)
> > >               return -ENOMEM;
> > >       if (!tcp_alloc_md5sig_pool()) {
> > > --
> > > 2.27.0.212.ge8ba1cc988-goog

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers
  2020-07-01 18:25     ` Marco Elver
@ 2020-07-01 18:32       ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2020-07-01 18:32 UTC (permalink / raw)
  To: Marco Elver
  Cc: Mathieu Desnoyers, David S. Miller, netdev, Eric Dumazet, Herbert Xu

On Wed, Jul 1, 2020 at 11:25 AM Marco Elver <elver@google.com> wrote:
>
> On Wed, 1 Jul 2020 at 18:05, Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Wed, Jul 1, 2020 at 8:57 AM Mathieu Desnoyers
> > <mathieu.desnoyers@efficios.com> wrote:
> > >
> > > ----- On Jul 1, 2020, at 11:50 AM, Eric Dumazet edumazet@google.com wrote:
> > >
> > > > My prior fix went a bit too far, according to Herbert and Mathieu.
> > > >
> > > > Since we accept that concurrent TCP MD5 lookups might see inconsistent
> > > > keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
> > > >
> > > > Clearing all key->key[] is needed to avoid possible KMSAN reports,
> > > > if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
> > > > using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
> > > >
> > > > data_race() was added in linux-5.8 and will prevent KCSAN reports,
> > > > this can safely be removed in stable backports, if data_race() is
> > > > not yet backported.
> > > >
> > > > Fixes: 6a2febec338d ("tcp: md5: add missing memory barriers in
> > > > tcp_md5_do_add()/tcp_md5_hash_key()")
> > > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > > > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > > > ---
> > > > net/ipv4/tcp.c      |  4 +---
> > > > net/ipv4/tcp_ipv4.c | 19 ++++++++++++++-----
> > > > 2 files changed, 15 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > > index
> > > > f111660453241692a17c881dd6dc2910a1236263..c3af8180c7049d5c4987bf5c67e4aff2ed6967c9
> > > > 100644
> > > > --- a/net/ipv4/tcp.c
> > > > +++ b/net/ipv4/tcp.c
> > > > @@ -4033,11 +4033,9 @@ EXPORT_SYMBOL(tcp_md5_hash_skb_data);
> > > >
> > > > int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key
> > > > *key)
> > > > {
> > > > -     u8 keylen = key->keylen;
> > > > +     u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in
> > > > tcp_md5_do_add */
> > > >       struct scatterlist sg;
> > > >
> > > > -     smp_rmb(); /* paired with smp_wmb() in tcp_md5_do_add() */
> > > > -
> > > >       sg_init_one(&sg, key->key, keylen);
> > > >       ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen);
> > > >       return crypto_ahash_update(hp->md5_req);
> > >
> > > I think we should change this to:
> > >
> > >     return data_race(crypto_ahash_update(hp->md5_req));
> > >
> > > because both sides can race on the data. Hopefully that would let
> > > KCSAN know that deep within the ->update callback the data race
> > > is OK (?)
> > >
> >
> > Let's ask Marco.
> >
> > Before data_race() has been there, using READ_ONCE() or WRITE_ONCE()
> > was enough to silence KCSAN.
> > Not sure if this is the same for data_race().
> >
> > I would prefer not having to add annotations all over the places, to
> > reduce backport pains.
> >
> > Unless we have plans to backport data_race() to all stable versions.
>
> data_race() actually changed its final semantics. The first version of
> data_race() still expected all racing locations to be marked (whether
> with data_race(), or ONCE, or ..). The current version doesn't, and
> strictly speaking only marking the reader location with a data_race()
> will silence KCSAN for good. Although it is entirely up to preference,
> and marking both sides is certainly valid and documents what's
> happening, KCSAN doesn't care.
>
> If you only mark the writers, we may still get reports at the reader
> location with KCSAN's default config in the form of "race of unknown
> origin" because there is an observable race on the reader side, but
> the writer is entirely hidden (syzbot won't report them though,
> because we have CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=n).
>
> Also, wrapping entire function calls in data_race() is valid.

Ok, I can do that, but it might hide other bugs.

Here only one of the inputs (the up to 80 bytes of MD5 key) might be
changed, but we would not expect other stuff to be changed under us.

I guess this is the same observation as adding
READ_ONCE()/WRITE_ONCE() all over the places to silence sanitizers :
Real bugs could be be hidden.

>
> Does this answer the question?

Yes thank you.

>
> > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > > > index
> > > > 99916fcc15ca0be12c2c133ff40516f79e6fdf7f..04bfcbbfee83aadf5bca0332275c57113abdbc75
> > > > 100644
> > > > --- a/net/ipv4/tcp_ipv4.c
> > > > +++ b/net/ipv4/tcp_ipv4.c
> > > > @@ -1111,12 +1111,21 @@ int tcp_md5_do_add(struct sock *sk, const union
> > > > tcp_md5_addr *addr,
> > > >
> > > >       key = tcp_md5_do_lookup_exact(sk, addr, family, prefixlen, l3index);
> > > >       if (key) {
> > > > -             /* Pre-existing entry - just update that one. */
> > > > -             memcpy(key->key, newkey, newkeylen);
> > > > +             /* Pre-existing entry - just update that one.
> > > > +              * Note that the key might be used concurrently.
> > > > +              * data_race() is telling kcsan that we do not care of
> > > > +              * key mismatches, since changing MD5 key on live flows
> > > > +              * can lead to packet drops.
> > > > +              */
> > > > +             data_race(memcpy(key->key, newkey, newkeylen));
> > > >
> > > > -             smp_wmb(); /* pairs with smp_rmb() in tcp_md5_hash_key() */
> > > > +             /* Pairs with READ_ONCE() in tcp_md5_hash_key().
> > > > +              * Also note that a reader could catch new key->keylen value
> > > > +              * but old key->key[], this is the reason we use __GFP_ZERO
> > > > +              * at sock_kmalloc() time below these lines.
> > > > +              */
> > > > +             WRITE_ONCE(key->keylen, newkeylen);
>
> From what I read, removing the barriers is safe; but in case it
> matters in future, KCSAN is certainly aware of other primitives such
> as smp_load_acquire/smp_store_release.
>
> > > > -             key->keylen = newkeylen;
> > > >               return 0;
> > > >       }
> > > >
> > > > @@ -1132,7 +1141,7 @@ int tcp_md5_do_add(struct sock *sk, const union
> > > > tcp_md5_addr *addr,
> > > >               rcu_assign_pointer(tp->md5sig_info, md5sig);
> > > >       }
> > > >
> > > > -     key = sock_kmalloc(sk, sizeof(*key), gfp);
> > > > +     key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
> > > >       if (!key)
> > > >               return -ENOMEM;
> > > >       if (!tcp_alloc_md5sig_pool()) {
> > > > --
> > > > 2.27.0.212.ge8ba1cc988-goog

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-07-01 18:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-01 15:50 [PATCH net] tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers Eric Dumazet
2020-07-01 15:57 ` Mathieu Desnoyers
2020-07-01 16:05   ` Eric Dumazet
2020-07-01 18:25     ` Marco Elver
2020-07-01 18:32       ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.