All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key
@ 2022-11-23 17:38 Dmitry Safonov
  2022-11-23 17:38 ` [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow Dmitry Safonov
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Dmitry Safonov @ 2022-11-23 17:38 UTC (permalink / raw)
  To: linux-kernel, David Ahern, Eric Dumazet, Peter Zijlstra
  Cc: Dmitry Safonov, Ard Biesheuvel, Bob Gilligan, David S. Miller,
	Dmitry Safonov, Francesco Ruggeri, Hideaki YOSHIFUJI,
	Jakub Kicinski, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev

Changes from v5:
- Corrected comment for static_key_fast_inc_not_negative() (Peter)
- Renamed static_key_fast_inc_not_negative() =>
  static_key_fast_inc_not_disabled() (as suggested by Peter)
- static_key_fast_inc_not_disabled() is exported and declared in the
  patch 1 that defines it, rather than in patch 3 that uses it (Peter)

Changes from v4:
- Used rcu_dereference_protected() for tp->md5sig_info in
  tcp_md5_do_add() and tcp_md5_key_copy() fail paths to make sure
  there won't be false-positives from sparse (Jakub)
- Added Acked-by: Jakub Kicinski

Changes from v3:
- Used atomic_try_cmpxchg() as suggested by Peter Zijlstra
- Renamed static_key_fast_inc() => static_key_fast_inc_not_negative()
  (addressing Peter Zijlstra's review)
- Based on linux-tip/master
- tcp_md5_key_copy() now does net_warn_ratelimited()
  (addressing Peter Zijlstra's review)
  tcp_md5_do_add() does not as it returns -EUSERS from setsockopt()
  syscall back to the userspace
- Corrected WARN_ON_ONCE(!static_key_fast_inc(key))
  (Spotted by Jason Baron)
- Moved declaration of static_key_fast_inc_not_negative() and its
  EXPORT_SYMBOL_GPL() to the patch 3 that uses it,
  "net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction"
  (addressing Peter Zijlstra's review)
- Added patch 4 that destroys the newly created request socket
  if md5 info allocation or static_key increment was unsuccessful.
  Instead of proceeding to add a socket without TCP-MD5 keys.
- Added patch 5 that separates helper tcp_time_wait_init()
  and converts BUG_ON() to WARN_ON_ONCE().

Changes from v2:
- Prevent key->enabled from turning negative by overflow from
  static_key_slow_inc() or static_key_fast_inc()
  (addressing Peter Zijlstra's review)
- Added checks if static_branch_inc() and static_key_fast_int()
  were successful to TCP-MD5 code.

Changes from v1:
- Add static_key_fast_inc() helper rather than open-coded atomic_inc()
  (as suggested by Eric Dumazet)

Version 5:
https://lore.kernel.org/all/20221122185534.308643-1-dima@arista.com/T/#u
Version 4:
https://lore.kernel.org/all/20221115211905.1685426-1-dima@arista.com/T/#u
Version 3:
https://lore.kernel.org/all/20221111212320.1386566-1-dima@arista.com/T/#u
Version 2: 
https://lore.kernel.org/all/20221103212524.865762-1-dima@arista.com/T/#u
Version 1: 
https://lore.kernel.org/all/20221102211350.625011-1-dima@arista.com/T/#u

The static key introduced by commit 6015c71e656b ("tcp: md5: add
tcp_md5_needed jump label") is a fast-path optimization aimed at
avoiding a cache line miss.
Once an MD5 key is introduced in the system the static key is enabled
and never disabled. Address this by disabling the static key when
the last tcp_md5sig_info in system is destroyed.

Previously it was submitted as a part of TCP-AO patches set [1].
Now in attempt to split 36 patches submission, I send this independently.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Bob Gilligan <gilligan@arista.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Francesco Ruggeri <fruggeri@arista.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Salam Noureddine <noureddine@arista.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

[1]: https://lore.kernel.org/all/20221027204347.529913-1-dima@arista.com/T/#u

Thanks,
            Dmitry

Dmitry Safonov (5):
  jump_label: Prevent key->enabled int overflow
  net/tcp: Separate tcp_md5sig_info allocation into
    tcp_md5sig_info_add()
  net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
  net/tcp: Do cleanup on tcp_md5_key_copy() failure
  net/tcp: Separate initialization of twsk

 include/linux/jump_label.h | 21 +++++++--
 include/net/tcp.h          | 10 ++--
 kernel/jump_label.c        | 56 +++++++++++++++++-----
 net/ipv4/tcp.c             |  5 +-
 net/ipv4/tcp_ipv4.c        | 96 +++++++++++++++++++++++++++++---------
 net/ipv4/tcp_minisocks.c   | 61 +++++++++++++++---------
 net/ipv4/tcp_output.c      |  4 +-
 net/ipv6/tcp_ipv6.c        | 21 ++++-----
 8 files changed, 194 insertions(+), 80 deletions(-)


base-commit: 736b6d81d93cf61a0601af90bd552103ef997b3f
-- 
2.38.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow
  2022-11-23 17:38 [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key Dmitry Safonov
@ 2022-11-23 17:38 ` Dmitry Safonov
  2022-11-25  7:59   ` Peter Zijlstra
  2022-11-23 17:38 ` [PATCH v6 2/5] net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add() Dmitry Safonov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Safonov @ 2022-11-23 17:38 UTC (permalink / raw)
  To: linux-kernel, David Ahern, Eric Dumazet, Peter Zijlstra
  Cc: Dmitry Safonov, Ard Biesheuvel, Bob Gilligan, David S. Miller,
	Dmitry Safonov, Francesco Ruggeri, Hideaki YOSHIFUJI,
	Jakub Kicinski, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev

1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
   protection against key->enabled refcounter overflow.
2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
   still may turn the refcounter negative as (v + 1) may overflow.

key->enabled is indeed a ref-counter as it's documented in multiple
places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
etc.

As -1 is reserved for static key that's in process of being enabled,
functions would break with negative key->enabled refcount:
- for CONFIG_JUMP_LABEL=n negative return of static_key_count()
  breaks static_key_false(), static_key_true()
- the ref counter may become 0 from negative side by too many
  static_key_slow_inc() calls and lead to use-after-free issues.

These flaws result in that some users have to introduce an additional
mutex and prevent the reference counter from overflowing themselves,
see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.

Prevent the reference counter overflow by checking if (v + 1) > 0.
Change functions API to return whether the increment was successful.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
---
 include/linux/jump_label.h | 21 +++++++++++---
 kernel/jump_label.c        | 56 ++++++++++++++++++++++++++++++--------
 2 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 570831ca9951..4e968ebadce6 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -224,9 +224,10 @@ extern bool arch_jump_label_transform_queue(struct jump_entry *entry,
 					    enum jump_label_type type);
 extern void arch_jump_label_transform_apply(void);
 extern int jump_label_text_reserved(void *start, void *end);
-extern void static_key_slow_inc(struct static_key *key);
+extern bool static_key_slow_inc(struct static_key *key);
+extern bool static_key_fast_inc_not_disabled(struct static_key *key);
 extern void static_key_slow_dec(struct static_key *key);
-extern void static_key_slow_inc_cpuslocked(struct static_key *key);
+extern bool static_key_slow_inc_cpuslocked(struct static_key *key);
 extern void static_key_slow_dec_cpuslocked(struct static_key *key);
 extern int static_key_count(struct static_key *key);
 extern void static_key_enable(struct static_key *key);
@@ -278,11 +279,23 @@ static __always_inline bool static_key_true(struct static_key *key)
 	return false;
 }
 
-static inline void static_key_slow_inc(struct static_key *key)
+static inline bool static_key_fast_inc_not_disabled(struct static_key *key)
 {
+	int v;
+
 	STATIC_KEY_CHECK_USE(key);
-	atomic_inc(&key->enabled);
+	/*
+	 * Prevent key->enabled getting negative to follow the same semantics
+	 * as for CONFIG_JUMP_LABEL=y, see kernel/jump_label.c comment.
+	 */
+	v = atomic_read(&key->enabled);
+	do {
+		if (v < 0 || (v + 1) < 0)
+			return false;
+	} while (!likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)));
+	return true;
 }
+#define static_key_slow_inc(key)	static_key_fast_inc_not_disabled(key)
 
 static inline void static_key_slow_dec(struct static_key *key)
 {
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 4d6c6f5f60db..d9c822bbffb8 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -113,9 +113,40 @@ int static_key_count(struct static_key *key)
 }
 EXPORT_SYMBOL_GPL(static_key_count);
 
-void static_key_slow_inc_cpuslocked(struct static_key *key)
+/*
+ * static_key_fast_inc_not_disabled - adds a user for a static key
+ * @key: static key that must be already enabled
+ *
+ * The caller must make sure that the static key can't get disabled while
+ * in this function. It doesn't patch jump labels, only adds a user to
+ * an already enabled static key.
+ *
+ * Returns true if the increment was done. Unlike refcount_t the ref counter
+ * is not saturated, but will fail to increment on overflow.
+ */
+bool static_key_fast_inc_not_disabled(struct static_key *key)
 {
+	int v;
+
 	STATIC_KEY_CHECK_USE(key);
+	/*
+	 * Negative key->enabled has a special meaning: it sends
+	 * static_key_slow_inc() down the slow path, and it is non-zero
+	 * so it counts as "enabled" in jump_label_update().  Note that
+	 * atomic_inc_unless_negative() checks >= 0, so roll our own.
+	 */
+	v = atomic_read(&key->enabled);
+	do {
+		if (v <= 0 || (v + 1) < 0)
+			return false;
+	} while (!likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)));
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(static_key_fast_inc_not_disabled);
+
+bool static_key_slow_inc_cpuslocked(struct static_key *key)
+{
 	lockdep_assert_cpus_held();
 
 	/*
@@ -124,15 +155,9 @@ void static_key_slow_inc_cpuslocked(struct static_key *key)
 	 * jump_label_update() process.  At the same time, however,
 	 * the jump_label_update() call below wants to see
 	 * static_key_enabled(&key) for jumps to be updated properly.
-	 *
-	 * So give a special meaning to negative key->enabled: it sends
-	 * static_key_slow_inc() down the slow path, and it is non-zero
-	 * so it counts as "enabled" in jump_label_update().  Note that
-	 * atomic_inc_unless_negative() checks >= 0, so roll our own.
 	 */
-	for (int v = atomic_read(&key->enabled); v > 0; )
-		if (likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)))
-			return;
+	if (static_key_fast_inc_not_disabled(key))
+		return true;
 
 	jump_label_lock();
 	if (atomic_read(&key->enabled) == 0) {
@@ -144,16 +169,23 @@ void static_key_slow_inc_cpuslocked(struct static_key *key)
 		 */
 		atomic_set_release(&key->enabled, 1);
 	} else {
-		atomic_inc(&key->enabled);
+		if (WARN_ON_ONCE(!static_key_fast_inc_not_disabled(key))) {
+			jump_label_unlock();
+			return false;
+		}
 	}
 	jump_label_unlock();
+	return true;
 }
 
-void static_key_slow_inc(struct static_key *key)
+bool static_key_slow_inc(struct static_key *key)
 {
+	bool ret;
+
 	cpus_read_lock();
-	static_key_slow_inc_cpuslocked(key);
+	ret = static_key_slow_inc_cpuslocked(key);
 	cpus_read_unlock();
+	return ret;
 }
 EXPORT_SYMBOL_GPL(static_key_slow_inc);
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/5] net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add()
  2022-11-23 17:38 [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key Dmitry Safonov
  2022-11-23 17:38 ` [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow Dmitry Safonov
@ 2022-11-23 17:38 ` Dmitry Safonov
  2022-11-23 17:38 ` [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction Dmitry Safonov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Dmitry Safonov @ 2022-11-23 17:38 UTC (permalink / raw)
  To: linux-kernel, David Ahern, Eric Dumazet, Peter Zijlstra
  Cc: Dmitry Safonov, Ard Biesheuvel, Bob Gilligan, David S. Miller,
	Dmitry Safonov, Francesco Ruggeri, Hideaki YOSHIFUJI,
	Jakub Kicinski, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev

Add a helper to allocate tcp_md5sig_info, that will help later to
do/allocate things when info allocated, once per socket.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
---
 net/ipv4/tcp_ipv4.c | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f0343538d1f8..2d76d50b8ae8 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1172,6 +1172,24 @@ struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk,
 }
 EXPORT_SYMBOL(tcp_v4_md5_lookup);
 
+static int tcp_md5sig_info_add(struct sock *sk, gfp_t gfp)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct tcp_md5sig_info *md5sig;
+
+	if (rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk)))
+		return 0;
+
+	md5sig = kmalloc(sizeof(*md5sig), gfp);
+	if (!md5sig)
+		return -ENOMEM;
+
+	sk_gso_disable(sk);
+	INIT_HLIST_HEAD(&md5sig->head);
+	rcu_assign_pointer(tp->md5sig_info, md5sig);
+	return 0;
+}
+
 /* This can be called on a newly created socket, from other files */
 int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
 		   int family, u8 prefixlen, int l3index, u8 flags,
@@ -1202,17 +1220,11 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
 		return 0;
 	}
 
+	if (tcp_md5sig_info_add(sk, gfp))
+		return -ENOMEM;
+
 	md5sig = rcu_dereference_protected(tp->md5sig_info,
 					   lockdep_sock_is_held(sk));
-	if (!md5sig) {
-		md5sig = kmalloc(sizeof(*md5sig), gfp);
-		if (!md5sig)
-			return -ENOMEM;
-
-		sk_gso_disable(sk);
-		INIT_HLIST_HEAD(&md5sig->head);
-		rcu_assign_pointer(tp->md5sig_info, md5sig);
-	}
 
 	key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
 	if (!key)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
  2022-11-23 17:38 [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key Dmitry Safonov
  2022-11-23 17:38 ` [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow Dmitry Safonov
  2022-11-23 17:38 ` [PATCH v6 2/5] net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add() Dmitry Safonov
@ 2022-11-23 17:38 ` Dmitry Safonov
  2022-12-01 19:38   ` Eric Dumazet
  2022-11-23 17:38 ` [PATCH v6 4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure Dmitry Safonov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Safonov @ 2022-11-23 17:38 UTC (permalink / raw)
  To: linux-kernel, David Ahern, Eric Dumazet, Peter Zijlstra
  Cc: Dmitry Safonov, Ard Biesheuvel, Bob Gilligan, David S. Miller,
	Dmitry Safonov, Francesco Ruggeri, Hideaki YOSHIFUJI,
	Jakub Kicinski, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev

To do that, separate two scenarios:
- where it's the first MD5 key on the system, which means that enabling
  of the static key may need to sleep;
- copying of an existing key from a listening socket to the request
  socket upon receiving a signed TCP segment, where static key was
  already enabled (when the key was added to the listening socket).

Now the life-time of the static branch for TCP-MD5 is until:
- last tcp_md5sig_info is destroyed
- last socket in time-wait state with MD5 key is closed.

Which means that after all sockets with TCP-MD5 keys are gone, the
system gets back the performance of disabled md5-key static branch.

While at here, provide static_key_fast_inc() helper that does ref
counter increment in atomic fashion (without grabbing cpus_read_lock()
on CONFIG_JUMP_LABEL=y). This is needed to add a new user for
a static_key when the caller controls the lifetime of another user.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
---
 include/net/tcp.h        | 10 ++++--
 net/ipv4/tcp.c           |  5 +--
 net/ipv4/tcp_ipv4.c      | 71 ++++++++++++++++++++++++++++++++--------
 net/ipv4/tcp_minisocks.c | 16 ++++++---
 net/ipv4/tcp_output.c    |  4 +--
 net/ipv6/tcp_ipv6.c      | 10 +++---
 6 files changed, 84 insertions(+), 32 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6b814e788f00..f925377066fe 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1675,7 +1675,11 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key,
 			const struct sock *sk, const struct sk_buff *skb);
 int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
 		   int family, u8 prefixlen, int l3index, u8 flags,
-		   const u8 *newkey, u8 newkeylen, gfp_t gfp);
+		   const u8 *newkey, u8 newkeylen);
+int tcp_md5_key_copy(struct sock *sk, const union tcp_md5_addr *addr,
+		     int family, u8 prefixlen, int l3index,
+		     struct tcp_md5sig_key *key);
+
 int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr,
 		   int family, u8 prefixlen, int l3index, u8 flags);
 struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk,
@@ -1683,7 +1687,7 @@ struct tcp_md5sig_key *tcp_v4_md5_lookup(const struct sock *sk,
 
 #ifdef CONFIG_TCP_MD5SIG
 #include <linux/jump_label.h>
-extern struct static_key_false tcp_md5_needed;
+extern struct static_key_false_deferred tcp_md5_needed;
 struct tcp_md5sig_key *__tcp_md5_do_lookup(const struct sock *sk, int l3index,
 					   const union tcp_md5_addr *addr,
 					   int family);
@@ -1691,7 +1695,7 @@ static inline struct tcp_md5sig_key *
 tcp_md5_do_lookup(const struct sock *sk, int l3index,
 		  const union tcp_md5_addr *addr, int family)
 {
-	if (!static_branch_unlikely(&tcp_md5_needed))
+	if (!static_branch_unlikely(&tcp_md5_needed.key))
 		return NULL;
 	return __tcp_md5_do_lookup(sk, l3index, addr, family);
 }
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4a69c5fcfedc..267406f199bc 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4465,11 +4465,8 @@ bool tcp_alloc_md5sig_pool(void)
 	if (unlikely(!READ_ONCE(tcp_md5sig_pool_populated))) {
 		mutex_lock(&tcp_md5sig_mutex);
 
-		if (!tcp_md5sig_pool_populated) {
+		if (!tcp_md5sig_pool_populated)
 			__tcp_alloc_md5sig_pool();
-			if (tcp_md5sig_pool_populated)
-				static_branch_inc(&tcp_md5_needed);
-		}
 
 		mutex_unlock(&tcp_md5sig_mutex);
 	}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 2d76d50b8ae8..2ae6a061f36e 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1064,7 +1064,7 @@ static void tcp_v4_reqsk_destructor(struct request_sock *req)
  * We need to maintain these in the sk structure.
  */
 
-DEFINE_STATIC_KEY_FALSE(tcp_md5_needed);
+DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_md5_needed, HZ);
 EXPORT_SYMBOL(tcp_md5_needed);
 
 static bool better_md5_match(struct tcp_md5sig_key *old, struct tcp_md5sig_key *new)
@@ -1177,9 +1177,6 @@ static int tcp_md5sig_info_add(struct sock *sk, gfp_t gfp)
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct tcp_md5sig_info *md5sig;
 
-	if (rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk)))
-		return 0;
-
 	md5sig = kmalloc(sizeof(*md5sig), gfp);
 	if (!md5sig)
 		return -ENOMEM;
@@ -1191,9 +1188,9 @@ static int tcp_md5sig_info_add(struct sock *sk, gfp_t gfp)
 }
 
 /* This can be called on a newly created socket, from other files */
-int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
-		   int family, u8 prefixlen, int l3index, u8 flags,
-		   const u8 *newkey, u8 newkeylen, gfp_t gfp)
+static int __tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
+			    int family, u8 prefixlen, int l3index, u8 flags,
+			    const u8 *newkey, u8 newkeylen, gfp_t gfp)
 {
 	/* Add Key to the list */
 	struct tcp_md5sig_key *key;
@@ -1220,9 +1217,6 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
 		return 0;
 	}
 
-	if (tcp_md5sig_info_add(sk, gfp))
-		return -ENOMEM;
-
 	md5sig = rcu_dereference_protected(tp->md5sig_info,
 					   lockdep_sock_is_held(sk));
 
@@ -1246,8 +1240,59 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
 	hlist_add_head_rcu(&key->node, &md5sig->head);
 	return 0;
 }
+
+int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
+		   int family, u8 prefixlen, int l3index, u8 flags,
+		   const u8 *newkey, u8 newkeylen)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+
+	if (!rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk))) {
+		if (tcp_md5sig_info_add(sk, GFP_KERNEL))
+			return -ENOMEM;
+
+		if (!static_branch_inc(&tcp_md5_needed.key)) {
+			struct tcp_md5sig_info *md5sig;
+
+			md5sig = rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk));
+			rcu_assign_pointer(tp->md5sig_info, NULL);
+			kfree_rcu(md5sig);
+			return -EUSERS;
+		}
+	}
+
+	return __tcp_md5_do_add(sk, addr, family, prefixlen, l3index, flags,
+				newkey, newkeylen, GFP_KERNEL);
+}
 EXPORT_SYMBOL(tcp_md5_do_add);
 
+int tcp_md5_key_copy(struct sock *sk, const union tcp_md5_addr *addr,
+		     int family, u8 prefixlen, int l3index,
+		     struct tcp_md5sig_key *key)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+
+	if (!rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk))) {
+		if (tcp_md5sig_info_add(sk, sk_gfp_mask(sk, GFP_ATOMIC)))
+			return -ENOMEM;
+
+		if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) {
+			struct tcp_md5sig_info *md5sig;
+
+			md5sig = rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk));
+			net_warn_ratelimited("Too many TCP-MD5 keys in the system\n");
+			rcu_assign_pointer(tp->md5sig_info, NULL);
+			kfree_rcu(md5sig);
+			return -EUSERS;
+		}
+	}
+
+	return __tcp_md5_do_add(sk, addr, family, prefixlen, l3index,
+				key->flags, key->key, key->keylen,
+				sk_gfp_mask(sk, GFP_ATOMIC));
+}
+EXPORT_SYMBOL(tcp_md5_key_copy);
+
 int tcp_md5_do_del(struct sock *sk, const union tcp_md5_addr *addr, int family,
 		   u8 prefixlen, int l3index, u8 flags)
 {
@@ -1334,7 +1379,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname,
 		return -EINVAL;
 
 	return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags,
-			      cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
+			      cmd.tcpm_key, cmd.tcpm_keylen);
 }
 
 static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp,
@@ -1591,8 +1636,7 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
 		 * memory, then we end up not copying the key
 		 * across. Shucks.
 		 */
-		tcp_md5_do_add(newsk, addr, AF_INET, 32, l3index, key->flags,
-			       key->key, key->keylen, GFP_ATOMIC);
+		tcp_md5_key_copy(newsk, addr, AF_INET, 32, l3index, key);
 		sk_gso_disable(newsk);
 	}
 #endif
@@ -2284,6 +2328,7 @@ void tcp_v4_destroy_sock(struct sock *sk)
 		tcp_clear_md5_list(sk);
 		kfree_rcu(rcu_dereference_protected(tp->md5sig_info, 1), rcu);
 		tp->md5sig_info = NULL;
+		static_branch_slow_dec_deferred(&tcp_md5_needed);
 	}
 #endif
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index c375f603a16c..6908812d50d3 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -291,13 +291,19 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
 		 */
 		do {
 			tcptw->tw_md5_key = NULL;
-			if (static_branch_unlikely(&tcp_md5_needed)) {
+			if (static_branch_unlikely(&tcp_md5_needed.key)) {
 				struct tcp_md5sig_key *key;
 
 				key = tp->af_specific->md5_lookup(sk, sk);
 				if (key) {
 					tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
-					BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool());
+					if (!tcptw->tw_md5_key)
+						break;
+					BUG_ON(!tcp_alloc_md5sig_pool());
+					if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) {
+						kfree(tcptw->tw_md5_key);
+						tcptw->tw_md5_key = NULL;
+					}
 				}
 			}
 		} while (0);
@@ -337,11 +343,13 @@ EXPORT_SYMBOL(tcp_time_wait);
 void tcp_twsk_destructor(struct sock *sk)
 {
 #ifdef CONFIG_TCP_MD5SIG
-	if (static_branch_unlikely(&tcp_md5_needed)) {
+	if (static_branch_unlikely(&tcp_md5_needed.key)) {
 		struct tcp_timewait_sock *twsk = tcp_twsk(sk);
 
-		if (twsk->tw_md5_key)
+		if (twsk->tw_md5_key) {
 			kfree_rcu(twsk->tw_md5_key, rcu);
+			static_branch_slow_dec_deferred(&tcp_md5_needed);
+		}
 	}
 #endif
 }
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 894410dc9293..71d01cf3c13e 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -766,7 +766,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
 
 	*md5 = NULL;
 #ifdef CONFIG_TCP_MD5SIG
-	if (static_branch_unlikely(&tcp_md5_needed) &&
+	if (static_branch_unlikely(&tcp_md5_needed.key) &&
 	    rcu_access_pointer(tp->md5sig_info)) {
 		*md5 = tp->af_specific->md5_lookup(sk, sk);
 		if (*md5) {
@@ -922,7 +922,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
 
 	*md5 = NULL;
 #ifdef CONFIG_TCP_MD5SIG
-	if (static_branch_unlikely(&tcp_md5_needed) &&
+	if (static_branch_unlikely(&tcp_md5_needed.key) &&
 	    rcu_access_pointer(tp->md5sig_info)) {
 		*md5 = tp->af_specific->md5_lookup(sk, sk);
 		if (*md5) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index f676be14e6b6..83304d6a6bd0 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -677,12 +677,11 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname,
 	if (ipv6_addr_v4mapped(&sin6->sin6_addr))
 		return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3],
 				      AF_INET, prefixlen, l3index, flags,
-				      cmd.tcpm_key, cmd.tcpm_keylen,
-				      GFP_KERNEL);
+				      cmd.tcpm_key, cmd.tcpm_keylen);
 
 	return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr,
 			      AF_INET6, prefixlen, l3index, flags,
-			      cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
+			      cmd.tcpm_key, cmd.tcpm_keylen);
 }
 
 static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp,
@@ -1382,9 +1381,8 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
 		 * memory, then we end up not copying the key
 		 * across. Shucks.
 		 */
-		tcp_md5_do_add(newsk, (union tcp_md5_addr *)&newsk->sk_v6_daddr,
-			       AF_INET6, 128, l3index, key->flags, key->key, key->keylen,
-			       sk_gfp_mask(sk, GFP_ATOMIC));
+		tcp_md5_key_copy(newsk, (union tcp_md5_addr *)&newsk->sk_v6_daddr,
+				 AF_INET6, 128, l3index, key);
 	}
 #endif
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure
  2022-11-23 17:38 [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key Dmitry Safonov
                   ` (2 preceding siblings ...)
  2022-11-23 17:38 ` [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction Dmitry Safonov
@ 2022-11-23 17:38 ` Dmitry Safonov
  2022-12-01 19:42   ` Eric Dumazet
  2022-11-23 17:38 ` [PATCH v6 5/5] net/tcp: Separate initialization of twsk Dmitry Safonov
  2022-12-02  4:10 ` [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key patchwork-bot+netdevbpf
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Safonov @ 2022-11-23 17:38 UTC (permalink / raw)
  To: linux-kernel, David Ahern, Eric Dumazet, Peter Zijlstra
  Cc: Dmitry Safonov, Ard Biesheuvel, Bob Gilligan, David S. Miller,
	Dmitry Safonov, Francesco Ruggeri, Hideaki YOSHIFUJI,
	Jakub Kicinski, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev

If the kernel was short on (atomic) memory and failed to allocate it -
don't proceed to creation of request socket. Otherwise the socket would
be unsigned and userspace likely doesn't expect that the TCP is not
MD5-signed anymore.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
---
 net/ipv4/tcp_ipv4.c |  9 ++-------
 net/ipv6/tcp_ipv6.c | 15 ++++++++-------
 2 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 2ae6a061f36e..e214098087fe 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1630,13 +1630,8 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
 	addr = (union tcp_md5_addr *)&newinet->inet_daddr;
 	key = tcp_md5_do_lookup(sk, l3index, addr, AF_INET);
 	if (key) {
-		/*
-		 * We're using one, so create a matching key
-		 * on the newsk structure. If we fail to get
-		 * memory, then we end up not copying the key
-		 * across. Shucks.
-		 */
-		tcp_md5_key_copy(newsk, addr, AF_INET, 32, l3index, key);
+		if (tcp_md5_key_copy(newsk, addr, AF_INET, 32, l3index, key))
+			goto put_and_exit;
 		sk_gso_disable(newsk);
 	}
 #endif
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 83304d6a6bd0..21486b4a9774 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1376,13 +1376,14 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
 	/* Copy over the MD5 key from the original socket */
 	key = tcp_v6_md5_do_lookup(sk, &newsk->sk_v6_daddr, l3index);
 	if (key) {
-		/* We're using one, so create a matching key
-		 * on the newsk structure. If we fail to get
-		 * memory, then we end up not copying the key
-		 * across. Shucks.
-		 */
-		tcp_md5_key_copy(newsk, (union tcp_md5_addr *)&newsk->sk_v6_daddr,
-				 AF_INET6, 128, l3index, key);
+		const union tcp_md5_addr *addr;
+
+		addr = (union tcp_md5_addr *)&newsk->sk_v6_daddr;
+		if (tcp_md5_key_copy(newsk, addr, AF_INET6, 128, l3index, key)) {
+			inet_csk_prepare_forced_close(newsk);
+			tcp_done(newsk);
+			goto out;
+		}
 	}
 #endif
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 5/5] net/tcp: Separate initialization of twsk
  2022-11-23 17:38 [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key Dmitry Safonov
                   ` (3 preceding siblings ...)
  2022-11-23 17:38 ` [PATCH v6 4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure Dmitry Safonov
@ 2022-11-23 17:38 ` Dmitry Safonov
  2022-12-01 19:44   ` Eric Dumazet
  2022-12-02  4:10 ` [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key patchwork-bot+netdevbpf
  5 siblings, 1 reply; 18+ messages in thread
From: Dmitry Safonov @ 2022-11-23 17:38 UTC (permalink / raw)
  To: linux-kernel, David Ahern, Eric Dumazet, Peter Zijlstra
  Cc: Dmitry Safonov, Ard Biesheuvel, Bob Gilligan, David S. Miller,
	Dmitry Safonov, Francesco Ruggeri, Hideaki YOSHIFUJI,
	Jakub Kicinski, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev

Convert BUG_ON() to WARN_ON_ONCE() and warn as well for unlikely
static key int overflow error-path.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
---
 net/ipv4/tcp_minisocks.c | 61 +++++++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 26 deletions(-)

diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 6908812d50d3..e002f2e1d4f2 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -240,6 +240,40 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 }
 EXPORT_SYMBOL(tcp_timewait_state_process);
 
+static void tcp_time_wait_init(struct sock *sk, struct tcp_timewait_sock *tcptw)
+{
+#ifdef CONFIG_TCP_MD5SIG
+	const struct tcp_sock *tp = tcp_sk(sk);
+	struct tcp_md5sig_key *key;
+
+	/*
+	 * The timewait bucket does not have the key DB from the
+	 * sock structure. We just make a quick copy of the
+	 * md5 key being used (if indeed we are using one)
+	 * so the timewait ack generating code has the key.
+	 */
+	tcptw->tw_md5_key = NULL;
+	if (!static_branch_unlikely(&tcp_md5_needed.key))
+		return;
+
+	key = tp->af_specific->md5_lookup(sk, sk);
+	if (key) {
+		tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
+		if (!tcptw->tw_md5_key)
+			return;
+		if (!tcp_alloc_md5sig_pool())
+			goto out_free;
+		if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key))
+			goto out_free;
+	}
+	return;
+out_free:
+	WARN_ON_ONCE(1);
+	kfree(tcptw->tw_md5_key);
+	tcptw->tw_md5_key = NULL;
+#endif
+}
+
 /*
  * Move a socket to time-wait or dead fin-wait-2 state.
  */
@@ -282,32 +316,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
 		}
 #endif
 
-#ifdef CONFIG_TCP_MD5SIG
-		/*
-		 * The timewait bucket does not have the key DB from the
-		 * sock structure. We just make a quick copy of the
-		 * md5 key being used (if indeed we are using one)
-		 * so the timewait ack generating code has the key.
-		 */
-		do {
-			tcptw->tw_md5_key = NULL;
-			if (static_branch_unlikely(&tcp_md5_needed.key)) {
-				struct tcp_md5sig_key *key;
-
-				key = tp->af_specific->md5_lookup(sk, sk);
-				if (key) {
-					tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
-					if (!tcptw->tw_md5_key)
-						break;
-					BUG_ON(!tcp_alloc_md5sig_pool());
-					if (!static_key_fast_inc_not_disabled(&tcp_md5_needed.key.key)) {
-						kfree(tcptw->tw_md5_key);
-						tcptw->tw_md5_key = NULL;
-					}
-				}
-			}
-		} while (0);
-#endif
+		tcp_time_wait_init(sk, tcptw);
 
 		/* Get the TIME_WAIT timeout firing. */
 		if (timeo < rto)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow
  2022-11-23 17:38 ` [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow Dmitry Safonov
@ 2022-11-25  7:59   ` Peter Zijlstra
  2022-11-25 14:28     ` Dmitry Safonov
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2022-11-25  7:59 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, David Ahern, Eric Dumazet, Ard Biesheuvel,
	Bob Gilligan, David S. Miller, Dmitry Safonov, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jakub Kicinski, Jason Baron, Josh Poimboeuf,
	Paolo Abeni, Salam Noureddine, Steven Rostedt, netdev

On Wed, Nov 23, 2022 at 05:38:55PM +0000, Dmitry Safonov wrote:
> 1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
>    protection against key->enabled refcounter overflow.
> 2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
>    still may turn the refcounter negative as (v + 1) may overflow.
> 
> key->enabled is indeed a ref-counter as it's documented in multiple
> places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
> etc.
> 
> As -1 is reserved for static key that's in process of being enabled,
> functions would break with negative key->enabled refcount:
> - for CONFIG_JUMP_LABEL=n negative return of static_key_count()
>   breaks static_key_false(), static_key_true()
> - the ref counter may become 0 from negative side by too many
>   static_key_slow_inc() calls and lead to use-after-free issues.
> 
> These flaws result in that some users have to introduce an additional
> mutex and prevent the reference counter from overflowing themselves,
> see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.
> 
> Prevent the reference counter overflow by checking if (v + 1) > 0.
> Change functions API to return whether the increment was successful.
> 
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> Acked-by: Jakub Kicinski <kuba@kernel.org>

This looks good to me:

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

What is the plan for merging this? I'm assuming it would want to go
through the network tree, but as already noted earlier it depends on a
patch I have in tip/locking/core.

Now I checked, tip/locking/core is *just* that one patch, so it might be
possible to merge that branch and this series into the network tree and
note that during the pull request to Linus.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow
  2022-11-25  7:59   ` Peter Zijlstra
@ 2022-11-25 14:28     ` Dmitry Safonov
  2022-12-01 22:31       ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Safonov @ 2022-11-25 14:28 UTC (permalink / raw)
  To: Peter Zijlstra, Eric Dumazet, Jakub Kicinski
  Cc: linux-kernel, Ard Biesheuvel, Bob Gilligan, Dmitry Safonov,
	Francesco Ruggeri, Hideaki YOSHIFUJI, Jason Baron,
	Josh Poimboeuf, Paolo Abeni, Salam Noureddine, Steven Rostedt,
	netdev, David S. Miller, David Ahern

On 11/25/22 07:59, Peter Zijlstra wrote:
> On Wed, Nov 23, 2022 at 05:38:55PM +0000, Dmitry Safonov wrote:
>> 1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
>>    protection against key->enabled refcounter overflow.
>> 2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
>>    still may turn the refcounter negative as (v + 1) may overflow.
>>
>> key->enabled is indeed a ref-counter as it's documented in multiple
>> places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
>> etc.
>>
>> As -1 is reserved for static key that's in process of being enabled,
>> functions would break with negative key->enabled refcount:
>> - for CONFIG_JUMP_LABEL=n negative return of static_key_count()
>>   breaks static_key_false(), static_key_true()
>> - the ref counter may become 0 from negative side by too many
>>   static_key_slow_inc() calls and lead to use-after-free issues.
>>
>> These flaws result in that some users have to introduce an additional
>> mutex and prevent the reference counter from overflowing themselves,
>> see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.
>>
>> Prevent the reference counter overflow by checking if (v + 1) > 0.
>> Change functions API to return whether the increment was successful.
>>
>> Signed-off-by: Dmitry Safonov <dima@arista.com>
>> Acked-by: Jakub Kicinski <kuba@kernel.org>
> 
> This looks good to me:
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Thank you, Peter!

> What is the plan for merging this? I'm assuming it would want to go
> through the network tree, but as already noted earlier it depends on a
> patch I have in tip/locking/core.
> 
> Now I checked, tip/locking/core is *just* that one patch, so it might be
> possible to merge that branch and this series into the network tree and
> note that during the pull request to Linus.

I initially thought it has to go through tip trees because of the
dependence, but as you say it's just one patch.

I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
get a go from him, I will send all 6 patches for inclusion into -net
tree, if that will be in time before the merge window.

Thanks again for the review and ack,
          Dmitry

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
  2022-11-23 17:38 ` [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction Dmitry Safonov
@ 2022-12-01 19:38   ` Eric Dumazet
  2022-12-02  5:05     ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2022-12-01 19:38 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, David Ahern, Peter Zijlstra, Ard Biesheuvel,
	Bob Gilligan, David S. Miller, Dmitry Safonov, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jakub Kicinski, Jason Baron, Josh Poimboeuf,
	Paolo Abeni, Salam Noureddine, Steven Rostedt, netdev

On Wed, Nov 23, 2022 at 6:39 PM Dmitry Safonov <dima@arista.com> wrote:
>
> To do that, separate two scenarios:
> - where it's the first MD5 key on the system, which means that enabling
>   of the static key may need to sleep;
> - copying of an existing key from a listening socket to the request
>   socket upon receiving a signed TCP segment, where static key was
>   already enabled (when the key was added to the listening socket).
>
> Now the life-time of the static branch for TCP-MD5 is until:
> - last tcp_md5sig_info is destroyed
> - last socket in time-wait state with MD5 key is closed.
>
> Which means that after all sockets with TCP-MD5 keys are gone, the
> system gets back the performance of disabled md5-key static branch.
>
> While at here, provide static_key_fast_inc() helper that does ref
> counter increment in atomic fashion (without grabbing cpus_read_lock()
> on CONFIG_JUMP_LABEL=y). This is needed to add a new user for
> a static_key when the caller controls the lifetime of another user.
>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> Acked-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure
  2022-11-23 17:38 ` [PATCH v6 4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure Dmitry Safonov
@ 2022-12-01 19:42   ` Eric Dumazet
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Dumazet @ 2022-12-01 19:42 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, David Ahern, Peter Zijlstra, Ard Biesheuvel,
	Bob Gilligan, David S. Miller, Dmitry Safonov, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jakub Kicinski, Jason Baron, Josh Poimboeuf,
	Paolo Abeni, Salam Noureddine, Steven Rostedt, netdev

On Wed, Nov 23, 2022 at 6:39 PM Dmitry Safonov <dima@arista.com> wrote:
>
> If the kernel was short on (atomic) memory and failed to allocate it -
> don't proceed to creation of request socket. Otherwise the socket would
> be unsigned and userspace likely doesn't expect that the TCP is not
> MD5-signed anymore.
>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> Acked-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 5/5] net/tcp: Separate initialization of twsk
  2022-11-23 17:38 ` [PATCH v6 5/5] net/tcp: Separate initialization of twsk Dmitry Safonov
@ 2022-12-01 19:44   ` Eric Dumazet
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Dumazet @ 2022-12-01 19:44 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, David Ahern, Peter Zijlstra, Ard Biesheuvel,
	Bob Gilligan, David S. Miller, Dmitry Safonov, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jakub Kicinski, Jason Baron, Josh Poimboeuf,
	Paolo Abeni, Salam Noureddine, Steven Rostedt, netdev

On Wed, Nov 23, 2022 at 6:39 PM Dmitry Safonov <dima@arista.com> wrote:
>
> Convert BUG_ON() to WARN_ON_ONCE() and warn as well for unlikely
> static key int overflow error-path.
>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> Acked-by: Jakub Kicinski <kuba@kernel.org>
> ---

Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks !

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow
  2022-11-25 14:28     ` Dmitry Safonov
@ 2022-12-01 22:31       ` Jakub Kicinski
  2022-12-01 23:17         ` Dmitry Safonov
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2022-12-01 22:31 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Peter Zijlstra, Eric Dumazet, linux-kernel, Ard Biesheuvel,
	Bob Gilligan, Dmitry Safonov, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev, David S. Miller,
	David Ahern

On Fri, 25 Nov 2022 14:28:30 +0000 Dmitry Safonov wrote:
> > What is the plan for merging this? I'm assuming it would want to go
> > through the network tree, but as already noted earlier it depends on a
> > patch I have in tip/locking/core.
> > 
> > Now I checked, tip/locking/core is *just* that one patch, so it might be
> > possible to merge that branch and this series into the network tree and
> > note that during the pull request to Linus.  
> 
> I initially thought it has to go through tip trees because of the
> dependence, but as you say it's just one patch.
> 
> I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> get a go from him, I will send all 6 patches for inclusion into -net
> tree, if that will be in time before the merge window.

Looks like we're all set on the networking side (thanks Eric!!)

Should I pull Peter's branch? Or you want to just resent a patch Peter
already queued. A bit of an unusual situation..

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow
  2022-12-01 22:31       ` Jakub Kicinski
@ 2022-12-01 23:17         ` Dmitry Safonov
  2022-12-01 23:36           ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Safonov @ 2022-12-01 23:17 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Dmitry Safonov, Peter Zijlstra, Eric Dumazet, linux-kernel,
	Ard Biesheuvel, Bob Gilligan, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev, David S. Miller,
	David Ahern

On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 25 Nov 2022 14:28:30 +0000 Dmitry Safonov wrote:
> > > What is the plan for merging this? I'm assuming it would want to go
> > > through the network tree, but as already noted earlier it depends on a
> > > patch I have in tip/locking/core.
> > >
> > > Now I checked, tip/locking/core is *just* that one patch, so it might be
> > > possible to merge that branch and this series into the network tree and
> > > note that during the pull request to Linus.
> >
> > I initially thought it has to go through tip trees because of the
> > dependence, but as you say it's just one patch.
> >
> > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > get a go from him, I will send all 6 patches for inclusion into -net
> > tree, if that will be in time before the merge window.
>
> Looks like we're all set on the networking side (thanks Eric!!)

Thanks!

> Should I pull Peter's branch? Or you want to just resent a patch Peter
> already queued. A bit of an unusual situation..

Either way would work for me.
I can send it in a couple of hours if you prefer instead of pulling the branch.

Thank you,
             Dmitry

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow
  2022-12-01 23:17         ` Dmitry Safonov
@ 2022-12-01 23:36           ` Jakub Kicinski
  2022-12-02  0:37             ` Dmitry Safonov
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2022-12-01 23:36 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov, Peter Zijlstra, Eric Dumazet, linux-kernel,
	Ard Biesheuvel, Bob Gilligan, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev, David S. Miller,
	David Ahern

On Thu, 1 Dec 2022 23:17:11 +0000 Dmitry Safonov wrote:
> On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <kuba@kernel.org> wrote:
> > > I initially thought it has to go through tip trees because of the
> > > dependence, but as you say it's just one patch.
> > >
> > > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > > get a go from him, I will send all 6 patches for inclusion into -net
> > > tree, if that will be in time before the merge window.  
> >
> > Looks like we're all set on the networking side (thanks Eric!!)  
> 
> Thanks!
> 
> > Should I pull Peter's branch? Or you want to just resent a patch Peter
> > already queued. A bit of an unusual situation..  
> 
> Either way would work for me.
> I can send it in a couple of hours if you prefer instead of pulling the branch.

I prefer to pull, seems safer in case Peter does get another patch.

It's this one, right?

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=locking/core

I'll pulled, I'll push out once the build is done.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow
  2022-12-01 23:36           ` Jakub Kicinski
@ 2022-12-02  0:37             ` Dmitry Safonov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Safonov @ 2022-12-02  0:37 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Dmitry Safonov, Peter Zijlstra, Eric Dumazet, linux-kernel,
	Ard Biesheuvel, Bob Gilligan, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jason Baron, Josh Poimboeuf, Paolo Abeni,
	Salam Noureddine, Steven Rostedt, netdev, David S. Miller,
	David Ahern

On Thu, 1 Dec 2022 at 23:36, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 1 Dec 2022 23:17:11 +0000 Dmitry Safonov wrote:
> > On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <kuba@kernel.org> wrote:
> > > > I initially thought it has to go through tip trees because of the
> > > > dependence, but as you say it's just one patch.
> > > >
> > > > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > > > get a go from him, I will send all 6 patches for inclusion into -net
> > > > tree, if that will be in time before the merge window.
> > >
> > > Looks like we're all set on the networking side (thanks Eric!!)
> >
> > Thanks!
> >
> > > Should I pull Peter's branch? Or you want to just resent a patch Peter
> > > already queued. A bit of an unusual situation..
> >
> > Either way would work for me.
> > I can send it in a couple of hours if you prefer instead of pulling the branch.
>
> I prefer to pull, seems safer in case Peter does get another patch.
>
> It's this one, right?

It is the correct one, yes.

> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=locking/core
>
> I'll pulled, I'll push out once the build is done.

Thank you once again,
             Dmitry

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key
  2022-11-23 17:38 [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key Dmitry Safonov
                   ` (4 preceding siblings ...)
  2022-11-23 17:38 ` [PATCH v6 5/5] net/tcp: Separate initialization of twsk Dmitry Safonov
@ 2022-12-02  4:10 ` patchwork-bot+netdevbpf
  5 siblings, 0 replies; 18+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-12-02  4:10 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, dsahern, edumazet, peterz, ardb, gilligan, davem,
	0x7f454c46, fruggeri, yoshfuji, kuba, jbaron, jpoimboe, pabeni,
	noureddine, rostedt, netdev

Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 23 Nov 2022 17:38:54 +0000 you wrote:
> Changes from v5:
> - Corrected comment for static_key_fast_inc_not_negative() (Peter)
> - Renamed static_key_fast_inc_not_negative() =>
>   static_key_fast_inc_not_disabled() (as suggested by Peter)
> - static_key_fast_inc_not_disabled() is exported and declared in the
>   patch 1 that defines it, rather than in patch 3 that uses it (Peter)
> 
> [...]

Here is the summary with links:
  - [v6,1/5] jump_label: Prevent key->enabled int overflow
    https://git.kernel.org/netdev/net-next/c/eb8c507296f6
  - [v6,2/5] net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add()
    https://git.kernel.org/netdev/net-next/c/f62c7517ffa1
  - [v6,3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
    https://git.kernel.org/netdev/net-next/c/459837b522f7
  - [v6,4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure
    https://git.kernel.org/netdev/net-next/c/b389d1affc2c
  - [v6,5/5] net/tcp: Separate initialization of twsk
    https://git.kernel.org/netdev/net-next/c/c5b8b515a211

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
  2022-12-01 19:38   ` Eric Dumazet
@ 2022-12-02  5:05     ` Eric Dumazet
  2022-12-02  5:34       ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2022-12-02  5:05 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, David Ahern, Peter Zijlstra, Ard Biesheuvel,
	Bob Gilligan, David S. Miller, Dmitry Safonov, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jakub Kicinski, Jason Baron, Josh Poimboeuf,
	Paolo Abeni, Salam Noureddine, Steven Rostedt, netdev

On Thu, Dec 1, 2022 at 8:38 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Nov 23, 2022 at 6:39 PM Dmitry Safonov <dima@arista.com> wrote:
> >
> > To do that, separate two scenarios:
> > - where it's the first MD5 key on the system, which means that enabling
> >   of the static key may need to sleep;
> > - copying of an existing key from a listening socket to the request
> >   socket upon receiving a signed TCP segment, where static key was
> >   already enabled (when the key was added to the listening socket).
> >
> > Now the life-time of the static branch for TCP-MD5 is until:
> > - last tcp_md5sig_info is destroyed
> > - last socket in time-wait state with MD5 key is closed.
> >
> > Which means that after all sockets with TCP-MD5 keys are gone, the
> > system gets back the performance of disabled md5-key static branch.
> >
> > While at here, provide static_key_fast_inc() helper that does ref
> > counter increment in atomic fashion (without grabbing cpus_read_lock()
> > on CONFIG_JUMP_LABEL=y). This is needed to add a new user for
> > a static_key when the caller controls the lifetime of another user.
> >
> > Signed-off-by: Dmitry Safonov <dima@arista.com>
> > Acked-by: Jakub Kicinski <kuba@kernel.org>
>
> Reviewed-by: Eric Dumazet <edumazet@google.com>

Hmm, I missed two kfree_rcu(key) calls, I will send the following fix:

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7fae586405cfb10011a0674289280bf400dfa8d8..8320d0ecb13ae1e3e259f3c13a4c2797fbd984a5
100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1245,7 +1245,7 @@ int tcp_md5_do_add(struct sock *sk, const union
tcp_md5_addr *addr,

                        md5sig =
rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk));
                        rcu_assign_pointer(tp->md5sig_info, NULL);
-                       kfree_rcu(md5sig);
+                       kfree_rcu(md5sig, rcu);
                        return -EUSERS;
                }
        }
@@ -1271,7 +1271,7 @@ int tcp_md5_key_copy(struct sock *sk, const
union tcp_md5_addr *addr,
                        md5sig =
rcu_dereference_protected(tp->md5sig_info, lockdep_sock_is_held(sk));
                        net_warn_ratelimited("Too many TCP-MD5 keys in
the system\n");
                        rcu_assign_pointer(tp->md5sig_info, NULL);
-                       kfree_rcu(md5sig);
+                       kfree_rcu(md5sig, rcu);
                        return -EUSERS;
                }
        }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
  2022-12-02  5:05     ` Eric Dumazet
@ 2022-12-02  5:34       ` Eric Dumazet
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Dumazet @ 2022-12-02  5:34 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, David Ahern, Peter Zijlstra, Ard Biesheuvel,
	Bob Gilligan, David S. Miller, Dmitry Safonov, Francesco Ruggeri,
	Hideaki YOSHIFUJI, Jakub Kicinski, Jason Baron, Josh Poimboeuf,
	Paolo Abeni, Salam Noureddine, Steven Rostedt, netdev

On Fri, Dec 2, 2022 at 6:05 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Hmm, I missed two kfree_rcu(key) calls, I will send the following fix:
>

https://patchwork.kernel.org/project/netdevbpf/patch/20221202052847.2623997-1-edumazet@google.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-12-02  5:34 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23 17:38 [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key Dmitry Safonov
2022-11-23 17:38 ` [PATCH v6 1/5] jump_label: Prevent key->enabled int overflow Dmitry Safonov
2022-11-25  7:59   ` Peter Zijlstra
2022-11-25 14:28     ` Dmitry Safonov
2022-12-01 22:31       ` Jakub Kicinski
2022-12-01 23:17         ` Dmitry Safonov
2022-12-01 23:36           ` Jakub Kicinski
2022-12-02  0:37             ` Dmitry Safonov
2022-11-23 17:38 ` [PATCH v6 2/5] net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add() Dmitry Safonov
2022-11-23 17:38 ` [PATCH v6 3/5] net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction Dmitry Safonov
2022-12-01 19:38   ` Eric Dumazet
2022-12-02  5:05     ` Eric Dumazet
2022-12-02  5:34       ` Eric Dumazet
2022-11-23 17:38 ` [PATCH v6 4/5] net/tcp: Do cleanup on tcp_md5_key_copy() failure Dmitry Safonov
2022-12-01 19:42   ` Eric Dumazet
2022-11-23 17:38 ` [PATCH v6 5/5] net/tcp: Separate initialization of twsk Dmitry Safonov
2022-12-01 19:44   ` Eric Dumazet
2022-12-02  4:10 ` [PATCH v6 0/5] net/tcp: Dynamically disable TCP-MD5 static key patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.