linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/4] mptcp: more fixes for v6.5
@ 2023-08-03 16:27 Matthieu Baerts
  2023-08-03 16:27 ` [PATCH net 1/4] selftests: mptcp: join: fix 'delete and re-add' test Matthieu Baerts
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Matthieu Baerts @ 2023-08-03 16:27 UTC (permalink / raw)
  To: mptcp, Mat Martineau, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Geliang Tang
  Cc: Andrea Claudi, netdev, linux-kselftest, linux-kernel,
	Matthieu Baerts, stable, Christoph Paasch

Here is a new batch of fixes related to MPTCP for v6.5 and older.

Patches 1 and 2 fix issues with MPTCP Join selftest when manually
launched with '-i' parameter to use 'ip mptcp' tool instead of the
dedicated one (pm_nl_ctl). The issues have been there since v5.18.

Thank you Andrea for your first contributions to MPTCP code in the
upstream kernel!

Patch 3 avoids corrupting the data stream when trying to reset
connections that have fallen back to TCP. This can happen from v6.1.

Patch 4 fixes a race when doing a disconnect() and an accept() in
parallel on a listener socket. The issue only happens in rare cases if
the user is really unlucky since a fix that landed in v6.3 but
backported up to v6.1.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
Andrea Claudi (2):
      selftests: mptcp: join: fix 'delete and re-add' test
      selftests: mptcp: join: fix 'implicit EP' test

Paolo Abeni (2):
      mptcp: avoid bogus reset on fallback close
      mptcp: fix disconnect vs accept race

 net/mptcp/protocol.c                            |  2 +-
 net/mptcp/protocol.h                            |  1 -
 net/mptcp/subflow.c                             | 60 ++++++++++++-------------
 tools/testing/selftests/net/mptcp/mptcp_join.sh |  6 ++-
 4 files changed, 35 insertions(+), 34 deletions(-)
---
base-commit: 0f71c9caf26726efea674646f566984e735cc3b9
change-id: 20230803-upstream-net-20230803-misc-fixes-6-5-6046c6ca74b6

Best regards,
-- 
Matthieu Baerts <matthieu.baerts@tessares.net>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net 1/4] selftests: mptcp: join: fix 'delete and re-add' test
  2023-08-03 16:27 [PATCH net 0/4] mptcp: more fixes for v6.5 Matthieu Baerts
@ 2023-08-03 16:27 ` Matthieu Baerts
  2023-08-03 16:27 ` [PATCH net 2/4] selftests: mptcp: join: fix 'implicit EP' test Matthieu Baerts
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts @ 2023-08-03 16:27 UTC (permalink / raw)
  To: mptcp, Mat Martineau, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Geliang Tang
  Cc: Andrea Claudi, netdev, linux-kselftest, linux-kernel,
	Matthieu Baerts, stable

From: Andrea Claudi <aclaudi@redhat.com>

mptcp_join 'delete and re-add' test fails when using ip mptcp:

  $ ./mptcp_join.sh -iI
  <snip>
  002 delete and re-add                    before delete[ ok ]
                                           mptcp_info subflows=1         [ ok ]
  Error: argument "ADDRESS" is wrong: invalid for non-zero id address
                                           after delete[fail] got 2:2 subflows expected 1

This happens because endpoint delete includes an ip address while id is
not 0, contrary to what is indicated in the ip mptcp man page:

"When used with the delete id operation, an IFADDR is only included when
the ID is 0."

This fixes the issue using the $addr variable in pm_nl_del_endpoint()
only when id is 0.

Fixes: 34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers")
Cc: stable@vger.kernel.org
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 3c2096ac97ef..067fabc401f1 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -705,6 +705,7 @@ pm_nl_del_endpoint()
 	local addr=$3
 
 	if [ $ip_mptcp -eq 1 ]; then
+		[ $id -ne 0 ] && addr=''
 		ip -n $ns mptcp endpoint delete id $id $addr
 	else
 		ip netns exec $ns ./pm_nl_ctl del $id $addr

-- 
2.40.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 2/4] selftests: mptcp: join: fix 'implicit EP' test
  2023-08-03 16:27 [PATCH net 0/4] mptcp: more fixes for v6.5 Matthieu Baerts
  2023-08-03 16:27 ` [PATCH net 1/4] selftests: mptcp: join: fix 'delete and re-add' test Matthieu Baerts
@ 2023-08-03 16:27 ` Matthieu Baerts
  2023-08-03 16:27 ` [PATCH net 3/4] mptcp: avoid bogus reset on fallback close Matthieu Baerts
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts @ 2023-08-03 16:27 UTC (permalink / raw)
  To: mptcp, Mat Martineau, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Geliang Tang
  Cc: Andrea Claudi, netdev, linux-kselftest, linux-kernel,
	Matthieu Baerts, stable

From: Andrea Claudi <aclaudi@redhat.com>

mptcp_join 'implicit EP' test currently fails when using ip mptcp:

  $ ./mptcp_join.sh -iI
  <snip>
  001 implicit EP    creation[fail] expected '10.0.2.2 10.0.2.2 id 1 implicit' found '10.0.2.2 id 1 rawflags 10 '
  Error: too many addresses or duplicate one: -22.
                     ID change is prevented[fail] expected '10.0.2.2 10.0.2.2 id 1 implicit' found '10.0.2.2 id 1 rawflags 10 '
                     modif is allowed[fail] expected '10.0.2.2 10.0.2.2 id 1 signal' found '10.0.2.2 id 1 signal '

This happens because of two reasons:
- iproute v6.3.0 does not support the implicit flag, fixed with
  iproute2-next commit 3a2535a41854 ("mptcp: add support for implicit
  flag")
- pm_nl_check_endpoint wrongly expects the ip address to be repeated two
  times in iproute output, and does not account for a final whitespace
  in it.

This fixes the issue trimming the whitespace in the output string and
removing the double address in the expected string.

Fixes: 69c6ce7b6eca ("selftests: mptcp: add implicit endpoint test case")
Cc: stable@vger.kernel.org
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 067fabc401f1..d01b73a8ed0f 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -796,10 +796,11 @@ pm_nl_check_endpoint()
 	fi
 
 	if [ $ip_mptcp -eq 1 ]; then
+		# get line and trim trailing whitespace
 		line=$(ip -n $ns mptcp endpoint show $id)
+		line="${line% }"
 		# the dump order is: address id flags port dev
-		expected_line="$addr"
-		[ -n "$addr" ] && expected_line="$expected_line $addr"
+		[ -n "$addr" ] && expected_line="$addr"
 		expected_line="$expected_line $id"
 		[ -n "$_flags" ] && expected_line="$expected_line ${_flags//","/" "}"
 		[ -n "$dev" ] && expected_line="$expected_line $dev"

-- 
2.40.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 3/4] mptcp: avoid bogus reset on fallback close
  2023-08-03 16:27 [PATCH net 0/4] mptcp: more fixes for v6.5 Matthieu Baerts
  2023-08-03 16:27 ` [PATCH net 1/4] selftests: mptcp: join: fix 'delete and re-add' test Matthieu Baerts
  2023-08-03 16:27 ` [PATCH net 2/4] selftests: mptcp: join: fix 'implicit EP' test Matthieu Baerts
@ 2023-08-03 16:27 ` Matthieu Baerts
  2023-08-03 16:27 ` [PATCH net 4/4] mptcp: fix disconnect vs accept race Matthieu Baerts
  2023-08-05  1:30 ` [PATCH net 0/4] mptcp: more fixes for v6.5 patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts @ 2023-08-03 16:27 UTC (permalink / raw)
  To: mptcp, Mat Martineau, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Geliang Tang
  Cc: Andrea Claudi, netdev, linux-kselftest, linux-kernel,
	Matthieu Baerts, stable

From: Paolo Abeni <pabeni@redhat.com>

Since the blamed commit, the MPTCP protocol unconditionally sends
TCP resets on all the subflows on disconnect().

That fits full-blown MPTCP sockets - to implement the fastclose
mechanism - but causes unexpected corruption of the data stream,
caught as sporadic self-tests failures.

Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios")
Cc: stable@vger.kernel.org
Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/419
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 net/mptcp/protocol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3317d1cca156..ac7c11a5cbe5 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2335,7 +2335,7 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
 
 	lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
 
-	if (flags & MPTCP_CF_FASTCLOSE) {
+	if ((flags & MPTCP_CF_FASTCLOSE) && !__mptcp_check_fallback(msk)) {
 		/* be sure to force the tcp_disconnect() path,
 		 * to generate the egress reset
 		 */

-- 
2.40.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 4/4] mptcp: fix disconnect vs accept race
  2023-08-03 16:27 [PATCH net 0/4] mptcp: more fixes for v6.5 Matthieu Baerts
                   ` (2 preceding siblings ...)
  2023-08-03 16:27 ` [PATCH net 3/4] mptcp: avoid bogus reset on fallback close Matthieu Baerts
@ 2023-08-03 16:27 ` Matthieu Baerts
  2023-08-05  1:30 ` [PATCH net 0/4] mptcp: more fixes for v6.5 patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts @ 2023-08-03 16:27 UTC (permalink / raw)
  To: mptcp, Mat Martineau, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Shuah Khan, Geliang Tang
  Cc: Andrea Claudi, netdev, linux-kselftest, linux-kernel,
	Matthieu Baerts, stable, Christoph Paasch

From: Paolo Abeni <pabeni@redhat.com>

Despite commit 0ad529d9fd2b ("mptcp: fix possible divide by zero in
recvmsg()"), the mptcp protocol is still prone to a race between
disconnect() (or shutdown) and accept.

The root cause is that the mentioned commit checks the msk-level
flag, but mptcp_stream_accept() does acquire the msk-level lock,
as it can rely directly on the first subflow lock.

As reported by Christoph than can lead to a race where an msk
socket is accepted after that mptcp_subflow_queue_clean() releases
the listener socket lock and just before it takes destructive
actions leading to the following splat:

BUG: kernel NULL pointer dereference, address: 0000000000000012
PGD 5a4ca067 P4D 5a4ca067 PUD 37d4c067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 10955 Comm: syz-executor.5 Not tainted 6.5.0-rc1-gdc7b257ee5dd #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:mptcp_stream_accept+0x1ee/0x2f0 include/net/inet_sock.h:330
Code: 0a 09 00 48 8b 1b 4c 39 e3 74 07 e8 bc 7c 7f fe eb a1 e8 b5 7c 7f fe 4c 8b 6c 24 08 eb 05 e8 a9 7c 7f fe 49 8b 85 d8 09 00 00 <0f> b6 40 12 88 44 24 07 0f b6 6c 24 07 bf 07 00 00 00 89 ee e8 89
RSP: 0018:ffffc90000d07dc0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888037e8d020 RCX: ffff88803b093300
RDX: 0000000000000000 RSI: ffffffff833822c5 RDI: ffffffff8333896a
RBP: 0000607f82031520 R08: ffff88803b093300 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000003e83 R12: ffff888037e8d020
R13: ffff888037e8c680 R14: ffff888009af7900 R15: ffff888009af6880
FS:  00007fc26d708640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000012 CR3: 0000000066bc5001 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 do_accept+0x1ae/0x260 net/socket.c:1872
 __sys_accept4+0x9b/0x110 net/socket.c:1913
 __do_sys_accept4 net/socket.c:1954 [inline]
 __se_sys_accept4 net/socket.c:1951 [inline]
 __x64_sys_accept4+0x20/0x30 net/socket.c:1951
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Address the issue by temporary removing the pending request socket
from the accept queue, so that racing accept() can't touch them.

After depleting the msk - the ssk still exists, as plain TCP sockets,
re-insert them into the accept queue, so that later inet_csk_listen_stop()
will complete the tcp socket disposal.

Fixes: 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/423
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 net/mptcp/protocol.h |  1 -
 net/mptcp/subflow.c  | 60 ++++++++++++++++++++++++++--------------------------
 2 files changed, 30 insertions(+), 31 deletions(-)

diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 37fbe22e2433..ba2a873a4d2e 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -325,7 +325,6 @@ struct mptcp_sock {
 	u32		subflow_id;
 	u32		setsockopt_seq;
 	char		ca_name[TCP_CA_NAME_MAX];
-	struct mptcp_sock	*dl_next;
 };
 
 #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 9ee3b7abbaf6..94ae7dd01c65 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1793,34 +1793,21 @@ static void subflow_state_change(struct sock *sk)
 void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_ssk)
 {
 	struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
-	struct mptcp_sock *msk, *next, *head = NULL;
-	struct request_sock *req;
-	struct sock *sk;
+	struct request_sock *req, *head, *tail;
+	struct mptcp_subflow_context *subflow;
+	struct sock *sk, *ssk;
 
-	/* build a list of all unaccepted mptcp sockets */
+	/* Due to lock dependencies no relevant lock can be acquired under rskq_lock.
+	 * Splice the req list, so that accept() can not reach the pending ssk after
+	 * the listener socket is released below.
+	 */
 	spin_lock_bh(&queue->rskq_lock);
-	for (req = queue->rskq_accept_head; req; req = req->dl_next) {
-		struct mptcp_subflow_context *subflow;
-		struct sock *ssk = req->sk;
-
-		if (!sk_is_mptcp(ssk))
-			continue;
-
-		subflow = mptcp_subflow_ctx(ssk);
-		if (!subflow || !subflow->conn)
-			continue;
-
-		/* skip if already in list */
-		sk = subflow->conn;
-		msk = mptcp_sk(sk);
-		if (msk->dl_next || msk == head)
-			continue;
-
-		sock_hold(sk);
-		msk->dl_next = head;
-		head = msk;
-	}
+	head = queue->rskq_accept_head;
+	tail = queue->rskq_accept_tail;
+	queue->rskq_accept_head = NULL;
+	queue->rskq_accept_tail = NULL;
 	spin_unlock_bh(&queue->rskq_lock);
+
 	if (!head)
 		return;
 
@@ -1829,13 +1816,19 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
 	 */
 	release_sock(listener_ssk);
 
-	for (msk = head; msk; msk = next) {
-		sk = (struct sock *)msk;
+	for (req = head; req; req = req->dl_next) {
+		ssk = req->sk;
+		if (!sk_is_mptcp(ssk))
+			continue;
+
+		subflow = mptcp_subflow_ctx(ssk);
+		if (!subflow || !subflow->conn)
+			continue;
+
+		sk = subflow->conn;
+		sock_hold(sk);
 
 		lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
-		next = msk->dl_next;
-		msk->dl_next = NULL;
-
 		__mptcp_unaccepted_force_close(sk);
 		release_sock(sk);
 
@@ -1859,6 +1852,13 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
 
 	/* we are still under the listener msk socket lock */
 	lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
+
+	/* restore the listener queue, to let the TCP code clean it up */
+	spin_lock_bh(&queue->rskq_lock);
+	WARN_ON_ONCE(queue->rskq_accept_head);
+	queue->rskq_accept_head = head;
+	queue->rskq_accept_tail = tail;
+	spin_unlock_bh(&queue->rskq_lock);
 }
 
 static int subflow_ulp_init(struct sock *sk)

-- 
2.40.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net 0/4] mptcp: more fixes for v6.5
  2023-08-03 16:27 [PATCH net 0/4] mptcp: more fixes for v6.5 Matthieu Baerts
                   ` (3 preceding siblings ...)
  2023-08-03 16:27 ` [PATCH net 4/4] mptcp: fix disconnect vs accept race Matthieu Baerts
@ 2023-08-05  1:30 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-08-05  1:30 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: mptcp, martineau, davem, edumazet, kuba, pabeni, shuah,
	geliang.tang, aclaudi, netdev, linux-kselftest, linux-kernel,
	stable, cpaasch

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 03 Aug 2023 18:27:26 +0200 you wrote:
> Here is a new batch of fixes related to MPTCP for v6.5 and older.
> 
> Patches 1 and 2 fix issues with MPTCP Join selftest when manually
> launched with '-i' parameter to use 'ip mptcp' tool instead of the
> dedicated one (pm_nl_ctl). The issues have been there since v5.18.
> 
> Thank you Andrea for your first contributions to MPTCP code in the
> upstream kernel!
> 
> [...]

Here is the summary with links:
  - [net,1/4] selftests: mptcp: join: fix 'delete and re-add' test
    https://git.kernel.org/netdev/net/c/aaf2123a5cf4
  - [net,2/4] selftests: mptcp: join: fix 'implicit EP' test
    https://git.kernel.org/netdev/net/c/c8c101ae390a
  - [net,3/4] mptcp: avoid bogus reset on fallback close
    https://git.kernel.org/netdev/net/c/ff18f9ef30ee
  - [net,4/4] mptcp: fix disconnect vs accept race
    https://git.kernel.org/netdev/net/c/511b90e39250

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-05  1:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-03 16:27 [PATCH net 0/4] mptcp: more fixes for v6.5 Matthieu Baerts
2023-08-03 16:27 ` [PATCH net 1/4] selftests: mptcp: join: fix 'delete and re-add' test Matthieu Baerts
2023-08-03 16:27 ` [PATCH net 2/4] selftests: mptcp: join: fix 'implicit EP' test Matthieu Baerts
2023-08-03 16:27 ` [PATCH net 3/4] mptcp: avoid bogus reset on fallback close Matthieu Baerts
2023-08-03 16:27 ` [PATCH net 4/4] mptcp: fix disconnect vs accept race Matthieu Baerts
2023-08-05  1:30 ` [PATCH net 0/4] mptcp: more fixes for v6.5 patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).