All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH mptcp-net] mptcp: fix disconnect vs accept race
@ 2023-07-31 22:25 Paolo Abeni
  2023-07-31 23:30 ` mptcp: fix disconnect vs accept race: Tests Results MPTCP CI
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Paolo Abeni @ 2023-07-31 22:25 UTC (permalink / raw)
  To: mptcp; +Cc: Christoph Paasch

Despite commit 0ad529d9fd2b ("mptcp: fix possible divide by zero in
recvmsg()"), the mptcp protocol is still prone to a race between
disconnect() (or shutdown) and accept.

The root cause is that the mentioned commit checks the msk-level
flag, but mptcp_stream_accept() does acquire the msk-level lock,
as it can rely directly on the first subflow lock.

As reported by Christoph than can lead to a race where an msk
socket is accepted after that mptcp_subflow_queue_clean() releases
the listener socket lock and just before it takes destructive
actions leading to the following splat:

BUG: kernel NULL pointer dereference, address: 0000000000000012
PGD 5a4ca067 P4D 5a4ca067 PUD 37d4c067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 10955 Comm: syz-executor.5 Not tainted 6.5.0-rc1-gdc7b257ee5dd #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:mptcp_stream_accept+0x1ee/0x2f0 include/net/inet_sock.h:330
Code: 0a 09 00 48 8b 1b 4c 39 e3 74 07 e8 bc 7c 7f fe eb a1 e8 b5 7c 7f fe 4c 8b 6c 24 08 eb 05 e8 a9 7c 7f fe 49 8b 85 d8 09 00 00 <0f> b6 40 12 88 44 24 07 0f b6 6c 24 07 bf 07 00 00 00 89 ee e8 89
RSP: 0018:ffffc90000d07dc0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888037e8d020 RCX: ffff88803b093300
RDX: 0000000000000000 RSI: ffffffff833822c5 RDI: ffffffff8333896a
RBP: 0000607f82031520 R08: ffff88803b093300 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000003e83 R12: ffff888037e8d020
R13: ffff888037e8c680 R14: ffff888009af7900 R15: ffff888009af6880
FS:  00007fc26d708640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000012 CR3: 0000000066bc5001 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 do_accept+0x1ae/0x260 net/socket.c:1872
 __sys_accept4+0x9b/0x110 net/socket.c:1913
 __do_sys_accept4 net/socket.c:1954 [inline]
 __se_sys_accept4 net/socket.c:1951 [inline]
 __x64_sys_accept4+0x20/0x30 net/socket.c:1951
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Address the issue by temporary removing the pending request socket
from the accept queue, so that racing accept() can't touch them.

After depleting the msk - the ssk still exists, as plain TCP sockets,
re-insert them into the accept queue, so that later inet_csk_listen_stop()
will complete the tcp socket disposal.

Fixes: 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/423
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
Note: as a net-next follow-up it would be nice to support mptcp socket
migration, basically duplicating part of the inet_csk_listen_stop() logic
---
 net/mptcp/protocol.h |  1 -
 net/mptcp/subflow.c  | 58 ++++++++++++++++++++++----------------------
 2 files changed, 29 insertions(+), 30 deletions(-)

diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index a7db82eb4eea..1ce9b1a1fb56 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -324,7 +324,6 @@ struct mptcp_sock {
 	u32		subflow_id;
 	u32		setsockopt_seq;
 	char		ca_name[TCP_CA_NAME_MAX];
-	struct mptcp_sock	*dl_next;
 };
 
 #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index ad7080fabb2f..9bf3c7bc1762 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1793,16 +1793,31 @@ static void subflow_state_change(struct sock *sk)
 void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_ssk)
 {
 	struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
-	struct mptcp_sock *msk, *next, *head = NULL;
-	struct request_sock *req;
-	struct sock *sk;
+	struct request_sock *req, *head, *tail;
+	struct mptcp_subflow_context *subflow;
+	struct sock *sk, *ssk;
 
-	/* build a list of all unaccepted mptcp sockets */
+	/* Due to lock dependencies no relevant lock can be acquired under rskq_lock.
+	 * Splice the req list, so that accept() can not reach the pending ssk after
+	 * the listener socket is released below.
+	 */
 	spin_lock_bh(&queue->rskq_lock);
-	for (req = queue->rskq_accept_head; req; req = req->dl_next) {
-		struct mptcp_subflow_context *subflow;
-		struct sock *ssk = req->sk;
+	head = queue->rskq_accept_head;
+	tail = queue->rskq_accept_tail;
+	queue->rskq_accept_head = NULL;
+	queue->rskq_accept_tail = NULL;
+	spin_unlock_bh(&queue->rskq_lock);
+
+	if (!head)
+		return;
 
+	/* can't acquire the msk socket lock under the subflow one,
+	 * or will cause ABBA deadlock
+	 */
+	release_sock(listener_ssk);
+
+	for (req = head; req; req = req->dl_next) {
+		ssk = req->sk;
 		if (!sk_is_mptcp(ssk))
 			continue;
 
@@ -1810,32 +1825,10 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
 		if (!subflow || !subflow->conn)
 			continue;
 
-		/* skip if already in list */
 		sk = subflow->conn;
-		msk = mptcp_sk(sk);
-		if (msk->dl_next || msk == head)
-			continue;
-
 		sock_hold(sk);
-		msk->dl_next = head;
-		head = msk;
-	}
-	spin_unlock_bh(&queue->rskq_lock);
-	if (!head)
-		return;
-
-	/* can't acquire the msk socket lock under the subflow one,
-	 * or will cause ABBA deadlock
-	 */
-	release_sock(listener_ssk);
-
-	for (msk = head; msk; msk = next) {
-		sk = (struct sock *)msk;
 
 		lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
-		next = msk->dl_next;
-		msk->dl_next = NULL;
-
 		__mptcp_unaccepted_force_close(sk);
 		release_sock(sk);
 
@@ -1859,6 +1852,13 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
 
 	/* we are still under the listener msk socket lock */
 	lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
+
+	/* restore the listener queue, to let the TCP code clean it up */
+	spin_lock_bh(&queue->rskq_lock);
+	WARN_ON_ONCE(queue->rskq_accept_head);
+	queue->rskq_accept_head = head;
+	queue->rskq_accept_tail = tail;
+	spin_unlock_bh(&queue->rskq_lock);
 }
 
 static int subflow_ulp_init(struct sock *sk)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: mptcp: fix disconnect vs accept race: Tests Results
  2023-07-31 22:25 [PATCH mptcp-net] mptcp: fix disconnect vs accept race Paolo Abeni
@ 2023-07-31 23:30 ` MPTCP CI
  2023-08-01 10:34 ` [PATCH mptcp-net] mptcp: fix disconnect vs accept race Matthieu Baerts
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: MPTCP CI @ 2023-07-31 23:30 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6609590159147008
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6609590159147008/summary/summary.txt

- KVM Validation: normal (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/4780002810527744
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/4780002810527744/summary/summary.txt

- KVM Validation: debug (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5905902717370368
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5905902717370368/summary/summary.txt

- KVM Validation: debug (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5342952763949056
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5342952763949056/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/2cd41a724bae


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-net] mptcp: fix disconnect vs accept race
  2023-07-31 22:25 [PATCH mptcp-net] mptcp: fix disconnect vs accept race Paolo Abeni
  2023-07-31 23:30 ` mptcp: fix disconnect vs accept race: Tests Results MPTCP CI
@ 2023-08-01 10:34 ` Matthieu Baerts
  2023-08-01 13:59   ` Paolo Abeni
  2023-08-01 14:35 ` Matthieu Baerts
  2023-08-01 16:09 ` mptcp: fix disconnect vs accept race: Tests Results MPTCP CI
  3 siblings, 1 reply; 7+ messages in thread
From: Matthieu Baerts @ 2023-08-01 10:34 UTC (permalink / raw)
  To: Paolo Abeni, mptcp; +Cc: Christoph Paasch

Hi Paolo,

On 01/08/2023 00:25, Paolo Abeni wrote:
> Despite commit 0ad529d9fd2b ("mptcp: fix possible divide by zero in
> recvmsg()"), the mptcp protocol is still prone to a race between
> disconnect() (or shutdown) and accept.
> 
> The root cause is that the mentioned commit checks the msk-level
> flag, but mptcp_stream_accept() does acquire the msk-level lock,
> as it can rely directly on the first subflow lock.
> 
> As reported by Christoph than can lead to a race where an msk
> socket is accepted after that mptcp_subflow_queue_clean() releases
> the listener socket lock and just before it takes destructive
> actions leading to the following splat:

(...)

> Address the issue by temporary removing the pending request socket
> from the accept queue, so that racing accept() can't touch them.
> 
> After depleting the msk - the ssk still exists, as plain TCP sockets,
> re-insert them into the accept queue, so that later inet_csk_listen_stop()
> will complete the tcp socket disposal.

Thank you for working on this fix and sharing this patch!

I just have one question at the end of your patch:

> Fixes: 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/423
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> Note: as a net-next follow-up it would be nice to support mptcp socket
> migration, basically duplicating part of the inet_csk_listen_stop() logic

(As you know, I'm always worry when you talk about duplicating code :) )

> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index a7db82eb4eea..1ce9b1a1fb56 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -324,7 +324,6 @@ struct mptcp_sock {
>  	u32		subflow_id;
>  	u32		setsockopt_seq;
>  	char		ca_name[TCP_CA_NAME_MAX];
> -	struct mptcp_sock	*dl_next;
>  };
>  
>  #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock)
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index ad7080fabb2f..9bf3c7bc1762 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -1793,16 +1793,31 @@ static void subflow_state_change(struct sock *sk)
>  void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_ssk)
>  {
>  	struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
> -	struct mptcp_sock *msk, *next, *head = NULL;
> -	struct request_sock *req;
> -	struct sock *sk;
> +	struct request_sock *req, *head, *tail;
> +	struct mptcp_subflow_context *subflow;
> +	struct sock *sk, *ssk;
>  
> -	/* build a list of all unaccepted mptcp sockets */
> +	/* Due to lock dependencies no relevant lock can be acquired under rskq_lock.
> +	 * Splice the req list, so that accept() can not reach the pending ssk after
> +	 * the listener socket is released below.
> +	 */
>  	spin_lock_bh(&queue->rskq_lock);
> -	for (req = queue->rskq_accept_head; req; req = req->dl_next) {
> -		struct mptcp_subflow_context *subflow;
> -		struct sock *ssk = req->sk;
> +	head = queue->rskq_accept_head;
> +	tail = queue->rskq_accept_tail;
> +	queue->rskq_accept_head = NULL;
> +	queue->rskq_accept_tail = NULL;
> +	spin_unlock_bh(&queue->rskq_lock);
> +
> +	if (!head)
> +		return;
>  
> +	/* can't acquire the msk socket lock under the subflow one,
> +	 * or will cause ABBA deadlock
> +	 */
> +	release_sock(listener_ssk);
> +
> +	for (req = head; req; req = req->dl_next) {
> +		ssk = req->sk;
>  		if (!sk_is_mptcp(ssk))
>  			continue;
>  
> @@ -1810,32 +1825,10 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
>  		if (!subflow || !subflow->conn)
>  			continue;
>  
> -		/* skip if already in list */
>  		sk = subflow->conn;
> -		msk = mptcp_sk(sk);
> -		if (msk->dl_next || msk == head)
> -			continue;
> -
>  		sock_hold(sk);
> -		msk->dl_next = head;
> -		head = msk;
> -	}
> -	spin_unlock_bh(&queue->rskq_lock);
> -	if (!head)
> -		return;
> -
> -	/* can't acquire the msk socket lock under the subflow one,
> -	 * or will cause ABBA deadlock
> -	 */
> -	release_sock(listener_ssk);
> -
> -	for (msk = head; msk; msk = next) {
> -		sk = (struct sock *)msk;
>  
>  		lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
> -		next = msk->dl_next;
> -		msk->dl_next = NULL;
> -
>  		__mptcp_unaccepted_force_close(sk);
>  		release_sock(sk);
>  
> @@ -1859,6 +1852,13 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
>  
>  	/* we are still under the listener msk socket lock */
>  	lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
> +
> +	/* restore the listener queue, to let the TCP code clean it up */
> +	spin_lock_bh(&queue->rskq_lock);
> +	WARN_ON_ONCE(queue->rskq_accept_head);

What prevent the accept queue to be modified between the moment it has
been "nullified" (at the beginning of this function) and here (at the end)?

I see that both the locks for the queue and the socket have been
released and re-held in between: could this accept queue be modified in
between if we are unlucky?

I'm sure I'm missing something :)

> +	queue->rskq_accept_head = head;
> +	queue->rskq_accept_tail = tail;
> +	spin_unlock_bh(&queue->rskq_lock);
>  }
>  
>  static int subflow_ulp_init(struct sock *sk)

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-net] mptcp: fix disconnect vs accept race
  2023-08-01 10:34 ` [PATCH mptcp-net] mptcp: fix disconnect vs accept race Matthieu Baerts
@ 2023-08-01 13:59   ` Paolo Abeni
  2023-08-01 14:16     ` Matthieu Baerts
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Abeni @ 2023-08-01 13:59 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp; +Cc: Christoph Paasch

On Tue, 2023-08-01 at 12:34 +0200, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 01/08/2023 00:25, Paolo Abeni wrote:
> > Despite commit 0ad529d9fd2b ("mptcp: fix possible divide by zero in
> > recvmsg()"), the mptcp protocol is still prone to a race between
> > disconnect() (or shutdown) and accept.
> > 
> > The root cause is that the mentioned commit checks the msk-level
> > flag, but mptcp_stream_accept() does acquire the msk-level lock,
> > as it can rely directly on the first subflow lock.
> > 
> > As reported by Christoph than can lead to a race where an msk
> > socket is accepted after that mptcp_subflow_queue_clean() releases
> > the listener socket lock and just before it takes destructive
> > actions leading to the following splat:
> 
> (...)
> 
> > Address the issue by temporary removing the pending request socket
> > from the accept queue, so that racing accept() can't touch them.
> > 
> > After depleting the msk - the ssk still exists, as plain TCP sockets,
> > re-insert them into the accept queue, so that later inet_csk_listen_stop()
> > will complete the tcp socket disposal.
> 
> Thank you for working on this fix and sharing this patch!
> 
> I just have one question at the end of your patch:
> 
> > Fixes: 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
> > Reported-by: Christoph Paasch <cpaasch@apple.com>
> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/423
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> > Note: as a net-next follow-up it would be nice to support mptcp socket
> > migration, basically duplicating part of the inet_csk_listen_stop() logic
> 
> (As you know, I'm always worry when you talk about duplicating code :) )

I know ;)

Probably I should had written something alike "implement quite similar
code to" ... which is actually more accurate.

> > @@ -1859,6 +1852,13 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
> >  
> >  	/* we are still under the listener msk socket lock */
> >  	lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
> > +
> > +	/* restore the listener queue, to let the TCP code clean it up */
> > +	spin_lock_bh(&queue->rskq_lock);
> > +	WARN_ON_ONCE(queue->rskq_accept_head);
> 
> What prevent the accept queue to be modified between the moment it has
> been "nullified" (at the beginning of this function) and here (at the end)?
> 
> I see that both the locks for the queue and the socket have been
> released and re-held in between: could this accept queue be modified in
> between if we are unlucky?
> 
> I'm sure I'm missing something :)

The tcp listener socket is unhashed before starting
mptcp_subflow_queue_clean() by tcp_set_state(TCP_CLOSE). Later lookup
can't find the listener socket at all - so not even attach more req
sockets there.

Put in another way, if the stack is able to reach the tcp listener
socket here, both mptcp and plain tcp could leak incoming request
socket at destroy time (because new req could be appended after
flushing the queue).

The WARN_ON_ONCE() should be really over-defensive.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-net] mptcp: fix disconnect vs accept race
  2023-08-01 13:59   ` Paolo Abeni
@ 2023-08-01 14:16     ` Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2023-08-01 14:16 UTC (permalink / raw)
  To: Paolo Abeni, mptcp; +Cc: Christoph Paasch

Hi Paolo,

On 01/08/2023 15:59, Paolo Abeni wrote:
> On Tue, 2023-08-01 at 12:34 +0200, Matthieu Baerts wrote:
>> Hi Paolo,
>>
>> On 01/08/2023 00:25, Paolo Abeni wrote:
>>> Despite commit 0ad529d9fd2b ("mptcp: fix possible divide by zero in
>>> recvmsg()"), the mptcp protocol is still prone to a race between
>>> disconnect() (or shutdown) and accept.
>>>
>>> The root cause is that the mentioned commit checks the msk-level
>>> flag, but mptcp_stream_accept() does acquire the msk-level lock,
>>> as it can rely directly on the first subflow lock.
>>>
>>> As reported by Christoph than can lead to a race where an msk
>>> socket is accepted after that mptcp_subflow_queue_clean() releases
>>> the listener socket lock and just before it takes destructive
>>> actions leading to the following splat:
>>
>> (...)
>>
>>> Address the issue by temporary removing the pending request socket
>>> from the accept queue, so that racing accept() can't touch them.
>>>
>>> After depleting the msk - the ssk still exists, as plain TCP sockets,
>>> re-insert them into the accept queue, so that later inet_csk_listen_stop()
>>> will complete the tcp socket disposal.
>>
>> Thank you for working on this fix and sharing this patch!
>>
>> I just have one question at the end of your patch:
>>
>>> Fixes: 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
>>> Reported-by: Christoph Paasch <cpaasch@apple.com>
>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/423
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>> ---
>>> Note: as a net-next follow-up it would be nice to support mptcp socket
>>> migration, basically duplicating part of the inet_csk_listen_stop() logic
>>
>> (As you know, I'm always worry when you talk about duplicating code :) )
> 
> I know ;)
> 
> Probably I should had written something alike "implement quite similar
> code to" ... which is actually more accurate.

Good, sounds better :-)

>>> @@ -1859,6 +1852,13 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
>>>  
>>>  	/* we are still under the listener msk socket lock */
>>>  	lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
>>> +
>>> +	/* restore the listener queue, to let the TCP code clean it up */
>>> +	spin_lock_bh(&queue->rskq_lock);
>>> +	WARN_ON_ONCE(queue->rskq_accept_head);
>>
>> What prevent the accept queue to be modified between the moment it has
>> been "nullified" (at the beginning of this function) and here (at the end)?
>>
>> I see that both the locks for the queue and the socket have been
>> released and re-held in between: could this accept queue be modified in
>> between if we are unlucky?
>>
>> I'm sure I'm missing something :)
> 
> The tcp listener socket is unhashed before starting
> mptcp_subflow_queue_clean() by tcp_set_state(TCP_CLOSE). Later lookup
> can't find the listener socket at all - so not even attach more req
> sockets there.
> 
> Put in another way, if the stack is able to reach the tcp listener
> socket here, both mptcp and plain tcp could leak incoming request
> socket at destroy time (because new req could be appended after
> flushing the queue).

Thank you for the explanations! All safe then, it looks good to me:

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>

Does it mean we could even drop the locks around the modification of the
queue? (But better to keep them)

> The WARN_ON_ONCE() should be really over-defensive.

Yes but it still makes sense, no problem to keep it!

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-net] mptcp: fix disconnect vs accept race
  2023-07-31 22:25 [PATCH mptcp-net] mptcp: fix disconnect vs accept race Paolo Abeni
  2023-07-31 23:30 ` mptcp: fix disconnect vs accept race: Tests Results MPTCP CI
  2023-08-01 10:34 ` [PATCH mptcp-net] mptcp: fix disconnect vs accept race Matthieu Baerts
@ 2023-08-01 14:35 ` Matthieu Baerts
  2023-08-01 16:09 ` mptcp: fix disconnect vs accept race: Tests Results MPTCP CI
  3 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2023-08-01 14:35 UTC (permalink / raw)
  To: Paolo Abeni, mptcp; +Cc: Christoph Paasch

Hi Paolo,

On 01/08/2023 00:25, Paolo Abeni wrote:
> Despite commit 0ad529d9fd2b ("mptcp: fix possible divide by zero in
> recvmsg()"), the mptcp protocol is still prone to a race between
> disconnect() (or shutdown) and accept.
> 
> The root cause is that the mentioned commit checks the msk-level
> flag, but mptcp_stream_accept() does acquire the msk-level lock,
> as it can rely directly on the first subflow lock.
> 
> As reported by Christoph than can lead to a race where an msk
> socket is accepted after that mptcp_subflow_queue_clean() releases
> the listener socket lock and just before it takes destructive
> actions leading to the following splat:
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000012
> PGD 5a4ca067 P4D 5a4ca067 PUD 37d4c067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP
> CPU: 2 PID: 10955 Comm: syz-executor.5 Not tainted 6.5.0-rc1-gdc7b257ee5dd #37
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> RIP: 0010:mptcp_stream_accept+0x1ee/0x2f0 include/net/inet_sock.h:330
> Code: 0a 09 00 48 8b 1b 4c 39 e3 74 07 e8 bc 7c 7f fe eb a1 e8 b5 7c 7f fe 4c 8b 6c 24 08 eb 05 e8 a9 7c 7f fe 49 8b 85 d8 09 00 00 <0f> b6 40 12 88 44 24 07 0f b6 6c 24 07 bf 07 00 00 00 89 ee e8 89
> RSP: 0018:ffffc90000d07dc0 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffff888037e8d020 RCX: ffff88803b093300
> RDX: 0000000000000000 RSI: ffffffff833822c5 RDI: ffffffff8333896a
> RBP: 0000607f82031520 R08: ffff88803b093300 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000003e83 R12: ffff888037e8d020
> R13: ffff888037e8c680 R14: ffff888009af7900 R15: ffff888009af6880
> FS:  00007fc26d708640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000012 CR3: 0000000066bc5001 CR4: 0000000000370ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  do_accept+0x1ae/0x260 net/socket.c:1872
>  __sys_accept4+0x9b/0x110 net/socket.c:1913
>  __do_sys_accept4 net/socket.c:1954 [inline]
>  __se_sys_accept4 net/socket.c:1951 [inline]
>  __x64_sys_accept4+0x20/0x30 net/socket.c:1951
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> 
> Address the issue by temporary removing the pending request socket
> from the accept queue, so that racing accept() can't touch them.
> 
> After depleting the msk - the ssk still exists, as plain TCP sockets,
> re-insert them into the accept queue, so that later inet_csk_listen_stop()
> will complete the tcp socket disposal.
> 
> Fixes: 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/423
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Thank you for the patch!

Now in our tree (fix for -net):

New patches for t/upstream-net and t/upstream:
- bd950ed266b6: mptcp: fix disconnect vs accept race
- Results: 7096265245f8..7e8e603725ca (export-net)
- Results: 42fa6a1925b9..829c05bb214b (export)

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20230801T143213
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20230801T143213

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mptcp: fix disconnect vs accept race: Tests Results
  2023-07-31 22:25 [PATCH mptcp-net] mptcp: fix disconnect vs accept race Paolo Abeni
                   ` (2 preceding siblings ...)
  2023-08-01 14:35 ` Matthieu Baerts
@ 2023-08-01 16:09 ` MPTCP CI
  3 siblings, 0 replies; 7+ messages in thread
From: MPTCP CI @ 2023-08-01 16:09 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Unstable: 1 failed test(s): selftest_simult_flows 🔴:
  - Task: https://cirrus-ci.com/task/5488088298815488
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5488088298815488/summary/summary.txt

- KVM Validation: debug (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/4662034151768064
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/4662034151768064/summary/summary.txt

- KVM Validation: normal (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5330855518797824
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5330855518797824/summary/summary.txt

- KVM Validation: debug (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5816899318054912
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5816899318054912/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/ab577d790497


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-08-01 16:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-31 22:25 [PATCH mptcp-net] mptcp: fix disconnect vs accept race Paolo Abeni
2023-07-31 23:30 ` mptcp: fix disconnect vs accept race: Tests Results MPTCP CI
2023-08-01 10:34 ` [PATCH mptcp-net] mptcp: fix disconnect vs accept race Matthieu Baerts
2023-08-01 13:59   ` Paolo Abeni
2023-08-01 14:16     ` Matthieu Baerts
2023-08-01 14:35 ` Matthieu Baerts
2023-08-01 16:09 ` mptcp: fix disconnect vs accept race: Tests Results MPTCP CI

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.