netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
@ 2022-11-14 19:16 Jakub Sitnicki
  2022-11-15 11:12 ` Tom Parkin
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jakub Sitnicki @ 2022-11-14 19:16 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Tom Parkin, Haowei Yan

sk->sk_user_data has multiple users, which are not compatible with each
other. Writers must synchronize by grabbing the sk->sk_callback_lock.

l2tp currently fails to grab the lock when modifying the underlying tunnel
socket fields. Fix it by adding appropriate locking.

We err on the side of safety and grab the sk_callback_lock also inside the
sk_destruct callback overridden by l2tp, even though there should be no
refs allowing access to the sock at the time when sk_destruct gets called.

v4:
- serialize write to sk_user_data in l2tp sk_destruct

v3:
- switch from sock lock to sk_callback_lock
- document write-protection for sk_user_data

v2:
- update Fixes to point to origin of the bug
- use real names in Reported/Tested-by tags

Cc: Tom Parkin <tparkin@katalix.com>
Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core")
Reported-by: Haowei Yan <g1042620637@gmail.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---

This took me forever. Sorry about that.

 include/net/sock.h   |  2 +-
 net/l2tp/l2tp_core.c | 19 +++++++++++++------
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 5db02546941c..e0517ecc6531 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -323,7 +323,7 @@ struct sk_filter;
   *	@sk_tskey: counter to disambiguate concurrent tstamp requests
   *	@sk_zckey: counter to order MSG_ZEROCOPY notifications
   *	@sk_socket: Identd and reporting IO signals
-  *	@sk_user_data: RPC layer private data
+  *	@sk_user_data: RPC layer private data. Write-protected by @sk_callback_lock.
   *	@sk_frag: cached page frag
   *	@sk_peek_off: current peek_offset value
   *	@sk_send_head: front of stuff to transmit
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 7499c51b1850..754fdda8a5f5 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1150,8 +1150,10 @@ static void l2tp_tunnel_destruct(struct sock *sk)
 	}
 
 	/* Remove hooks into tunnel socket */
+	write_lock_bh(&sk->sk_callback_lock);
 	sk->sk_destruct = tunnel->old_sk_destruct;
 	sk->sk_user_data = NULL;
+	write_unlock_bh(&sk->sk_callback_lock);
 
 	/* Call the original destructor */
 	if (sk->sk_destruct)
@@ -1469,16 +1471,18 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
 		sock = sockfd_lookup(tunnel->fd, &ret);
 		if (!sock)
 			goto err;
-
-		ret = l2tp_validate_socket(sock->sk, net, tunnel->encap);
-		if (ret < 0)
-			goto err_sock;
 	}
 
+	sk = sock->sk;
+	write_lock(&sk->sk_callback_lock);
+
+	ret = l2tp_validate_socket(sk, net, tunnel->encap);
+	if (ret < 0)
+		goto err_sock;
+
 	tunnel->l2tp_net = net;
 	pn = l2tp_pernet(net);
 
-	sk = sock->sk;
 	sock_hold(sk);
 	tunnel->sock = sk;
 
@@ -1504,7 +1508,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
 
 		setup_udp_tunnel_sock(net, sock, &udp_cfg);
 	} else {
-		sk->sk_user_data = tunnel;
+		rcu_assign_sk_user_data(sk, tunnel);
 	}
 
 	tunnel->old_sk_destruct = sk->sk_destruct;
@@ -1518,6 +1522,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
 	if (tunnel->fd >= 0)
 		sockfd_put(sock);
 
+	write_unlock(&sk->sk_callback_lock);
 	return 0;
 
 err_sock:
@@ -1525,6 +1530,8 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
 		sock_release(sock);
 	else
 		sockfd_put(sock);
+
+	write_unlock(&sk->sk_callback_lock);
 err:
 	return ret;
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-14 19:16 [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock Jakub Sitnicki
@ 2022-11-15 11:12 ` Tom Parkin
  2022-11-16 13:30 ` patchwork-bot+netdevbpf
  2022-12-02  9:50 ` Hangbin Liu
  2 siblings, 0 replies; 15+ messages in thread
From: Tom Parkin @ 2022-11-15 11:12 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Haowei Yan

[-- Attachment #1: Type: text/plain, Size: 3971 bytes --]

On  Mon, Nov 14, 2022 at 20:16:19 +0100, Jakub Sitnicki wrote:
> sk->sk_user_data has multiple users, which are not compatible with each
> other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> 
> l2tp currently fails to grab the lock when modifying the underlying tunnel
> socket fields. Fix it by adding appropriate locking.
> 
> We err on the side of safety and grab the sk_callback_lock also inside the
> sk_destruct callback overridden by l2tp, even though there should be no
> refs allowing access to the sock at the time when sk_destruct gets called.
> 
> v4:
> - serialize write to sk_user_data in l2tp sk_destruct
> 
> v3:
> - switch from sock lock to sk_callback_lock
> - document write-protection for sk_user_data
> 
> v2:
> - update Fixes to point to origin of the bug
> - use real names in Reported/Tested-by tags

LGTM, thanks Jakub.

> 
> Cc: Tom Parkin <tparkin@katalix.com>
> Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core")
> Reported-by: Haowei Yan <g1042620637@gmail.com>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> 
> This took me forever. Sorry about that.
> 
>  include/net/sock.h   |  2 +-
>  net/l2tp/l2tp_core.c | 19 +++++++++++++------
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 5db02546941c..e0517ecc6531 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -323,7 +323,7 @@ struct sk_filter;
>    *	@sk_tskey: counter to disambiguate concurrent tstamp requests
>    *	@sk_zckey: counter to order MSG_ZEROCOPY notifications
>    *	@sk_socket: Identd and reporting IO signals
> -  *	@sk_user_data: RPC layer private data
> +  *	@sk_user_data: RPC layer private data. Write-protected by @sk_callback_lock.
>    *	@sk_frag: cached page frag
>    *	@sk_peek_off: current peek_offset value
>    *	@sk_send_head: front of stuff to transmit
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 7499c51b1850..754fdda8a5f5 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1150,8 +1150,10 @@ static void l2tp_tunnel_destruct(struct sock *sk)
>  	}
>  
>  	/* Remove hooks into tunnel socket */
> +	write_lock_bh(&sk->sk_callback_lock);
>  	sk->sk_destruct = tunnel->old_sk_destruct;
>  	sk->sk_user_data = NULL;
> +	write_unlock_bh(&sk->sk_callback_lock);
>  
>  	/* Call the original destructor */
>  	if (sk->sk_destruct)
> @@ -1469,16 +1471,18 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
>  		sock = sockfd_lookup(tunnel->fd, &ret);
>  		if (!sock)
>  			goto err;
> -
> -		ret = l2tp_validate_socket(sock->sk, net, tunnel->encap);
> -		if (ret < 0)
> -			goto err_sock;
>  	}
>  
> +	sk = sock->sk;
> +	write_lock(&sk->sk_callback_lock);
> +
> +	ret = l2tp_validate_socket(sk, net, tunnel->encap);
> +	if (ret < 0)
> +		goto err_sock;
> +
>  	tunnel->l2tp_net = net;
>  	pn = l2tp_pernet(net);
>  
> -	sk = sock->sk;
>  	sock_hold(sk);
>  	tunnel->sock = sk;
>  
> @@ -1504,7 +1508,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
>  
>  		setup_udp_tunnel_sock(net, sock, &udp_cfg);
>  	} else {
> -		sk->sk_user_data = tunnel;
> +		rcu_assign_sk_user_data(sk, tunnel);
>  	}
>  
>  	tunnel->old_sk_destruct = sk->sk_destruct;
> @@ -1518,6 +1522,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
>  	if (tunnel->fd >= 0)
>  		sockfd_put(sock);
>  
> +	write_unlock(&sk->sk_callback_lock);
>  	return 0;
>  
>  err_sock:
> @@ -1525,6 +1530,8 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
>  		sock_release(sock);
>  	else
>  		sockfd_put(sock);
> +
> +	write_unlock(&sk->sk_callback_lock);
>  err:
>  	return ret;
>  }
> -- 
> 2.38.1
> 

-- 
Tom Parkin
Katalix Systems Ltd
https://katalix.com
Catalysts for your Embedded Linux software development

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-14 19:16 [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock Jakub Sitnicki
  2022-11-15 11:12 ` Tom Parkin
@ 2022-11-16 13:30 ` patchwork-bot+netdevbpf
  2022-11-17  9:07   ` Eric Dumazet
  2022-12-02  9:50 ` Hangbin Liu
  2 siblings, 1 reply; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-11-16 13:30 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, davem, edumazet, kuba, pabeni, tparkin, g1042620637

Hello:

This patch was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
> sk->sk_user_data has multiple users, which are not compatible with each
> other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> 
> l2tp currently fails to grab the lock when modifying the underlying tunnel
> socket fields. Fix it by adding appropriate locking.
> 
> We err on the side of safety and grab the sk_callback_lock also inside the
> sk_destruct callback overridden by l2tp, even though there should be no
> refs allowing access to the sock at the time when sk_destruct gets called.
> 
> [...]

Here is the summary with links:
  - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
    https://git.kernel.org/netdev/net/c/b68777d54fac

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-16 13:30 ` patchwork-bot+netdevbpf
@ 2022-11-17  9:07   ` Eric Dumazet
  2022-11-17  9:35     ` Jakub Sitnicki
  2022-11-17  9:40     ` Eric Dumazet
  0 siblings, 2 replies; 15+ messages in thread
From: Eric Dumazet @ 2022-11-17  9:07 UTC (permalink / raw)
  To: patchwork-bot+netdevbpf
  Cc: Jakub Sitnicki, netdev, davem, kuba, pabeni, tparkin, g1042620637

On Wed, Nov 16, 2022 at 5:30 AM <patchwork-bot+netdevbpf@kernel.org> wrote:
>
> Hello:
>
> This patch was applied to netdev/net.git (master)
> by David S. Miller <davem@davemloft.net>:
>
> On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
> > sk->sk_user_data has multiple users, which are not compatible with each
> > other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> >
> > l2tp currently fails to grab the lock when modifying the underlying tunnel
> > socket fields. Fix it by adding appropriate locking.
> >
> > We err on the side of safety and grab the sk_callback_lock also inside the
> > sk_destruct callback overridden by l2tp, even though there should be no
> > refs allowing access to the sock at the time when sk_destruct gets called.
> >
> > [...]
>
> Here is the summary with links:
>   - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
>     https://git.kernel.org/netdev/net/c/b68777d54fac
>
>

I guess this patch has not been tested with LOCKDEP, right ?

sk_callback_lock always needs _bh safety.

I will send something like:

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 754fdda8a5f52e4e8e2c0f47331c3b22765033d0..a3b06a3cf68248f5ec7ae8be2a9711d0f482ac36
100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1474,7 +1474,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
*tunnel, struct net *net,
        }

        sk = sock->sk;
-       write_lock(&sk->sk_callback_lock);
+       write_lock_bh(&sk->sk_callback_lock);

        ret = l2tp_validate_socket(sk, net, tunnel->encap);
        if (ret < 0)
@@ -1522,7 +1522,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
*tunnel, struct net *net,
        if (tunnel->fd >= 0)
                sockfd_put(sock);

-       write_unlock(&sk->sk_callback_lock);
+       write_unlock_bh(&sk->sk_callback_lock);
        return 0;

 err_sock:
@@ -1531,7 +1531,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
*tunnel, struct net *net,
        else
                sockfd_put(sock);

-       write_unlock(&sk->sk_callback_lock);
+       write_unlock_bh(&sk->sk_callback_lock);
 err:
        return ret;
 }

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-17  9:07   ` Eric Dumazet
@ 2022-11-17  9:35     ` Jakub Sitnicki
  2022-11-17  9:54       ` Eric Dumazet
  2022-11-17  9:40     ` Eric Dumazet
  1 sibling, 1 reply; 15+ messages in thread
From: Jakub Sitnicki @ 2022-11-17  9:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: patchwork-bot+netdevbpf, netdev, davem, kuba, pabeni, tparkin,
	g1042620637

On Thu, Nov 17, 2022 at 01:07 AM -08, Eric Dumazet wrote:
> On Wed, Nov 16, 2022 at 5:30 AM <patchwork-bot+netdevbpf@kernel.org> wrote:
>>
>> Hello:
>>
>> This patch was applied to netdev/net.git (master)
>> by David S. Miller <davem@davemloft.net>:
>>
>> On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
>> > sk->sk_user_data has multiple users, which are not compatible with each
>> > other. Writers must synchronize by grabbing the sk->sk_callback_lock.
>> >
>> > l2tp currently fails to grab the lock when modifying the underlying tunnel
>> > socket fields. Fix it by adding appropriate locking.
>> >
>> > We err on the side of safety and grab the sk_callback_lock also inside the
>> > sk_destruct callback overridden by l2tp, even though there should be no
>> > refs allowing access to the sock at the time when sk_destruct gets called.
>> >
>> > [...]
>>
>> Here is the summary with links:
>>   - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
>>     https://git.kernel.org/netdev/net/c/b68777d54fac
>>
>>
>
> I guess this patch has not been tested with LOCKDEP, right ?
>
> sk_callback_lock always needs _bh safety.
>
> I will send something like:
>
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 754fdda8a5f52e4e8e2c0f47331c3b22765033d0..a3b06a3cf68248f5ec7ae8be2a9711d0f482ac36
> 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1474,7 +1474,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> *tunnel, struct net *net,
>         }
>
>         sk = sock->sk;
> -       write_lock(&sk->sk_callback_lock);
> +       write_lock_bh(&sk->sk_callback_lock);
>
>         ret = l2tp_validate_socket(sk, net, tunnel->encap);
>         if (ret < 0)
> @@ -1522,7 +1522,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> *tunnel, struct net *net,
>         if (tunnel->fd >= 0)
>                 sockfd_put(sock);
>
> -       write_unlock(&sk->sk_callback_lock);
> +       write_unlock_bh(&sk->sk_callback_lock);
>         return 0;
>
>  err_sock:
> @@ -1531,7 +1531,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> *tunnel, struct net *net,
>         else
>                 sockfd_put(sock);
>
> -       write_unlock(&sk->sk_callback_lock);
> +       write_unlock_bh(&sk->sk_callback_lock);
>  err:
>         return ret;
>  }

Hmm, weird. I double checked - I have PROVE_LOCKING enabled.
Didn't see any lockdep reports when running selftests/net/l2tp.sh.

I my defense - I thought _bh was not needed because
l2tp_tunnel_register() gets called only in the process context. I mean,
it's triggered by Netlink sendmsg, but that gets processed in-line
AFAIU:

netlink_sendmsg
  netlink_unicast
    ->netlink_rcv
      genl_rcv
        genl_rcv_msg
          genl_family_rcv_msg
            genl_family_rcv_msg_doit
              ->doit
                l2tp_nl_cmd_tunnel_create
                  l2tp_tunnel_register

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-17  9:07   ` Eric Dumazet
  2022-11-17  9:35     ` Jakub Sitnicki
@ 2022-11-17  9:40     ` Eric Dumazet
  2022-11-17  9:55       ` Jakub Sitnicki
  1 sibling, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2022-11-17  9:40 UTC (permalink / raw)
  To: patchwork-bot+netdevbpf
  Cc: Jakub Sitnicki, netdev, davem, kuba, pabeni, tparkin, g1042620637

On Thu, Nov 17, 2022 at 1:07 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Nov 16, 2022 at 5:30 AM <patchwork-bot+netdevbpf@kernel.org> wrote:
> >
> > Hello:
> >
> > This patch was applied to netdev/net.git (master)
> > by David S. Miller <davem@davemloft.net>:
> >
> > On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
> > > sk->sk_user_data has multiple users, which are not compatible with each
> > > other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> > >
> > > l2tp currently fails to grab the lock when modifying the underlying tunnel
> > > socket fields. Fix it by adding appropriate locking.
> > >
> > > We err on the side of safety and grab the sk_callback_lock also inside the
> > > sk_destruct callback overridden by l2tp, even though there should be no
> > > refs allowing access to the sock at the time when sk_destruct gets called.
> > >
> > > [...]
> >
> > Here is the summary with links:
> >   - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
> >     https://git.kernel.org/netdev/net/c/b68777d54fac
> >
> >
>
> I guess this patch has not been tested with LOCKDEP, right ?
>
> sk_callback_lock always needs _bh safety.
>
> I will send something like:
>
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 754fdda8a5f52e4e8e2c0f47331c3b22765033d0..a3b06a3cf68248f5ec7ae8be2a9711d0f482ac36
> 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1474,7 +1474,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> *tunnel, struct net *net,
>         }
>
>         sk = sock->sk;
> -       write_lock(&sk->sk_callback_lock);
> +       write_lock_bh(&sk->sk_callback_lock);

Unfortunately this might still not work, because
setup_udp_tunnel_sock->udp_encap_enable() probably could sleep in
static_branch_inc() ?

I will release the syzbot report, and let you folks work on a fix, thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-17  9:35     ` Jakub Sitnicki
@ 2022-11-17  9:54       ` Eric Dumazet
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2022-11-17  9:54 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: patchwork-bot+netdevbpf, netdev, davem, kuba, pabeni, tparkin,
	g1042620637

On Thu, Nov 17, 2022 at 1:45 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Thu, Nov 17, 2022 at 01:07 AM -08, Eric Dumazet wrote:
> > On Wed, Nov 16, 2022 at 5:30 AM <patchwork-bot+netdevbpf@kernel.org> wrote:
> >>
> >> Hello:
> >>
> >> This patch was applied to netdev/net.git (master)
> >> by David S. Miller <davem@davemloft.net>:
> >>
> >> On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
> >> > sk->sk_user_data has multiple users, which are not compatible with each
> >> > other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> >> >
> >> > l2tp currently fails to grab the lock when modifying the underlying tunnel
> >> > socket fields. Fix it by adding appropriate locking.
> >> >
> >> > We err on the side of safety and grab the sk_callback_lock also inside the
> >> > sk_destruct callback overridden by l2tp, even though there should be no
> >> > refs allowing access to the sock at the time when sk_destruct gets called.
> >> >
> >> > [...]
> >>
> >> Here is the summary with links:
> >>   - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
> >>     https://git.kernel.org/netdev/net/c/b68777d54fac
> >>
> >>
> >
> > I guess this patch has not been tested with LOCKDEP, right ?
> >
> > sk_callback_lock always needs _bh safety.
> >
> > I will send something like:
> >
> > diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> > index 754fdda8a5f52e4e8e2c0f47331c3b22765033d0..a3b06a3cf68248f5ec7ae8be2a9711d0f482ac36
> > 100644
> > --- a/net/l2tp/l2tp_core.c
> > +++ b/net/l2tp/l2tp_core.c
> > @@ -1474,7 +1474,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> > *tunnel, struct net *net,
> >         }
> >
> >         sk = sock->sk;
> > -       write_lock(&sk->sk_callback_lock);
> > +       write_lock_bh(&sk->sk_callback_lock);
> >
> >         ret = l2tp_validate_socket(sk, net, tunnel->encap);
> >         if (ret < 0)
> > @@ -1522,7 +1522,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> > *tunnel, struct net *net,
> >         if (tunnel->fd >= 0)
> >                 sockfd_put(sock);
> >
> > -       write_unlock(&sk->sk_callback_lock);
> > +       write_unlock_bh(&sk->sk_callback_lock);
> >         return 0;
> >
> >  err_sock:
> > @@ -1531,7 +1531,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> > *tunnel, struct net *net,
> >         else
> >                 sockfd_put(sock);
> >
> > -       write_unlock(&sk->sk_callback_lock);
> > +       write_unlock_bh(&sk->sk_callback_lock);
> >  err:
> >         return ret;
> >  }
>
> Hmm, weird. I double checked - I have PROVE_LOCKING enabled.
> Didn't see any lockdep reports when running selftests/net/l2tp.sh.
>
> I my defense - I thought _bh was not needed because
> l2tp_tunnel_register() gets called only in the process context. I mean,
> it's triggered by Netlink sendmsg, but that gets processed in-line
> AFAIU:
>
> netlink_sendmsg
>   netlink_unicast
>     ->netlink_rcv
>       genl_rcv
>         genl_rcv_msg
>           genl_family_rcv_msg
>             genl_family_rcv_msg_doit
>               ->doit
>                 l2tp_nl_cmd_tunnel_create
>                   l2tp_tunnel_register

Three different syzbot reports will help to better understand the
issue, sorry it is 2am for me, I am not sure in which time zone you
are in ...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-17  9:40     ` Eric Dumazet
@ 2022-11-17  9:55       ` Jakub Sitnicki
  2022-11-18 10:28         ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Sitnicki @ 2022-11-17  9:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: patchwork-bot+netdevbpf, netdev, davem, kuba, pabeni, tparkin,
	g1042620637

On Thu, Nov 17, 2022 at 01:40 AM -08, Eric Dumazet wrote:
> On Thu, Nov 17, 2022 at 1:07 AM Eric Dumazet <edumazet@google.com> wrote:
>>
>> On Wed, Nov 16, 2022 at 5:30 AM <patchwork-bot+netdevbpf@kernel.org> wrote:
>> >
>> > Hello:
>> >
>> > This patch was applied to netdev/net.git (master)
>> > by David S. Miller <davem@davemloft.net>:
>> >
>> > On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
>> > > sk->sk_user_data has multiple users, which are not compatible with each
>> > > other. Writers must synchronize by grabbing the sk->sk_callback_lock.
>> > >
>> > > l2tp currently fails to grab the lock when modifying the underlying tunnel
>> > > socket fields. Fix it by adding appropriate locking.
>> > >
>> > > We err on the side of safety and grab the sk_callback_lock also inside the
>> > > sk_destruct callback overridden by l2tp, even though there should be no
>> > > refs allowing access to the sock at the time when sk_destruct gets called.
>> > >
>> > > [...]
>> >
>> > Here is the summary with links:
>> >   - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
>> >     https://git.kernel.org/netdev/net/c/b68777d54fac
>> >
>> >
>>
>> I guess this patch has not been tested with LOCKDEP, right ?
>>
>> sk_callback_lock always needs _bh safety.
>>
>> I will send something like:
>>
>> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
>> index 754fdda8a5f52e4e8e2c0f47331c3b22765033d0..a3b06a3cf68248f5ec7ae8be2a9711d0f482ac36
>> 100644
>> --- a/net/l2tp/l2tp_core.c
>> +++ b/net/l2tp/l2tp_core.c
>> @@ -1474,7 +1474,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
>> *tunnel, struct net *net,
>>         }
>>
>>         sk = sock->sk;
>> -       write_lock(&sk->sk_callback_lock);
>> +       write_lock_bh(&sk->sk_callback_lock);
>
> Unfortunately this might still not work, because
> setup_udp_tunnel_sock->udp_encap_enable() probably could sleep in
> static_branch_inc() ?
>
> I will release the syzbot report, and let you folks work on a fix, thanks.

Ah, the problem is with pppol2tp_connect racing with itself. Thanks for
the syzbot report. I will take a look. I live for debugging deadlocks
:-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-17  9:55       ` Jakub Sitnicki
@ 2022-11-18 10:28         ` Eric Dumazet
  2022-11-18 10:57           ` Jakub Sitnicki
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2022-11-18 10:28 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: patchwork-bot+netdevbpf, netdev, davem, kuba, pabeni, tparkin,
	g1042620637

On Thu, Nov 17, 2022 at 1:57 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Thu, Nov 17, 2022 at 01:40 AM -08, Eric Dumazet wrote:
> > On Thu, Nov 17, 2022 at 1:07 AM Eric Dumazet <edumazet@google.com> wrote:
> >>
> >> On Wed, Nov 16, 2022 at 5:30 AM <patchwork-bot+netdevbpf@kernel.org> wrote:
> >> >
> >> > Hello:
> >> >
> >> > This patch was applied to netdev/net.git (master)
> >> > by David S. Miller <davem@davemloft.net>:
> >> >
> >> > On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
> >> > > sk->sk_user_data has multiple users, which are not compatible with each
> >> > > other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> >> > >
> >> > > l2tp currently fails to grab the lock when modifying the underlying tunnel
> >> > > socket fields. Fix it by adding appropriate locking.
> >> > >
> >> > > We err on the side of safety and grab the sk_callback_lock also inside the
> >> > > sk_destruct callback overridden by l2tp, even though there should be no
> >> > > refs allowing access to the sock at the time when sk_destruct gets called.
> >> > >
> >> > > [...]
> >> >
> >> > Here is the summary with links:
> >> >   - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
> >> >     https://git.kernel.org/netdev/net/c/b68777d54fac
> >> >
> >> >
> >>
> >> I guess this patch has not been tested with LOCKDEP, right ?
> >>
> >> sk_callback_lock always needs _bh safety.
> >>
> >> I will send something like:
> >>
> >> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> >> index 754fdda8a5f52e4e8e2c0f47331c3b22765033d0..a3b06a3cf68248f5ec7ae8be2a9711d0f482ac36
> >> 100644
> >> --- a/net/l2tp/l2tp_core.c
> >> +++ b/net/l2tp/l2tp_core.c
> >> @@ -1474,7 +1474,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> >> *tunnel, struct net *net,
> >>         }
> >>
> >>         sk = sock->sk;
> >> -       write_lock(&sk->sk_callback_lock);
> >> +       write_lock_bh(&sk->sk_callback_lock);
> >
> > Unfortunately this might still not work, because
> > setup_udp_tunnel_sock->udp_encap_enable() probably could sleep in
> > static_branch_inc() ?
> >
> > I will release the syzbot report, and let you folks work on a fix, thanks.
>
> Ah, the problem is with pppol2tp_connect racing with itself. Thanks for
> the syzbot report. I will take a look. I live for debugging deadlocks
> :-)

Hi Jakub, any updates on this issue ? (Other syzbot reports with the
same root cause are piling up)

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-18 10:28         ` Eric Dumazet
@ 2022-11-18 10:57           ` Jakub Sitnicki
  2022-11-18 11:09             ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Sitnicki @ 2022-11-18 10:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: patchwork-bot+netdevbpf, netdev, davem, kuba, pabeni, tparkin,
	g1042620637

On Fri, Nov 18, 2022 at 02:28 AM -08, Eric Dumazet wrote:
> On Thu, Nov 17, 2022 at 1:57 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>
>> On Thu, Nov 17, 2022 at 01:40 AM -08, Eric Dumazet wrote:
>> > On Thu, Nov 17, 2022 at 1:07 AM Eric Dumazet <edumazet@google.com> wrote:
>> >>
>> >> On Wed, Nov 16, 2022 at 5:30 AM <patchwork-bot+netdevbpf@kernel.org> wrote:
>> >> >
>> >> > Hello:
>> >> >
>> >> > This patch was applied to netdev/net.git (master)
>> >> > by David S. Miller <davem@davemloft.net>:
>> >> >
>> >> > On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
>> >> > > sk->sk_user_data has multiple users, which are not compatible with each
>> >> > > other. Writers must synchronize by grabbing the sk->sk_callback_lock.
>> >> > >
>> >> > > l2tp currently fails to grab the lock when modifying the underlying tunnel
>> >> > > socket fields. Fix it by adding appropriate locking.
>> >> > >
>> >> > > We err on the side of safety and grab the sk_callback_lock also inside the
>> >> > > sk_destruct callback overridden by l2tp, even though there should be no
>> >> > > refs allowing access to the sock at the time when sk_destruct gets called.
>> >> > >
>> >> > > [...]
>> >> >
>> >> > Here is the summary with links:
>> >> >   - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
>> >> >     https://git.kernel.org/netdev/net/c/b68777d54fac
>> >> >
>> >> >
>> >>
>> >> I guess this patch has not been tested with LOCKDEP, right ?
>> >>
>> >> sk_callback_lock always needs _bh safety.
>> >>
>> >> I will send something like:
>> >>
>> >> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
>> >> index 754fdda8a5f52e4e8e2c0f47331c3b22765033d0..a3b06a3cf68248f5ec7ae8be2a9711d0f482ac36
>> >> 100644
>> >> --- a/net/l2tp/l2tp_core.c
>> >> +++ b/net/l2tp/l2tp_core.c
>> >> @@ -1474,7 +1474,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
>> >> *tunnel, struct net *net,
>> >>         }
>> >>
>> >>         sk = sock->sk;
>> >> -       write_lock(&sk->sk_callback_lock);
>> >> +       write_lock_bh(&sk->sk_callback_lock);
>> >
>> > Unfortunately this might still not work, because
>> > setup_udp_tunnel_sock->udp_encap_enable() probably could sleep in
>> > static_branch_inc() ?
>> >
>> > I will release the syzbot report, and let you folks work on a fix, thanks.
>>
>> Ah, the problem is with pppol2tp_connect racing with itself. Thanks for
>> the syzbot report. I will take a look. I live for debugging deadlocks
>> :-)
>
> Hi Jakub, any updates on this issue ? (Other syzbot reports with the
> same root cause are piling up)
>
> Thanks

Sorry, I don't have anything yet. I have reserved time to work on it
this afternoon (I'm in the CET timezone).

Alternatively, I can send a revert right away and come back with fixed
patch once I have that, if you prefer.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-18 10:57           ` Jakub Sitnicki
@ 2022-11-18 11:09             ` Eric Dumazet
  2022-11-19 13:04               ` Jakub Sitnicki
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2022-11-18 11:09 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: patchwork-bot+netdevbpf, netdev, davem, kuba, pabeni, tparkin,
	g1042620637

On Fri, Nov 18, 2022 at 3:00 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:

>
> Sorry, I don't have anything yet. I have reserved time to work on it
> this afternoon (I'm in the CET timezone).
>
> Alternatively, I can send a revert right away and come back with fixed
> patch once I have that, if you prefer.

No worries, this can wait for a fix, thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-18 11:09             ` Eric Dumazet
@ 2022-11-19 13:04               ` Jakub Sitnicki
  0 siblings, 0 replies; 15+ messages in thread
From: Jakub Sitnicki @ 2022-11-19 13:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: patchwork-bot+netdevbpf, netdev, davem, kuba, pabeni, tparkin,
	g1042620637


On Fri, Nov 18, 2022 at 03:09 AM -08, Eric Dumazet wrote:
> On Fri, Nov 18, 2022 at 3:00 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
>>
>> Sorry, I don't have anything yet. I have reserved time to work on it
>> this afternoon (I'm in the CET timezone).
>>
>> Alternatively, I can send a revert right away and come back with fixed
>> patch once I have that, if you prefer.
>
> No worries, this can wait for a fix, thanks.

Proposed fix now posted:

https://lore.kernel.org/netdev/20221119130317.39158-1-jakub@cloudflare.com/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-11-14 19:16 [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock Jakub Sitnicki
  2022-11-15 11:12 ` Tom Parkin
  2022-11-16 13:30 ` patchwork-bot+netdevbpf
@ 2022-12-02  9:50 ` Hangbin Liu
  2022-12-05 10:24   ` Jakub Sitnicki
  2 siblings, 1 reply; 15+ messages in thread
From: Hangbin Liu @ 2022-12-02  9:50 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Tom Parkin, Haowei Yan, Roopa Prabhu,
	Nikolay Aleksandrov

On Mon, Nov 14, 2022 at 08:16:19PM +0100, Jakub Sitnicki wrote:
> sk->sk_user_data has multiple users, which are not compatible with each
> other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> 
> l2tp currently fails to grab the lock when modifying the underlying tunnel
> socket fields. Fix it by adding appropriate locking.
> 
> We err on the side of safety and grab the sk_callback_lock also inside the
> sk_destruct callback overridden by l2tp, even though there should be no
> refs allowing access to the sock at the time when sk_destruct gets called.
> 
> v4:
> - serialize write to sk_user_data in l2tp sk_destruct
> 
> v3:
> - switch from sock lock to sk_callback_lock
> - document write-protection for sk_user_data
> 
> v2:
> - update Fixes to point to origin of the bug
> - use real names in Reported/Tested-by tags
> 
> Cc: Tom Parkin <tparkin@katalix.com>
> Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core")
> Reported-by: Haowei Yan <g1042620637@gmail.com>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> 
> This took me forever. Sorry about that.
> 
>  include/net/sock.h   |  2 +-
>  net/l2tp/l2tp_core.c | 19 +++++++++++++------
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 5db02546941c..e0517ecc6531 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -323,7 +323,7 @@ struct sk_filter;
>    *	@sk_tskey: counter to disambiguate concurrent tstamp requests
>    *	@sk_zckey: counter to order MSG_ZEROCOPY notifications
>    *	@sk_socket: Identd and reporting IO signals
> -  *	@sk_user_data: RPC layer private data
> +  *	@sk_user_data: RPC layer private data. Write-protected by @sk_callback_lock.
>    *	@sk_frag: cached page frag
>    *	@sk_peek_off: current peek_offset value
>    *	@sk_send_head: front of stuff to transmit
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 7499c51b1850..754fdda8a5f5 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1150,8 +1150,10 @@ static void l2tp_tunnel_destruct(struct sock *sk)
>  	}
>  
>  	/* Remove hooks into tunnel socket */
> +	write_lock_bh(&sk->sk_callback_lock);
>  	sk->sk_destruct = tunnel->old_sk_destruct;
>  	sk->sk_user_data = NULL;
> +	write_unlock_bh(&sk->sk_callback_lock);
>  
>  	/* Call the original destructor */
>  	if (sk->sk_destruct)

Hi Jakub,

I have a similar issue with vxlan driver. Similar with commit
ad6c9986bcb6 ("vxlan: Fix GRO cells race condition between receive and link
delete"). There is still a race condition on vxlan that when receive a packet
while deleting a VXLAN device. In vxlan_ecn_decapsulate(), the
vxlan_get_sk_family() call panic as sk is NULL.

 #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
 #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
 #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
 #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
 #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
 #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
 #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
    [exception RIP: vxlan_ecn_decapsulate+0x3b]
    RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
    RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
    RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
    RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
    R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
    R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
 #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
 #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3
--- <IRQ stack> ---

> struct socket ffff8a9fd1168700
struct socket {
  state = SS_FREE,
  type = 0,
  flags = 0,
  file = 0xffff8a9fcb748000,
  sk = 0x0,
  ops = 0x0,

So I'm wondering if we should also have locks in udp_tunnel_sock_release().
Or should we add a checking in sk state before calling vxlan_get_sk_family()?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-12-02  9:50 ` Hangbin Liu
@ 2022-12-05 10:24   ` Jakub Sitnicki
  2022-12-05 12:37     ` Hangbin Liu
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Sitnicki @ 2022-12-05 10:24 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Tom Parkin, Haowei Yan, Roopa Prabhu,
	Nikolay Aleksandrov

On Fri, Dec 02, 2022 at 05:50 PM +08, Hangbin Liu wrote:
> On Mon, Nov 14, 2022 at 08:16:19PM +0100, Jakub Sitnicki wrote:
>> sk->sk_user_data has multiple users, which are not compatible with each
>> other. Writers must synchronize by grabbing the sk->sk_callback_lock.
>> 
>> l2tp currently fails to grab the lock when modifying the underlying tunnel
>> socket fields. Fix it by adding appropriate locking.
>> 
>> We err on the side of safety and grab the sk_callback_lock also inside the
>> sk_destruct callback overridden by l2tp, even though there should be no
>> refs allowing access to the sock at the time when sk_destruct gets called.
>> 
>> v4:
>> - serialize write to sk_user_data in l2tp sk_destruct
>> 
>> v3:
>> - switch from sock lock to sk_callback_lock
>> - document write-protection for sk_user_data
>> 
>> v2:
>> - update Fixes to point to origin of the bug
>> - use real names in Reported/Tested-by tags
>> 
>> Cc: Tom Parkin <tparkin@katalix.com>
>> Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core")
>> Reported-by: Haowei Yan <g1042620637@gmail.com>
>> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>> ---
>> 
>> This took me forever. Sorry about that.
>> 
>>  include/net/sock.h   |  2 +-
>>  net/l2tp/l2tp_core.c | 19 +++++++++++++------
>>  2 files changed, 14 insertions(+), 7 deletions(-)
>> 
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index 5db02546941c..e0517ecc6531 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -323,7 +323,7 @@ struct sk_filter;
>>    *	@sk_tskey: counter to disambiguate concurrent tstamp requests
>>    *	@sk_zckey: counter to order MSG_ZEROCOPY notifications
>>    *	@sk_socket: Identd and reporting IO signals
>> -  *	@sk_user_data: RPC layer private data
>> +  *	@sk_user_data: RPC layer private data. Write-protected by @sk_callback_lock.
>>    *	@sk_frag: cached page frag
>>    *	@sk_peek_off: current peek_offset value
>>    *	@sk_send_head: front of stuff to transmit
>> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
>> index 7499c51b1850..754fdda8a5f5 100644
>> --- a/net/l2tp/l2tp_core.c
>> +++ b/net/l2tp/l2tp_core.c
>> @@ -1150,8 +1150,10 @@ static void l2tp_tunnel_destruct(struct sock *sk)
>>  	}
>>  
>>  	/* Remove hooks into tunnel socket */
>> +	write_lock_bh(&sk->sk_callback_lock);
>>  	sk->sk_destruct = tunnel->old_sk_destruct;
>>  	sk->sk_user_data = NULL;
>> +	write_unlock_bh(&sk->sk_callback_lock);
>>  
>>  	/* Call the original destructor */
>>  	if (sk->sk_destruct)
>
> Hi Jakub,
>
> I have a similar issue with vxlan driver. Similar with commit
> ad6c9986bcb6 ("vxlan: Fix GRO cells race condition between receive and link
> delete"). There is still a race condition on vxlan that when receive a packet
> while deleting a VXLAN device. In vxlan_ecn_decapsulate(), the
> vxlan_get_sk_family() call panic as sk is NULL.
>
>  #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
>  #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
>  #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
>  #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
>  #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
>  #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
>  #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
>     [exception RIP: vxlan_ecn_decapsulate+0x3b]
>     RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
>     RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
>     RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
>     RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
>     R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
>     R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
>  #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
>  #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
> #10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
> #11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
> #12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
> #13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
> #14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
> #15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
> #16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
> #17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
> #18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3
> --- <IRQ stack> ---
>
>> struct socket ffff8a9fd1168700
> struct socket {
>   state = SS_FREE,
>   type = 0,
>   flags = 0,
>   file = 0xffff8a9fcb748000,
>   sk = 0x0,
>   ops = 0x0,
>
> So I'm wondering if we should also have locks in udp_tunnel_sock_release().
> Or should we add a checking in sk state before calling vxlan_get_sk_family()?

This is how like to think about it:

To know when it is safe to load vs->sock->sk->sk_family, we have to ask:

1. What ensures that the objects remain alive/valid in our scope?
2. What protects the objects from being mutated?

In case of vxlan_sock object in the context of vxlan_ecn_decapsulate():

1. We are in an RCU read side section (ip_local_deliver_finish).
2. RCU-protected objects are not to be mutated while readers exist.

The classic "What is RCU, Fundamentally?" article explains it much
better than I ever could:

https://lwn.net/Articles/262464/

As to where the problem lies. I belive udp_tunnel_sock_release() is not
keeping the (2) promise.

After unpublishing the sk_user_data, we should wait for any existing
readers accessing the vxlan_sock to finish with synchronize_rcu(),
before releaseing the socket.

That is:

--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -176,6 +176,7 @@ EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb);
 void udp_tunnel_sock_release(struct socket *sock)
 {
        rcu_assign_sk_user_data(sock->sk, NULL);
+       synchronize_rcu();
        kernel_sock_shutdown(sock, SHUT_RDWR);
        sock_release(sock);
 }


Otherwise accessing vxlan_sock state doesn't look safe to me.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
  2022-12-05 10:24   ` Jakub Sitnicki
@ 2022-12-05 12:37     ` Hangbin Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Hangbin Liu @ 2022-12-05 12:37 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Tom Parkin, Haowei Yan, Roopa Prabhu,
	Nikolay Aleksandrov

On Mon, Dec 05, 2022 at 11:24:39AM +0100, Jakub Sitnicki wrote:
> > Hi Jakub,
> >
> > I have a similar issue with vxlan driver. Similar with commit
> > ad6c9986bcb6 ("vxlan: Fix GRO cells race condition between receive and link
> > delete"). There is still a race condition on vxlan that when receive a packet
> > while deleting a VXLAN device. In vxlan_ecn_decapsulate(), the
> > vxlan_get_sk_family() call panic as sk is NULL.
> >
> > So I'm wondering if we should also have locks in udp_tunnel_sock_release().
> > Or should we add a checking in sk state before calling vxlan_get_sk_family()?
> 
> This is how like to think about it:
> 
> To know when it is safe to load vs->sock->sk->sk_family, we have to ask:
> 
> 1. What ensures that the objects remain alive/valid in our scope?
> 2. What protects the objects from being mutated?
> 
> In case of vxlan_sock object in the context of vxlan_ecn_decapsulate():
> 
> 1. We are in an RCU read side section (ip_local_deliver_finish).
> 2. RCU-protected objects are not to be mutated while readers exist.
> 
> The classic "What is RCU, Fundamentally?" article explains it much
> better than I ever could:
> 
> https://lwn.net/Articles/262464/
> 
> As to where the problem lies. I belive udp_tunnel_sock_release() is not
> keeping the (2) promise.
> 
> After unpublishing the sk_user_data, we should wait for any existing
> readers accessing the vxlan_sock to finish with synchronize_rcu(),
> before releaseing the socket.
> 
> That is:
> 
> --- a/net/ipv4/udp_tunnel_core.c
> +++ b/net/ipv4/udp_tunnel_core.c
> @@ -176,6 +176,7 @@ EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb);
>  void udp_tunnel_sock_release(struct socket *sock)
>  {
>         rcu_assign_sk_user_data(sock->sk, NULL);
> +       synchronize_rcu();
>         kernel_sock_shutdown(sock, SHUT_RDWR);
>         sock_release(sock);
>  }
> 
> 
> Otherwise accessing vxlan_sock state doesn't look safe to me.

Hi Jakub,

Thanks for your explain. As it's a little on my side, I will read your
comments and try your suggestion tomorrow. Currently, I use the following
draft patch to fix the vxlan issue.

diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 2122747a0224..53259b0b07f3 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -4234,6 +4234,7 @@ static void vxlan_dellink(struct net_device *dev, struct list_head *head)
        struct vxlan_dev *vxlan = netdev_priv(dev);

        vxlan_flush(vxlan, true);
+       vxlan_sock_release(vxlan);

        list_del(&vxlan->next);
        unregister_netdevice_queue(dev, head);


Cheers
Hangbin

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-12-05 12:38 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-14 19:16 [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock Jakub Sitnicki
2022-11-15 11:12 ` Tom Parkin
2022-11-16 13:30 ` patchwork-bot+netdevbpf
2022-11-17  9:07   ` Eric Dumazet
2022-11-17  9:35     ` Jakub Sitnicki
2022-11-17  9:54       ` Eric Dumazet
2022-11-17  9:40     ` Eric Dumazet
2022-11-17  9:55       ` Jakub Sitnicki
2022-11-18 10:28         ` Eric Dumazet
2022-11-18 10:57           ` Jakub Sitnicki
2022-11-18 11:09             ` Eric Dumazet
2022-11-19 13:04               ` Jakub Sitnicki
2022-12-02  9:50 ` Hangbin Liu
2022-12-05 10:24   ` Jakub Sitnicki
2022-12-05 12:37     ` Hangbin Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).