netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] vrf: Fix lockdep splat in output path
@ 2023-07-15 15:36 Ido Schimmel
  2023-07-15 17:20 ` David Ahern
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ido Schimmel @ 2023-07-15 15:36 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, pabeni, edumazet, dsahern, naresh.kamboju, Ido Schimmel

Cited commit converted the neighbour code to use the standard RCU
variant instead of the RCU-bh variant, but the VRF code still uses
rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup
code in its IPv4 and IPv6 output paths, resulting in lockdep splats
[1][2]. Can be reproduced using [3].

Fix by switching to rcu_read_lock() / rcu_read_unlock().

[1]
=============================
WARNING: suspicious RCU usage
6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted
-----------------------------
include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by ping/183:
 #0: ffff888105ea1d80 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xc6c/0x33c0
 #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output+0x2e3/0x2030

stack backtrace:
CPU: 0 PID: 183 Comm: ping Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xc1/0xf0
 lockdep_rcu_suspicious+0x211/0x3b0
 vrf_output+0x1380/0x2030
 ip_push_pending_frames+0x125/0x2a0
 raw_sendmsg+0x200d/0x33c0
 inet_sendmsg+0xa2/0xe0
 __sys_sendto+0x2aa/0x420
 __x64_sys_sendto+0xe5/0x1c0
 do_syscall_64+0x38/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

[2]
=============================
WARNING: suspicious RCU usage
6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted
-----------------------------
include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by ping6/182:
 #0: ffff888114b63000 (sk_lock-AF_INET6){+.+.}-{0:0}, at: rawv6_sendmsg+0x1602/0x3e50
 #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output6+0xe9/0x1310

stack backtrace:
CPU: 0 PID: 182 Comm: ping6 Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xc1/0xf0
 lockdep_rcu_suspicious+0x211/0x3b0
 vrf_output6+0xd32/0x1310
 ip6_local_out+0xb4/0x1a0
 ip6_send_skb+0xbc/0x340
 ip6_push_pending_frames+0xe5/0x110
 rawv6_sendmsg+0x2e6e/0x3e50
 inet_sendmsg+0xa2/0xe0
 __sys_sendto+0x2aa/0x420
 __x64_sys_sendto+0xe5/0x1c0
 do_syscall_64+0x38/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

[3]
#!/bin/bash

ip link add name vrf-red up numtxqueues 2 type vrf table 10
ip link add name swp1 up master vrf-red type dummy
ip address add 192.0.2.1/24 dev swp1
ip address add 2001:db8:1::1/64 dev swp1
ip neigh add 192.0.2.2 lladdr 00:11:22:33:44:55 nud perm dev swp1
ip neigh add 2001:db8:1::2 lladdr 00:11:22:33:44:55 nud perm dev swp1
ip vrf exec vrf-red ping 192.0.2.2 -c 1 &> /dev/null
ip vrf exec vrf-red ping6 2001:db8:1::2 -c 1 &> /dev/null

Fixes: 09eed1192cec ("neighbour: switch to standard rcu, instead of rcu_bh")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/netdev/CA+G9fYtEr-=GbcXNDYo3XOkwR+uYgehVoDjsP0pFLUpZ_AZcyg@mail.gmail.com/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
Using the "Link" tag instead of "Closes" since there are two reports in
the link, but I can only reproduce the second.

I believe that the rcu_read_lock_bh() / rcu_read_unlock_bh() in
vrf_finish_direct() can be removed since dev_queue_xmit_nit() uses
rcu_read_lock() / rcu_read_unlock(). I will send a patch to net-next
after confirming it.
---
 drivers/net/vrf.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index bdb3a76a352e..6043e63b42f9 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -664,7 +664,7 @@ static int vrf_finish_output6(struct net *net, struct sock *sk,
 	skb->protocol = htons(ETH_P_IPV6);
 	skb->dev = dev;
 
-	rcu_read_lock_bh();
+	rcu_read_lock();
 	nexthop = rt6_nexthop((struct rt6_info *)dst, &ipv6_hdr(skb)->daddr);
 	neigh = __ipv6_neigh_lookup_noref(dst->dev, nexthop);
 	if (unlikely(!neigh))
@@ -672,10 +672,10 @@ static int vrf_finish_output6(struct net *net, struct sock *sk,
 	if (!IS_ERR(neigh)) {
 		sock_confirm_neigh(skb, neigh);
 		ret = neigh_output(neigh, skb, false);
-		rcu_read_unlock_bh();
+		rcu_read_unlock();
 		return ret;
 	}
-	rcu_read_unlock_bh();
+	rcu_read_unlock();
 
 	IP6_INC_STATS(dev_net(dst->dev),
 		      ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
@@ -889,7 +889,7 @@ static int vrf_finish_output(struct net *net, struct sock *sk, struct sk_buff *s
 		}
 	}
 
-	rcu_read_lock_bh();
+	rcu_read_lock();
 
 	neigh = ip_neigh_for_gw(rt, skb, &is_v6gw);
 	if (!IS_ERR(neigh)) {
@@ -898,11 +898,11 @@ static int vrf_finish_output(struct net *net, struct sock *sk, struct sk_buff *s
 		sock_confirm_neigh(skb, neigh);
 		/* if crossing protocols, can not use the cached header */
 		ret = neigh_output(neigh, skb, is_v6gw);
-		rcu_read_unlock_bh();
+		rcu_read_unlock();
 		return ret;
 	}
 
-	rcu_read_unlock_bh();
+	rcu_read_unlock();
 	vrf_tx_error(skb->dev, skb);
 	return -EINVAL;
 }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] vrf: Fix lockdep splat in output path
  2023-07-15 15:36 [PATCH net] vrf: Fix lockdep splat in output path Ido Schimmel
@ 2023-07-15 17:20 ` David Ahern
  2023-07-17 16:20 ` Eric Dumazet
  2023-07-18 11:00 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: David Ahern @ 2023-07-15 17:20 UTC (permalink / raw)
  To: Ido Schimmel, netdev; +Cc: davem, kuba, pabeni, edumazet, naresh.kamboju

On 7/15/23 9:36 AM, Ido Schimmel wrote:
> Cited commit converted the neighbour code to use the standard RCU
> variant instead of the RCU-bh variant, but the VRF code still uses
> rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup
> code in its IPv4 and IPv6 output paths, resulting in lockdep splats
> [1][2]. Can be reproduced using [3].
> 
> Fix by switching to rcu_read_lock() / rcu_read_unlock().
> 

...

> [3]
> #!/bin/bash
> 
> ip link add name vrf-red up numtxqueues 2 type vrf table 10
> ip link add name swp1 up master vrf-red type dummy
> ip address add 192.0.2.1/24 dev swp1
> ip address add 2001:db8:1::1/64 dev swp1
> ip neigh add 192.0.2.2 lladdr 00:11:22:33:44:55 nud perm dev swp1
> ip neigh add 2001:db8:1::2 lladdr 00:11:22:33:44:55 nud perm dev swp1
> ip vrf exec vrf-red ping 192.0.2.2 -c 1 &> /dev/null
> ip vrf exec vrf-red ping6 2001:db8:1::2 -c 1 &> /dev/null
> 
> Fixes: 09eed1192cec ("neighbour: switch to standard rcu, instead of rcu_bh")
> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Link: https://lore.kernel.org/netdev/CA+G9fYtEr-=GbcXNDYo3XOkwR+uYgehVoDjsP0pFLUpZ_AZcyg@mail.gmail.com/
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
> Using the "Link" tag instead of "Closes" since there are two reports in
> the link, but I can only reproduce the second.
> 
> I believe that the rcu_read_lock_bh() / rcu_read_unlock_bh() in
> vrf_finish_direct() can be removed since dev_queue_xmit_nit() uses
> rcu_read_lock() / rcu_read_unlock(). I will send a patch to net-next
> after confirming it.
> ---
>  drivers/net/vrf.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>

Thanks, Ido.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] vrf: Fix lockdep splat in output path
  2023-07-15 15:36 [PATCH net] vrf: Fix lockdep splat in output path Ido Schimmel
  2023-07-15 17:20 ` David Ahern
@ 2023-07-17 16:20 ` Eric Dumazet
  2023-07-18 11:00 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2023-07-17 16:20 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, davem, kuba, pabeni, dsahern, naresh.kamboju

On Sat, Jul 15, 2023 at 5:37 PM Ido Schimmel <idosch@nvidia.com> wrote:
>
> Cited commit converted the neighbour code to use the standard RCU
> variant instead of the RCU-bh variant, but the VRF code still uses
> rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup
> code in its IPv4 and IPv6 output paths, resulting in lockdep splats
> [1][2]. Can be reproduced using [3].
>
> Fix by switching to rcu_read_lock() / rcu_read_unlock().

We might have other bugs like that.

We should get rid of all rcu_read_lock_bh()

Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] vrf: Fix lockdep splat in output path
  2023-07-15 15:36 [PATCH net] vrf: Fix lockdep splat in output path Ido Schimmel
  2023-07-15 17:20 ` David Ahern
  2023-07-17 16:20 ` Eric Dumazet
@ 2023-07-18 11:00 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-07-18 11:00 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, pabeni, edumazet, dsahern, naresh.kamboju

Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Sat, 15 Jul 2023 18:36:05 +0300 you wrote:
> Cited commit converted the neighbour code to use the standard RCU
> variant instead of the RCU-bh variant, but the VRF code still uses
> rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup
> code in its IPv4 and IPv6 output paths, resulting in lockdep splats
> [1][2]. Can be reproduced using [3].
> 
> Fix by switching to rcu_read_lock() / rcu_read_unlock().
> 
> [...]

Here is the summary with links:
  - [net] vrf: Fix lockdep splat in output path
    https://git.kernel.org/netdev/net/c/2033ab90380d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-07-18 11:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-15 15:36 [PATCH net] vrf: Fix lockdep splat in output path Ido Schimmel
2023-07-15 17:20 ` David Ahern
2023-07-17 16:20 ` Eric Dumazet
2023-07-18 11:00 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).