All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
@ 2022-03-11 15:03 Sebastian Andrzej Siewior
  2022-03-14 10:40 ` patchwork-bot+netdevbpf
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-11 15:03 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Eric Dumazet, Thomas Gleixner,
	Peter Zijlstra

____napi_schedule() needs to be invoked with disabled interrupts due to
__raise_softirq_irqoff (in order not to corrupt the per-CPU list).
____napi_schedule() needs also to be invoked from an interrupt context
so that the raised-softirq is processed while the interrupt context is
left.

Add lockdep asserts for both conditions.
While this is the second time the irq/softirq check is needed, provide a
generic lockdep_assert_softirq_will_run() which is used by both caller.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
This is my todo from
   https://lkml.kernel.org/r/20220203170901.52ccfd09@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com

- It was suggested to add this to __napi_schedule(),
  __napi_schedule_irqoff but both end up here in ____napi_schedule().

- It was suggested to do lockdep_assert_softirq_will_run(). Done plus
  moved the other caller.

While adding this, I stumbled over lockdep_assert_in_softirq(). This
really special casing things and builds upon assumptions. It took me a
while to figure what is going on. I would suggest to remove it and as a
replacement add lockdep annotations (as in spin_acquire()) around
`napi_alloc_cache' access which is the thing the annotation cares about.
And then the lockdep_assert_in_softirq() can be replaced with a
might_lock() so in the end we know why do what we do. Lockdep will yell
if the lock has been observed in-hardirq and in-softirq without
disabling interrupts.
Sounds good?

---
 include/linux/lockdep.h | 7 +++++++
 net/core/dev.c          | 5 ++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 467b94257105e..0cc65d2167015 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -329,6 +329,12 @@ extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie);
 
 #define lockdep_assert_none_held_once()		\
 	lockdep_assert_once(!current->lockdep_depth)
+/*
+ * Ensure that softirq is handled within the callchain and not delayed and
+ * handled by chance.
+ */
+#define lockdep_assert_softirq_will_run()	\
+	lockdep_assert_once(hardirq_count() | softirq_count())
 
 #define lockdep_recursing(tsk)	((tsk)->lockdep_recursion)
 
@@ -414,6 +420,7 @@ extern int lockdep_is_held(const void *);
 #define lockdep_assert_held_read(l)		do { (void)(l); } while (0)
 #define lockdep_assert_held_once(l)		do { (void)(l); } while (0)
 #define lockdep_assert_none_held_once()	do { } while (0)
+#define lockdep_assert_softirq_will_run()	do { } while (0)
 
 #define lockdep_recursing(tsk)			(0)
 
diff --git a/net/core/dev.c b/net/core/dev.c
index ba69ddf85af6b..dbda85879f6c1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4267,6 +4267,9 @@ static inline void ____napi_schedule(struct softnet_data *sd,
 {
 	struct task_struct *thread;
 
+	lockdep_assert_softirq_will_run();
+	lockdep_assert_irqs_disabled();
+
 	if (test_bit(NAPI_STATE_THREADED, &napi->state)) {
 		/* Paired with smp_mb__before_atomic() in
 		 * napi_enable()/dev_set_threaded().
@@ -4874,7 +4877,7 @@ int __netif_rx(struct sk_buff *skb)
 {
 	int ret;
 
-	lockdep_assert_once(hardirq_count() | softirq_count());
+	lockdep_assert_softirq_will_run();
 
 	trace_netif_rx_entry(skb);
 	ret = netif_rx_internal(skb);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
  2022-03-11 15:03 [PATCH net-next] net: Add lockdep asserts to ____napi_schedule() Sebastian Andrzej Siewior
@ 2022-03-14 10:40 ` patchwork-bot+netdevbpf
  2022-03-17 19:21 ` Saeed Mahameed
  2022-03-18  1:48 ` Jason A. Donenfeld
  2 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-03-14 10:40 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: netdev, davem, kuba, edumazet, tglx, peterz

Hello:

This patch was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Fri, 11 Mar 2022 16:03:42 +0100 you wrote:
> ____napi_schedule() needs to be invoked with disabled interrupts due to
> __raise_softirq_irqoff (in order not to corrupt the per-CPU list).
> ____napi_schedule() needs also to be invoked from an interrupt context
> so that the raised-softirq is processed while the interrupt context is
> left.
> 
> Add lockdep asserts for both conditions.
> While this is the second time the irq/softirq check is needed, provide a
> generic lockdep_assert_softirq_will_run() which is used by both caller.
> 
> [...]

Here is the summary with links:
  - [net-next] net: Add lockdep asserts to ____napi_schedule().
    https://git.kernel.org/netdev/net-next/c/fbd9a2ceba5c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
  2022-03-11 15:03 [PATCH net-next] net: Add lockdep asserts to ____napi_schedule() Sebastian Andrzej Siewior
  2022-03-14 10:40 ` patchwork-bot+netdevbpf
@ 2022-03-17 19:21 ` Saeed Mahameed
  2022-03-18 10:05   ` Sebastian Andrzej Siewior
  2022-03-18  1:48 ` Jason A. Donenfeld
  2 siblings, 1 reply; 9+ messages in thread
From: Saeed Mahameed @ 2022-03-17 19:21 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: netdev, David S. Miller, Jakub Kicinski, Eric Dumazet,
	Thomas Gleixner, Peter Zijlstra

On 11 Mar 16:03, Sebastian Andrzej Siewior wrote:
>____napi_schedule() needs to be invoked with disabled interrupts due to
>__raise_softirq_irqoff (in order not to corrupt the per-CPU list).

This patch is likely causing the following call trace when RPS is enabled:

  [  690.429122] WARNING: CPU: 0 PID: 0 at net/core/dev.c:4268 rps_trigger_softirq+0x21/0xb0
  [  690.431236] Modules linked in: bonding ib_ipoib ipip tunnel4 geneve ib_umad ip6_gre ip6_tunnel tunnel6 rdma_ucm nf_tables ip_gre gre mlx5_ib ib_uverbs mlx5_core iptable_raw openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink rpcrdma ib_iser xt_addrtype libiscsi scsi_transport_iscsi rdma_cm iw_cm iptable_nat nf_nat ib_cm br_netfilter ib_core overlay fuse [last unloaded: ib_uverbs]
  [  690.439693] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc7_net_next_4303f9c #1
  [  690.441709] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  [  690.444587] RIP: 0010:rps_trigger_softirq+0x21/0xb0
  [  690.445971] Code: ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 b1 ea 21 01 53 48 89 fb 85 c0 74 1b 65 8b 05 16 56 71 7e a9 00 ff 0f 00 75 02 <0f> 0b 65 8b 05 4e 5f 72 7e 85 c0 74 5b 48 8b 83 e0 01 00 00 f6 c4
  [  690.450682] RSP: 0018:ffffffff82803e70 EFLAGS: 00010046
  [  690.452106] RAX: 0000000000000001 RBX: ffff88852ca3d400 RCX: ffff88852ca3d540
  [  690.453972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88852ca3d400
  [  690.455860] RBP: 0000000000000000 R08: ffff88852ca3d400 R09: 0000000000000001
  [  690.457684] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
  [  690.459548] R13: ffff88852ca3d540 R14: ffffffff82829628 R15: 0000000000000000
  [  690.461429] FS:  0000000000000000(0000) GS:ffff88852ca00000(0000) knlGS:0000000000000000
  [  690.463653] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  690.465180] CR2: 00007ff718200b98 CR3: 000000013b4de003 CR4: 0000000000370eb0
  [  690.467022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [  690.468915] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [  690.470742] Call Trace:
  [  690.471544]  <TASK>
  [  690.472242]  flush_smp_call_function_queue+0xe5/0x1e0
  [  690.473639]  flush_smp_call_function_from_idle+0x5f/0xa0


For some reason - that i haven't looked into yet -
net_rps_send_ipi() will eventually ____napi_schedule()
only after enabling IRQ. see net_rps_action_and_irq_enable()


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
  2022-03-11 15:03 [PATCH net-next] net: Add lockdep asserts to ____napi_schedule() Sebastian Andrzej Siewior
  2022-03-14 10:40 ` patchwork-bot+netdevbpf
  2022-03-17 19:21 ` Saeed Mahameed
@ 2022-03-18  1:48 ` Jason A. Donenfeld
  2022-03-18 10:57   ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 9+ messages in thread
From: Jason A. Donenfeld @ 2022-03-18  1:48 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: netdev, David S. Miller, Jakub Kicinski, Eric Dumazet,
	Thomas Gleixner, Peter Zijlstra, toke

Hi Sebastian,

On Fri, Mar 11, 2022 at 04:03:42PM +0100, Sebastian Andrzej Siewior wrote:
> ____napi_schedule() needs to be invoked with disabled interrupts due to
> __raise_softirq_irqoff (in order not to corrupt the per-CPU list).
> ____napi_schedule() needs also to be invoked from an interrupt context
> so that the raised-softirq is processed while the interrupt context is
> left.
> 
> Add lockdep asserts for both conditions.
> While this is the second time the irq/softirq check is needed, provide a
> generic lockdep_assert_softirq_will_run() which is used by both caller.

I stumbled upon this commit when noticing a new failure in WireGuard's
test suite:

[    1.338823] ------------[ cut here ]------------
[    1.339289] WARNING: CPU: 0 PID: 11 at ../../../../../../../../net/core/dev.c:4268 __napi_schedule+0xa1/0x300
[    1.340222] CPU: 0 PID: 11 Comm: kworker/0:1 Not tainted 5.17.0-rc8-debug+ #1
[    1.340896] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS d55cb5a 04/01/2014
[    1.341669] Workqueue: wg-crypt-wg0 wg_packet_decrypt_worker
[    1.342207] RIP: 0010:__napi_schedule+0xa1/0x300
[    1.342655] Code: c0 03 0f b6 14 11 38 d0 7c 08 84 d2 0f 85 eb 01 00 00 8b 05 cd a9 0d 01 85 c0 74 1f 65 8b 05 d6 87 7d 7e a9 00 ff 0f 00 75 02 <0f> 0b 65 8b 05 96 8e 7d 7e 85 c0 0f 84 86 01 00 00 4c 8d 73 10 be
[    1.344366] RSP: 0018:ffff888004bc7c98 EFLAGS: 00010046
[    1.344861] RAX: 0000000080000000 RBX: ffff888007570750 RCX: 1ffffffff05251e5
[    1.345532] RDX: 0000000000000000 RSI: ffffffff822e1060 RDI: ffffffff8244c700
[    1.346189] RBP: ffff888036400000 R08: 0000000000000001 R09: ffff888007570767
[    1.346847] R10: ffffed1000eae0ec R11: 0000000000000001 R12: 0000000000000200
[    1.347504] R13: 00000000000364c0 R14: ffff8880078231c0 R15: ffff888007570750
[    1.348193] FS:  0000000000000000(0000) GS:ffff888036400000(0000) knlGS:0000000000000000
[    1.348973] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.349505] CR2: 00007ffec7b8ed3c CR3: 0000000002625005 CR4: 0000000000370eb0
[    1.350207] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.350921] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    1.351587] Call Trace:
[    1.351822]  <TASK>
[    1.352026]  ? napi_schedule_prep+0x37/0x90
[    1.352417]  wg_packet_decrypt_worker+0x2ac/0x470
[    1.352859]  ? lock_is_held_type+0xd7/0x130
[    1.353251]  process_one_work+0x839/0x1380
[    1.353651]  ? rcu_read_unlock+0x40/0x40
[    1.354023]  ? pwq_dec_nr_in_flight+0x230/0x230
[    1.354448]  ? __rwlock_init+0x140/0x140
[    1.354826]  worker_thread+0x593/0xf60
[    1.355180]  ? process_one_work+0x1380/0x1380
[    1.355593]  ? process_one_work+0x1380/0x1380
[    1.356002]  kthread+0x262/0x300
[    1.356308]  ? kthread_exit+0xc0/0xc0
[    1.356656]  ret_from_fork+0x1f/0x30
[    1.357011]  </TASK>

Sounds like wg_packet_decrypt_worker() might be doing something wrong? I
vaguely recall a thread where you started looking into some things there
that seemed non-optimal, but I didn't realize there were correctness
issues. If your patch is correct, and wg_packet_decrypt_worker() is
wrong, do you have a concrete idea of how we should approach fixing
wireguard? Or do you want to send a patch for that?

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
  2022-03-17 19:21 ` Saeed Mahameed
@ 2022-03-18 10:05   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-18 10:05 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: netdev, David S. Miller, Jakub Kicinski, Eric Dumazet,
	Thomas Gleixner, Peter Zijlstra

On 2022-03-17 12:21:45 [-0700], Saeed Mahameed wrote:
> On 11 Mar 16:03, Sebastian Andrzej Siewior wrote:
> > ____napi_schedule() needs to be invoked with disabled interrupts due to
> > __raise_softirq_irqoff (in order not to corrupt the per-CPU list).
> 
> This patch is likely causing the following call trace when RPS is enabled:
> 
>  [  690.429122] WARNING: CPU: 0 PID: 0 at net/core/dev.c:4268 rps_trigger_softirq+0x21/0xb0
>  [  690.431236] Modules linked in: bonding ib_ipoib ipip tunnel4 geneve ib_umad ip6_gre ip6_tunnel tunnel6 rdma_ucm nf_tables ip_gre gre mlx5_ib ib_uverbs mlx5_core iptable_raw openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink rpcrdma ib_iser xt_addrtype libiscsi scsi_transport_iscsi rdma_cm iw_cm iptable_nat nf_nat ib_cm br_netfilter ib_core overlay fuse [last unloaded: ib_uverbs]
>  [  690.439693] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc7_net_next_4303f9c #1
>  [  690.441709] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>  [  690.444587] RIP: 0010:rps_trigger_softirq+0x21/0xb0
>  [  690.445971] Code: ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 b1 ea 21 01 53 48 89 fb 85 c0 74 1b 65 8b 05 16 56 71 7e a9 00 ff 0f 00 75 02 <0f> 0b 65 8b 05 4e 5f 72 7e 85 c0 74 5b 48 8b 83 e0 01 00 00 f6 c4
>  [  690.450682] RSP: 0018:ffffffff82803e70 EFLAGS: 00010046
>  [  690.452106] RAX: 0000000000000001 RBX: ffff88852ca3d400 RCX: ffff88852ca3d540
>  [  690.453972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88852ca3d400
>  [  690.455860] RBP: 0000000000000000 R08: ffff88852ca3d400 R09: 0000000000000001
>  [  690.457684] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
>  [  690.459548] R13: ffff88852ca3d540 R14: ffffffff82829628 R15: 0000000000000000
>  [  690.461429] FS:  0000000000000000(0000) GS:ffff88852ca00000(0000) knlGS:0000000000000000
>  [  690.463653] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  [  690.465180] CR2: 00007ff718200b98 CR3: 000000013b4de003 CR4: 0000000000370eb0
>  [  690.467022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  [  690.468915] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  [  690.470742] Call Trace:
>  [  690.471544]  <TASK>
>  [  690.472242]  flush_smp_call_function_queue+0xe5/0x1e0
>  [  690.473639]  flush_smp_call_function_from_idle+0x5f/0xa0
> 
> 
> For some reason - that i haven't looked into yet -
> net_rps_send_ipi() will eventually ____napi_schedule()
> only after enabling IRQ. see net_rps_action_and_irq_enable()

Perfect. There is a do_softirq() in flush_smp_call_function_from_idle()
it so it fine.
PeterZ any idea in how to shut lockdep here? Playing with the preemption
counter will do the trick… I'm worried that by disabling BH
unconditionally here it will need to be done by the upper caller and
this in turn will force a BH-flush on PREEMPT_RT. While it looks
harmless in the idle case, it looks bad for migration_cpu_stop().

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
  2022-03-18  1:48 ` Jason A. Donenfeld
@ 2022-03-18 10:57   ` Sebastian Andrzej Siewior
  2022-03-18 18:19     ` Jason A. Donenfeld
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-18 10:57 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: netdev, David S. Miller, Jakub Kicinski, Eric Dumazet,
	Thomas Gleixner, Peter Zijlstra, toke

On 2022-03-17 19:48:51 [-0600], Jason A. Donenfeld wrote:
> Hi Sebastian,
Hi Jason,

> I stumbled upon this commit when noticing a new failure in WireGuard's
> test suite:
> [    1.339289] WARNING: CPU: 0 PID: 11 at ../../../../../../../../net/core/dev.c:4268 __napi_schedule+0xa1/0x300
> [    1.352417]  wg_packet_decrypt_worker+0x2ac/0x470
> Sounds like wg_packet_decrypt_worker() might be doing something wrong? I
> vaguely recall a thread where you started looking into some things there
> that seemed non-optimal, but I didn't realize there were correctness
> issues. If your patch is correct, and wg_packet_decrypt_worker() is
> wrong, do you have a concrete idea of how we should approach fixing
> wireguard? Or do you want to send a patch for that?

In your case it is "okay" since that ptr_ring_consume_bh() will do BH
disable/enable which forces the softirq to run. It is not obvious. What
about the following:

diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
index 7b8df406c7737..26ffa3afa542e 100644
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -502,15 +502,21 @@ void wg_packet_decrypt_worker(struct work_struct *work)
 	struct crypt_queue *queue = container_of(work, struct multicore_worker,
 						 work)->ptr;
 	struct sk_buff *skb;
+	unsigned int packets = 0;
 
-	while ((skb = ptr_ring_consume_bh(&queue->ring)) != NULL) {
+	local_bh_disable();
+	while ((skb = ptr_ring_consume(&queue->ring)) != NULL) {
 		enum packet_state state =
 			likely(decrypt_packet(skb, PACKET_CB(skb)->keypair)) ?
 				PACKET_STATE_CRYPTED : PACKET_STATE_DEAD;
 		wg_queue_enqueue_per_peer_rx(skb, state);
-		if (need_resched())
+		if (!(++packets % 4)) {
+			local_bh_enable();
 			cond_resched();
+			local_bh_disable();
+		}
 	}
+	local_bh_enable();
 }
 
 static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)

It would decrypt 4 packets in a row and then after local_bh_enable() it
would invoke wg_packet_rx_poll() (assuming since it is the only napi
handler in wireguard) and after that it will attempt cond_resched() and
then continue with the next batch.

> Jason

Sebastian

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
  2022-03-18 10:57   ` Sebastian Andrzej Siewior
@ 2022-03-18 18:19     ` Jason A. Donenfeld
  2022-03-18 18:59       ` Jakub Kicinski
  0 siblings, 1 reply; 9+ messages in thread
From: Jason A. Donenfeld @ 2022-03-18 18:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Netdev, David S. Miller, Jakub Kicinski, Eric Dumazet,
	Thomas Gleixner, Peter Zijlstra,
	Toke Høiland-Jørgensen

Hi Sebastian,

On Fri, Mar 18, 2022 at 4:57 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> > Hi Sebastian,
> Hi Jason,
>
> > I stumbled upon this commit when noticing a new failure in WireGuard's
> > test suite:
> …
> > [    1.339289] WARNING: CPU: 0 PID: 11 at ../../../../../../../../net/core/dev.c:4268 __napi_schedule+0xa1/0x300
> …
> > [    1.352417]  wg_packet_decrypt_worker+0x2ac/0x470
> …
> > Sounds like wg_packet_decrypt_worker() might be doing something wrong? I
> > vaguely recall a thread where you started looking into some things there
> > that seemed non-optimal, but I didn't realize there were correctness
> > issues. If your patch is correct, and wg_packet_decrypt_worker() is
> > wrong, do you have a concrete idea of how we should approach fixing
> > wireguard? Or do you want to send a patch for that?
>
> In your case it is "okay" since that ptr_ring_consume_bh() will do BH
> disable/enable which forces the softirq to run. It is not obvious.

In that case, isn't the lockdep assertion you added wrong and should
be reverted? If correct code is hitting it, something seems wrong...

> What
> about the following:
>
> diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
> index 7b8df406c7737..26ffa3afa542e 100644
> --- a/drivers/net/wireguard/receive.c
> +++ b/drivers/net/wireguard/receive.c
> @@ -502,15 +502,21 @@ void wg_packet_decrypt_worker(struct work_struct *work)
>         struct crypt_queue *queue = container_of(work, struct multicore_worker,
>                                                  work)->ptr;
>         struct sk_buff *skb;
> +       unsigned int packets = 0;
>
> -       while ((skb = ptr_ring_consume_bh(&queue->ring)) != NULL) {
> +       local_bh_disable();
> +       while ((skb = ptr_ring_consume(&queue->ring)) != NULL) {
>                 enum packet_state state =
>                         likely(decrypt_packet(skb, PACKET_CB(skb)->keypair)) ?
>                                 PACKET_STATE_CRYPTED : PACKET_STATE_DEAD;
>                 wg_queue_enqueue_per_peer_rx(skb, state);
> -               if (need_resched())
> +               if (!(++packets % 4)) {
> +                       local_bh_enable();
>                         cond_resched();
> +                       local_bh_disable();
> +               }
>         }
> +       local_bh_enable();
>  }
>
>  static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)
>
> It would decrypt 4 packets in a row and then after local_bh_enable() it
> would invoke wg_packet_rx_poll() (assuming since it is the only napi
> handler in wireguard) and after that it will attempt cond_resched() and
> then continue with the next batch.

I'm willing to consider batching and all sorts of heuristics in there,
though probably for 5.19 rather than 5.18. Indeed there's some
interesting optimization work to be done. But if you want to propose a
change like this, can you send some benchmarks with it, preferably
taken with something like flent so we can see if it negatively affects
latency?

Regards,
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
  2022-03-18 18:19     ` Jason A. Donenfeld
@ 2022-03-18 18:59       ` Jakub Kicinski
  2022-03-19  0:41         ` Jason A. Donenfeld
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2022-03-18 18:59 UTC (permalink / raw)
  To: Jason A. Donenfeld, Sebastian Andrzej Siewior
  Cc: Netdev, David S. Miller, Eric Dumazet, Thomas Gleixner,
	Peter Zijlstra, Toke Høiland-Jørgensen

On Fri, 18 Mar 2022 12:19:45 -0600 Jason A. Donenfeld wrote:
> > In your case it is "okay" since that ptr_ring_consume_bh() will do BH
> > disable/enable which forces the softirq to run. It is not obvious.  
> 
> In that case, isn't the lockdep assertion you added wrong and should
> be reverted? If correct code is hitting it, something seems wrong...

FWIW I'd lean towards revert as well, I can't think of a simple
fix that won't require work arounds in callers.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
  2022-03-18 18:59       ` Jakub Kicinski
@ 2022-03-19  0:41         ` Jason A. Donenfeld
  0 siblings, 0 replies; 9+ messages in thread
From: Jason A. Donenfeld @ 2022-03-19  0:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Sebastian Andrzej Siewior, Netdev, David S. Miller, Eric Dumazet,
	Thomas Gleixner, Peter Zijlstra,
	Toke Høiland-Jørgensen

Hey Jakub,

On Fri, Mar 18, 2022 at 12:59 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 18 Mar 2022 12:19:45 -0600 Jason A. Donenfeld wrote:
> > > In your case it is "okay" since that ptr_ring_consume_bh() will do BH
> > > disable/enable which forces the softirq to run. It is not obvious.
> >
> > In that case, isn't the lockdep assertion you added wrong and should
> > be reverted? If correct code is hitting it, something seems wrong...
>
> FWIW I'd lean towards revert as well, I can't think of a simple
> fix that won't require work arounds in callers.

I just got an email from syzbot about this too:
https://lore.kernel.org/wireguard/0000000000000eaff805da869d5b@google.com/

And no word from Sebastian, and now it's the weekend, so I suspect
you're probably right. I'll send in a revert.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-03-19  0:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-11 15:03 [PATCH net-next] net: Add lockdep asserts to ____napi_schedule() Sebastian Andrzej Siewior
2022-03-14 10:40 ` patchwork-bot+netdevbpf
2022-03-17 19:21 ` Saeed Mahameed
2022-03-18 10:05   ` Sebastian Andrzej Siewior
2022-03-18  1:48 ` Jason A. Donenfeld
2022-03-18 10:57   ` Sebastian Andrzej Siewior
2022-03-18 18:19     ` Jason A. Donenfeld
2022-03-18 18:59       ` Jakub Kicinski
2022-03-19  0:41         ` Jason A. Donenfeld

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.