bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
@ 2023-03-08 22:07 John Hickey
  2023-03-25 18:29 ` Rout, ChandanX
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: John Hickey @ 2023-03-08 22:07 UTC (permalink / raw)
  To: anthony.l.nguyen
  Cc: John Hickey, Jesse Brandeburg, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Shujin Li, Jason Xing,
	intel-wired-lan, netdev, linux-kernel, bpf

In commit 'ixgbe: let the xdpdrv work with more than 64 cpus'
(4fe815850bdc), support was added to allow XDP programs to run on systems
with more than 64 CPUs by locking the XDP TX rings and indexing them
using cpu % 64 (IXGBE_MAX_XDP_QS).

Upon trying this out patch via the Intel 5.18.6 out of tree driver
on a system with more than 64 cores, the kernel paniced with an
array-index-out-of-bounds at the return in ixgbe_determine_xdp_ring in
ixgbe.h, which means ixgbe_determine_xdp_q_idx was just returning the
cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example splat:

 ==========================================================================
 UBSAN: array-index-out-of-bounds in
 /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
 index 65 is out of range for type 'ixgbe_ring *[64]'
 ==========================================================================
 BUG: kernel NULL pointer dereference, address: 0000000000000058
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] SMP NOPTI
 CPU: 65 PID: 408 Comm: ksoftirqd/65
 Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
 Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
 RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
 Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
 00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
 47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
 RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
 RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
 RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
 RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
 R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
 R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
 FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
 knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  ixgbe_poll+0x103e/0x1280 [ixgbe]
  ? sched_clock_cpu+0x12/0xe0
  __napi_poll+0x30/0x160
  net_rx_action+0x11c/0x270
  __do_softirq+0xda/0x2ee
  run_ksoftirqd+0x2f/0x50
  smpboot_thread_fn+0xb7/0x150
  ? sort_range+0x30/0x30
  kthread+0x127/0x150
  ? set_kthread_struct+0x50/0x50
  ret_from_fork+0x1f/0x30
  </TASK>

I think this is how it happens:

Upon loading the first XDP program on a system with more than 64 CPUs,
ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
immediately after this, the rings are reconfigured by ixgbe_setup_tc.
ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
stopped my system from panicing.

I suspect to make the original patch work, I would need to load an XDP
program and then replace it in order to get ixgbe_xdp_locking_key back
above 0 since ixgbe_setup_tc is only called when transitioning between
XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
incremented every time ixgbe_xdp_setup is called.

Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
with greater than 64 CPUs.

For this patch, I have changed static_branch_inc to static_branch_enable
in ixgbe_setup_xdp.  We weren't counting references.  The
ixgbe_xdp_locking_key only protects code in the XDP_TX path, which is
not run when an XDP program is loaded.  The other condition for setting
it on is the number of CPUs, which I assume is static.

Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
Signed-off-by: John Hickey <jjh@daedalian.us>
---
v1 -> v2:
	Added Fixes and net tag.  No code changes.
v2 -> v3:
	Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
	is not turned off.  Based on feedback from Maciej Fijalkowski.
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index f8156fe4b1dc..0ee943db3dc9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -1035,9 +1035,6 @@ static void ixgbe_free_q_vector(struct ixgbe_adapter *adapter, int v_idx)
 	adapter->q_vector[v_idx] = NULL;
 	__netif_napi_del(&q_vector->napi);
 
-	if (static_key_enabled(&ixgbe_xdp_locking_key))
-		static_branch_dec(&ixgbe_xdp_locking_key);
-
 	/*
 	 * after a call to __netif_napi_del() napi may still be used and
 	 * ixgbe_get_stats64() might access the rings on this vector,
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index ab8370c413f3..cd2fb72c67be 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10283,7 +10283,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
 	if (nr_cpu_ids > IXGBE_MAX_XDP_QS * 2)
 		return -ENOMEM;
 	else if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
-		static_branch_inc(&ixgbe_xdp_locking_key);
+		static_branch_enable(&ixgbe_xdp_locking_key);
 
 	old_prog = xchg(&adapter->xdp_prog, prog);
 	need_reset = (!!prog != !!old_prog);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
  2023-03-08 22:07 [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs John Hickey
@ 2023-03-25 18:29 ` Rout, ChandanX
  2023-03-26 15:03 ` [Intel-wired-lan] " Paul Menzel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Rout, ChandanX @ 2023-03-25 18:29 UTC (permalink / raw)
  To: John Hickey, Nguyen, Anthony L, intel-wired-lan
  Cc: Brandeburg, Jesse, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Shujin Li, Xing, Wanli,
	netdev, linux-kernel, bpf, Kuruvinakunnel, George, Nagraj,
	Shravan, Nagaraju, Shwetha



>-----Original Message-----
>From: John Hickey <jjh@daedalian.us>
>Sent: 09 March 2023 03:38
>To: Nguyen, Anthony L <anthony.l.nguyen@intel.com>
>Cc: John Hickey <jjh@daedalian.us>; Brandeburg, Jesse
><jesse.brandeburg@intel.com>; David S. Miller <davem@davemloft.net>;
>Eric Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>;
>Paolo Abeni <pabeni@redhat.com>; Alexei Starovoitov <ast@kernel.org>;
>Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
><hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>; Shujin Li
><lishujin@kuaishou.com>; Xing, Wanli <xingwanli@kuaishou.com>; intel-
>wired-lan@lists.osuosl.org; netdev@vger.kernel.org; linux-
>kernel@vger.kernel.org; bpf@vger.kernel.org
>Subject: [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
>
>In commit 'ixgbe: let the xdpdrv work with more than 64 cpus'
>(4fe815850bdc), support was added to allow XDP programs to run on systems
>with more than 64 CPUs by locking the XDP TX rings and indexing them using
>cpu % 64 (IXGBE_MAX_XDP_QS).
>
>Upon trying this out patch via the Intel 5.18.6 out of tree driver on a system
>with more than 64 cores, the kernel paniced with an array-index-out-of-
>bounds at the return in ixgbe_determine_xdp_ring in ixgbe.h, which means
>ixgbe_determine_xdp_q_idx was just returning the cpu instead of cpu %
>IXGBE_MAX_XDP_QS.  An example splat:
>
>
>===========================================================
>===============
> UBSAN: array-index-out-of-bounds in
> /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
> index 65 is out of range for type 'ixgbe_ring *[64]'
>
>===========================================================
>===============
> BUG: kernel NULL pointer dereference, address: 0000000000000058
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page  PGD 0 P4D 0
> Oops: 0000 [#1] SMP NOPTI
> CPU: 65 PID: 408 Comm: ksoftirqd/65
> Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
> Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
> RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
> Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
> 00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
> 47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
> RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
> RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
> RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
> RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
> R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
> R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
> FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
>  <TASK>
>  ixgbe_poll+0x103e/0x1280 [ixgbe]
>  ? sched_clock_cpu+0x12/0xe0
>  __napi_poll+0x30/0x160
>  net_rx_action+0x11c/0x270
>  __do_softirq+0xda/0x2ee
>  run_ksoftirqd+0x2f/0x50
>  smpboot_thread_fn+0xb7/0x150
>  ? sort_range+0x30/0x30
>  kthread+0x127/0x150
>  ? set_kthread_struct+0x50/0x50
>  ret_from_fork+0x1f/0x30
>  </TASK>
>
>I think this is how it happens:
>
>Upon loading the first XDP program on a system with more than 64 CPUs,
>ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
>immediately after this, the rings are reconfigured by ixgbe_setup_tc.
>ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
>ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
>ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if it is
>non-zero.  Commenting out the decrement in ixgbe_free_q_vector stopped
>my system from panicing.
>
>I suspect to make the original patch work, I would need to load an XDP
>program and then replace it in order to get ixgbe_xdp_locking_key back
>above 0 since ixgbe_setup_tc is only called when transitioning between XDP
>and non-XDP ring configurations, while ixgbe_xdp_locking_key is
>incremented every time ixgbe_xdp_setup is called.
>
>Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this becomes
>another path to decrement ixgbe_xdp_locking_key to 0 on systems with
>greater than 64 CPUs.
>
>For this patch, I have changed static_branch_inc to static_branch_enable in
>ixgbe_setup_xdp.  We weren't counting references.  The
>ixgbe_xdp_locking_key only protects code in the XDP_TX path, which is not
>run when an XDP program is loaded.  The other condition for setting it on is
>the number of CPUs, which I assume is static.
>
>Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
>Signed-off-by: John Hickey <jjh@daedalian.us>
>---
>v1 -> v2:
>	Added Fixes and net tag.  No code changes.
>v2 -> v3:
>	Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
>	is not turned off.  Based on feedback from Maciej Fijalkowski.
>---
> drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
>drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
> 2 files changed, 1 insertion(+), 4 deletions(-)
>

Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
  2023-03-08 22:07 [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs John Hickey
  2023-03-25 18:29 ` Rout, ChandanX
@ 2023-03-26 15:03 ` Paul Menzel
  2023-03-26 15:10   ` Paul Menzel
  2023-03-27  3:00 ` Jason Xing
  2023-04-05 14:40 ` Maciej Fijalkowski
  3 siblings, 1 reply; 10+ messages in thread
From: Paul Menzel @ 2023-03-26 15:03 UTC (permalink / raw)
  To: John Hickey
  Cc: Tony Nguyen, Shujin Li, Alexei Starovoitov,
	Jesper Dangaard Brouer, Daniel Borkmann, netdev, John Fastabend,
	Jesse Brandeburg, Eric Dumazet, intel-wired-lan, Jakub Kicinski,
	bpf, Paolo Abeni, David S. Miller, linux-kernel, Jason Xing

Dear John,


Thank you for your patch.

I’d recommend, to use a statement in the git commit message/summary by 
adding a verb (in imperative mood). Maybe:

Fix panic during XDP_TX with > 64 CPUs

Am 08.03.23 um 23:07 schrieb John Hickey:
> In commit 'ixgbe: let the xdpdrv work with more than 64 cpus'
> (4fe815850bdc), support was added to allow XDP programs to run on systems

I think it’s more common to write it like:

In commit 4fe815850bdc (ixgbe: let the xdpdrv work with more than 64 cpus) …

Even shorter

Commit 4fe815850bdc (ixgbe: let the xdpdrv work with more than 64 cpus) 
adds support to allow XDP programs …

> with more than 64 CPUs by locking the XDP TX rings and indexing them
> using cpu % 64 (IXGBE_MAX_XDP_QS).
> 
> Upon trying this out patch via the Intel 5.18.6 out of tree driver

Upon trying this patch out via …

> on a system with more than 64 cores, the kernel paniced with an
> array-index-out-of-bounds at the return in ixgbe_determine_xdp_ring in
> ixgbe.h, which means ixgbe_determine_xdp_q_idx was just returning the
> cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example splat:

Please add, that you have UBSAN  enabled, or does it happen without?

> 
>   ==========================================================================
>   UBSAN: array-index-out-of-bounds in
>   /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
>   index 65 is out of range for type 'ixgbe_ring *[64]'
>   ==========================================================================
>   BUG: kernel NULL pointer dereference, address: 0000000000000058
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: 0000 [#1] SMP NOPTI
>   CPU: 65 PID: 408 Comm: ksoftirqd/65
>   Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
>   Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
>   RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
>   Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
>   00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
>   47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0

If you do not it yet, `scripts/decode_stacktrace.sh` helps decoding 
these traces.

>   RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
>   RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
>   RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
>   RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
>   R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
>   R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
>   FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
>   knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   PKRU: 55555554
>   Call Trace:
>    <TASK>
>    ixgbe_poll+0x103e/0x1280 [ixgbe]
>    ? sched_clock_cpu+0x12/0xe0
>    __napi_poll+0x30/0x160
>    net_rx_action+0x11c/0x270
>    __do_softirq+0xda/0x2ee
>    run_ksoftirqd+0x2f/0x50
>    smpboot_thread_fn+0xb7/0x150
>    ? sort_range+0x30/0x30
>    kthread+0x127/0x150
>    ? set_kthread_struct+0x50/0x50
>    ret_from_fork+0x1f/0x30
>    </TASK>
> 
> I think this is how it happens:
> 
> Upon loading the first XDP program on a system with more than 64 CPUs,
> ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
> immediately after this, the rings are reconfigured by ixgbe_setup_tc.
> ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
> ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
> ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
> it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
> stopped my system from panicing.
> 
> I suspect to make the original patch work, I would need to load an XDP
> program and then replace it in order to get ixgbe_xdp_locking_key back
> above 0 since ixgbe_setup_tc is only called when transitioning between
> XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
> incremented every time ixgbe_xdp_setup is called.
> 
> Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
> becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
> with greater than 64 CPUs.

… with more than 64 CPUs.

> For this patch, I have changed static_branch_inc to static_branch_enable
> in ixgbe_setup_xdp.  We weren't counting references.  The
> ixgbe_xdp_locking_key only protects code in the XDP_TX path, which is
> not run when an XDP program is loaded.  The other condition for setting
> it on is the number of CPUs, which I assume is static.
> 
> Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
> Signed-off-by: John Hickey <jjh@daedalian.us>
> ---
> v1 -> v2:
> 	Added Fixes and net tag.  No code changes.
> v2 -> v3:
> 	Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
> 	is not turned off.  Based on feedback from Maciej Fijalkowski.
> ---
>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
>   2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> index f8156fe4b1dc..0ee943db3dc9 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> @@ -1035,9 +1035,6 @@ static void ixgbe_free_q_vector(struct ixgbe_adapter *adapter, int v_idx)
>   	adapter->q_vector[v_idx] = NULL;
>   	__netif_napi_del(&q_vector->napi);
>   
> -	if (static_key_enabled(&ixgbe_xdp_locking_key))
> -		static_branch_dec(&ixgbe_xdp_locking_key);
> -
>   	/*
>   	 * after a call to __netif_napi_del() napi may still be used and
>   	 * ixgbe_get_stats64() might access the rings on this vector,
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index ab8370c413f3..cd2fb72c67be 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10283,7 +10283,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
>   	if (nr_cpu_ids > IXGBE_MAX_XDP_QS * 2)
>   		return -ENOMEM;
>   	else if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
> -		static_branch_inc(&ixgbe_xdp_locking_key);
> +		static_branch_enable(&ixgbe_xdp_locking_key);
>   
>   	old_prog = xchg(&adapter->xdp_prog, prog);
>   	need_reset = (!!prog != !!old_prog);


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
  2023-03-26 15:03 ` [Intel-wired-lan] " Paul Menzel
@ 2023-03-26 15:10   ` Paul Menzel
  2023-03-26 15:13     ` Paul Menzel
  2023-03-27  2:45     ` Jason Xing
  0 siblings, 2 replies; 10+ messages in thread
From: Paul Menzel @ 2023-03-26 15:10 UTC (permalink / raw)
  To: John Hickey
  Cc: Shujin Li, Jesper Dangaard Brouer, Daniel Borkmann, netdev,
	John Fastabend, Alexei Starovoitov, Jesse Brandeburg,
	Eric Dumazet, Tony Nguyen, Jakub Kicinski, intel-wired-lan, bpf,
	Paolo Abeni, David S. Miller, linux-kernel, Jason Xing

[Cc: Remove undeliverable <xingwanli@kuaishou.com>]

Am 26.03.23 um 17:03 schrieb Paul Menzel:
> Dear John,
> 
> 
> Thank you for your patch.
> 
> I’d recommend, to use a statement in the git commit message/summary by 
> adding a verb (in imperative mood). Maybe:
> 
> Fix panic during XDP_TX with > 64 CPUs
> 
> Am 08.03.23 um 23:07 schrieb John Hickey:
>> In commit 'ixgbe: let the xdpdrv work with more than 64 cpus'
>> (4fe815850bdc), support was added to allow XDP programs to run on systems
> 
> I think it’s more common to write it like:
> 
> In commit 4fe815850bdc (ixgbe: let the xdpdrv work with more than 64 
> cpus) …
> 
> Even shorter
> 
> Commit 4fe815850bdc (ixgbe: let the xdpdrv work with more than 64 cpus) 
> adds support to allow XDP programs …
> 
>> with more than 64 CPUs by locking the XDP TX rings and indexing them
>> using cpu % 64 (IXGBE_MAX_XDP_QS).
>>
>> Upon trying this out patch via the Intel 5.18.6 out of tree driver
> 
> Upon trying this patch out via …
> 
>> on a system with more than 64 cores, the kernel paniced with an
>> array-index-out-of-bounds at the return in ixgbe_determine_xdp_ring in
>> ixgbe.h, which means ixgbe_determine_xdp_q_idx was just returning the
>> cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example splat:
> 
> Please add, that you have UBSAN  enabled, or does it happen without?
> 
>>
>>   
>> ==========================================================================
>>   UBSAN: array-index-out-of-bounds in
>>   /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
>>   index 65 is out of range for type 'ixgbe_ring *[64]'
>>   
>> ==========================================================================
>>   BUG: kernel NULL pointer dereference, address: 0000000000000058
>>   #PF: supervisor read access in kernel mode
>>   #PF: error_code(0x0000) - not-present page
>>   PGD 0 P4D 0
>>   Oops: 0000 [#1] SMP NOPTI
>>   CPU: 65 PID: 408 Comm: ksoftirqd/65
>>   Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
>>   Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
>>   RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
>>   Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
>>   00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
>>   47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
> 
> If you do not it yet, `scripts/decode_stacktrace.sh` helps decoding 
> these traces.
> 
>>   RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
>>   RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
>>   RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
>>   RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
>>   R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
>>   R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
>>   FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
>>   knlGS:0000000000000000
>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>   CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>   PKRU: 55555554
>>   Call Trace:
>>    <TASK>
>>    ixgbe_poll+0x103e/0x1280 [ixgbe]
>>    ? sched_clock_cpu+0x12/0xe0
>>    __napi_poll+0x30/0x160
>>    net_rx_action+0x11c/0x270
>>    __do_softirq+0xda/0x2ee
>>    run_ksoftirqd+0x2f/0x50
>>    smpboot_thread_fn+0xb7/0x150
>>    ? sort_range+0x30/0x30
>>    kthread+0x127/0x150
>>    ? set_kthread_struct+0x50/0x50
>>    ret_from_fork+0x1f/0x30
>>    </TASK>
>>
>> I think this is how it happens:
>>
>> Upon loading the first XDP program on a system with more than 64 CPUs,
>> ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
>> immediately after this, the rings are reconfigured by ixgbe_setup_tc.
>> ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
>> ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
>> ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
>> it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
>> stopped my system from panicing.
>>
>> I suspect to make the original patch work, I would need to load an XDP
>> program and then replace it in order to get ixgbe_xdp_locking_key back
>> above 0 since ixgbe_setup_tc is only called when transitioning between
>> XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
>> incremented every time ixgbe_xdp_setup is called.
>>
>> Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
>> becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
>> with greater than 64 CPUs.
> 
> … with more than 64 CPUs.
> 
>> For this patch, I have changed static_branch_inc to static_branch_enable
>> in ixgbe_setup_xdp.  We weren't counting references.  The
>> ixgbe_xdp_locking_key only protects code in the XDP_TX path, which is
>> not run when an XDP program is loaded.  The other condition for setting
>> it on is the number of CPUs, which I assume is static.
>>
>> Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
>> Signed-off-by: John Hickey <jjh@daedalian.us>
>> ---
>> v1 -> v2:
>>     Added Fixes and net tag.  No code changes.
>> v2 -> v3:
>>     Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
>>     is not turned off.  Based on feedback from Maciej Fijalkowski.
>> ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
>>   2 files changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c 
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> index f8156fe4b1dc..0ee943db3dc9 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>> @@ -1035,9 +1035,6 @@ static void ixgbe_free_q_vector(struct 
>> ixgbe_adapter *adapter, int v_idx)
>>       adapter->q_vector[v_idx] = NULL;
>>       __netif_napi_del(&q_vector->napi);
>> -    if (static_key_enabled(&ixgbe_xdp_locking_key))
>> -        static_branch_dec(&ixgbe_xdp_locking_key);
>> -
>>       /*
>>        * after a call to __netif_napi_del() napi may still be used and
>>        * ixgbe_get_stats64() might access the rings on this vector,
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index ab8370c413f3..cd2fb72c67be 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -10283,7 +10283,7 @@ static int ixgbe_xdp_setup(struct net_device 
>> *dev, struct bpf_prog *prog)
>>       if (nr_cpu_ids > IXGBE_MAX_XDP_QS * 2)
>>           return -ENOMEM;
>>       else if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
>> -        static_branch_inc(&ixgbe_xdp_locking_key);
>> +        static_branch_enable(&ixgbe_xdp_locking_key);
>>       old_prog = xchg(&adapter->xdp_prog, prog);
>>       need_reset = (!!prog != !!old_prog);
> 
> 
> Kind regards,
> 
> Paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
  2023-03-26 15:10   ` Paul Menzel
@ 2023-03-26 15:13     ` Paul Menzel
  2023-03-27  2:45     ` Jason Xing
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Menzel @ 2023-03-26 15:13 UTC (permalink / raw)
  To: John Hickey
  Cc: Jesper Dangaard Brouer, Daniel Borkmann, netdev, John Fastabend,
	Alexei Starovoitov, Jesse Brandeburg, Eric Dumazet, Paolo Abeni,
	Tony Nguyen, Jakub Kicinski, bpf, intel-wired-lan,
	David S. Miller, linux-kernel

[Cc: Remove undeliverable <lishujin@kuaishou.com>]

Am 26.03.23 um 17:10 schrieb Paul Menzel:
> [Cc: Remove undeliverable <xingwanli@kuaishou.com>]
> 
> Am 26.03.23 um 17:03 schrieb Paul Menzel:
>> Dear John,
>>
>>
>> Thank you for your patch.
>>
>> I’d recommend, to use a statement in the git commit message/summary by 
>> adding a verb (in imperative mood). Maybe:
>>
>> Fix panic during XDP_TX with > 64 CPUs
>>
>> Am 08.03.23 um 23:07 schrieb John Hickey:
>>> In commit 'ixgbe: let the xdpdrv work with more than 64 cpus'
>>> (4fe815850bdc), support was added to allow XDP programs to run on 
>>> systems
>>
>> I think it’s more common to write it like:
>>
>> In commit 4fe815850bdc (ixgbe: let the xdpdrv work with more than 64 
>> cpus) …
>>
>> Even shorter
>>
>> Commit 4fe815850bdc (ixgbe: let the xdpdrv work with more than 64 
>> cpus) adds support to allow XDP programs …
>>
>>> with more than 64 CPUs by locking the XDP TX rings and indexing them
>>> using cpu % 64 (IXGBE_MAX_XDP_QS).
>>>
>>> Upon trying this out patch via the Intel 5.18.6 out of tree driver
>>
>> Upon trying this patch out via …
>>
>>> on a system with more than 64 cores, the kernel paniced with an
>>> array-index-out-of-bounds at the return in ixgbe_determine_xdp_ring in
>>> ixgbe.h, which means ixgbe_determine_xdp_q_idx was just returning the
>>> cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example splat:
>>
>> Please add, that you have UBSAN  enabled, or does it happen without?
>>
>>>
>>> ==========================================================================
>>>   UBSAN: array-index-out-of-bounds in
>>>   /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
>>>   index 65 is out of range for type 'ixgbe_ring *[64]'
>>> ==========================================================================
>>>   BUG: kernel NULL pointer dereference, address: 0000000000000058
>>>   #PF: supervisor read access in kernel mode
>>>   #PF: error_code(0x0000) - not-present page
>>>   PGD 0 P4D 0
>>>   Oops: 0000 [#1] SMP NOPTI
>>>   CPU: 65 PID: 408 Comm: ksoftirqd/65
>>>   Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
>>>   Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
>>>   RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
>>>   Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
>>>   00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
>>>   47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
>>
>> If you do not it yet, `scripts/decode_stacktrace.sh` helps decoding 
>> these traces.
>>
>>>   RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
>>>   RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
>>>   RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
>>>   RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
>>>   R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
>>>   R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
>>>   FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
>>>   knlGS:0000000000000000
>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>   CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
>>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>   PKRU: 55555554
>>>   Call Trace:
>>>    <TASK>
>>>    ixgbe_poll+0x103e/0x1280 [ixgbe]
>>>    ? sched_clock_cpu+0x12/0xe0
>>>    __napi_poll+0x30/0x160
>>>    net_rx_action+0x11c/0x270
>>>    __do_softirq+0xda/0x2ee
>>>    run_ksoftirqd+0x2f/0x50
>>>    smpboot_thread_fn+0xb7/0x150
>>>    ? sort_range+0x30/0x30
>>>    kthread+0x127/0x150
>>>    ? set_kthread_struct+0x50/0x50
>>>    ret_from_fork+0x1f/0x30
>>>    </TASK>
>>>
>>> I think this is how it happens:
>>>
>>> Upon loading the first XDP program on a system with more than 64 CPUs,
>>> ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
>>> immediately after this, the rings are reconfigured by ixgbe_setup_tc.
>>> ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
>>> ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
>>> ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
>>> it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
>>> stopped my system from panicing.
>>>
>>> I suspect to make the original patch work, I would need to load an XDP
>>> program and then replace it in order to get ixgbe_xdp_locking_key back
>>> above 0 since ixgbe_setup_tc is only called when transitioning between
>>> XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
>>> incremented every time ixgbe_xdp_setup is called.
>>>
>>> Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
>>> becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
>>> with greater than 64 CPUs.
>>
>> … with more than 64 CPUs.
>>
>>> For this patch, I have changed static_branch_inc to static_branch_enable
>>> in ixgbe_setup_xdp.  We weren't counting references.  The
>>> ixgbe_xdp_locking_key only protects code in the XDP_TX path, which is
>>> not run when an XDP program is loaded.  The other condition for setting
>>> it on is the number of CPUs, which I assume is static.
>>>
>>> Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
>>> Signed-off-by: John Hickey <jjh@daedalian.us>
>>> ---
>>> v1 -> v2:
>>>     Added Fixes and net tag.  No code changes.
>>> v2 -> v3:
>>>     Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
>>>     is not turned off.  Based on feedback from Maciej Fijalkowski.
>>> ---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
>>>   2 files changed, 1 insertion(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c 
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> index f8156fe4b1dc..0ee943db3dc9 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> @@ -1035,9 +1035,6 @@ static void ixgbe_free_q_vector(struct 
>>> ixgbe_adapter *adapter, int v_idx)
>>>       adapter->q_vector[v_idx] = NULL;
>>>       __netif_napi_del(&q_vector->napi);
>>> -    if (static_key_enabled(&ixgbe_xdp_locking_key))
>>> -        static_branch_dec(&ixgbe_xdp_locking_key);
>>> -
>>>       /*
>>>        * after a call to __netif_napi_del() napi may still be used and
>>>        * ixgbe_get_stats64() might access the rings on this vector,
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index ab8370c413f3..cd2fb72c67be 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -10283,7 +10283,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
>>>       if (nr_cpu_ids > IXGBE_MAX_XDP_QS * 2)
>>>           return -ENOMEM;
>>>       else if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
>>> -        static_branch_inc(&ixgbe_xdp_locking_key);
>>> +        static_branch_enable(&ixgbe_xdp_locking_key);
>>>       old_prog = xchg(&adapter->xdp_prog, prog);
>>>       need_reset = (!!prog != !!old_prog);
>>
>>
>> Kind regards,
>>
>> Paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
  2023-03-26 15:10   ` Paul Menzel
  2023-03-26 15:13     ` Paul Menzel
@ 2023-03-27  2:45     ` Jason Xing
  1 sibling, 0 replies; 10+ messages in thread
From: Jason Xing @ 2023-03-27  2:45 UTC (permalink / raw)
  To: Paul Menzel
  Cc: John Hickey, Shujin Li, Jesper Dangaard Brouer, Daniel Borkmann,
	netdev, John Fastabend, Alexei Starovoitov, Jesse Brandeburg,
	Eric Dumazet, Tony Nguyen, Jakub Kicinski, intel-wired-lan, bpf,
	Paolo Abeni, David S. Miller, linux-kernel, Jason Xing

On Sun, Mar 26, 2023 at 11:50 PM Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>
> [Cc: Remove undeliverable <xingwanli@kuaishou.com>]

Ah, It's my previous company email. Sorry to notice this email so late.

>
> Am 26.03.23 um 17:03 schrieb Paul Menzel:
> > Dear John,
> >
> >
> > Thank you for your patch.
> >
> > I’d recommend, to use a statement in the git commit message/summary by
> > adding a verb (in imperative mood). Maybe:
> >
> > Fix panic during XDP_TX with > 64 CPUs
> >
> > Am 08.03.23 um 23:07 schrieb John Hickey:
> >> In commit 'ixgbe: let the xdpdrv work with more than 64 cpus'
> >> (4fe815850bdc), support was added to allow XDP programs to run on systems
> >
> > I think it’s more common to write it like:
> >
> > In commit 4fe815850bdc (ixgbe: let the xdpdrv work with more than 64
> > cpus) …
> >
> > Even shorter
> >
> > Commit 4fe815850bdc (ixgbe: let the xdpdrv work with more than 64 cpus)
> > adds support to allow XDP programs …
> >
> >> with more than 64 CPUs by locking the XDP TX rings and indexing them
> >> using cpu % 64 (IXGBE_MAX_XDP_QS).
> >>
> >> Upon trying this out patch via the Intel 5.18.6 out of tree driver
> >
> > Upon trying this patch out via …
> >
> >> on a system with more than 64 cores, the kernel paniced with an
> >> array-index-out-of-bounds at the return in ixgbe_determine_xdp_ring in
> >> ixgbe.h, which means ixgbe_determine_xdp_q_idx was just returning the
> >> cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example splat:
> >
> > Please add, that you have UBSAN  enabled, or does it happen without?
> >
> >>
> >>
> >> ==========================================================================
> >>   UBSAN: array-index-out-of-bounds in
> >>   /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
> >>   index 65 is out of range for type 'ixgbe_ring *[64]'
> >>
> >> ==========================================================================
> >>   BUG: kernel NULL pointer dereference, address: 0000000000000058
> >>   #PF: supervisor read access in kernel mode
> >>   #PF: error_code(0x0000) - not-present page
> >>   PGD 0 P4D 0
> >>   Oops: 0000 [#1] SMP NOPTI
> >>   CPU: 65 PID: 408 Comm: ksoftirqd/65
> >>   Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
> >>   Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
> >>   RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
> >>   Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
> >>   00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
> >>   47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
> >
> > If you do not it yet, `scripts/decode_stacktrace.sh` helps decoding
> > these traces.
> >
> >>   RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
> >>   RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
> >>   RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
> >>   RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
> >>   R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
> >>   R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
> >>   FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
> >>   knlGS:0000000000000000
> >>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>   CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
> >>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >>   PKRU: 55555554
> >>   Call Trace:
> >>    <TASK>
> >>    ixgbe_poll+0x103e/0x1280 [ixgbe]
> >>    ? sched_clock_cpu+0x12/0xe0
> >>    __napi_poll+0x30/0x160
> >>    net_rx_action+0x11c/0x270
> >>    __do_softirq+0xda/0x2ee
> >>    run_ksoftirqd+0x2f/0x50
> >>    smpboot_thread_fn+0xb7/0x150
> >>    ? sort_range+0x30/0x30
> >>    kthread+0x127/0x150
> >>    ? set_kthread_struct+0x50/0x50
> >>    ret_from_fork+0x1f/0x30
> >>    </TASK>
> >>
> >> I think this is how it happens:
> >>
> >> Upon loading the first XDP program on a system with more than 64 CPUs,
> >> ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
> >> immediately after this, the rings are reconfigured by ixgbe_setup_tc.
> >> ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
> >> ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
> >> ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
> >> it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
> >> stopped my system from panicing.
> >>
> >> I suspect to make the original patch work, I would need to load an XDP
> >> program and then replace it in order to get ixgbe_xdp_locking_key back
> >> above 0 since ixgbe_setup_tc is only called when transitioning between
> >> XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
> >> incremented every time ixgbe_xdp_setup is called.
> >>
> >> Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
> >> becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
> >> with greater than 64 CPUs.
> >
> > … with more than 64 CPUs.
> >
> >> For this patch, I have changed static_branch_inc to static_branch_enable
> >> in ixgbe_setup_xdp.  We weren't counting references.  The
> >> ixgbe_xdp_locking_key only protects code in the XDP_TX path, which is
> >> not run when an XDP program is loaded.  The other condition for setting
> >> it on is the number of CPUs, which I assume is static.
> >>
> >> Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
> >> Signed-off-by: John Hickey <jjh@daedalian.us>
> >> ---
> >> v1 -> v2:
> >>     Added Fixes and net tag.  No code changes.
> >> v2 -> v3:
> >>     Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
> >>     is not turned off.  Based on feedback from Maciej Fijalkowski.
> >> ---
> >>   drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
> >>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
> >>   2 files changed, 1 insertion(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> >> b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> >> index f8156fe4b1dc..0ee943db3dc9 100644
> >> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> >> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> >> @@ -1035,9 +1035,6 @@ static void ixgbe_free_q_vector(struct
> >> ixgbe_adapter *adapter, int v_idx)
> >>       adapter->q_vector[v_idx] = NULL;
> >>       __netif_napi_del(&q_vector->napi);
> >> -    if (static_key_enabled(&ixgbe_xdp_locking_key))
> >> -        static_branch_dec(&ixgbe_xdp_locking_key);
> >> -
> >>       /*
> >>        * after a call to __netif_napi_del() napi may still be used and
> >>        * ixgbe_get_stats64() might access the rings on this vector,
> >> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >> index ab8370c413f3..cd2fb72c67be 100644
> >> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >> @@ -10283,7 +10283,7 @@ static int ixgbe_xdp_setup(struct net_device
> >> *dev, struct bpf_prog *prog)
> >>       if (nr_cpu_ids > IXGBE_MAX_XDP_QS * 2)
> >>           return -ENOMEM;
> >>       else if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
> >> -        static_branch_inc(&ixgbe_xdp_locking_key);
> >> +        static_branch_enable(&ixgbe_xdp_locking_key);
> >>       old_prog = xchg(&adapter->xdp_prog, prog);
> >>       need_reset = (!!prog != !!old_prog);
> >
> >
> > Kind regards,
> >
> > Paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
  2023-03-08 22:07 [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs John Hickey
  2023-03-25 18:29 ` Rout, ChandanX
  2023-03-26 15:03 ` [Intel-wired-lan] " Paul Menzel
@ 2023-03-27  3:00 ` Jason Xing
  2023-04-05 14:40 ` Maciej Fijalkowski
  3 siblings, 0 replies; 10+ messages in thread
From: Jason Xing @ 2023-03-27  3:00 UTC (permalink / raw)
  To: John Hickey
  Cc: anthony.l.nguyen, Jesse Brandeburg, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	intel-wired-lan, netdev, linux-kernel, bpf

On Thu, Mar 9, 2023 at 6:22 AM John Hickey <jjh@daedalian.us> wrote:
>
> In commit 'ixgbe: let the xdpdrv work with more than 64 cpus'
> (4fe815850bdc), support was added to allow XDP programs to run on systems
> with more than 64 CPUs by locking the XDP TX rings and indexing them
> using cpu % 64 (IXGBE_MAX_XDP_QS).
>
> Upon trying this out patch via the Intel 5.18.6 out of tree driver
> on a system with more than 64 cores, the kernel paniced with an
> array-index-out-of-bounds at the return in ixgbe_determine_xdp_ring in
> ixgbe.h, which means ixgbe_determine_xdp_q_idx was just returning the
> cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example splat:
>
>  ==========================================================================
>  UBSAN: array-index-out-of-bounds in
>  /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
>  index 65 is out of range for type 'ixgbe_ring *[64]'
>  ==========================================================================
>  BUG: kernel NULL pointer dereference, address: 0000000000000058
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 0 P4D 0
>  Oops: 0000 [#1] SMP NOPTI
>  CPU: 65 PID: 408 Comm: ksoftirqd/65
>  Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
>  Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
>  RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
>  Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
>  00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
>  47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
>  RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
>  RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
>  RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
>  RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
>  R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
>  R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
>  FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
>  knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  PKRU: 55555554
>  Call Trace:
>   <TASK>
>   ixgbe_poll+0x103e/0x1280 [ixgbe]
>   ? sched_clock_cpu+0x12/0xe0
>   __napi_poll+0x30/0x160
>   net_rx_action+0x11c/0x270
>   __do_softirq+0xda/0x2ee
>   run_ksoftirqd+0x2f/0x50
>   smpboot_thread_fn+0xb7/0x150
>   ? sort_range+0x30/0x30
>   kthread+0x127/0x150
>   ? set_kthread_struct+0x50/0x50
>   ret_from_fork+0x1f/0x30
>   </TASK>
>
> I think this is how it happens:
>
> Upon loading the first XDP program on a system with more than 64 CPUs,
> ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
> immediately after this, the rings are reconfigured by ixgbe_setup_tc.
> ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
> ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
> ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
> it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
> stopped my system from panicing.
>
> I suspect to make the original patch work, I would need to load an XDP
> program and then replace it in order to get ixgbe_xdp_locking_key back
> above 0 since ixgbe_setup_tc is only called when transitioning between
> XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
> incremented every time ixgbe_xdp_setup is called.
>
> Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
> becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
> with greater than 64 CPUs.
>
> For this patch, I have changed static_branch_inc to static_branch_enable
> in ixgbe_setup_xdp.  We weren't counting references.  The
> ixgbe_xdp_locking_key only protects code in the XDP_TX path, which is
> not run when an XDP program is loaded.  The other condition for setting
> it on is the number of CPUs, which I assume is static.
>
> Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
> Signed-off-by: John Hickey <jjh@daedalian.us>
> ---
> v1 -> v2:
>         Added Fixes and net tag.  No code changes.
> v2 -> v3:
>         Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
>         is not turned off.  Based on feedback from Maciej Fijalkowski.
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
>  2 files changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> index f8156fe4b1dc..0ee943db3dc9 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> @@ -1035,9 +1035,6 @@ static void ixgbe_free_q_vector(struct ixgbe_adapter *adapter, int v_idx)
>         adapter->q_vector[v_idx] = NULL;
>         __netif_napi_del(&q_vector->napi);
>
> -       if (static_key_enabled(&ixgbe_xdp_locking_key))
> -               static_branch_dec(&ixgbe_xdp_locking_key);
> -

Seems like you still didn't add another pair (enable VS disable)
static_branch_disable() to switch off this static key as Maciej
Fijalkowski suggested on top of your 1st patch.

Since we use this static key indicating that we switch it on at the
very beginning, we also need to switch it off in turn as its life
cycle comes to an end.

For me, I would recommend to use _inc/_dec to control the whole thing
even though it's relatively more complicated to balance the uses of
the inc and dec pairs.

Never mind, enable/disable can work well if maintainers don't have
opinions on this.

Thanks,
Jason

>         /*
>          * after a call to __netif_napi_del() napi may still be used and
>          * ixgbe_get_stats64() might access the rings on this vector,
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index ab8370c413f3..cd2fb72c67be 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10283,7 +10283,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
>         if (nr_cpu_ids > IXGBE_MAX_XDP_QS * 2)
>                 return -ENOMEM;
>         else if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
> -               static_branch_inc(&ixgbe_xdp_locking_key);
> +               static_branch_enable(&ixgbe_xdp_locking_key);
>
>         old_prog = xchg(&adapter->xdp_prog, prog);
>         need_reset = (!!prog != !!old_prog);
> --
> 2.37.2
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs
  2023-03-08 22:07 [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs John Hickey
                   ` (2 preceding siblings ...)
  2023-03-27  3:00 ` Jason Xing
@ 2023-04-05 14:40 ` Maciej Fijalkowski
  2023-04-13 23:03   ` [PATCH net v4] ixgbe: Fix panic " John Hickey
  3 siblings, 1 reply; 10+ messages in thread
From: Maciej Fijalkowski @ 2023-04-05 14:40 UTC (permalink / raw)
  To: John Hickey
  Cc: anthony.l.nguyen, Jesse Brandeburg, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Shujin Li, Jason Xing, intel-wired-lan, netdev, linux-kernel,
	bpf

On Wed, Mar 08, 2023 at 02:07:57PM -0800, John Hickey wrote:
> In commit 'ixgbe: let the xdpdrv work with more than 64 cpus'
> (4fe815850bdc), support was added to allow XDP programs to run on systems
> with more than 64 CPUs by locking the XDP TX rings and indexing them
> using cpu % 64 (IXGBE_MAX_XDP_QS).
> 
> Upon trying this out patch via the Intel 5.18.6 out of tree driver
> on a system with more than 64 cores, the kernel paniced with an
> array-index-out-of-bounds at the return in ixgbe_determine_xdp_ring in
> ixgbe.h, which means ixgbe_determine_xdp_q_idx was just returning the
> cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example splat:

i am assuming that in-tree driver is also affected so the info about OOT
driver is irrelevant here :)

> 
>  ==========================================================================
>  UBSAN: array-index-out-of-bounds in
>  /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
>  index 65 is out of range for type 'ixgbe_ring *[64]'
>  ==========================================================================
>  BUG: kernel NULL pointer dereference, address: 0000000000000058
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 0 P4D 0
>  Oops: 0000 [#1] SMP NOPTI
>  CPU: 65 PID: 408 Comm: ksoftirqd/65
>  Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
>  Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
>  RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
>  Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
>  00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
>  47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
>  RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
>  RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
>  RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
>  RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
>  R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
>  R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
>  FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
>  knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  PKRU: 55555554
>  Call Trace:
>   <TASK>
>   ixgbe_poll+0x103e/0x1280 [ixgbe]
>   ? sched_clock_cpu+0x12/0xe0
>   __napi_poll+0x30/0x160
>   net_rx_action+0x11c/0x270
>   __do_softirq+0xda/0x2ee
>   run_ksoftirqd+0x2f/0x50
>   smpboot_thread_fn+0xb7/0x150
>   ? sort_range+0x30/0x30
>   kthread+0x127/0x150
>   ? set_kthread_struct+0x50/0x50
>   ret_from_fork+0x1f/0x30
>   </TASK>
> 
> I think this is how it happens:
> 
> Upon loading the first XDP program on a system with more than 64 CPUs,
> ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
> immediately after this, the rings are reconfigured by ixgbe_setup_tc.
> ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
> ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
> ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
> it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
> stopped my system from panicing.
> 
> I suspect to make the original patch work, I would need to load an XDP
> program and then replace it in order to get ixgbe_xdp_locking_key back
> above 0 since ixgbe_setup_tc is only called when transitioning between
> XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
> incremented every time ixgbe_xdp_setup is called.
> 
> Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
> becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
> with greater than 64 CPUs.
> 
> For this patch, I have changed static_branch_inc to static_branch_enable
> in ixgbe_setup_xdp.  We weren't counting references.  The
> ixgbe_xdp_locking_key only protects code in the XDP_TX path, which is
> not run when an XDP program is loaded.  The other condition for setting
> it on is the number of CPUs, which I assume is static.

From technical POV i think the contents here are fine. Maybe we could even
move this check to probe path because as you said this is static. I saw
that others had some wording/style comments, so if you decide to fix it
and push a v4, then you can add:

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

> 
> Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
> Signed-off-by: John Hickey <jjh@daedalian.us>
> ---
> v1 -> v2:
> 	Added Fixes and net tag.  No code changes.
> v2 -> v3:
> 	Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
> 	is not turned off.  Based on feedback from Maciej Fijalkowski.
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
>  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> index f8156fe4b1dc..0ee943db3dc9 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
> @@ -1035,9 +1035,6 @@ static void ixgbe_free_q_vector(struct ixgbe_adapter *adapter, int v_idx)
>  	adapter->q_vector[v_idx] = NULL;
>  	__netif_napi_del(&q_vector->napi);
>  
> -	if (static_key_enabled(&ixgbe_xdp_locking_key))
> -		static_branch_dec(&ixgbe_xdp_locking_key);
> -
>  	/*
>  	 * after a call to __netif_napi_del() napi may still be used and
>  	 * ixgbe_get_stats64() might access the rings on this vector,
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index ab8370c413f3..cd2fb72c67be 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10283,7 +10283,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
>  	if (nr_cpu_ids > IXGBE_MAX_XDP_QS * 2)
>  		return -ENOMEM;
>  	else if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
> -		static_branch_inc(&ixgbe_xdp_locking_key);
> +		static_branch_enable(&ixgbe_xdp_locking_key);
>  
>  	old_prog = xchg(&adapter->xdp_prog, prog);
>  	need_reset = (!!prog != !!old_prog);
> -- 
> 2.37.2
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH net v4] ixgbe: Fix panic during XDP_TX with > 64 CPUs
  2023-04-05 14:40 ` Maciej Fijalkowski
@ 2023-04-13 23:03   ` John Hickey
  2023-04-24  4:08     ` [Intel-wired-lan] " Rout, ChandanX
  0 siblings, 1 reply; 10+ messages in thread
From: John Hickey @ 2023-04-13 23:03 UTC (permalink / raw)
  To: maciej.fijalkowski
  Cc: John Hickey, Jesse Brandeburg, Tony Nguyen, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Shujin Li, Jason Xing, intel-wired-lan, netdev, linux-kernel,
	bpf

Commit 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
adds support to allow XDP programs to run on systems with more than
64 CPUs by locking the XDP TX rings and indexing them using cpu % 64
(IXGBE_MAX_XDP_QS).

Upon trying this out patch on a system with more than 64 cores,
the kernel paniced with an array-index-out-of-bounds at the return in
ixgbe_determine_xdp_ring in ixgbe.h, which means ixgbe_determine_xdp_q_idx
was just returning the cpu instead of cpu % IXGBE_MAX_XDP_QS.  An example
splat:

 ==========================================================================
 UBSAN: array-index-out-of-bounds in
 /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
 index 65 is out of range for type 'ixgbe_ring *[64]'
 ==========================================================================
 BUG: kernel NULL pointer dereference, address: 0000000000000058
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] SMP NOPTI
 CPU: 65 PID: 408 Comm: ksoftirqd/65
 Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
 Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
 RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
 Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
 00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
 47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
 RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
 RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
 RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
 RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
 R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
 R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
 FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
 knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  ixgbe_poll+0x103e/0x1280 [ixgbe]
  ? sched_clock_cpu+0x12/0xe0
  __napi_poll+0x30/0x160
  net_rx_action+0x11c/0x270
  __do_softirq+0xda/0x2ee
  run_ksoftirqd+0x2f/0x50
  smpboot_thread_fn+0xb7/0x150
  ? sort_range+0x30/0x30
  kthread+0x127/0x150
  ? set_kthread_struct+0x50/0x50
  ret_from_fork+0x1f/0x30
  </TASK>

I think this is how it happens:

Upon loading the first XDP program on a system with more than 64 CPUs,
ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
immediately after this, the rings are reconfigured by ixgbe_setup_tc.
ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if
it is non-zero.  Commenting out the decrement in ixgbe_free_q_vector
stopped my system from panicing.

I suspect to make the original patch work, I would need to load an XDP
program and then replace it in order to get ixgbe_xdp_locking_key back
above 0 since ixgbe_setup_tc is only called when transitioning between
XDP and non-XDP ring configurations, while ixgbe_xdp_locking_key is
incremented every time ixgbe_xdp_setup is called.

Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this
becomes another path to decrement ixgbe_xdp_locking_key to 0 on systems
with more than 64 CPUs.

Since ixgbe_xdp_locking_key only protects the XDP_TX path and is tied
to the number of CPUs present, there is no reason to disable it upon
unloading an XDP program.  To avoid confusion, I have moved enabling
ixgbe_xdp_locking_key into ixgbe_sw_init, which is part of the probe path.

Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
Signed-off-by: John Hickey <jjh@daedalian.us>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
v1 -> v2:
	Added Fixes and net tag.  No code changes.
v2 -> v3:
	Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
	is not turned off.  Based on feedback from Maciej Fijalkowski.
v3 -> v4:
	Moved setting ixgbe_xdp_locking_key into the probe path.
	Commit message cleanup.
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 ++++--
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index f8156fe4b1dc..0ee943db3dc9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -1035,9 +1035,6 @@ static void ixgbe_free_q_vector(struct ixgbe_adapter *adapter, int v_idx)
 	adapter->q_vector[v_idx] = NULL;
 	__netif_napi_del(&q_vector->napi);
 
-	if (static_key_enabled(&ixgbe_xdp_locking_key))
-		static_branch_dec(&ixgbe_xdp_locking_key);
-
 	/*
 	 * after a call to __netif_napi_del() napi may still be used and
 	 * ixgbe_get_stats64() might access the rings on this vector,
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 773c35fecace..d7c247e46dfc 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -6495,6 +6495,10 @@ static int ixgbe_sw_init(struct ixgbe_adapter *adapter,
 	set_bit(0, adapter->fwd_bitmask);
 	set_bit(__IXGBE_DOWN, &adapter->state);
 
+	/* enable locking for XDP_TX if we have more CPUs than queues */
+	if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
+		static_branch_enable(&ixgbe_xdp_locking_key);
+
 	return 0;
 }
 
@@ -10290,8 +10294,6 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
 	 */
 	if (nr_cpu_ids > IXGBE_MAX_XDP_QS * 2)
 		return -ENOMEM;
-	else if (nr_cpu_ids > IXGBE_MAX_XDP_QS)
-		static_branch_inc(&ixgbe_xdp_locking_key);
 
 	old_prog = xchg(&adapter->xdp_prog, prog);
 	need_reset = (!!prog != !!old_prog);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [Intel-wired-lan] [PATCH net v4] ixgbe: Fix panic during XDP_TX with > 64 CPUs
  2023-04-13 23:03   ` [PATCH net v4] ixgbe: Fix panic " John Hickey
@ 2023-04-24  4:08     ` Rout, ChandanX
  0 siblings, 0 replies; 10+ messages in thread
From: Rout, ChandanX @ 2023-04-24  4:08 UTC (permalink / raw)
  To: John Hickey, Fijalkowski, Maciej, intel-wired-lan
  Cc: Shujin Li, Alexei Starovoitov, Jesper Dangaard Brouer,
	Daniel Borkmann, John Fastabend, Brandeburg, Jesse, Eric Dumazet,
	Nguyen, Anthony L, netdev, Jakub Kicinski, bpf, Paolo Abeni,
	David S. Miller, linux-kernel, Xing, Wanli



>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>John Hickey
>Sent: 14 April 2023 04:33
>To: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>
>Cc: Shujin Li <lishujin@kuaishou.com>; Alexei Starovoitov <ast@kernel.org>;
>Jesper Dangaard Brouer <hawk@kernel.org>; Daniel Borkmann
><daniel@iogearbox.net>; intel-wired-lan@lists.osuosl.org; John Fastabend
><john.fastabend@gmail.com>; Brandeburg, Jesse
><jesse.brandeburg@intel.com>; John Hickey <jjh@daedalian.us>; Eric
>Dumazet <edumazet@google.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; netdev@vger.kernel.org; Jakub Kicinski
><kuba@kernel.org>; bpf@vger.kernel.org; Paolo Abeni
><pabeni@redhat.com>; David S. Miller <davem@davemloft.net>; linux-
>kernel@vger.kernel.org; Xing, Wanli <xingwanli@kuaishou.com>
>Subject: [Intel-wired-lan] [PATCH net v4] ixgbe: Fix panic during XDP_TX with
>> 64 CPUs
>
>Commit 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
>adds support to allow XDP programs to run on systems with more than
>64 CPUs by locking the XDP TX rings and indexing them using cpu % 64
>(IXGBE_MAX_XDP_QS).
>
>Upon trying this out patch on a system with more than 64 cores, the kernel
>paniced with an array-index-out-of-bounds at the return in
>ixgbe_determine_xdp_ring in ixgbe.h, which means
>ixgbe_determine_xdp_q_idx was just returning the cpu instead of cpu %
>IXGBE_MAX_XDP_QS.  An example
>splat:
>
>
>===========================================================
>===============
> UBSAN: array-index-out-of-bounds in
> /var/lib/dkms/ixgbe/5.18.6+focal-1/build/src/ixgbe.h:1147:26
> index 65 is out of range for type 'ixgbe_ring *[64]'
>
>===========================================================
>===============
> BUG: kernel NULL pointer dereference, address: 0000000000000058
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page  PGD 0 P4D 0
> Oops: 0000 [#1] SMP NOPTI
> CPU: 65 PID: 408 Comm: ksoftirqd/65
> Tainted: G          IOE     5.15.0-48-generic #54~20.04.1-Ubuntu
> Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.5.4 01/13/2020
> RIP: 0010:ixgbe_xmit_xdp_ring+0x1b/0x1c0 [ixgbe]
> Code: 3b 52 d4 cf e9 42 f2 ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 b9
> 00 00 00 00 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 <44> 0f b7
> 47 58 0f b7 47 5a 0f b7 57 54 44 0f b7 76 08 66 41 39 c0
> RSP: 0018:ffffbc3fcd88fcb0 EFLAGS: 00010282
> RAX: ffff92a253260980 RBX: ffffbc3fe68b00a0 RCX: 0000000000000000
> RDX: ffff928b5f659000 RSI: ffff928b5f659000 RDI: 0000000000000000
> RBP: ffffbc3fcd88fce0 R08: ffff92b9dfc20580 R09: 0000000000000001
> R10: 3d3d3d3d3d3d3d3d R11: 3d3d3d3d3d3d3d3d R12: 0000000000000000
> R13: ffff928b2f0fa8c0 R14: ffff928b9be20050 R15: 000000000000003c
> FS:  0000000000000000(0000) GS:ffff92b9dfc00000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000058 CR3: 000000011dd6a002 CR4: 00000000007706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
>  <TASK>
>  ixgbe_poll+0x103e/0x1280 [ixgbe]
>  ? sched_clock_cpu+0x12/0xe0
>  __napi_poll+0x30/0x160
>  net_rx_action+0x11c/0x270
>  __do_softirq+0xda/0x2ee
>  run_ksoftirqd+0x2f/0x50
>  smpboot_thread_fn+0xb7/0x150
>  ? sort_range+0x30/0x30
>  kthread+0x127/0x150
>  ? set_kthread_struct+0x50/0x50
>  ret_from_fork+0x1f/0x30
>  </TASK>
>
>I think this is how it happens:
>
>Upon loading the first XDP program on a system with more than 64 CPUs,
>ixgbe_xdp_locking_key is incremented in ixgbe_xdp_setup.  However,
>immediately after this, the rings are reconfigured by ixgbe_setup_tc.
>ixgbe_setup_tc calls ixgbe_clear_interrupt_scheme which calls
>ixgbe_free_q_vectors which calls ixgbe_free_q_vector in a loop.
>ixgbe_free_q_vector decrements ixgbe_xdp_locking_key once per call if it is
>non-zero.  Commenting out the decrement in ixgbe_free_q_vector stopped
>my system from panicing.
>
>I suspect to make the original patch work, I would need to load an XDP
>program and then replace it in order to get ixgbe_xdp_locking_key back
>above 0 since ixgbe_setup_tc is only called when transitioning between XDP
>and non-XDP ring configurations, while ixgbe_xdp_locking_key is
>incremented every time ixgbe_xdp_setup is called.
>
>Also, ixgbe_setup_tc can be called via ethtool --set-channels, so this becomes
>another path to decrement ixgbe_xdp_locking_key to 0 on systems with
>more than 64 CPUs.
>
>Since ixgbe_xdp_locking_key only protects the XDP_TX path and is tied to the
>number of CPUs present, there is no reason to disable it upon unloading an
>XDP program.  To avoid confusion, I have moved enabling
>ixgbe_xdp_locking_key into ixgbe_sw_init, which is part of the probe path.
>
>Fixes: 4fe815850bdc ("ixgbe: let the xdpdrv work with more than 64 cpus")
>Signed-off-by: John Hickey <jjh@daedalian.us>
>Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
>v1 -> v2:
>	Added Fixes and net tag.  No code changes.
>v2 -> v3:
>	Added splat.  Slight clarification as to why ixgbe_xdp_locking_key
>	is not turned off.  Based on feedback from Maciej Fijalkowski.
>v3 -> v4:
>	Moved setting ixgbe_xdp_locking_key into the probe path.
>	Commit message cleanup.
>---
> drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  | 3 ---
>drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 ++++--
> 2 files changed, 4 insertions(+), 5 deletions(-)
>

Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-04-24  4:08 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-08 22:07 [PATCH net v3] ixgbe: Panic during XDP_TX with > 64 CPUs John Hickey
2023-03-25 18:29 ` Rout, ChandanX
2023-03-26 15:03 ` [Intel-wired-lan] " Paul Menzel
2023-03-26 15:10   ` Paul Menzel
2023-03-26 15:13     ` Paul Menzel
2023-03-27  2:45     ` Jason Xing
2023-03-27  3:00 ` Jason Xing
2023-04-05 14:40 ` Maciej Fijalkowski
2023-04-13 23:03   ` [PATCH net v4] ixgbe: Fix panic " John Hickey
2023-04-24  4:08     ` [Intel-wired-lan] " Rout, ChandanX

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).