All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rout, ChandanX" <chandanx.rout@intel.com>
To: "Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"intel-wired-lan@lists.osuosl.org"
	<intel-wired-lan@lists.osuosl.org>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"Nguyen, Anthony L" <anthony.l.nguyen@intel.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Kuruvinakunnel, George" <george.kuruvinakunnel@intel.com>,
	"Nagaraju, Shwetha" <shwetha.nagaraju@intel.com>,
	"Nagraj, Shravan" <shravan.nagraj@intel.com>,
	"Sanigani, SarithaX" <sarithax.sanigani@intel.com>
Subject: RE: [Intel-wired-lan] [PATCH intel-net] ice: xsk: disable txq irq before flushing hw
Date: Mon, 13 Mar 2023 03:27:13 +0000	[thread overview]
Message-ID: <MN2PR11MB4045E5FC83D6D3EFC35A0AD6EAB99@MN2PR11MB4045.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20230216122839.6878-1-maciej.fijalkowski@intel.com>



>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: 16 February 2023 17:59
>To: intel-wired-lan@lists.osuosl.org
>Cc: netdev@vger.kernel.org; bpf@vger.kernel.org; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH intel-net] ice: xsk: disable txq irq before
>flushing hw
>
>ice_qp_dis() intends to stop a given queue pair that is a target of xsk pool
>attach/detach. One of the steps is to disable interrupts on these queues. It
>currently is broken in a way that txq irq is turned off
>*after* HW flush which in turn takes no effect.
>
>ice_qp_dis():
>-> ice_qvec_dis_irq()
>--> disable rxq irq
>--> flush hw
>-> ice_vsi_stop_tx_ring()
>-->disable txq irq
>
>Below splat can be triggered by following steps:
>- start xdpsock WITHOUT loading xdp prog
>- run xdp_rxq_info with XDP_TX action on this interface
>- start traffic
>- terminate xdpsock
>
>[  256.312485] BUG: kernel NULL pointer dereference, address:
>0000000000000018 [  256.319560] #PF: supervisor read access in kernel mode [
>256.324775] #PF: error_code(0x0000) - not-present page [  256.329994] PGD 0
>P4D 0 [  256.332574] Oops: 0000 [#1] PREEMPT SMP NOPTI
>[  256.337006] CPU: 3 PID: 32 Comm: ksoftirqd/3 Tainted: G           OE      6.2.0-
>rc5+ #51
>[  256.345218] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS
>SE5C620.86B.02.01.0008.031920191559 03/19/2019 [  256.355807] RIP:
>0010:ice_clean_rx_irq_zc+0x9c/0x7d0 [ice] [  256.361423] Code: b7 8f 8a 00 00
>00 66 39 ca 0f 84 f1 04 00 00 49 8b 47 40 4c 8b 24 d0 41 0f b7 45 04 66 25 ff 3f 66
>89 04 24 0f 84 85 02 00 00 <49> 8b 44 24 18 0f b7 14 24 48 05 00 01 00 00 49 89 04
>24 49 89 44 [  256.380463] RSP: 0018:ffffc900088bfd20 EFLAGS: 00010206 [
>256.385765] RAX: 000000000000003c RBX: 0000000000000035 RCX:
>000000000000067f [  256.393012] RDX: 0000000000000775 RSI:
>0000000000000000 RDI: ffff8881deb3ac80 [  256.400256] RBP:
>000000000000003c R08: ffff889847982710 R09: 0000000000010000 [
>256.407500] R10: ffffffff82c060c0 R11: 0000000000000004 R12:
>0000000000000000 [  256.414746] R13: ffff88811165eea0 R14: ffffc9000d255000
>R15: ffff888119b37600 [  256.421990] FS:  0000000000000000(0000)
>GS:ffff8897e0cc0000(0000) knlGS:0000000000000000 [  256.430207] CS:  0010
>DS: 0000 ES: 0000 CR0: 0000000080050033 [  256.436036] CR2:
>0000000000000018 CR3: 0000000005c0a006 CR4: 00000000007706e0 [
>256.443283] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>0000000000000000 [  256.450527] DR3: 0000000000000000 DR6:
>00000000fffe0ff0 DR7: 0000000000000400 [  256.457770] PKRU: 55555554 [
>256.460529] Call Trace:
>[  256.463015]  <TASK>
>[  256.465157]  ? ice_xmit_zc+0x6e/0x150 [ice] [  256.469437]
>ice_napi_poll+0x46d/0x680 [ice] [  256.473815]  ?
>_raw_spin_unlock_irqrestore+0x1b/0x40
>[  256.478863]  __napi_poll+0x29/0x160
>[  256.482409]  net_rx_action+0x136/0x260 [  256.486222]
>__do_softirq+0xe8/0x2e5 [  256.489853]  ? smpboot_thread_fn+0x2c/0x270 [
>256.494108]  run_ksoftirqd+0x2a/0x50 [  256.497747]
>smpboot_thread_fn+0x1c1/0x270 [  256.501907]  ?
>__pfx_smpboot_thread_fn+0x10/0x10 [  256.506594]  kthread+0xea/0x120 [
>256.509785]  ? __pfx_kthread+0x10/0x10 [  256.513597]
>ret_from_fork+0x29/0x50 [  256.517238]  </TASK>
>
>In fact, irqs were not disabled and napi managed to be scheduled and run
>while xsk_pool pointer was still valid, but SW ring of xdp_buff pointers was
>already freed.
>
>To fix this, call ice_qvec_dis_irq() after ice_vsi_stop_tx_ring(). Also while at it,
>remove redundant ice_clean_rx_ring() call - this is handled in
>ice_qp_clean_rings().
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>

Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)

WARNING: multiple messages have this Message-ID (diff)
From: "Rout, ChandanX" <chandanx.rout@intel.com>
To: "Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"intel-wired-lan@lists.osuosl.org"
	<intel-wired-lan@lists.osuosl.org>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"Nguyen, Anthony L" <anthony.l.nguyen@intel.com>,
	"Sanigani, SarithaX" <sarithax.sanigani@intel.com>,
	"Nagraj, Shravan" <shravan.nagraj@intel.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>
Subject: Re: [Intel-wired-lan] [PATCH intel-net] ice: xsk: disable txq irq before flushing hw
Date: Mon, 13 Mar 2023 03:27:13 +0000	[thread overview]
Message-ID: <MN2PR11MB4045E5FC83D6D3EFC35A0AD6EAB99@MN2PR11MB4045.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20230216122839.6878-1-maciej.fijalkowski@intel.com>



>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: 16 February 2023 17:59
>To: intel-wired-lan@lists.osuosl.org
>Cc: netdev@vger.kernel.org; bpf@vger.kernel.org; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH intel-net] ice: xsk: disable txq irq before
>flushing hw
>
>ice_qp_dis() intends to stop a given queue pair that is a target of xsk pool
>attach/detach. One of the steps is to disable interrupts on these queues. It
>currently is broken in a way that txq irq is turned off
>*after* HW flush which in turn takes no effect.
>
>ice_qp_dis():
>-> ice_qvec_dis_irq()
>--> disable rxq irq
>--> flush hw
>-> ice_vsi_stop_tx_ring()
>-->disable txq irq
>
>Below splat can be triggered by following steps:
>- start xdpsock WITHOUT loading xdp prog
>- run xdp_rxq_info with XDP_TX action on this interface
>- start traffic
>- terminate xdpsock
>
>[  256.312485] BUG: kernel NULL pointer dereference, address:
>0000000000000018 [  256.319560] #PF: supervisor read access in kernel mode [
>256.324775] #PF: error_code(0x0000) - not-present page [  256.329994] PGD 0
>P4D 0 [  256.332574] Oops: 0000 [#1] PREEMPT SMP NOPTI
>[  256.337006] CPU: 3 PID: 32 Comm: ksoftirqd/3 Tainted: G           OE      6.2.0-
>rc5+ #51
>[  256.345218] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS
>SE5C620.86B.02.01.0008.031920191559 03/19/2019 [  256.355807] RIP:
>0010:ice_clean_rx_irq_zc+0x9c/0x7d0 [ice] [  256.361423] Code: b7 8f 8a 00 00
>00 66 39 ca 0f 84 f1 04 00 00 49 8b 47 40 4c 8b 24 d0 41 0f b7 45 04 66 25 ff 3f 66
>89 04 24 0f 84 85 02 00 00 <49> 8b 44 24 18 0f b7 14 24 48 05 00 01 00 00 49 89 04
>24 49 89 44 [  256.380463] RSP: 0018:ffffc900088bfd20 EFLAGS: 00010206 [
>256.385765] RAX: 000000000000003c RBX: 0000000000000035 RCX:
>000000000000067f [  256.393012] RDX: 0000000000000775 RSI:
>0000000000000000 RDI: ffff8881deb3ac80 [  256.400256] RBP:
>000000000000003c R08: ffff889847982710 R09: 0000000000010000 [
>256.407500] R10: ffffffff82c060c0 R11: 0000000000000004 R12:
>0000000000000000 [  256.414746] R13: ffff88811165eea0 R14: ffffc9000d255000
>R15: ffff888119b37600 [  256.421990] FS:  0000000000000000(0000)
>GS:ffff8897e0cc0000(0000) knlGS:0000000000000000 [  256.430207] CS:  0010
>DS: 0000 ES: 0000 CR0: 0000000080050033 [  256.436036] CR2:
>0000000000000018 CR3: 0000000005c0a006 CR4: 00000000007706e0 [
>256.443283] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>0000000000000000 [  256.450527] DR3: 0000000000000000 DR6:
>00000000fffe0ff0 DR7: 0000000000000400 [  256.457770] PKRU: 55555554 [
>256.460529] Call Trace:
>[  256.463015]  <TASK>
>[  256.465157]  ? ice_xmit_zc+0x6e/0x150 [ice] [  256.469437]
>ice_napi_poll+0x46d/0x680 [ice] [  256.473815]  ?
>_raw_spin_unlock_irqrestore+0x1b/0x40
>[  256.478863]  __napi_poll+0x29/0x160
>[  256.482409]  net_rx_action+0x136/0x260 [  256.486222]
>__do_softirq+0xe8/0x2e5 [  256.489853]  ? smpboot_thread_fn+0x2c/0x270 [
>256.494108]  run_ksoftirqd+0x2a/0x50 [  256.497747]
>smpboot_thread_fn+0x1c1/0x270 [  256.501907]  ?
>__pfx_smpboot_thread_fn+0x10/0x10 [  256.506594]  kthread+0xea/0x120 [
>256.509785]  ? __pfx_kthread+0x10/0x10 [  256.513597]
>ret_from_fork+0x29/0x50 [  256.517238]  </TASK>
>
>In fact, irqs were not disabled and napi managed to be scheduled and run
>while xsk_pool pointer was still valid, but SW ring of xdp_buff pointers was
>already freed.
>
>To fix this, call ice_qvec_dis_irq() after ice_vsi_stop_tx_ring(). Also while at it,
>remove redundant ice_clean_rx_ring() call - this is handled in
>ice_qp_clean_rings().
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>

Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

  parent reply	other threads:[~2023-03-13  3:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-16 12:28 [PATCH intel-net] ice: xsk: disable txq irq before flushing hw Maciej Fijalkowski
2023-02-16 12:28 ` [Intel-wired-lan] " Maciej Fijalkowski
2023-02-16 14:20 ` Larysa Zaremba
2023-02-16 14:20   ` [Intel-wired-lan] " Larysa Zaremba
2023-03-13  3:27 ` Rout, ChandanX [this message]
2023-03-13  3:27   ` Rout, ChandanX
2023-03-14  5:09 ` John Fastabend
2023-03-14  5:09   ` [Intel-wired-lan] " John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MN2PR11MB4045E5FC83D6D3EFC35A0AD6EAB99@MN2PR11MB4045.namprd11.prod.outlook.com \
    --to=chandanx.rout@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=george.kuruvinakunnel@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=sarithax.sanigani@intel.com \
    --cc=shravan.nagraj@intel.com \
    --cc=shwetha.nagaraju@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.