All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Ivan Vecera <ivecera@redhat.com>,
	Helena Anna Dubel <helena.anna.dubel@intel.com>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 58/79] i40e: Fix kernel crash during module removal
Date: Tue, 13 Sep 2022 16:05:03 +0200	[thread overview]
Message-ID: <20220913140353.018779173@linuxfoundation.org> (raw)
In-Reply-To: <20220913140350.291927556@linuxfoundation.org>

From: Ivan Vecera <ivecera@redhat.com>

[ Upstream commit fb8396aeda5872369a8ed6d2301e2c86e303c520 ]

The driver incorrectly frees client instance and subsequent
i40e module removal leads to kernel crash.

Reproducer:
1. Do ethtool offline test followed immediately by another one
host# ethtool -t eth0 offline; ethtool -t eth0 offline
2. Remove recursively irdma module that also removes i40e module
host# modprobe -r irdma

Result:
[ 8675.035651] i40e 0000:3d:00.0 eno1: offline testing starting
[ 8675.193774] i40e 0000:3d:00.0 eno1: testing finished
[ 8675.201316] i40e 0000:3d:00.0 eno1: offline testing starting
[ 8675.358921] i40e 0000:3d:00.0 eno1: testing finished
[ 8675.496921] i40e 0000:3d:00.0: IRDMA hardware initialization FAILED init_state=2 status=-110
[ 8686.188955] i40e 0000:3d:00.1: i40e_ptp_stop: removed PHC on eno2
[ 8686.943890] i40e 0000:3d:00.1: Deleted LAN device PF1 bus=0x3d dev=0x00 func=0x01
[ 8686.952669] i40e 0000:3d:00.0: i40e_ptp_stop: removed PHC on eno1
[ 8687.761787] BUG: kernel NULL pointer dereference, address: 0000000000000030
[ 8687.768755] #PF: supervisor read access in kernel mode
[ 8687.773895] #PF: error_code(0x0000) - not-present page
[ 8687.779034] PGD 0 P4D 0
[ 8687.781575] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 8687.785935] CPU: 51 PID: 172891 Comm: rmmod Kdump: loaded Tainted: G        W I        5.19.0+ #2
[ 8687.794800] Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.0X.02.0001.051420190324 05/14/2019
[ 8687.805222] RIP: 0010:i40e_lan_del_device+0x13/0xb0 [i40e]
[ 8687.810719] Code: d4 84 c0 0f 84 b8 25 01 00 e9 9c 25 01 00 41 bc f4 ff ff ff eb 91 90 0f 1f 44 00 00 41 54 55 53 48 8b 87 58 08 00 00 48 89 fb <48> 8b 68 30 48 89 ef e8 21 8a 0f d5 48 89 ef e8 a9 78 0f d5 48 8b
[ 8687.829462] RSP: 0018:ffffa604072efce0 EFLAGS: 00010202
[ 8687.834689] RAX: 0000000000000000 RBX: ffff8f43833b2000 RCX: 0000000000000000
[ 8687.841821] RDX: 0000000000000000 RSI: ffff8f4b0545b298 RDI: ffff8f43833b2000
[ 8687.848955] RBP: ffff8f43833b2000 R08: 0000000000000001 R09: 0000000000000000
[ 8687.856086] R10: 0000000000000000 R11: 000ffffffffff000 R12: ffff8f43833b2ef0
[ 8687.863218] R13: ffff8f43833b2ef0 R14: ffff915103966000 R15: ffff8f43833b2008
[ 8687.870342] FS:  00007f79501c3740(0000) GS:ffff8f4adffc0000(0000) knlGS:0000000000000000
[ 8687.878427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8687.884174] CR2: 0000000000000030 CR3: 000000014276e004 CR4: 00000000007706e0
[ 8687.891306] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8687.898441] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8687.905572] PKRU: 55555554
[ 8687.908286] Call Trace:
[ 8687.910737]  <TASK>
[ 8687.912843]  i40e_remove+0x2c0/0x330 [i40e]
[ 8687.917040]  pci_device_remove+0x33/0xa0
[ 8687.920962]  device_release_driver_internal+0x1aa/0x230
[ 8687.926188]  driver_detach+0x44/0x90
[ 8687.929770]  bus_remove_driver+0x55/0xe0
[ 8687.933693]  pci_unregister_driver+0x2a/0xb0
[ 8687.937967]  i40e_exit_module+0xc/0xf48 [i40e]

Two offline tests cause IRDMA driver failure (ETIMEDOUT) and this
failure is indicated back to i40e_client_subtask() that calls
i40e_client_del_instance() to free client instance referenced
by pf->cinst and sets this pointer to NULL. During the module
removal i40e_remove() calls i40e_lan_del_device() that dereferences
pf->cinst that is NULL -> crash.
Do not remove client instance when client open callbacks fails and
just clear __I40E_CLIENT_INSTANCE_OPENED bit. The driver also needs
to take care about this situation (when netdev is up and client
is NOT opened) in i40e_notify_client_of_netdev_close() and
calls client close callback only when __I40E_CLIENT_INSTANCE_OPENED
is set.

Fixes: 0ef2d5afb12d ("i40e: KISS the client interface")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Helena Anna Dubel <helena.anna.dubel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/i40e/i40e_client.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_client.c b/drivers/net/ethernet/intel/i40e/i40e_client.c
index 32f3facbed1a5..b3cb5d1033260 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_client.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_client.c
@@ -178,6 +178,10 @@ void i40e_notify_client_of_netdev_close(struct i40e_vsi *vsi, bool reset)
 			"Cannot locate client instance close routine\n");
 		return;
 	}
+	if (!test_bit(__I40E_CLIENT_INSTANCE_OPENED, &cdev->state)) {
+		dev_dbg(&pf->pdev->dev, "Client is not open, abort close\n");
+		return;
+	}
 	cdev->client->ops->close(&cdev->lan_info, cdev->client, reset);
 	clear_bit(__I40E_CLIENT_INSTANCE_OPENED, &cdev->state);
 	i40e_client_release_qvlist(&cdev->lan_info);
@@ -374,7 +378,6 @@ void i40e_client_subtask(struct i40e_pf *pf)
 				/* Remove failed client instance */
 				clear_bit(__I40E_CLIENT_INSTANCE_OPENED,
 					  &cdev->state);
-				i40e_client_del_instance(pf);
 				return;
 			}
 		}
-- 
2.35.1




  parent reply	other threads:[~2022-09-13 14:52 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-13 14:04 [PATCH 5.10 00/79] 5.10.143-rc1 review Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 01/79] NFSD: Fix verifier returned in stable WRITEs Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 02/79] xen-blkfront: Cache feature_persistent value before advertisement Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 03/79] tty: n_gsm: initialize more members at gsm_alloc_mux() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 04/79] tty: n_gsm: avoid call of sleeping functions from atomic context Greg Kroah-Hartman
2022-09-14 12:38   ` Pavel Machek
2022-09-13 14:04 ` [PATCH 5.10 05/79] efi: libstub: Disable struct randomization Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 06/79] efi: capsule-loader: Fix use-after-free in efi_capsule_write Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 07/79] wifi: iwlegacy: 4965: corrected fix for potential off-by-one overflow in il4965_rs_fill_link_cmd() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 08/79] net: mvpp2: debugfs: fix memory leak when using debugfs_lookup() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 09/79] fs: only do a memory barrier for the first set_buffer_uptodate() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 10/79] Revert "mm: kmemleak: take a full lowmem check in kmemleak_*_phys()" Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 11/79] scsi: qla2xxx: Disable ATIO interrupt coalesce for quad port ISP27XX Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 12/79] scsi: megaraid_sas: Fix double kfree() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 13/79] drm/gem: Fix GEM handle release errors Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 14/79] drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 15/79] drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 16/79] drm/radeon: add a force flush to delay work when radeon Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 17/79] parisc: ccio-dma: Handle kmalloc failure in ccio_init_resources() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 18/79] parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 19/79] arm64: cacheinfo: Fix incorrect assignment of signed error value to unsigned fw_level Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 20/79] arm64/signal: Raise limit on stack frames Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 21/79] net/core/skbuff: Check the return value of skb_copy_bits() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 22/79] fbdev: chipsfb: Add missing pci_disable_device() in chipsfb_pci_init() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 23/79] drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 24/79] ALSA: emu10k1: Fix out of bounds access in snd_emu10k1_pcm_channel_alloc() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 25/79] ALSA: aloop: Fix random zeros in capture data when using jiffies timer Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 26/79] ALSA: usb-audio: Fix an out-of-bounds bug in __snd_usb_parse_audio_interface() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 27/79] kprobes: Prohibit probes in gate area Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 28/79] debugfs: add debugfs_lookup_and_remove() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 29/79] nvmet: fix a use-after-free Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 30/79] drm/i915: Implement WaEdpLinkRateDataReload Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 31/79] scsi: mpt3sas: Fix use-after-free warning Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 32/79] scsi: lpfc: Add missing destroy_workqueue() in error path Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 33/79] cgroup: Elide write-locking threadgroup_rwsem when updating csses on an empty subtree Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 34/79] cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 35/79] cifs: remove useless parameter is_fsctl from SMB2_ioctl() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 36/79] smb3: missing inode locks in punch hole Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 37/79] ARM: dts: imx6qdl-kontron-samx6i: remove duplicated node Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 38/79] regulator: core: Clean up on enable failure Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 39/79] tee: fix compiler warning in tee_shm_register() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 40/79] RDMA/cma: Fix arguments order in net device validation Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 41/79] soc: brcmstb: pm-arm: Fix refcount leak and __iomem leak bugs Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 42/79] RDMA/hns: Fix supported page size Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 43/79] RDMA/hns: Fix wrong fixed value of qp->rq.wqe_shift Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 44/79] ARM: dts: at91: sama5d27_wlsom1: specify proper regulator output ranges Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 45/79] ARM: dts: at91: sama5d2_icp: " Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 46/79] ARM: dts: at91: sama5d27_wlsom1: dont keep ldo2 enabled all the time Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 47/79] ARM: dts: at91: sama5d2_icp: dont keep vdd_other " Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 48/79] netfilter: br_netfilter: Drop dst references before setting Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 49/79] netfilter: nf_tables: clean up hook list when offload flags check fails Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 50/79] netfilter: nf_conntrack_irc: Fix forged IP logic Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 51/79] ALSA: usb-audio: Inform the delayed registration more properly Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 52/79] ALSA: usb-audio: Register card again for iface over delayed_register option Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 53/79] rxrpc: Fix an insufficiently large sglist in rxkad_verify_packet_2() Greg Kroah-Hartman
2022-09-13 14:04 ` [PATCH 5.10 54/79] afs: Use the operation issue time instead of the reply time for callbacks Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 55/79] sch_sfb: Dont assume the skb is still around after enqueueing to child Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 56/79] tipc: fix shift wrapping bug in map_get() Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 57/79] ice: use bitmap_free instead of devm_kfree Greg Kroah-Hartman
2022-09-13 14:05 ` Greg Kroah-Hartman [this message]
2022-09-13 14:05 ` [PATCH 5.10 59/79] net: fec: Use a spinlock to guard `fep->ptp_clk_on` Greg Kroah-Hartman
2022-09-13 15:57   ` Marc Kleine-Budde
2022-09-13 14:05 ` [PATCH 5.10 60/79] xen-netback: only remove hotplug-status when the vif is actually destroyed Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 61/79] RDMA/siw: Pass a pointer to virt_to_page() Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 62/79] ipv6: sr: fix out-of-bounds read when setting HMAC data Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 63/79] IB/core: Fix a nested dead lock as part of ODP flow Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 64/79] RDMA/mlx5: Set local port to one when accessing counters Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 65/79] nvme-tcp: fix UAF when detecting digest errors Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 66/79] nvme-tcp: fix regression that causes sporadic requests to time out Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 67/79] tcp: fix early ETIMEDOUT after spurious non-SACK RTO Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 68/79] sch_sfb: Also store skb len before calling child enqueue Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 69/79] ASoC: mchp-spdiftx: remove references to mchp_i2s_caps Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 70/79] ASoC: mchp-spdiftx: Fix clang -Wbitfield-constant-conversion Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 71/79] MIPS: loongson32: ls1c: Fix hang during startup Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 72/79] swiotlb: avoid potential left shift overflow Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 73/79] iommu/amd: use full 64-bit value in build_completion_wait() Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 74/79] hwmon: (mr75203) fix VM sensor allocation when "intel,vm-map" not defined Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 75/79] hwmon: (mr75203) update pvt->v_num and vm_num to the actual number of used sensors Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 76/79] hwmon: (mr75203) fix voltage equation for negative source input Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 77/79] hwmon: (mr75203) fix multi-channel voltage reading Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 78/79] hwmon: (mr75203) enable polling for all VM channels Greg Kroah-Hartman
2022-09-13 14:05 ` [PATCH 5.10 79/79] arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly Greg Kroah-Hartman
2022-09-14  9:38 ` [PATCH 5.10 00/79] 5.10.143-rc1 review Sudip Mukherjee
2022-09-14  9:51 ` Pavel Machek
2022-09-14 11:04 ` Naresh Kamboju
2022-09-14 15:27 ` Jon Hunter
2022-09-14 20:58 ` Florian Fainelli
2022-09-15  0:13 ` Guenter Roeck
2022-09-15  7:28 ` Rudi Heitbaum
2022-09-17  3:18 ` zhouzhixiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220913140353.018779173@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=anthony.l.nguyen@intel.com \
    --cc=helena.anna.dubel@intel.com \
    --cc=ivecera@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.