linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: James Smart <jsmart2021@gmail.com>,
	Himanshu Madhani <hmadhani@marvell.com>,
	"Ewan D . Milne" <emilne@redhat.com>,
	Keith Busch <kbusch@kernel.org>, Sasha Levin <sashal@kernel.org>,
	linux-nvme@lists.infradead.org
Subject: [PATCH AUTOSEL 5.4 03/52] nvme-fc: fix double-free scenarios on hw queues
Date: Fri, 20 Dec 2019 09:29:05 -0500	[thread overview]
Message-ID: <20191220142954.9500-3-sashal@kernel.org> (raw)
In-Reply-To: <20191220142954.9500-1-sashal@kernel.org>

From: James Smart <jsmart2021@gmail.com>

[ Upstream commit c869e494ef8b5846d9ba91f1e922c23cd444f0c1 ]

If an error occurs on one of the ios used for creating an
association, the creating routine has error paths that are
invoked by the command failure and the error paths will free
up the controller resources created to that point.

But... the io was ultimately determined by an asynchronous
completion routine that detected the error and which
unconditionally invokes the error_recovery path which calls
delete_association. Delete association deletes all outstanding
io then tears down the controller resources. So the
create_association thread can be running in parallel with
the error_recovery thread. What was seen was the LLDD received
a call to delete a queue, causing the LLDD to do a free of a
resource, then the transport called the delete queue again
causing the driver to repeat the free call. The second free
routine corrupted the allocator. The transport shouldn't be
making the duplicate call, and the delete queue is just one
of the resources being freed.

To fix, it is realized that the create_association path is
completely serialized with one command at a time. So the
failed io completion will always be seen by the create_association
path and as of the failure, there are no ios to terminate and there
is no reason to be manipulating queue freeze states, etc.
The serialized condition stays true until the controller is
transitioned to the LIVE state. Thus the fix is to change the
error recovery path to check the controller state and only
invoke the teardown path if not already in the CONNECTING state.

Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/nvme/host/fc.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 3f102d9f39b83..59474bd0c728d 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2910,10 +2910,22 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
 static void
 __nvme_fc_terminate_io(struct nvme_fc_ctrl *ctrl)
 {
-	nvme_stop_keep_alive(&ctrl->ctrl);
+	/*
+	 * if state is connecting - the error occurred as part of a
+	 * reconnect attempt. The create_association error paths will
+	 * clean up any outstanding io.
+	 *
+	 * if it's a different state - ensure all pending io is
+	 * terminated. Given this can delay while waiting for the
+	 * aborted io to return, we recheck adapter state below
+	 * before changing state.
+	 */
+	if (ctrl->ctrl.state != NVME_CTRL_CONNECTING) {
+		nvme_stop_keep_alive(&ctrl->ctrl);
 
-	/* will block will waiting for io to terminate */
-	nvme_fc_delete_association(ctrl);
+		/* will block will waiting for io to terminate */
+		nvme_fc_delete_association(ctrl);
+	}
 
 	if (ctrl->ctrl.state != NVME_CTRL_CONNECTING &&
 	    !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING))
-- 
2.20.1


  parent reply	other threads:[~2019-12-20 14:39 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-20 14:29 [PATCH AUTOSEL 5.4 01/52] drm/mcde: dsi: Fix invalid pointer dereference if panel cannot be found Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 02/52] nvme_fc: add module to ops template to allow module references Sasha Levin
2019-12-20 14:29 ` Sasha Levin [this message]
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 04/52] drm/amdgpu: add check before enabling/disabling broadcast mode Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 05/52] drm/amdgpu: add header line for power profile on Arcturus Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 06/52] drm/amdgpu: add cache flush workaround to gfx8 emit_fence Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 07/52] drm/amd/display: Map DSC resources 1-to-1 if numbers of OPPs and DSCs are equal Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 08/52] drm/amd/display: Fixed kernel panic when booting with DP-to-HDMI dongle Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 09/52] drm/amd/display: Change the delay time before enabling FEC Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 10/52] drm/amd/display: Reset steer fifo before unblanking the stream Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 11/52] drm/amd/display: update dispclk and dppclk vco frequency Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 12/52] nvme/pci: Fix write and poll queue types Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 13/52] nvme/pci: Fix read queue count Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 14/52] iio: st_accel: Fix unused variable warning Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 15/52] iio: adc: max9611: Fix too short conversion time delay Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 16/52] PM / devfreq: Fix devfreq_notifier_call returning errno Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 17/52] PM / devfreq: Set scaling_max_freq to max on OPP notifier error Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 18/52] PM / devfreq: Don't fail devfreq_dev_release if not in list Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 19/52] afs: Fix afs_find_server lookups for ipv4 peers Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 20/52] afs: Fix SELinux setting security label on /afs Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 21/52] RDMA/cma: add missed unregister_pernet_subsys in init failure Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 22/52] rxe: correctly calculate iCRC for unaligned payloads Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 23/52] scsi: lpfc: Fix memory leak on lpfc_bsg_write_ebuf_set func Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 24/52] scsi: qla2xxx: Use explicit LOGO in target mode Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 25/52] scsi: qla2xxx: Drop superfluous INIT_WORK of del_work Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 26/52] scsi: qla2xxx: Don't call qlt_async_event twice Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 27/52] scsi: qla2xxx: Fix PLOGI payload and ELS IOCB dump length Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 28/52] scsi: qla2xxx: Configure local loop for N2N target Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 29/52] scsi: qla2xxx: Send Notify ACK after N2N PLOGI Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 30/52] scsi: qla2xxx: Don't defer relogin unconditonally Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 31/52] scsi: qla2xxx: Ignore PORT UPDATE after N2N PLOGI Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 32/52] scsi: iscsi: qla4xxx: fix double free in probe Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 33/52] scsi: libsas: stop discovering if oob mode is disconnected Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 34/52] scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 35/52] staging/wlan-ng: add CRC32 dependency in Kconfig Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 36/52] drm/nouveau: Move the declaration of struct nouveau_conn_atom up a bit Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 37/52] drm/nouveau: Fix drm-core using atomic code-paths on pre-nv50 hardware Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 38/52] drm/nouveau/kms/nv50-: fix panel scaling Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 39/52] usb: gadget: fix wrong endpoint desc Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 40/52] net: make socket read/write_iter() honor IOCB_NOWAIT Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 41/52] afs: Fix mountpoint parsing Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 42/52] afs: Fix creation calls in the dynamic root to fail with EOPNOTSUPP Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 43/52] raid5: need to set STRIPE_HANDLE for batch head Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 44/52] md: raid1: check rdev before reference in raid1_sync_request func Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 45/52] s390/cpum_sf: Adjust sampling interval to avoid hitting sample limits Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 46/52] s390/cpum_sf: Avoid SBD overflow condition in irq handler Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 47/52] RDMA/counter: Prevent auto-binding a QP which are not tracked with res Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 48/52] IB/mlx4: Follow mirror sequence of device add during device removal Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 49/52] IB/mlx5: Fix steering rule of drop and count Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 50/52] xen-blkback: prevent premature module unload Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 51/52] xen/balloon: fix ballooned page accounting without hotplug enabled Sasha Levin
2019-12-20 14:29 ` [PATCH AUTOSEL 5.4 52/52] PM / hibernate: memory_bm_find_bit(): Tighten node optimisation Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191220142954.9500-3-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=emilne@redhat.com \
    --cc=hmadhani@marvell.com \
    --cc=jsmart2021@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).