linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Steffen Maier <maier@linux.vnet.ibm.com>,
	Benjamin Block <bblock@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Subject: [PATCH 4.13 30/43] scsi: zfcp: fix erp_action use-before-initialize in REC action trace
Date: Tue, 31 Oct 2017 10:55:50 +0100	[thread overview]
Message-ID: <20171031095531.718767700@linuxfoundation.org> (raw)
In-Reply-To: <20171031095530.520746935@linuxfoundation.org>

4.13-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Steffen Maier <maier@linux.vnet.ibm.com>

commit ab31fd0ce65ec93828b617123792c1bb7c6dcc42 upstream.

v4.10 commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN
recovery") extended accessing parent pointer fields of struct
zfcp_erp_action for tracing.  If an erp_action has never been enqueued
before, these parent pointer fields are uninitialized and NULL. Examples
are zfcp objects freshly added to the parent object's children list,
before enqueueing their first recovery subsequently. In
zfcp_erp_try_rport_unblock(), we iterate such list. Accessing erp_action
fields can cause a NULL pointer dereference.  Since the kernel can read
from lowcore on s390, it does not immediately cause a kernel page
fault. Instead it can cause hangs on trying to acquire the wrong
erp_action->adapter->dbf->rec_lock in zfcp_dbf_rec_action_lvl()
                      ^bogus^
while holding already other locks with IRQs disabled.

Real life example from attaching lots of LUNs in parallel on many CPUs:

crash> bt 17723
PID: 17723  TASK: ...               CPU: 25  COMMAND: "zfcperp0.0.1800"
 LOWCORE INFO:
  -psw      : 0x0404300180000000 0x000000000038e424
  -function : _raw_spin_lock_wait_flags at 38e424
...
 #0 [fdde8fc90] zfcp_dbf_rec_action_lvl at 3e0004e9862 [zfcp]
 #1 [fdde8fce8] zfcp_erp_try_rport_unblock at 3e0004dfddc [zfcp]
 #2 [fdde8fd38] zfcp_erp_strategy at 3e0004e0234 [zfcp]
 #3 [fdde8fda8] zfcp_erp_thread at 3e0004e0a12 [zfcp]
 #4 [fdde8fe60] kthread at 173550
 #5 [fdde8feb8] kernel_thread_starter at 10add2

zfcp_adapter
 zfcp_port
  zfcp_unit <address>, 0x404040d600000000
  scsi_device NULL, returning early!
zfcp_scsi_dev.status = 0x40000000
0x40000000 ZFCP_STATUS_COMMON_RUNNING

crash> zfcp_unit <address>
struct zfcp_unit {
  erp_action = {
    adapter = 0x0,
    port = 0x0,
    unit = 0x0,
  },
}

zfcp_erp_action is always fully embedded into its container object. Such
container object is never moved in its object tree (only add or delete).
Hence, erp_action parent pointers can never change.

To fix the issue, initialize the erp_action parent pointers before
adding the erp_action container to any list and thus before it becomes
accessible from outside of its initializing function.

In order to also close the time window between zfcp_erp_setup_act()
memsetting the entire erp_action to zero and setting the parent pointers
again, drop the memset and instead explicitly initialize individually
all erp_action fields except for parent pointers. To be extra careful
not to introduce any other unintended side effect, even keep zeroing the
erp_action fields for list and timer. Also double-check with
WARN_ON_ONCE that erp_action parent pointers never change, so we get to
know when we would deviate from previous behavior.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/s390/scsi/zfcp_aux.c  |    5 +++++
 drivers/s390/scsi/zfcp_erp.c  |   18 +++++++++++-------
 drivers/s390/scsi/zfcp_scsi.c |    5 +++++
 3 files changed, 21 insertions(+), 7 deletions(-)

--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -358,6 +358,8 @@ struct zfcp_adapter *zfcp_adapter_enqueu
 
 	adapter->next_port_scan = jiffies;
 
+	adapter->erp_action.adapter = adapter;
+
 	if (zfcp_qdio_setup(adapter))
 		goto failed;
 
@@ -514,6 +516,9 @@ struct zfcp_port *zfcp_port_enqueue(stru
 	port->dev.groups = zfcp_port_attr_groups;
 	port->dev.release = zfcp_port_release;
 
+	port->erp_action.adapter = adapter;
+	port->erp_action.port = port;
+
 	if (dev_set_name(&port->dev, "0x%016llx", (unsigned long long)wwpn)) {
 		kfree(port);
 		goto err_out;
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -193,9 +193,8 @@ static struct zfcp_erp_action *zfcp_erp_
 		atomic_or(ZFCP_STATUS_COMMON_ERP_INUSE,
 				&zfcp_sdev->status);
 		erp_action = &zfcp_sdev->erp_action;
-		memset(erp_action, 0, sizeof(struct zfcp_erp_action));
-		erp_action->port = port;
-		erp_action->sdev = sdev;
+		WARN_ON_ONCE(erp_action->port != port);
+		WARN_ON_ONCE(erp_action->sdev != sdev);
 		if (!(atomic_read(&zfcp_sdev->status) &
 		      ZFCP_STATUS_COMMON_RUNNING))
 			act_status |= ZFCP_STATUS_ERP_CLOSE_ONLY;
@@ -208,8 +207,8 @@ static struct zfcp_erp_action *zfcp_erp_
 		zfcp_erp_action_dismiss_port(port);
 		atomic_or(ZFCP_STATUS_COMMON_ERP_INUSE, &port->status);
 		erp_action = &port->erp_action;
-		memset(erp_action, 0, sizeof(struct zfcp_erp_action));
-		erp_action->port = port;
+		WARN_ON_ONCE(erp_action->port != port);
+		WARN_ON_ONCE(erp_action->sdev != NULL);
 		if (!(atomic_read(&port->status) & ZFCP_STATUS_COMMON_RUNNING))
 			act_status |= ZFCP_STATUS_ERP_CLOSE_ONLY;
 		break;
@@ -219,7 +218,8 @@ static struct zfcp_erp_action *zfcp_erp_
 		zfcp_erp_action_dismiss_adapter(adapter);
 		atomic_or(ZFCP_STATUS_COMMON_ERP_INUSE, &adapter->status);
 		erp_action = &adapter->erp_action;
-		memset(erp_action, 0, sizeof(struct zfcp_erp_action));
+		WARN_ON_ONCE(erp_action->port != NULL);
+		WARN_ON_ONCE(erp_action->sdev != NULL);
 		if (!(atomic_read(&adapter->status) &
 		      ZFCP_STATUS_COMMON_RUNNING))
 			act_status |= ZFCP_STATUS_ERP_CLOSE_ONLY;
@@ -229,7 +229,11 @@ static struct zfcp_erp_action *zfcp_erp_
 		return NULL;
 	}
 
-	erp_action->adapter = adapter;
+	WARN_ON_ONCE(erp_action->adapter != adapter);
+	memset(&erp_action->list, 0, sizeof(erp_action->list));
+	memset(&erp_action->timer, 0, sizeof(erp_action->timer));
+	erp_action->step = ZFCP_ERP_STEP_UNINITIALIZED;
+	erp_action->fsf_req_id = 0;
 	erp_action->action = need;
 	erp_action->status = act_status;
 
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -115,10 +115,15 @@ static int zfcp_scsi_slave_alloc(struct
 	struct zfcp_unit *unit;
 	int npiv = adapter->connection_features & FSF_FEATURE_NPIV_MODE;
 
+	zfcp_sdev->erp_action.adapter = adapter;
+	zfcp_sdev->erp_action.sdev = sdev;
+
 	port = zfcp_get_port_by_wwpn(adapter, rport->port_name);
 	if (!port)
 		return -ENXIO;
 
+	zfcp_sdev->erp_action.port = port;
+
 	unit = zfcp_unit_find(port, zfcp_scsi_dev_lun(sdev));
 	if (unit)
 		put_device(&unit->dev);

  parent reply	other threads:[~2017-10-31 10:04 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-31  9:55 [PATCH 4.13 00/43] 4.13.11-stable review Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 01/43] workqueue: replace pool->manager_arb mutex with a flag Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 03/43] ALSA: hda/realtek - Add support for ALC236/ALC3204 Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 04/43] ALSA: hda - fix headset mic problem for Dell machines with alc236 Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 05/43] ceph: unlock dangling spinlock in try_flush_caps() Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 06/43] Fix tracing sample code warning Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 07/43] KVM: PPC: Fix oops when checking KVM_CAP_PPC_HTM Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 08/43] KVM: PPC: Book3S HV: POWER9 more doorbell fixes Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 09/43] KVM: PPC: Book3S: Protect kvmppc_gpa_to_ua() with SRCU Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 10/43] s390/kvm: fix detection of guest machine checks Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 11/43] nbd: handle interrupted sendmsg with a sndtimeo set Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 12/43] spi: uapi: spidev: add missing ioctl header Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 13/43] spi: a3700: Return correct value on timeout detection Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 14/43] spi: bcm-qspi: Fix use after free in bcm_qspi_probe() in error path Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 15/43] spi: armada-3700: Fix failing commands with quad-SPI Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 16/43] ovl: add NULL check in ovl_alloc_inode Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 17/43] ovl: fix EIO from lookup of non-indexed upper Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 18/43] ovl: handle ENOENT on index lookup Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 19/43] ovl: do not cleanup unsupported index entries Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 20/43] fuse: fix READDIRPLUS skipping an entry Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 21/43] xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap() Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 22/43] xen: fix booting ballooned down hvm guest Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 23/43] cifs: Select all required crypto modules Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 25/43] Input: elan_i2c - add ELAN0611 to the ACPI table Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 26/43] Input: gtco - fix potential out-of-bound access Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 27/43] Fix encryption labels and lengths for SMB3.1.1 Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 28/43] SMB3: Validate negotiate request must always be signed Greg Kroah-Hartman
2017-10-31 13:02   ` Thomas Backlund
2017-11-01 15:17     ` Greg Kroah-Hartman
2017-11-01 15:18     ` Greg Kroah-Hartman
2018-01-04  2:15       ` Srivatsa S. Bhat
2018-01-18 21:25         ` Srivatsa S. Bhat
2018-01-19 13:23           ` Aurélien Aptel
2018-01-30  3:31             ` Srivatsa S. Bhat
2018-02-27  3:44         ` Srivatsa S. Bhat
2018-02-27  8:54           ` Greg Kroah-Hartman
2018-02-27  9:22             ` Srivatsa S. Bhat
2018-02-27 12:40               ` Greg Kroah-Hartman
2018-02-27 17:45                 ` Srivatsa S. Bhat
2018-02-27 17:55                   ` Steve French
2018-02-27 17:56                   ` Steve French
2018-02-27 18:33                     ` Srivatsa S. Bhat
2018-03-12  2:37                       ` Steve French
2018-03-13  9:21                         ` Greg Kroah-Hartman
2018-03-13 15:21                           ` Steve French
2018-03-16 13:32                             ` Greg Kroah-Hartman
2018-03-16 16:19                               ` Steve French
2018-03-22  2:02                               ` Steve French
2018-03-22  5:12                                 ` Srivatsa S. Bhat
2018-03-22  5:15                                   ` Srivatsa S. Bhat
2018-03-22 10:32                                     ` Greg Kroah-Hartman
2018-03-22 19:14                                   ` Pavel Shilovsky
2018-03-01 20:12                     ` Steve French
2018-03-01 20:51                       ` Srivatsa S. Bhat
2017-10-31  9:55 ` [PATCH 4.13 29/43] assoc_array: Fix a buggy node-splitting case Greg Kroah-Hartman
2017-10-31  9:55 ` Greg Kroah-Hartman [this message]
2017-10-31  9:55 ` [PATCH 4.13 31/43] scsi: aacraid: Fix controller initialization failure Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 32/43] scsi: qla2xxx: Initialize Work element before requesting IRQs Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 33/43] scsi: sg: Re-fix off by one in sg_fill_request_table() Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 34/43] x86/cpu/AMD: Apply the Erratum 688 fix when the BIOS doesnt Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 35/43] drm/amd/powerplay: fix uninitialized variable Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 36/43] drm/i915/perf: fix perf enable/disable ioctls with 32bits userspace Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 37/43] can: sun4i: fix loopback mode Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 38/43] can: kvaser_usb: Correct return value in printout Greg Kroah-Hartman
2017-10-31  9:55 ` [PATCH 4.13 39/43] can: kvaser_usb: Ignore CMD_FLUSH_QUEUE_REPLY messages Greg Kroah-Hartman
2017-10-31  9:56 ` [PATCH 4.13 40/43] cfg80211: fix connect/disconnect edge cases Greg Kroah-Hartman
2017-10-31  9:56 ` [PATCH 4.13 41/43] ipsec: Fix aborted xfrm policy dump crash Greg Kroah-Hartman
2017-10-31  9:56 ` [PATCH 4.13 42/43] regulator: fan53555: fix I2C device ids Greg Kroah-Hartman
2017-10-31 17:20 ` [PATCH 4.13 00/43] 4.13.11-stable review Guenter Roeck
2017-11-01 15:21   ` Greg Kroah-Hartman
2017-10-31 20:04 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171031095531.718767700@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bblock@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maier@linux.vnet.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).